1 Vomuro

Existsnode In Pl/Sql What Is The Assignment Operator

EXISTSNODE

Syntax


Description of the illustration existsnode.gif

Purpose

determines whether traversal of an XML document using a specified path results in any nodes. It takes as arguments the instance containing an XML document and a XPath string designating a path. The optional must resolve to a value that specifies a default mapping or namespace mapping for prefixes, which Oracle Database uses when evaluating the XPath expression(s).

The argument defaults to the namespace of the root element. If you refer to any subelement in , then you must specify , and you must specify the "who" prefix in both of these arguments.

The return value is :

  • 0 if no nodes remain after applying the XPath traversal on the document

  • 1 if any nodes remain

Examples

The following example tests for the existence of the node in the XML path of the column of the sample table :

SELECT warehouse_id, warehouse_name FROM warehouses WHERE EXISTSNODE(warehouse_spec, '/Warehouse/Docks') = 1 ORDER BY warehouse_id, warehouse_name; WAREHOUSE_ID WAREHOUSE_NAME ------------ ----------------------------------- 1 Southlake, Texas 2 San Francisco 4 Seattle, Washington

7
Searching XML Data with Oracle Text

This chapter explains the use of Oracle Text functionality in indexing and querying XML data. It contains the following sections:

Searching XML Data with Oracle Text

This chapter describes the following aspects of Oracle Text:

  • How to create a section group and index your XML document(s)
  • How to build an XML query application with Oracle Text, to search and retrieve data from your XML document(s)
  • Using Oracle Text to search data

Introducing Oracle Text


Note:

Oracle Text is a strictly server-based implementation.


Oracle Text (aka interMedia Text) can be used to search XML documents. It extends Oracle9i by indexing any text or document stored in Oracle. It can also search documents in the file system and URLs.

Oracle Text enables the following:

  • Content-based queries, such as, finding text and documents which contain particular words, using familiar, standard SQL.
  • File-based text applications to use Oracle9i to manage text and documents in an integrated fashion with traditional relational information.
  • Concept searching of English language documents.
  • Theme analysis of English language documents using the theme/gist package.
  • Highlighting hit words. With Oracle Text, you can render a document in different ways. For example, you can present documents with query terms highlighted, either the "words" of a word query or the "themes" of an ABOUT query in English. Use the CTX_DOC.MARKUP or HIGHLIGHT procedures for this.
  • Highlighting hit words. With Oracle Text, you can render a document in different ways. For example, you can present documents with query terms highlighted, either the "words" of a word query or the "themes" of an ABOUT query in English. Also, you can present XML documents with the INPATH/HASPATH query element(s) highlighted. Use CTX_DOC.MARKUP or HIGHLIGHT procedures for this.
  • With Oracle Text you can use PL/SQL packages for document presentation and thesaurus maintenance.

You can query XML data stored in the database directly, without using Oracle Text. However, Oracle Text is useful for boosting query performance.

Accessing Oracle Text

Oracle Text is a standard feature that comes with every Oracle9i Standard, Enterprise, and Personal edition license. It needs to be selected during installation. No special installation instructions are required.

Oracle Text is essentially a set of schema objects owned by CTXSYS. These objects are linked to the Oracle kernel. The schema objects are present when you perform an Oracle9i installation.

Oracle Text Now Supports XMLType

You can now perform Oracle Text searches on tables containing columns.

Further Oracle Text Examples

You can find more examples for Oracle Text and for creating section group indexes at the following site: http://otn.oracle.com/products/text

Assumptions Made in This Chapter's Examples

XML text is aVARCHAR2 or CLOB type in an Oracle9i database table with character semantics. Oracle Text can also deal with documents in a file system or in URLs, but we are not considering these document types in this chapter.

To simplify the examples included in this chapter they use a subset of the Oracle Text options and make the following assumptions:

  • All XML data here is represented using US-ASCII, a 7 bit character set.
  • Issues about whether a character such as "*" is treated as white space or as part of a word are not included.
  • Storage characteristics of the Oracle schema object that implement the Oracle Text index are not considered.
  • They focus on the SECTION GROUP parameter in the or statement. Other parameter types available for and , are , , , , and .

    Here is an example of using SECTION GROUP in :

    CREATE INDEX my_index ON my_table ( my_column ) INDEXTYPE IS ctxsys.context PARAMETERS ( 'SECTION GROUP my_section_group' ) ;
  • Specifically, the examples focus on using and , and .
  • Tagged or marked up data. In this chapter, the examples focus on how to handle XML data. Oracle Text handles many other kinds of data besides XML data.

Oracle Text Users and Roles

With Oracle Text you can use the following users/roles:

  • user to administer users
  • role to create and delete Oracle Text preferences and use Oracle Text PL/SQL packages

User CTXSYS

User is created at install time. Administer Oracle Text users as this user. User has the following privileges:

  • Modify system-defined preferences
  • Drop and modify other user preferences
  • Call procedures in the PL/SQL package to start servers and set system-parameters
  • Start a ctxsrv server
  • Query all system-defined views
  • Perform all the tasks of a user with the role

Role CTXAPP

Any user can create an Oracle Text index and issue a Text query. For additional tasks, use the role. This is a system-defined role that enables you to perform the following tasks:

  • Create and delete Oracle Text preferences
  • Use Oracle Text PL/SQL packages, such as the package

Querying with the CONTAINS Operator

Oracle Text's main purpose is to provide an implementation for the operator. The operator can be used in the WHERE clause of a SELECT statement to specify the query expression for a Text query.

CONTAINS Syntax

Here is the syntax:

...WHERE CONTAINS([schema.]column,text_query VARCHAR2,[label NUMBER])

where:

Table 7-1 CONTAINS Operator: Syntax Description
SyntaxDescription

[schema.] column

Specifies the text column to be searched on. This column must have a Text index associated with it.

text_query

Specifies the query expression that defines your search in column.

label

Optionally specifies the label that identifies the score generated by theoperator.

For each row selected,returns a number between 0 and 100 that indicates how relevant the document row is to the query. The number 0 means that Oracle found no matches in the row. You can obtain this score with the SCORE operator.


Note:

You must use the SCORE operator with a label to obtain this number.


Example 7-1 Using a Simple SELECT Statement with CONTAINS

The following example illustrates how the operator is used in a SELECT statement:

SELECT id FROM my_table WHERE CONTAINS (my_column, 'receipts') > 0

The'receipts' parameter of the operator is called the "Text Query Expression".


Note:

The SQL statement with the operator requires an Oracle Text index in order to run.


Example 7-2 Using the Score Operator with a Label to Obtain the Relevance

The following example searches for all documents in the text column that contain the word Oracle. The score for each row is selected with the operator using a label of 1:

SELECT SCORE(1), title from newsindex WHERE CONTAINS(text, 'oracle', 1) > 0 ORDER BY SCORE(1) DESC;

The operator must always be followed by the > 0 syntax. This specifies that the score value calculated by the operator must be greater than zero for the row selected.

When the operator is called, such as in a clause, the operator must reference the label value as shown in the example.

Using the WITHIN Operator to Narrow Query Down to Document Sections

When documents have internal structure such as in HTML and XML, you can define document sections using embedded tags before you index. This enables you to query within the sections using the operator.


Note:

This is only true for , but not true for AUTO_ or .


Introducing SECTION_GROUPS

You can query within attribute sections when you index with either , , or your section group type. Consider the following XML document:

<book title="Tale of Two Cities">It was the best of times.</book>

XML_SECTION_GROUP

If you use , you can specify any of the following sections:

  • Zone sections
  • Field sections
  • Attribute section
  • Special sections

This chapter only focuses on Zone, Field, and Attribute sections. For more information on Special sections see Oracle Text Reference and Oracle Text Application Developer's Guide.

Zone Sections: CTX_DLL.ADD_ZONE_SECTION Procedure

The syntax for this is:

CTX_DDL.ADD_ZONE_SECTION( group_name in varchar2, section_name in varchar2, tag in varchar2);

To define a as a Zone section, create an and define the Zone section as follows:

EXEC ctx_ddl_create_section_group('myxmlgroup', 'XML_SECTION_GROUP'); EXEC ctx_ddl.add_zone_section('myxmlgroup', 'chapter', 'chapter');

When you define Zone section as such and index the document set, you can query the XML Zone section as follows:

'Cities within chapter'

Field Sections: CTX_DLL.ADD_FIELD_SECTION Procedure

The syntax for this is:

CTX_DDL.ADD_FIELD_SECTION( group_name in varchar2, section_name in varchar2, tag in varchar2);

To define a as a Field section, create an and define the Field section as follows:

EXEC ctx_ddl_create_section_group('myxmlgroup', 'XML_SECTION_GROUP'); EXEC ctx_ddl.add_field_section('myxmlgroup', 'abstract', 'abstract');

When you define Field section as such and index the document set, you can query the XML Field section as follows:

'Cities within abstract'

Attribute Section: CTX_DLL.ADD_ATTR_SECTION Procedure

The syntax for this is:

CTX_DDL.ADD_ATTR_SECTION( group_name in varchar2, section_name in varchar2, tag in varchar2);

To define the attribute as an Attribute section, create an and define the Attribute section as follows:

EXEC ctx_ddl_create_section_group('myxmlgroup', 'XML_SECTION_GROUP'); EXEC ctx_ddl.add_attr_section('myxmlgroup', 'booktitle', 'book@title');

When you define the Attribute section as such and index the document set, you can query the XML attribute text as follows:

'Cities within booktitle'

Constraints for Querying Attribute or Field Sections

The following constraints apply to querying within Attribute or Field sections:

  • Regular queries on attribute text will not work unless qualified in a clause. Using the following XML document: <book title="Tale of Two Cities">It was the best of times.</book>

    querying on Tale will not work unless qualified with ' title@book'.

  • You cannot use Attribute or Field sections in a nested query.
  • Phrases ignore attribute text. For example, if the original document looked like: ....Now is the time for all good <word type="noun"> men </word> to come to the aid......

    The search would result in a regular query's, "good men", and ignore the intervening attribute text.

AUTO_ SECTION_GROUP/ PATH_SECTION_GROUP for INPATH and HASPATH

When you use the or to index XML documents, Oracle9i automatically creates sections.

To search on Tale within the Attribute section , include the following clause in your SELECT statement:

  • If you are using : ... WHERE CONTAINS ('Tale INPATH booktitle')>0;
  • If you are using ... WHERE CONTAINS ('Tale INPATH title@book')>0;

Dynamically Adding Sections or Stop Section Using ALTER INDEX

The syntax for is:

ALTER INDEX [schema.]index REBUILD [ONLINE] [PARAMETERS (paramstring)];

where

paramstring = 'replace [datastore datastore_pref] [filter filter_pref] [lexer lexer_pref] [wordlist wordlist_pref] [storage storage_pref] [stoplist stoplist] [section group section_group] [memory memsize] | ... | add zone section section_name tag tag | add field section section_name tag tag [(VISIBLE | INVISIBLE)] | add attr section section_name tag tag@attr | add stop section tag'

The added section applies only to documents indexed after this operation. Thus for the change to take effect, you must manually re-index any existing documents that contain the tag. The index is not rebuilt by this statement.

WITHIN Syntax for Section Querying

Here is the syntax for querying sections:

...WHERE CONTAINS(text,'XML WITHIN title') >0;...

This searches for expression within a section. If you are using the following restrictions apply to the pre-defined zone, field, or attribute section:

  • If section is a zone, expression can contain one or more operators (nested ) whose section is a zone or special section.
  • If section is a field or attribute section, expression cannot contain another operator.

You can combine and nest clauses. For finer grained searches of XML sections, you can use clauses inside select statements.

WITHIN Operator Limitations

The operator has the following limitations:

  • You cannot embed the clause in a phrase. For example, you cannot write: term1 section term2
  • You cannot combine with expansion operators, such as $ ! and *.
  • Since is a reserved word, you must escape the word with braces to search on it.

INPATH or HASPATH Operators Search Using XPath-Like Expressions

Path Indexing and Path Querying with Oracle Text

In Oracle9i Oracle Text introduced a new section type and new query operators which support an XPath-like query language. Indexes of type context with XML path searching are able to perform very complex section searches on XML documents. Here are the basic concepts of path indexing and path querying.

Path Indexing

Section searching is enabled by defining section groups. To use XML path searching, the Oracle Text index must be created with the new section group, as follows:

begin ctx_ddl.create_section_group('mypathgroup','PATH_SECTION_GROUP'); end;

To create the Oracle Text index use this command:

create index order_idx on library_catalog(text) indextype is ctxsys.context parameters ('SECTION GROUP mypathgroup');

Path Querying

The Oracle Text path query language is based on W3C XPath. For Oracle9i Release 1 (9.0.1) and higher, you can use the and operators to express path queries.

Using INPATH Operator for Path Searching in XML Documents

You can use operator to perform path searching in XML documents. Table 7-2 summarizes the ways you can use the operator for path searching.

Table 7-2 Path Searching XML Documents Using the INPATH Operator  
Path Search FeatureSyntaxDescription

Simple Tag Searching

virginia INPATH (//STATE)

Finds all documents where the word "virginia" appears between <STATE> and </STATE>. The STATE element can appear at any level of the document structure.

Case-sensitivity

virginia INPATH (STATE)

virginia INPATH (State)

Tags and attribute names in path searching are case-sensitive. virginia STATE -- finds <STATE>virginia</STATE> but NOT <State>virginia</State>. To find the latter you must do virginia State.

Top-Level Tag Searching

virginia INPATH (Legal)

virginia INPATH (/Legal)

For example, the following query finds Quijote where it occurs between <order> and </order>:

select id from library_catalog where contains(text,'Quijote INPATH(order)') > 0;

Here <order> must be the top level tag.

Finds all documents where "virginia" appears in a Legal element which is the top-level tag.'Legal' MUST be the top-level tag of the document.'virginia' may appear anywhere in this tag regardless of other intervening tags. For example:

<?xml version="1.0" standalone="yes"?>

<!-- <?xml-stylesheet type="text/xsl" href="./xsl/vacourtfiling(html).xsl"?> -->

<Legal>

<CourtFiling>

<Filing ID="f001" FilingType="Civil">

<LeadDocument>

<CaseCaption>

<CourtInformation>

<Location>

<Address>

<AddressState>VIRGINIA</AddressState>

</Address> ... </Legal>

Any Level Tag Searching

virginia INPATH (//Address)

For example, a double slash indicates "any number of levels" down. The following query finds Quijote inside a <title> tag that occurs at the top level or any lower level:

select id from library_catalog

where contains(text,'Quijote INPATH(//title)') > 0;

'Virginia' can appear anywhere within an 'Address' tag, which may appear within any other tags. for example:

<?xml version="1.0" standalone="yes"?>

<!-- <?xml-stylesheet type="text/xsl" href="./xsl/vacourtfiling(html).xsl"?> -->

<Legal>

<CourtFiling>

<Filing ID="f001" FilingType="Civil">

<LeadDocument>

<CaseCaption>

<CourtInformation>

<Location>

<Address>

<AddressState> VIRGINIA </AddressState>... </Legal>

Direct Parentage Path Searching

virginia INPATH (//CourtInformation/Location)

for example:

select id from library_catalog where contains(text,'virginia INPATH(order/item)') > 0;

Finds all documents where "virginia" appears in a element which is a direct child of a element. For example:

<?xml version="1.0" standalone="yes"?>

<!-- <?xml-stylesheet type="text/xsl" href="./xsl/vacourtfiling(html).xsl"?> -->

<Legal>

<CourtFiling>

<Filing ID="f001" FilingType="Civil">

<LeadDocument>

<CaseCaption>

<CourtInformation>

<Location>

<Address>

<AddressState> VIRGINIA </AddressState>

</Address>... </CourtInformation>

Single-Level Wildcard Searching

virginia INPATH(A/*/B)

'virginia INPATH (//CaseCaption/*/Location)'

Finds all documents where "virginia" appears in a B element which is a grandchild of an A element. For instance, <A><D><B>virginia</B></D></A>. The intermediate element does not need to be an indexed XML tag. For example:

<?xml version="1.0" standalone="yes"?>

<!-- <?xml-stylesheet type="text/xsl" href="./xsl/vacourtfiling(html).xsl"?> -->

<Legal>

<CourtFiling>

<Filing ID="f001" FilingType="Civil">

<LeadDocument>

<CaseCaption>

<CourtInformation>

<Location>

<Address>

<AddressState>VIRGINIA</AddressState>...

</Legal>

Multi-level Wildcard Searching

'virginia INPATH (Legal/*/Filing/*/*/CourtInformation)'

'Legal' must be a top-level tag, and there must be exactly one tag-level between 'Legal' and 'Filing', and two between 'Filing' and 'CourtInformation'. 'Virginia' may then appear anywhere within 'CourtInformation'. For example:

<?xml version="1.0" standalone="yes"?>

<!-- <?xml-stylesheet type="text/xsl" href="./xsl/vacourtfiling(html).xsl"?> -->

<Legal>

<CourtFiling>

<Filing ID="f001" FilingType="Civil">

<LeadDocument>

<CaseCaption>

<CourtInformation>

<Location>

<Address>

<AddressState>VIRGINIA</AddressState>

</Address>

</Location>

<CourtName>

IN THE CIRCUIT COURT OF LOUDOUN COUNTY

</CourtName>

</CourtInformation>....

Descendant Searching

virginia INPATH(A//B)

Finds all documents where "virginia" appears in a B element which is some descendant (any level) of an A element.

Attribute Searching

virginia INPATH(A/@B)

Finds all documents where "virginia" appears in the B attribute of an A element. You can search within an attribute value using the syntax <tag>/@<attribute>:

select id from library_catalog where contains(text,'dvd (//item/@type)') > 0; AND and OR

You can use boolean AND and OR to combine existence or equality predicates in a test.

select id from library_catalog where contains(text,'Levy or Cervantes (//title)') >0;

Descendant/Attribute Existence Testing

virginia INPATH (A[B])

You can search for documents using the any-level tag searching:

select id from library_catalog where contains (text,'Quijote INPATH(/order/title)') > 0;

You can also use the "*" as a single level wildcard. The * matches exactly one level.:

select id from library_catalog where contains (text,'Cervantes INPATH(/order/*/author)') > 0;

Finds all documents where "virginia" appears in an A element which has a B element as a direct child.

  • virginia INPATH A[.//B] -- Finds all documents where "virginia" appears in an A element which has a B element as a descendant (any level).
  • virginia INPATH A[@B] -- Finds all documents where "virginia" appears in an A element which has a B attribute

Attribute Value Testing

Within Equality

virginia INPATH A[@B = "foo"]

That means that:

virginia INPATH (A[@B = "pot of gold"]), would, with the default lexer and stoplist, match any of the following:

<A B="POT OF GOLD">virginia</A>

By default, lexing is case-independent, so "pot" matches "POT", <A B="POT BLACK GOLD">virginia</A>

By default, "of" is a stopword, and, in a query, would match any word in that position, <A B=" Pot OF Gold ">virginia</A>

Finds all documents where "virginia" appears in an A element which has a B attribute whose value is "foo".

  • Only equality is supported as a test. Range operators and functions are not supported.
  • The left-hand-side of the equality MUST be an attribute or tag. Literals here are not allowed.
  • The right-hand-side must be a literal. Tags and attributes here are not allowed.

Within equality (See "Using INPATH Operator for Path Searching in XML Documents") is used to evaluate the test.

Whitespace is mainly ignored in text indexing. Again, lexing is case-independent:

<A B="pot_of_gold">virginia</A>

Underscore is a non-alphabetic character, and is not a join character by default. As a result, it is treated more or less as whitespace and breaks up that string into three words.

Example:

select id from library_catalog where contains(text,'(Bob the Builder) (//item[@type="dvd"])') > 0;

The following will not return rows:

select id from library_catalog where contains(text,'(Bob the Builder) (//item[@type="book"])') > 0;

Numeric Equality

virginia INPATH (A[@B = 5])

Numeric literals are allowed. But they are treated as text. The within equality is used to evaluate. This means that the query does NOT match. That is, <A B="5.0">virginia</A> does not match A[@B=5] where "5.0", a decimal is not considered the same as 5, an integer.

Conjunctive Testing

virginia INPATH (A[B AND C])

virginia INPATH (A[B AND @C = "foo"])...

Predicates can be conjunctively combined.

Combining Path and Node Tests

virginia INPATH (A[@B = "foo"]/C/D)

virginia INPATH(A//B[@C]/D[E])...

Node tests can be applied to any node in the path.

Using HASPATH Operator for Path Searching in XML Documents

Use the operator to find all XML documents that contain a specified section path. is used when you want to test for path existence. It is also very useful for section equality testing. To find all XML documents where an order has an item within it:

select id from library_catalog where contains(text,'HASPATH(order/item)') > 0;

will return all documents where the top-level tag is a order element which has a item element as a direct child.

In Oracle9i, Oracle Text introduces a new section type and new query operators which support an XPath-like query language. Indexes of type context with XML path searching are able to perform very complex section searches on XML documents. Here are more examples of path querying using and . Assuming the following XML document:

<?xml version="1.0"?> <order> <item type="book"> <title>Crypto</title> <author>Levi</author> </item> <item type="dvd"> <title> Bob the Builder</title> <author>Auerbach</author> </item> <item type="book"> <title>Don Quijote</title> <author>Cervantes</author> </item> </order>

In general, use and operators only when your index has been created with . Use of enables path searching. Path searching extends the syntax of the operator so that the section name operand (right-hand-side) is a path instead of a section name.

Using HASPATH Operator for Path Existence Searching


Note:

The operator functions in a similar fashion to the existsNode() in .


Only use the operator when your index has been created with the . The syntax for the operator is:

  • WHERE CONTAINS(column, 'HASPATH(path)'...): Here searches an XML document set and returns a score of 100 for all documents where path exists. Parent and child paths are separated with the / character, for example, A/B/C. For example, the query: ...WHERE CONTAINS (col,'HASPATH(A/B/C)')>0;

    finds and returns a score of 100 for the document:

    <A><B><C>Virginia</C></B></A>

    without having to reference Virginia at all.

  • WHERE CONTAINS(column, 'HASPATH(A="value")'...): Here the clause searches an XML document set and returns a score of 100 for all documents that have element A with content value and only that value. is used to test equality. This is the "Section Equality Testing" feature of the operator. The query: ...WHERE CONTAINS virginia INPATH A

    finds <A>virginia</A>, but it also finds <A>virginia state</A>. To limit the query to the term virginia and nothing else, you can use a section equality test with the operator. For example:

    ... WHERE CONTAINS (col,'HASPATH(A="virginia")'

    finds and returns a score of 100 only for the first document, and not the second.

Tag Value Equality Testing

You can do tag value equality test with :

select id from library_catalog

where CONTAINS(text,'HASPATH (//author="Auerbach")') >0;

Building a Query Application with Oracle Text

To build a Oracle Text query application carry out the following steps:

  1. Create a section preference group. Before you create a section group and Oracle text index you must first determine the role you will need and grant the appropriate privilege. See "Oracle Text Users and Roles", and grant the appropriate privilege.

    After creating and preparing your data, you are ready to perform the next step. See "Step 1. Create a Section Group Preference".

  2. Add sections or stop_sections
  3. Create an Oracle Text index based on the section group you created. Using the section preference created, you then create an Oracle Text index. See Building a Query Application with Oracle Text.
  4. Build your query application using the operator. Now you can finish building your query application. See "Building a Query Application with Oracle Text".

What Role Do You Need?

First determine the role you need. See Oracle Text Reference and "Oracle Text Users and Roles", and grant the appropriate privilege as follows:

CONNECT system/manager GRANT ctxapp to scott; CONNECT scott/tiger

Step 1. Create a Section Group Preference

The first thing you must do is create a preference. This section describes how to create section preferences using , , and . Table 7-3 describes the groups and summarizes their features.

Table 7-3 Comparing Oracle Text Section Groups
Section GroupDescription

XML_SECTION_GROUP

Use this group type for indexing XML documents and for defining sections in XML documents.

AUTO_SECTION_GROUP

Use this group type to automatically create a zone section for each start-tag/end-tag pair in an XML document. The section names derived from XML tags are case-sensitive as in XML. Attribute sections are created automatically for XML tags that have attributes. Attribute sections are named in the form attribute@tag. Stop sections, empty tags, processing instructions, and comments are not indexed. The following limitations apply to automatic section groups:

  • You cannot add zone, field or special sections to an automatic section group.
  • Automatic sectioning does not index XML document types (root elements.) However, you can define stop-sections with document type.
  • The length of the indexed tags including prefix and namespace cannot exceed 64 characters. Tags longer than this are not indexed.

PATH_SECTION_GROUP

Use this group type to index XML documents. Behaves like the . With this section group you can do path searching with the and operators. Queries are case-sensitive for tag and attribute names.

How is PATH_SECTION_GROUP Similar to AUTO_SECTION_GROUP?

Documents are assumed to be XML, Every tag and every attribute is indexed by default, Stop sections can be added to prevent certain tags from being indexed, Only stop sections can be added , and sections cannot be added, When indexing XML document collections, you do not need to explicitly define sections as Oracle automatically does this for you.

How Does PATH_SECTION_GROUP Differ From AUTO_SECTION_GROUP?

Path Searching is allowed at query time (see "Case Study: Searching XML-Based Conference Proceedings" and "You can use INPATH operator to perform path searching in XML documents. Table 7-2 summarizes the ways you can use the INPATH operator for path searching.") with the new and operators, Tag and attribute names are case-sensitive in queries.


Note:

If you are using the or to index an XML document collection, you need not explicitly define sections since the system does this for you during indexing.


Deciding Which Section Group to Use

How do you determine which section groups is best for your application? This depends on your application. Table 7-4 lists some general guidelines to help you decide which of the , , or section groups to use when indexing your XML documents, and why.

Table 7-4 Guidelines for Choosing XML_, AUTO_, or PATH_ Section Groups
Application CriteriaXML_section_...AUTO_section_...PATH_section_...

You are using XPATH search features

--

--

Yes

You know the layout and structure of your XML documents, and you can predefine the sections on which users are most likely to search.

Yes

--

--

You do not know which tags users are most likely to search.

--

Yes

--

Query performance, in general

Fastest

Little slower than XML_section_...

Little slower than AUTO_section_...

Indexing performance, in general

Fastest

Little slower than XML_section_...

Little slower than AUTO_section_...

Index size

Smallest

Little larger than XML_section_...

Little larger than AUTO_section_...

Other features

Mappings can be defined so that tags in one or different DTDs can be mapped to one section. Good for DTD evolution and data aggregation.

Simplest. No need to define mapping, add_stop_section can be used to ignore some sections.

Designed for more sophisticated XPATH- like queries

Creating a Section Preference with XML_SECTION_GROUP

The following command creates a section group called, xmlgroup, with the group type:

EXEC ctx_ddl.create_section_group('myxmlgroup', 'XML_SECTION_GROUP');

Creating a Section Preference with AUTO_SECTION_GROUP

You can set up your indexing operation to automatically create sections from XML documents using the section group . Here, Oracle creates zone sections for XML tags. Attribute sections are created for those tags that have attributes, and these attribute sections are named in the form "tag@attribute."

The following command creates a section group called autogroup with the group type. This section group automatically creates sections from tags in XML documents.

EXEC ctx_ddl.create_section_group('autogroup', 'AUTO_SECTION_GROUP');

Note:

You can add attribute sections only to XML section groups. When you use , attribute sections are created automatically. Attribute sections created automatically are named in the form tag@attribute.


Creating a Section Preference with PATH_SECTION_GROUP

To enable path section searching, index your XML document with . For example:

EXEC ctx_ddl.create_section_group('xmlpathgroup', 'PATH_SECTION_GROUP');

Step 2. Set the Preference's Attributes

To set the preference's attributes for , use the following procedures:

  • Add_Zone_Section
  • Add_Attr_Section
  • Add_Field_Section
  • Add_Special_Section

To set the preference's attributes for and , use the following procedures:

There are corresponding . sections and . section commands.

2.1 XML_SECTION_GROUP: Using CTX_DDL.add_zone_section

The syntax for .add_zone_section follows:

CTX_DDL.Add_Zone_Section ( group_name => 'my_section_group' /* whatever you called it in the preceding section */ section_name => 'author' /* what you want to call this section */ tag => 'my_tag' /* what represents it in XML */ );

where 'my_tag' implies opening with <my_tag> and closing with </my_tag>.

add_zone_section Guidelines

add_zone_section guidelines are listed here:

  • Call . for each tag in your XML document that you need to search on.

2.2 XML_SECTION_GROUP: Using CTX_DDL.Add_Attr_Section

The syntax for . follows:

CTX_DDL.Add_Attr_Section ( /* call this as many times as you need to describe the attribute sections */ group_name => 'my_section_group' /* whatever you called it in the preceding section */ section_name => 'author' /* what you want to call this section */ tag => 'my_tag' /* what represents it in XML */ );

where 'my_tag' implies opening with <my_tag> and closing with </my_tag>.

Add_Attr_Section Guidelines

Add_Attr_Section guidelines are listed here:

  • Consider meta_data attribute author:

    <meta_data author = "John Smith" title="How to get to Mars">

adds an attribute section to an XML section group. This procedure is useful for defining attributes in XML documents as sections. This enables searching XML attribute text with the operator.

The section_name:

  • Is the name used for queries on the attribute text.
  • Cannot contain the colon (:) or dot (.) characters.
  • Must be unique within group_name.
  • Is case-insensitive.
  • Can be no more than 64 bytes.

The tag specifies the name of the attribute in tag@attr format. This is case-sensitive.


Note:

In the procedure, you can have many tags all represented by the same section name at query time. Explained in another way, the names used as the arguments of the keyword can be different from the actual XML tag names. That is many tags can be mapped to the same name at query time. This feature enhances query usability.


2.3 XML_SECTION_GROUP: Using CTX_DDL.Add_Field_Section

The syntax for . follows:

CTX_DDL.Add_Field_Section ( group_name => 'my_section_group' /* whatever you called it in the preceding section */ section_name => 'qq' /* what you want to call this section */ tag => 'my_tag' /* what represents it in XML */ ); visible => TRUE or FALSE );

Add_Field_Section Guidelines

Add_Field_Section guidelines are listed here:

  • Searches using Field_Sections are faster than those using Zone_Section.
  • Visible attribute: This is available in Add_Field_Section but not available in the Add_Zone_section. If is set to TRUE then the text within the Field section will be indexed as part of the enclosing document. For example: <state> Virginia </state> CTX_DDL.Add_Field_Section ( group_name => 'my_section_group' section_name => 'state' tag => 'state' visible => TRUE or FALSE );

    If is set to TRUE, then searching on Virginia without specifying the Field section produces a hit.

    If is set to FALSE, then searching on Virginia without specifying the Field section does not produce a hit.

How Attr_Section Differs from Field_Section

Attribute section differs from Field section in the following ways:

  • Attribute text is considered invisible, hence the following clause: WHERE CONTAINS (..., '... jeeves',...)...

    does NOT find the document. This is similar to when Field sections have set to FALSE. Unlike Field sections, however, Attribute section within searches can distinguish between occurrences. Consider the document:

    <comment author="jeeves"> I really like Oracle Text </comment> <comment author="bertram"> Me too </comment>

    the query:

    WHERE CONTAINS (...,'(cryil and bertram) WITHIN author', ...)...

    will NOT find the document, because "jeeves" and "bertram" do not occur within the SAME attribute text.

  • Attribute section names cannot overlap with zone or field section names although you can map more than one to a single section name. Attribute sections do not support default values. Given the document: <!DOCTYPE foo [ <!ELEMENT foo (bar)> <!ELEMENT bar (#PCDATA)> <!ATTLIST bar rev CDATA "8i"> ]> <foo> <bar>whatever</bar> </foo>

    and attribute section:

    ctx_ddl.add_attr_section('mysg','barrev','bar@rev');

    the query:

    8i within barrev does not hit the document, although in XML semantics, the "bar" element has a default value for its "rev" attribute.

2.5 AUTO_SECTION_GROUP: Using CtX_DDL.Add_Stop_Section

CtX_DDL.Add_Stop_Section ( group_name => 'my_section_group' /* whatever you called it in the preceding section */ section_name => 'qq' /* what you want to call this section */ );

Step 3. Create an Index Using the Section Preference Created in Step 2

Create an index depending on which section group you used to create a preference:

Creating an Index Using XML_SECTION_GROUP

To index your XML document when you have used , you can use the following statement:

CREATE INDEX myindex ON docs(htmlfile) INDEXTYPE IS ctxsys.context parameters('section group xmlgroup');

Creating an Index Using AUTO_SECTION_GROUP

The following statement creates the index, myindex, on a column containing XML files using the :

CREATE INDEX myindex ON xmldocs(xmlfile) INDEXTYPE IS ctxsys.context PARAMETERS ('section group autogroup');

Creating an Index Using PATH_SECTION_GROUP

To index your XML document when you have used , you can use the following statement:

CREATE INDEX myindex ON xmldocs(xmlfile) INDEXTYPE IS ctxsys.context PARAMETERS ('section group xmlpathgroup');

Example 7-3 Creating an Index Using XML_SECTION_GROUP

EXEC ctx_ddl_create_section_group('myxmlgroup', 'XML_SECTION_GROUP'); /* ADDING A FIELD SECTION */ EXEC ctx_ddl.Add_Field_Section /* THIS IS KEY */ ( group_name =>'my_section_group', section_name =>'author',/* do this for EVERY tag used after "WITHIN" */ tag =>'author' ); EXEC ctx_ddl.Add_Field_Section /* THIS IS KEY */ ( group_name =>'my_section_group', section_name =>'document',/*do this for EVERY tag after "WITHIN" */ tag =>'document' ); ... / /* ADDING AN ATTRIBUTE SECTION */ EXEC ctx_ddl.add_attr_section('myxmlgroup', 'booktitle', 'book@title'); /* The more sections you add to your index, the longer your search will take.*/ /* Useful for defining attributes in XML documents as sections. This allows*/ /* you to search XML attribute text using the WITHIN operator.*/ /* The section name: /* ** Is used for WITHIN queries on the attribute text. ** Cannot contain the colon (:) or dot (.) characters. ** Must be unique within group_name. ** Is case-insensitive. ** Can be no more than 64 bytes. ** The tag specifies the name of the attribute in tag@attr format. This is case-sensitive. */ /* Names used as arguments of the keyword WITHIN can be different from the actual XML tag names. Many tags can be mapped to the same name at query time.*/ /* Call CTX_DDL.Add_Zone_Section for each tag in your XML document that you need to search on. */ EXEC ctx_ddl.add_zone_section('myxmlgroup', 'mybooksec', 'mydocname(book)'); CREATE INDEX my_index ON my_table ( my_column ) INDEXTYPE IS ctxsys.context PARAMETERS ( 'SECTION GROUP my_section_group' ); SELECT my_column FROM my_table WHERE CONTAINS(my_column, 'smith WITHIN author') > 0;

Step 4. Create Your Query Syntax

See the section, "Querying with the CONTAINS Operator" for information about how to use the operator in query statements.

Querying Within Attribute Sections

You can query within attribute sections when you index with either or as your section group type.

Assume you have an XML document as follows:

<book title="Tale of Two Cities">It was the best of times.</book>

You can define the section as the attribute section title. You can do so with the . procedure or dynamically after indexing with ALTER INDEX.


Note:

When you use the to index XML documents, the system automatically creates attribute sections and names them in the form .


If you use the , you can name attribute sections anything with ..

To search on Tale within the attribute section title, issue the following query:

WHERE CONTAINS (...,'Tale WITHIN title', ...)

When you define the TITLE attribute section as such and index the document set, you can query the XML attribute text as follows:

... WHERE CONTAINS (...,'Cities WITHIN booktitle', ....)...

When you define the AUTHOR attribute section as such and index the document set, you can query the XML attribute text as follows:

... WHERE 'England WITHIN authors'

Example 7-4 Querying an XML Document

This example does the following:

  1. Creates and populates table res_xml
  2. Creates an index, section_group, and preferences
  3. Paramaterizes the preferences
  4. Runs a test query against res_xml drop table res_xml; CREATE TABLE res_xml ( pk NUMBER PRIMARY KEY , text CLOB ) ; insert into res_xml values(111, 'ENTITY chap8 "Chapter 8, <q>Keeping it Tidy: the XML Rule Book </q>"> this is the document section'); commit; --- --- script to create index on res_xml --- --- cleanup, in case we have run this before DROP INDEX res_index ;

Leave a Comment

(0 Comments)

Your email address will not be published. Required fields are marked *