IBM FileNet P8, Version 5.2            

Queries

The essential parts of a Content Engine search are a SQL statement, contained in a SearchSQL instance, and the object store or object stores searched, contained in a SearchScope object. Content searches are specified through the CONTAINS operator in the SQL statement.

The SQL Statement

There are helper methods on the SearchSQL class to assist you in constructing a SQL statement. Alternatively, you can construct a SQL statement independently and pass it to a SearchSQL instance as a string. SQL statements need to follow the IBM® FileNet® standard, which generally conforms to SQL-92, with extensions for IBM FileNet specific constructs. See SQL Syntax Reference for a complete description.

The SearchSQL helper methods are supplied for assistance in building SQL statements, and cannot provide the level of specification you can achieve with an independently constructed statement. However, in a development environment, you can use the helper methods for initial construction of the SQL statement, then use the SearchSQL.toString method to get the SQL statement string and manually refine the SQL statement.

The Search Scope

The SearchScope methods execute the SQL statement on one or more object stores to find objects (IndependentObject instances), database rows (RepositoryRow instances), or metadata (ClassDescription instances).

You can use the SearchScope class to search one or more object stores using a single query. To issue a query on multiple object stores, call the constructor for SearchScope with an array of object stores, similar to the following:

ObjectStore[] osArray =new ObjectStore[]{os1,os2}; 
SearchScope objStores = new SearchScope(osArray, MergeMode.INTERSECTION); 
 

Then use the SearchScope instance (objStores) to execute a query. The query will merge the results from the object stores, and return them in a single, ordered list.

For example, if the "SELECT DocumentTitle FROM Document WHERE DocumentTitle LIKE 'C%' ORDER BY DocumentTitle" search statement is executed on a list of two object stores, the results (in a single collection) might be:

Cars 
City 
Concrete 
Cows 

Cars and Concrete might come from the first object store, and City and Cows might come from the second object store. Note that the results from the different object stores are intermingled in the list, ordered by the ORDER BY clause of the search statement.

Matching Classes and Properties

Classes and properties are defined in each object store. A class or property in one object store is considered to be the same class or property existing in another object store only if the compared classes or properties have matching GUIDs. Having the same name does not indicate that the compared classes or properties are the same.

GUID values are stored in properties on both the ClassDefinition and PropertyDefinition classes.

A ClassDefinition object has both an Id property that is a GUID, and an AliasIds property that is a list of GUIDs. The Id property contains the GUID that is usually used to identify ClassDefinition objects. The AliasId properties can alternatively be used to identify these objects. Two ClassDefinition objects from two different object stores are considered to be the same if the value of either the Id property or AliasId property of one ClassDefinition object matches the value of the corresponding property on the other ClassDefinition object.

For example, the query "SELECT * from DocSubClass" executed on a list of two object stores might return objects named DocSubClass from both object stores, but if these objects do not have the same Id or AliasId property value, they will not be recognized as the same object. Attempting to query both object stores using the name DocSubClass will not return any rows from the second object store. (However, the object named DocSubClass in the second object store can be referenced using the string format of the ClassDefinition.Id property, rather than the name.)

PropertyDefinition objects have Id, PrimaryId, and AliasId properties. For PropertyDefinition objects, the PrimaryId property is used to identify the object, rather than the Id property. (Note that the PrimaryId property is the same as the Id property of the PropertyTemplate object to which the property refers.) Two PropertyDefinition objects from two different object stores are considered to be the same then if either the PrimaryId or AliasId property value of one PropertyDefinition object matches the value of the corresponding property on the other PropertyDefinition object, when both PropertyDefinition objects are on ClassDefinition objects that also match.

The AliasId properties for both ClassDefinition and PropertyDefinition objects are cumulative. For instance, suppose four objects are to be merged from object stores A, B, C, D, with the Id and AliasId values shown below (using single digit integers for brevity):

Object Store Class Id Alias Id Ids of Class
OS-A 1 2 1,2
OS-B 2 3 1,2,3
OS-C 3 4 1,2,3,4
OS-D 4 (none) 1,2,3,4

The values in the Ids of Class column indicate the cumulative object GUIDs, and if matched by any Id or AliasId of another object, will result in the merging of the two objects for the purposes of the query. So, all of the objects in the table are aliased together as the same object. Note that this example illustrates how IDs are matched; a class alias scheme this complex in a real deployment is unlikely.

The typical aliasing scheme is:

Object Store Class Id Alias Id Ids of Class
OS-A 1 (none)  
OS-B 2 1  
OS-C 3 1  
OS-D 3 1  

Duplicate matches are not allowed for alias IDs, which means that a single object cannot match more than one other object, and a single property cannot match more than one other property. If alias IDs are set up so that duplicate matches occur, an exception will be thrown and the multiple object store query will not be allowed for any objects across that combination of object stores (not just the objects that contain the duplicate alias IDs).

The system administrator should normally create the classes and properties on one object store, and then export those definitions from that object store and import them to any other object store that needs to support queries across object stores. This export/import operation will insure that the IDs of the classes and properties are the same in each object store. The imported names will also be the same, which is a good practice to follow.

If the object stores that must support queries across object stores contain pre-existing objects with different IDs, then the alias IDs must be used as the alternate identifier. In this case, the system administrator must assign alias IDs to the intended matching objects and properties on each object store. When assigning alias IDs, the ClassDefinition.Id property of an object in one object store is assigned to the AliasId list of that object in another object store. Additionally, the PropertyDefinition.PrimaryId property of a property in one object store is assigned to the AliasId list of the property in another object store.

Note: If two object stores need matching objects, the alias IDs for the corresponding objects need to be assigned on only one of the two object stores.

Class and Property Names

When determining names of classes or properties, it is the first object store in which the class is encountered that determines the name. For example, suppose there is an object named "apple" in the first object store, and "orange" in a second object store, and that both objects have the same GUID value for their Id property. For the purpose of an object store query that executes across both object stores, any reference to the object having the name "apple" would match both the apple and orange objects. Any name reference to the object having the name "orange" would throw an undefined class exception.

Since the search order of the object stores can affect name based queries, the same object store order should be used whenever performing queries across object stores, because doing so is more efficient. Merging object stores A and B does not produce the same results as merging B and A. So the server must cache merged object store metadata that is order dependent (B & A and also A & B), and changing the order for one query versus the next can cause excessive amounts of metadata to be cached, resulting in either too much memory being cached, or thrashing due to metadata being flushed from the cache to restrict size and then reconstituted later.

Merge Mode

The merge mode specified for a query across object stores affects how classes and properties are merged. There are two merge modes: intersection and union (MergeMode.INTERSECTION and MergeMode.UNION).

For an intersection merge, only objects and properties defined in all object stores are present in the merged metadata, and only these objects and properties may be referenced in a search. Any class or property that exists in one object store, but does not have a matching class or property in every object store, is excluded from the merged metadata, and cannot be used in a search.

For a union merge, all classes and properties from all object stores are present in the merged metadata, and all classes and properties can be returned.

As an example, assume the following:

(Note that OS1 is the first object store in the collection.) The following custom properties then exist for "Alpha" in each object store:

If you specify MergeMode.UNION, the properties returned are:

If you specify MergeMode.INTERSECTION, the properties returned are:

Attempts to select either PropertyA or PropertyD will result in an undefined property exception.

If the classes had the same GUIDs for the same names, but the properties had different GUIDs and were not aliased, the MergeMode.UNION for the above example would have the following properties:

If you executed the select statement "SELECT * FROM Alpha", the result would be a row with ten columns for each object store that contained a row. Each column in the rows returned would be non-null only if the row came from the preceding object store in the list.

If the select statement was "SELECT PropertyA, PropertyB, PropertyC, PropertyD FROM Alpha", PropertyA would come only from OS1 and would be null for rows from any other object store. Similarly, PropertyB would come only from OS1, PropertyC from OS1, and PropertyD from OS2. You could not select just PropertyB from OS3 based on the property name, so this configuration is not very useful, illustrating why you need to put alias IDs on properties (or export/import across object stores to make the IDs match); otherwise, the query results might not be meaningful.

Returned Objects

For queries across object stores, when a property having the same GUID does not have the same name in each object store, the type of objects returned will affect the property name: If RepositoryRow objects are returned, the property gets the name from the first object store in which it is defined, and the name is the same for rows from any subsequent object store in the list. If IndependentObject objects are returned, the property will be named according to each object store in which it is defined.

RepositoryRow objects differ from IndependentObject objects in some notable ways:

As an example, suppose you execute the statement "SELECT apple FROM someclass" against a list of two objects stores; where, in the first object store, the property "apple" matches (by a GUID) a property named "orange" in a second object store. A query that returned RepositoryRow objects will always return properties named apple, regardless of which object store they came from, but a query that returned IndependentObject objects will return a property name of apple for data from the first object store and will return a property name of orange for data returned from the second object store. If this was not the case, attempts to do updates using the IndependentObject objects returned from the second object store would generate the error "Property apple not defined".

When RepositoryRow objects are returned, the names of properties can be renamed. For instance, you could call SearchScope.fetchRows, then execute "SELECT Owner AS Bob FROM Document" on the search results. In the results, each RepositoryRow object would then have a property named Bob.

However, you cannot use the AS clause when returning IndependentObject objects. IndependentObject objects can be used as a subsequent update, so, in the preceding RepositoryRow example, there probably would be no property named Bob for the update, nor would it be useful to try to update a (possible) property named Bob using "Owner".

Content Searches

Content (full-text) searches include in the query words or phrases that might be stored in objects, or in string properties of these objects. For the content in an object or its string properties to be searched, you must enable content-based retrieval (CBR) for the object and any of its string properties to be included in a content search. This is controlled by the (boolean) value of the IsCBREnabled property on the following classes:

The IsCBREnabled property can be enabled only for Document, Annotation, CustomObject, and Folder objects.

A content search is initiated by a CONTAINS function in the SQL statement contained in SearchSQL. The CONTAINS function can search content in all properties, or in a single property.

See CBR Queries for more information about the CONTAINS functions, and Content-Based Retrieval for information about administrative interfaces for full-text information.

Note: Full-text queries can take a considerable amount of time to execute. Some queries can finish in a few seconds, while others could potentially run for hours. Your applications should be written to allow the user to set a timeout; a single default value is probably not sufficient. The user settings should ensure that either the query does not run longer than desired, or that the timeout value is high enough to enable the query to finish execution. Note that the timeout value is the time required to fetch a page for a continuable query, not the time to fetch all pages for the query.

Stored Searches

A StoredSearch object can be one of two types: stored search or search template. Both types are persisted to an object store and are designed for performing searches multiple times.

Note: Stored searches are only available for use when the Stored Search Extensions add-on is installed.

The content of a StoredSearch object is the search criteria in the form of an XML string. It is subclassed from the Document object, so when you instantiate a StoredSearch object, you can work with it in the same ways as you work with a Document object (such as checking out the stored search, setting its content, checking it back in, filing it into a folder, and deleting it).

A StoredSearch object is identified as a stored search or a search template by the value of the searchtype element in the XML. The StoredSearch object can query for Document, Folder or CustomObject objects. The XML objecttype attribute identifies the object type for the query.

Only one of the object types (Document, Folder, or Custom Object) can be specified per search clause in the XML. Each search clause must be handled as an individual query, requiring a separate SearchScope call to execute each search clause.

You can create stored searches and search templates using Search Designer in Workplace XT and by saving the XML in a StoredSearch object in an object store. All stored searches must conform to the Stored Search schema. Use the SearchScope methods fetchObjects and fetchRows having StoredSearch in their signature to execute a stored search.

Using the SearchTemplate* classes (those classes having "SearchTemplate" as a prefix), you can make runtime modifications to the stored search or search template XML that has been persisted in a StoredSearch object. The XML modifications are passed to a SearchScope call in a SearchTemplateParameters instance.

See Searching for Objects Using a Stored Search for more information.

Stored Search Type

A stored search predefines a query to retrieve Document, Folder or Custom Object objects (or subclasses of those classes) from one or more object stores. Only one object type can be specified per search clause.

Search Template Type

A search template can provide some or all of the search criteria and values for the query define. The template design gives the user the opportunity to modify the values of writeable properties before executing the search. The search template identifies how the fields are to be processed (which ones require the user to assign a value, which fields are automatically pre-assigned, which fields can be modified or are read-only, and so on).

Search templates support Document, Folder, or Custom Object substitution at runtime, enabling users to select documents, folders or custom objects, different than those specified in the search template XML. The specified objects are modified or replaced individually based on the itemid attribute of the relevant XML element.



Feedback

Last updated: October 2013
query_concepts.htm

© Copyright IBM Corporation 2014.
This information center is powered by Eclipse technology. (http://www.eclipse.org)