DB2 Version 9.7 for Linux, UNIX, and Windows

XML schema structuring recommendations for decomposition

You can minimize the demands made on your system's memory resources from annotated schema decomposition by adjusting the order of elements in your annotated XML schema.

For very large documents, following this recommendation might make a difference in whether the document can be decomposed without having to increase the amount of available memory for the DB2® database server. For sibling elements that are annotated for decomposition, elements of simple types should be placed before the sibling elements of complex type in the annotated schema. Similarly, sibling elements that have the maxOccurs attribute set to 1 should be placed before siblings that have maxOccurs > 1.

The memory consumption required by annotated schema decomposition is affected by the structure of the XML schema because each item that forms a row must be held in memory until all of the items that form the row are processed. These schema structuring recommendations organize the items of a row in such as way as to minimize the number of items that must be kept in memory.

The following example shows the recommended XML schema structuring for mapped sibling elements contrasted with the less optimal structuring. Notice how <complexElem>, which is of complex type, is placed before <status>, which is of simple type, in the less optimal example. Placing <complexElem> after the <id> and <status> elements improves decomposition runtime efficiency.

<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
           xmlns:db2-xdb="http://www.ibm.com/xmlns/prod/db2/xdb1"> 
  <-- Recommended structuring with simple types placed before
      the recurring element <wrapper>, which is of complex type --> 
  <xs:complexType name="typeA"> 
    <xs:sequence> 
      <xs:element name="id" type="xs:integer"
                  db2-xdb:rowSet="relA" db2-xdb:column="ID" /> 
      <xs:element name="status" type="xs:string"
                  db2-xdb:rowSet="relA" db2-xdb:column="status" /> 
      <xs:element name="wrapper" type="typeX" maxOccurs="unbounded"/> 
    </xs:sequence> 
  </xs:complexType> 

  <-- Less optimal structuring with recurring complex type element
      appearing before the simple type element -->
  <--
  <xs:complexType name="typeA"> 
    <xs:sequence> 
      <xs:element name="id" type="xs:integer"
                  db2-xdb:rowSet="relA" db2-xdb:column="ID" /> 
      <xs:element name="wrapper" type="typeX" maxOccurs="unbounded"/> 
      <xs:element name="status" type="xs:string"
                  db2-xdb:rowSet="relA" db2-xdb:column="status" /> 
    </xs:sequence> 
  </xs:complexType> --> 

  <xs:complexType name="typeX"> 
    <xs:sequence> 
      <xs:element name="elem1" type="xs:string"
                  db2-xdb:rowSet="relA" db2-xdb:column="elem1" /> 
      <xs:element name="elem2" type="xs:long"
                  db2-xdb:rowSet="relA" db2-xdb:column="elem2" /> 
    </xs:sequence> 
  </xs:complexType> 

  <xs:element name="A" type="typeA" /> 

</xs:schema>

Note that <id>, <status>, <elem1>, and <elem2> are mapped to the same rowSet, that is, together they form a row. Memory associated with a row is released when a row is complete. In the less optimal case presented above, none of the rows associated with the rowSet relA can be considered complete until the <status> element is reached in the document. The <wrapper> element must be processed first, however, as it occurs before the <status> element. This means that all instances of <wrapper> must be buffered in memory until the <status> element is reached (or the end of <A> is reached, if <status> is absent from the document).

The impact of this structure becomes significant if there are a high number of instances of an element. For example, if there were 10 000 instances of the <wrapper> element, then all 10 000 instances would have to be held in memory until the rowSet was complete. In the optimal case presented above, however, memory associated with the rows of rowset relA, can be released when <elem2> is reached.