DB2 Version 9.7 for Linux, UNIX, and Windows

db2-xdb:normalization decomposition annotation

The db2-xdb:normalization annotation specifies the normalization of whitespace in the XML data to be inserted or to be substituted for $DECOMP_CONTENT (when used with db2-xdb:expression).

db2-xdb:normalization belongs to the set of decomposition annotations that can be added to an XML schema document to describe the mappings between elements and attributes of XML documents to DB2® base tables. The decomposition process uses the annotated XML schema to determine how elements and attributes of an XML document should be decomposed into DB2 tables.

Annotation type

Attribute of <xs:element> or <xs:attribute>, or attribute of <db2-xdb:rowSetMapping>

How to specify

db2-xdb:normalization is specified in any of the following ways (where value represents a valid value for the annotation):
  • <xs:element db2-xdb:normalization="value" />
  • <xs:attribute db2-xdb:normalization="value" />
  • <db2-xdb:rowSetMapping db2-xdb:normalization="value">
      <db2-xdb:rowSet>value</db2-xdb:rowSet>
      …
    </db2-xdb:rowSetMapping>

Namespace

http://www.ibm.com/xmlns/prod/db2/xdb1

Valid values

One of the following case-sensitive tokens:
  • canonical
  • original (default)
  • whitespaceStrip
Note: The db2-xdb:normalization attribute is valid only for mappings between certain XML schema types and SQL character types. Refer to the Details section for the list of supported XML schema types that can be normalized for SQL character columns.

Details

When inserting XML values into character type target columns (CHAR, VARCHAR, LONG VARCHAR, CLOB, DBCLOB, GRAPHIC, VARGRAPHIC, LONG VARGRAPHIC), it might be necessary to normalize the data being inserted. Different types of normalization can be specified with the db2-xdb:normalization annotation; the valid values, which are case-sensitive settings are:
canonical
The XML value is converted to its canonical form, according to its XML schema type, before being inserted into the target column or substituted for occurrences of $DECOMP_CONTENT in the same mapping as this db2-xdb:normalization annotation.
original
The original character data, following any processing by an XML parser, of the element content or attribute value (depending on whether this mapping is for an XML element or XML attribute) is inserted into the target column or substituted for occurrences of $DECOMP_CONTENT in the same mapping as this db2-xdb:normalization annotation. If the db2-xdb:normalization attribute is not specified for a mapping where this annotation is relevant, then the decomposition process normalizes data according to the "original" setting.
whitespaceStrip
The XML value has all leading and trailing whitespace removed, and consecutive whitespace is collapsed into a single whitespace character, before being inserted into the target column or substituted for occurrences of $DECOMP_CONTENT in the same mapping as this db2-xdb:normalization annotation.
db2-xdb:normalization is applicable when an element or attribute of (or derived from) one of these atomic XML schema types is mapped to a column of character type (CHAR, VARCHAR, LONG VARCHAR, CLOB, DBCLOB, GRAPHIC, VARGRAPHIC, and LONG VARGRAPHIC).
  • byte, unsigned byte
  • integer, positiveInteger, negativeInteger, nonPositiveInteger, nonNegativeInteger
  • int, unsignedInt
  • long, unsignedLong
  • short, unsignedShort
  • decimal
  • float
  • double
  • boolean
  • time
  • date
  • dateTime
db2-xdb:normalization will be ignored if specified on any other types. Note that these are XML schema types for which the W3C recommendation, XML Schema Part 2: Datatypes Second Edition, has a canonical representation.

Because the db2-xdb:normalization annotation is valid only for certain XML schema to SQL character type mappings, when the annotation is specified for unsupported mappings, it is ignored.

Example

The following example shows how whitespace normalization can be controlled with the db2-xdb:normalization annotation. The annotated schema is presented first.

      <xs:element name="author">
        <xs:complexType>
          <xs:sequence>
            <xs:element name="firstname" type="xs:string"
                        db2-xdb:rowSet="AUTHORS" db2-xdb:column="FIRSTNAME" />
            <xs:element name="lastname" type="xs:string"
                        db2-xdb:rowSet="AUTHORS" db2-xdb:column="SURNAME"
                        db2-xdb:normalization="whitespaceStrip" />
            <xs:element name="activeStatus" type="xs:boolean"
                        db2-xdb:rowSet="AUTHORS" db2-xdb:column="ACTIVE"
                        db2-xdb:normalization="canonical" />
            <xs:attribute name="ID" type="xs:integer"
                        db2-xdb:rowSet="AUTHORS" db2-xdb:column="AUTHID"
                        db2-xdb:normalization="whitespaceStrip" />
	  </xs:sequence>
        </xs:complexType>
      </xs:element>

The <author> element that is being mapped is presented next (notable whitespaces are represented below by the '_' underscore character for the purpose of demonstration), followed by the resulting AUTHORS table after the decomposition has completed.

 <author ID="__22">
    <firstname>Ann</firstname>
    <lastname>__Brown_</lastname>
    <activeStatus>1</activeStatus>
  </author>
Table 1. AUTHORS
AUTHID FIRSTNAME SURNAME ACTIVE NUMBOOKS
22 Ann __Brown_ true NULL

The "whitespaceStrip" setting causes the leading whitespace from the "ID" attribute to be removed before the value is inserted into the target table. Notice, however, that the leading and trailing whitespace from the <lastname> element is not stripped, even though the "whitespaceStrip" setting was specified. This is because the <lastname> element has an XML schema type of string, which is not an applicable type for db2-xdb:normalization. The <activeStatus> child element of <author> is defined as a boolean type, and the canonical representation of boolean types is either the literal "true" or "false". The "canonical" setting for the <activeStatus> element results in the canonical form of "1", which is "true", being inserted into the ACTIVE column of the AUTHORS table.

If in the XML schema presented above, the "ID" attribute had been annotated with db2-xdb:normalization="original" instead, then the original value from the document, "__22" (where the underscore character represents whitespace), would have been inserted into the AUTHID column.