DB2 9.1 XML/XQuery support


I investigated the XML and XQuery support in IBM DB2 9.1 (as of 2009-10-31, I think the latest DB2 release is 9.7). The XML tooling in DB2 9.1 is called pureXML.

DB2 9.1 provides support for 'XML' data type in SQL (which is now part of the ISO/IEC SQL specification). We can now define table columns, of type 'XML' in DB2. This makes it possible to store an entire XML document in a particular field of a relational DB2 table. Within this 'XML' field, an XML document is stored in it's native hierarchical form (the XML is stored in a native XML store, whereas the pointer to the XML document is kept in the field). This approach of XML storage into relational tables, is different to traditional shredding (can also be thought as decomposition) of XML documents into relational tables. Shredding of XML documents into relational tables can affect the application performance, if the XML Schema is very vast and maps to large number of relational tables and columns. Joining a large number of relational tables, to get a business application view can be a major cause of poor application performance.

The SQL/relational model and XML are now closely tied in DB2. We can write XML access code in SQL, which allows many interesting operations to be performed. XML within the database can be accessed using XQuery (either directly, or from within SQL).

Following are the three ways in which XML data can be accessed within DB2:
1) Through XQuery only
2) XQuery that can invoke SQL
3) Through SQL only

I developed a small sample application in DB2 9.1, illustrating the features of XML and XQuery in DB2 database. Following are the steps (along with screen displays) to design this application:

Step 1: Create a database with XML support
Use DB2 control center to do this.

Following is a screen shot to create a database in DB2 9.1:



The option, "Enable database for XML" must be selected.

Step 2: Register a XML Schema with XML Schema Repository (XSR)
Since we'll be storing XML in the database (in a table named 'CUSTOMER'), we want to validate the XML document with a XML Schema, each time we will insert or update the XML data (which is a good architectural decision for software applications). Using the XML Schema validation approach, only a valid XML document could be stored in the database. XML validation must generally be done, to maintain integrity of business domain objects, stored as XML data.

To perform this step, we must register the XML Schema in DB2 (so it get's an identity in the database).

Please copy the XML Schema somewhere on the file system (I kept it on the Windows desktop, for this sample application).

The example Schema I used is (named customer.xsd):

<?xml version="1.0" encoding="UTF-8"?>
    <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
        <xs:element name="customer">
            <xs:complexType>
                <xs:sequence>
                    <xs:element name="name" type="xs:string" />
                    <xs:element name="address">
                        <xs:complexType>
                            <xs:sequence>
                                <xs:element name="street" type="xs:string" />
                                <xs:element name="city" type="xs:string" />
                                <xs:element name="state" type="xs:string" />
                                <xs:element name="pin" type="xs:string" />
                                <xs:element name="country" type="xs:string" />
                            </xs:sequence>
                        </xs:complexType>
                    </xs:element>
                </xs:sequence>
                <xs:attribute name="id" type="xs:string" />
            </xs:complexType>
    </xs:element>
</xs:schema>


The XML Schema Repository looks like following (this is an empty XSR):
 


Perform following steps from DB2 Command Line processor.

db2 => connect to sampledb

Database Connection Information

Database server = DB2/NT 9.1.0
SQL authorization ID = GANDHIMU
Local database alias = SAMPLEDB

db2 => register xmlschema 'http://mukul/customer' from 'file://C:/Documents and Settings/gandhimu/Desktop/customer.xsd' as gandhimu.customerschema
DB20000I The REGISTER XMLSCHEMA command completed successfully.

db2 => complete xmlschema gandhimu.customerschema
DB20000I The COMPLETE XMLSCHEMA command completed successfully.

Please note: The two commands, register xmlschema and complete xmlschema must be performed in this sequence, to register a XML Schema in XSR.

Now a Schema is registered into the XSR. The XSR view now looks like below:
 


Step 3: Create a table having column of type XML
The table structure is shown below:



Please note: The column 'INFO' is of type XML.

Step 4: Store records in the table
Keep XML documents to be stored in a relational table, somewhere on file system (I kept it on the Windows desktop).

Following is a sample XML file (named customer.xml):

<?xml version="1.0" encoding="UTF-8"?>
<customer id="1000">
    <name>XYZ Ltd.</name>
    <address>
        <street>DLF City, Phase 2</street>
        <city>Gurgaon</city>
        <state>Haryana</state>
        <pin>122001</pin>
        <country>India</country>
    </address>
</customer>

To insert records in the table, we'll use a Java program. Following is a sample Java program, which will insert records in the table:

import java.sql.PreparedStatement;
import java.sql.DriverManager;
import java.sql.Connection;
import java.io.FileInputStream;
import java.io.File;

public class testXMLDb {

    public static void main(String[] args) {
        try {
            PreparedStatement insertStmt = null;
            String sqls = null;
            int cid = 1000;

            sqls = "INSERT INTO CUSTOMER (id, info) VALUES (?, XMLVALIDATE(? ACCORDING TO XMLSCHEMA ID gandhimu.customerschema))";

            String url = "jdbc:db2://localhost:50000/SAMPLEDB";

            Class.forName("com.ibm.db2.jcc.DB2Driver").newInstance();
            Connection conn = DriverManager.getConnection(url, "uid", "pwd");

            insertStmt = conn.prepareStatement(sqls);
            insertStmt.setInt(1, cid);
            File file = new File("C:\\Documents and Settings\\gandhimu\\Desktop\\customer.xml");
            insertStmt.setBinaryStream(2, new FileInputStream(file), (int)file.length());
            insertStmt.executeUpdate();
        }
        catch(Exception ex) {
            ex.printStackTrace();
        }
    }

}


Please note:
The insert statement uses a clause, XMLVALIDATE, which will cause validation of the XML document using the XML Schema stored in DB2's 'XML Schema Repository'.

Insert two records in the table (for this sample). The populated table looks like following in the DB2 control center:
 


Step 5: Retrieve data from the table using XQuery
Following are few SQL query examples, for this sample application:

1) SELECT id
FROM customer c
WHERE XMLCAST(XMLQUERY('$cust/customer/name/text()' PASSING c.info AS "cust") AS VARCHAR(100)) = 'XYZ Ltd.'

2) SELECT XMLCAST(XMLQUERY('$cust/customer/name/text()' PASSING c.info AS "cust") AS VARCHAR(100))
FROM customer c
where id = 1000

The first query retrieves an usual traditional SQL column. The 'where' clause compares data stored in XML document with a usual BIGINT relational field.

The second query retrieves the customer name stored in the XML document (in the table column, named INFO).

This illustrates that XQuery and SQL are closely tied in DB2.

Notes: To update XML data in DB2 9.1, we have to update the entire XML document (commonly in a client application, like Java). The upcoming DB2 version 9.5 supports the emerging W3C standard, "XQuery Update facility", which will allow updations of XML document in the database itself, at a fine grained level like an XML element or attribute.

2008-11-03: I tried some of the XQuery update features in DB2 9.5, which is now available for the community. I found the XQuery update facility in DB2 9.5 to be working well.

If readers wish to know how XQuery update facility works in DB2, this article is an useful reference:
http://www.ibm.com/developerworks/db2/library/techarticle/dm-0710nicola/.

Conclusions:
This hybrid SQL/XML database concept is a useful facility, and a paradigm shift in relational and XML storage which will allow many useful applications to be built. I feel that software applications should exploit the combined power of SQL and XML into their applications.


References:
1. http://www-01.ibm.com/software/data/db2/
2. http://www-01.ibm.com/software/data/db2/xml/


Home


Last Updated: Oct 31, 2009