Dr. Jon A. Lind, InfoSphere Warehouse Product Manager, DB2 Warehousing Product Management,
Shamit Bagchi, DB2 & PureData Data Warehouse Product Marketing
An example of the DB2 Tech Preview in action.
Lets take an example here where a system with 32 cores, a 10TB table where each row in the table has 100 columns, and the table contains 10 years of data. We then run the query “SELECT COUNT(*) from MYTABLE where YEAR = ‘2010’”. From this query I want to get sub second results. Today that would be a challenge for nearly all database management solutions. Before the technology preview there would have been no way a sub second query without an index, MQT, MDCs, etc. would be possible. How can we possibly achieve this within the technology preview?
First we compress the data in the table by 10x resulting in a table that on disk is only 1TB in size. The query then only accesses 1 column so 1/100 of the columns in the table (1% - 10GB of 1TB). So using data skipping we can skip over 9 years and only look at 1 year (now 1GB of data). Now divide across 32 cores for the scan, each core the processes only 32 MB of data. Scan will happen faster on encoded data (say 4x faster than traditional) as fast as 8MB of data on traditional system. Therefore, in the end each core is only processing 8MB of data which is no issue to get a sub second response from.