Multi-core parallelism or intra-partition parallelism in
DB2, which can be enabled by a configuration parameter intra_parallel, allows
query execution to be parallelized by dividing the work among subagents.
As a result, CPU resources on a multi-core machine
can be better utilized and query elapsed time can be reduced.
It is especially beneficial to long-running
Also, it does not
require any form of data partitioning.
DB2 has supported intra-partition parallelism for many
releases. In the new InfoSphere
Warehouse 10 and DB2 10, we’ve given it a major update. In particular, we’ve significantly improved
its scalability as the degree of parallelism (a.k.a. query degree) increases. This
is achieved by improved load balancing, more efficient parallelization
techniques, and reduced latch contention.
In Version 10, we’ve introduced a new operator called
REBAL to improve load balancing in intra-partition parallelism. In particular, REBAL redistributes rows to ensure all
subagents do equal work. Our optimizer
will look for sources of imbalance in a query’s access plan and inject these REBAL
operations where appropriate. We've also implemented more efficient parallelization techniques, e.g., complete group-by
on unique keys without any partitioned sort, improved access plan parallelization transformation costing,
and exploitation of stream partitioning. Finally, we’ve put a lot of
effort in reducing latch contention, e.g., in hash joins, partitioned sort, and
prefetcher queue. All of them together
help improve the scalability of intra-partition parallelism, when compared to the
On the other hand, in Version 10 we've added a
new support for mixed workloads.
Specifically, individual applications or workloads can now dynamically
throttle the degree of parallelism to optimize performance for the types of
queries being executed.
Typically, a transactional workload consists of short
insert, update, and delete transactions, and does not benefit from parallelization. In some cases, the overhead of intra-partition parallelism even hurts the throughput. On the contrary, a data warehouse workload
benefit greatly from parallelization as its queries are often processor-intensive
and long running. In the previous
releases it was only possible to turn on/off intra-partition parallelism and
control the degree of parallelism for the whole instance. Thus, in a mixed environment with
transactional and data warehouse workloads, it was difficult to use this
feature without making the other workload worse. In the new release, with WLM
workload control, we can now configure different workloads to use the optimal
parallelism settings (ON, OFF, and query degree.) Hence, an optimal performance can now be
Multi-core Parallelism will take our warehouse performance to a whole new level in InfoSphere Warehouse 10 and DB2 10.
Michael Kwok, Ph.D.
Manager, DB2 LUW / ISW Warehouse Performance
IBM Toronto Lab