You use analytic functions to compute cumulative, moving,
centered, and reporting aggregates.
IBM® Netezza® SQL
analytic functions compute an aggregate value that is based on a group
of rows. But unlike aggregate functions, they return multiple rows
for each group. Analytic functions answer the following questions:
- What is the running total?
- What are the percentages within a group?
- What are the top n queries?
- What is the moving average?
The
Netezza SQL
functions can be divided into the following families. Some functions
are used in more than one family:
- Reporting/Window aggregate family
- Use reporting functions to compare values at different levels
of aggregation because they can do multiple passes over the data in
a single query block and then return the same aggregate value for
every row in a partition. You can use reporting functions in percent-of-total
and market share calculations. For instance, you might want to know
regional sales levels as a percent of national sales.
Window functions
answer questions such as “what is the 12-week moving average of a
stock price?” or “what was the cumulative sum of sales per each region?”
For
all SQL aggregate functions, reporting functions provide reporting
aggregate processing, while window functions provide moving and cumulative
processing. These SQL aggregate functions include: avg, sum, min, max, count, variance,
and stddev.
- Lag/Lead family
- Use these functions to compare different rows of a table by specifying
an offset from the current row. You can use these functions to analyze
change and variation. The functions include lag and lead.
- First/Last family
- Use these functions to specify sorted aggregate groups and return
the first or last value of each group. For example, you can query
bank statements for beginning and ending monthly balances. These functions
include first_value and last_value.
- Ranking family
- These functions help to answer questions such as “what are the
top 10 and bottom 10 selling items?” The functions examine the entire
output before they display an answer. These functions include ntile, dense_rank, percent_rank, cume_dist,
and rank.
- Row count family
- The row_number function assigns a number to
each row that is based on its position within the window partition.
It is similar to dense_rank and rank,
but unlike the rank functions, it counts ties (peer
rows, that is, rows that match on the order by column).
- Hypothetical set family
- These functions give the rank or percentile that a row would have
if inserted into a specified data set. These functions include dense_rank, percent_rank, cume_dist,
and rank.
- Inverse distribution functions family
- These functions give the value in a data set that corresponds
to a specified percentile. These functions include percentile_cont and percentile_disc,
however, these two functions are not supported as window aggregates.