IBM PureData System for Analytics, Version 7.1

Hypothetical set functions

The following is an example of a hypothetical set function:

SELECT grp, percent_rank(1500) WITHIN GROUP(ORDER BY sal) 
percent_rankPR, rank(1500) WITHIN GROUP(ORDER BY sal) rank, COUNT(*) 
FROM salary_info GROUP BY grp (ORDER BY grp);
GRP PR  RANK COUNT(*)
--- --- ---- --------
1   0.0000000000000000 1    2
2   1.0000000000000000 5    4
3   0.5000000000000000 3    4
(3 rows)

Let us work out the steps. Look at the data from table salary_info:

GRP A  SAL
--- -- ----
1   1  3000
1   2  3000
2   3  800
2   4  950
2   5  1100
2   6  1300
3   7  1250
3   8  1250
3   9  1500
3   10 1600

For group with grp value =3, the “sal” set ordered by the column value is {1250, 1250, 1500, 1600}. The argument of percent_rank provides a hypothetical row. This is “added” to the “sal” set. Now the ordered “sal” set is {1250, 1250, 1500, 1500, 1600}.

percent_rank(1500) is (rank of “1500” in the “sal” set -1)/ (number of rows in group -1) = 3 – 1 / 4 = 2/4 = 0.5

The following is an example of a hypothetical set function with multiple arguments:

SELECT region, rank(1800, 15) WITHIN GROUP(ORDER BY amt, 
profit_margin), dense_rank(1800, 15) WITHIN GROUP(ORDER BY amt, 
profit_margin) FROM sales_tbl GROUP BY region;
 REGION    | RANK | DENSE_RANK 
---------- +------+------------
 Central   |    5 |         4
 Northeast |    3 |         3 
 Northwest |    1 |         1 
 Southwest |    1 |         1
(4 rows)