已选择48行。
接下来我们再来看分析函数与我们上次学的ROLLUP、CUBE、GROUPING SETS的结合使用:
SELECT
prd_type_id,SUM(amount),
RANK() OVER (ORDER BY SUM(amount) DESC NULLS LAST) AS rank
FROM all_sales
WHERE year=2003
GROUP BY ROLLUP(prd_type_id)
ORDER BY rank;
已选择42行。
SQL> SELECT
2 prd_type_id,emp_id,SUM(amount),
3 RANK() OVER (ORDER BY SUM(amount) DESC NULLS LAST) AS rank
4 FROM all_sales
5 WHERE year=2003
6 GROUP BY GROUPING SETS(prd_type_id,emp_id)
7 ORDER BY prd_type_id,emp_id;
PRD_TYPE_ID EMP_ID SUM(AMOUNT) RANK
----------- ---------- ----------- ----------
1 905081.84 1
2 186381.22 10
3 478270.91 2
4 402751.16 5
5 11
21 466997.84 3
22 432297.84 4
23 323077.84 7
24 333370.84 6
25 187970.84 9
26 228769.93 8
已选择11行。
2)CUME-DIST()和PERCENT-RANK()函数
下面这个例子说明了CUME-DIST()与PERCENT-RANK()的使用,它得到的是销量的累积分布和百分比排名:
SQL> SELECT
2 prd_type_id,SUM(amount),
3 CUME_DIST() OVER (ORDER BY SUM(amount) DESC) AS cume_dist,
4 PERCENT_RANK() OVER (ORDER BY SUM(amount) DESC) AS percent_rank
5 FROM all_sales
6 WHERE year=2003
7 GROUP BY prd_type_id
8 ORDER BY prd_type_id;
PRD_TYPE_ID SUM(AMOUNT) CUME_DIST PERCENT_RANK
----------- ----------- ---------- ------------
1 905081.84 .4 .25
2 186381.22 1 1
3 478270.91 .6 .5
4 402751.16 .8 .75
5 .2 0
3)NTILE()函数的使用
前面我们已经介绍了这个函数的作用就是把记录结果集分成N部分的意思,这个函数的参数为NTILE(buckets),这个bucket参数指定了分片的片数,下面我们看例子来说明
SQL> SELECT
2 prd_type_id,SUM(amount),
3 NTILE(2) OVER (ORDER BY SUM(amount) DESC) AS ntile
4 FROM all_sales
5 WHERE year=2003
6 AND amount IS NOT NULL
7 GROUP BY prd_type_id
8 ORDER BY prd_type_id;
PRD_TYPE_ID SUM(AMOUNT) NTILE
----------- ----------- ----------
1 905081.84 1
2 186381.22 2
3 478270.91 1
4 402751.16 2
注意这里的N为2,因此分成了下面的1,2两片
SQL> SELECT
2 prd_type_id,SUM(amount),
3 NTILE(3) OVER (ORDER BY SUM(amount) DESC) AS ntile
4 FROM all_sales
5 WHERE year=2003
6 AND amount IS NOT NULL
7 GROUP BY prd_type_id
8 ORDER BY prd_type_id;
PRD_TYPE_ID SUM(AMOUNT) NTILE
----------- ----------- ----------
1 905081.84 1
2 186381.22 3
3 478270.91 1
4 402751.16 2
注意这里的N为3,因此分成了下面的1,2,3三片,这里我的看法是当分片不均时,都是向上最加(即有两个1片)
当N=4时就与RANK相同了
PRD_TYPE_ID SUM(AMOUNT) NTILE
----------- ----------- ----------
1 905081.84 1
2 186381.22 4
3 478270.91 2
4 402751.16 3`
4)ROW-NUMBER()函数
SQL> SELECT
2 prd_type_id,SUM(amount),
3 ROW_NUMBER() OVER (ORDER BY SUM(amount) DESC) AS row_number
4 FROM all_sales
5 WHERE year=2003
6 GROUP BY prd_type_id
7 ORDER BY prd_type_id;
PRD_TYPE_ID SUM(AMOUNT) ROW_NUMBER
----------- ----------- ----------
1 905081.84 2
2 186381.22 5
3 478270.91 3
4 402751.16 4
5 1
这里ROW-NUMBER()函数就相当于RANK()函数。
总结:在上面介绍的这些评级函数中其中RANK()、DENSE-RANK()、PERCENT-RANK()函数是比较常用的(相对于其他几个而言),因此我们最好要掌握而其他几个大家只要知道了解就可以了。
反百分点函数的使用
PERCENTILE-DISC(X)函数与CUME-DIST相反,它在每一个分组中检查累积分布的数值,直到找到大于或等于X的值。
PERCENTILE-CONT(X)函数与PERCENT-RANK()相反,在每一个分组中检查百分比排名的值,直到
找到大于或等于X的值。
下面我们来看个例子获取百分点大于等于0.6的销售总量:
SQL> SELECT
2 PERCENTILE_CONT(0.6) WITHIN GROUP (ORDER BY SUM(amount) DESC) AS percentile_cont,
3 PERCENTILE_DISC(0.6) WITHIN GROUP (ORDER BY SUM(amount) DESC) AS percentile_disc
4 FROM all_sales
5 WHERE year=2003
6 GROUP BY prd_type_id;
PERCENTILE_CONT PERCENTILE_DISC
--------------- ---------------
417855.11 402751.16
窗口函数
窗口函数主要用来计算一定的记录范围内、一定的值域内、或一段时间内的累积和及移动平均值等。之所以叫“窗口”因为处理结果中使用了一个滑动的查询结果集范围。
1).计算累积和
下面这个例子是计算出2003年从1月到12月的累积销量。
SQL> SELECT
2 month 月份,SUM(amount) AS 月总销量,
3 SUM(SUM(amount)) OVER
4 (ORDER BY month ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS 月累积销量 --定义了窗口的起点和终点
5 FROM all_sales
6 WHERE year=2003
7 GROUP BY month
8 ORDER BY month;
已选择12行。
那如果是计算6月到12月的累积销量呢!!!
SQL> SELECT
2 month 月份,SUM(amount) AS 月总销量,
3 SUM(SUM(amount)) OVER
4 (ORDER BY month ROWS UNBOUNDED PRECEDING) AS 月累积销量
5 FROM all_sales
6 WHERE year=2003
7 AND month BETWEEN 6 AND 12 --6和12换为相应的月就可以了
8 GROUP BY month
9 ORDER BY month;
已选择7行。
2).计算移动平均值
计算本月与前三个月之间销量的移动平均值
SQL> SELECT
2 month 月份,SUM(amount) AS 月总销量,
3 AVG(SUM(amount)) OVER
4 (ORDER BY month ROWS BETWEEN 3 PRECEDING AND CURRENT ROW) AS 三月平均累积销量
5 FROM all_sales
6 WHERE year=2003
7 GROUP BY month
8 ORDER BY month;
已选择12行。
3).计算中心平均值
计算当前月份前、后各一个月内的销量移动平均值:
SQL> SELECT
2 month 月份,SUM(amount) AS 月总销量,
3 AVG(SUM(amount)) OVER
4 (ORDER BY month ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING) AS 平均累积销量
5 FROM all_sales
6 WHERE year=2003
7 GROUP BY month
8 ORDER BY month;
月份 月总销量 平均累积销量
---------- ---------- ------------
12 137336.17 148779.075
4.FIRST-VALUE()和LAST-VALUE()函数的使用:
下面这个例子是用FIRST-VALUE()和LAST-VALUE()来获得前一个月和后一个月的销量:
SQL> SELECT
2 month 月份,SUM(amount) AS 月总销量,
3 FIRST_VALUE(SUM(amount)) OVER
4 (ORDER BY month ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING) AS 前月销量,
5 LAST_VALUE(SUM(amount)) OVER
6 (ORDER BY month ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING) AS 后月销量
7 FROM all_sales
8 WHERE year=2003
9 GROUP BY month
10 ORDER BY month;
已选择12行。
还有一个例子是计算当前月与前、后各一个月的比率
SQL> SELECT
2 month 月份,SUM(amount) AS 月总销量,
3 SUM(amount)/FIRST_VALUE(SUM(amount)) OVER
4 (ORDER BY month ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING) AS 当月与前月比率,
5 SUM(amount)/LAST_VALUE(SUM(amount)) OVER
6 (ORDER BY month ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING) AS 当月与后月比率
7 FROM all_sales
8 WHERE year=2003
9 GROUP BY month
10 ORDER BY month;