Flink SQL Client的Rolling Aggregation實驗解析
阿新 • • 發佈:2021-02-04
技術標籤:Flink
基本概念
stddev
這個stddev是Strandard Deviation的縮寫
下面來分析一個FLINK SQL 執行Rolling Aggregation的例子
如下:
SELECT measurement_time, city, temperature, AVG(CAST(temperature AS FLOAT)) OVER last_minute AS avg_temperature_minute, MAX(temperature) OVER last_minute AS min_temperature_minute, MIN(temperature) OVER last_minute AS max_temperature_minute, STDDEV(CAST(temperature AS FLOAT)) OVER last_minute AS stdev_temperature_minute FROM temperature_measurements WINDOW last_minute AS ( PARTITION BY city ORDER BY measurement_time RANGE BETWEEN INTERVAL '1' MINUTE PRECEDING AND CURRENT ROW );
具體解析如下:
欄位 | 解釋 |
measurement_time, | 選擇測量時間 |
city, | 選擇城市 |
temperature, | 選擇溫度 |
AVG(CAST(temperature AS FLOAT)) OVER last_minute AS avg_temperature_minute, | 溫度平均值(最近一分鐘) |
MAX(temperature) OVER last_minute AS min_temperature_minute, | 溫度最大值(最近一分鐘) |
MIN(temperature) OVER last_minute AS max_temperature_minute, | 溫度最小值(最近一分鐘) |
STDDEV(CAST(temperature AS FLOAT)) OVER last_minute AS stdev_temperature_minute | 最近一分鐘的標準差 |
總結:
①根據上面的這個解析,我們也就知道了Rolling Aggregation是啥意思呢?
也就是說要統計過去的一分鐘內,關於某個變數(temperature)的最新的統計值(AVG/MAX/MIN/STDDEV)
②SQL中哪裡體現Rolling呢?
last_minute
不斷計算最近一分鐘(Rolling)的統計值(聚合操作),體現了Rolling
Reference:
[1]Rows Over Window與Range Over Window的區別