elasticsearch(18) es中一些聚合操作
GET /tvs/sales/_search
{
"size":0,
"aggs": {
"group_by_color": { (指定分組)
"terms": {
"field": "color"
},
"aggs": {
"avg_price": { (分組下的平均值)
"avg": {
"field": "price"
}
},
"max_price":{ (分組下的最大值)
"max":{
"field": "price"
}
},
"min_price":{ (分組下的最小值)
"min":{
"field": "price"
}
},
"sum_price":{ (分組下的總量)
"sum": {
"field": "price"
}
}
}
}
}
}
2 histogram
類似於terms,接收一個field,按照這個field給定的區間進行分組
get /tvs/sales/_search
{
"size":0,
"aggs":{
"group_by_color":{
"histogram": {
"field": "price",
"interval": 2000 (電視機按照0-2000,2000-4000,,,分組)
},
"aggs":{
"sum_price":{
"sum":{
"field": "price"
}
}
}
}
}}
3.date histogram
date interval
get /tvs/sales/_search
{
"size":0,
"aggs":{
"group_by":{
"date_histogram": {
"field": "sold_date", 根據這個fields劃分
"interval": "month",
"format": "yyyy-MM-dd",
"min_doc_count": 0,即使沒有也返回0
"extended_bounds": {
"min": "2016-01-01",起始時間
"max": "2017-01-31"結束時間
}
}
}
}
}
下鑽操作
get /tvs/sales/_search
{
"size":0,
"aggs":{
"group_by":{ ---1
"date_histogram": {---2
"field": "sold_date",
"interval": "month",
"format": "yyyy-MM-dd",
"min_doc_count": 0,
"extended_bounds": {
"min": "2016-01-01",
"max": "2017-01-31"
}
},
"aggs":{ ---3
"group_by_brand":{
"terms": {
"field": "brand"
},
"aggs":{ ---4
"sum_price":{
"sum": {
"field": "price"
}
}
}
},
"sum_total":{ (再3之下)
"sum": {
"field": "price"
}
}
}
}
}
}
4 GET /tvs/sales/_search
{
"size": 0,
"query": {
"match": {
"brand":"長虹"
}
},
"aggs": {
"single_brand_avg_price": { (只聚合長虹電視的平均價格)
"avg": {
"field": "price"
}
},
"all": { ---別名
"global": {}, ----計算全部電視的平均價格
"aggs": {
"all_brand_avg_price": {
"avg": {
"field": "price"
}
}
}
}
}
}
5 get /tvs/sales/_search
{
"size":0,
"aggs":{
"group_by_color":{
"terms":{
"field": "brand",
"order": {
"avg_price": "desc" -----按照平均價格降序
}
},
"aggs":{ -------再顏色分組下再做一次聚合操作
"avg_price":{
"avg": {
"field": "price"
}
}
}
}
}
}
6.cardinality關鍵詞
本質上是一個count(distinct)操作,有5%的錯誤率,但是速度很快再100ms以內,如果要進一步控制精確度,可以用
precision_threshol這個關鍵詞,
比如GET /tvs/sales/_search
{
"size" : 0,
"aggs" : {
"distinct_brand" : {
"cardinality" : {
"field" : "brand",
"precision_threshold" : 100 -----如果brand的數量再100個以內,可以保證精確
}
}
}
}
使用precision_thredhold引數會佔用一定記憶體,記憶體佔有率再value*8 byte,比如上面就是佔用100*8個位元組
7.percenties的
GET /website/logs/_search
{
"size": 0,
"aggs": {
"latency_percentiles": {
"percentiles": {
"field": "latency",
"percents": [
50, ----- latency在top 50的平均值
95, ---- latency在top 95的平均值
99 ---- latency在top 99的平均值
]
}
},
"latency_avg": {
"avg": {
"field": "latency"
}
}
}
}
8.percenties rank
GET /website/logs/_search
{
"size": 0,
"aggs": {
"group_by_province": {
"terms": {
"field": "province"
},
"aggs": {
"latency_percentile_ranks": {
"percentile_ranks": {
"field": "latency",
"values": [
200, ------ 延遲再200ms以內的百分比
1000 -----延遲在1000ms以內的百分比
]
}
}
}
}
}
}