ElasticSearch 查詢命令
- 查詢方式
- 插入資料
- 檢視 index
- term 查詢
- 分詞大小寫
- terms 查詢
- match 查詢
- match_phrase 查詢
- multi_match
- match_all 查詢
- bool 查詢
- 控制查詢返回數
- 控制返回欄位
- 排序
- 範圍查詢
- 萬用字元查詢
- 模糊查詢
- 分值
- aggregation 查詢
- metrics aggregation : avg
- metrics aggregation : avg & histogram
- metrics aggregation : max/min/sum
- metrics aggregation : boxplot
- metrics aggregation : cardinality
- metrics aggregation : extended_stats
- metrics aggregation : geo
- metrics aggregation : matrix_stats
- bucket aggregation : adjacency_matrix
- bucket aggregation : composite
- bucket aggregation : date_histogram/auto_date_histogram
- bucket aggregation : term/filter/filters
- bucket aggregation : range/date range
- 即分組統計 (count) 又計數 (如 avg)
- Pipeline aggregations : avg_bucket
- Pipeline aggregations : cumulative_sum
- Pipeline aggregations : max_bucket
- 指令碼執行
查詢方式
ES 有自己的不同於 SQL 的查詢語法,也提供了 JDBC 等包可以執行相應的 SQL
這裡的例子用的是 ES 自己的查詢語法
插入資料
curl -X POST 'http://localhost:9200/my_index/_doc' -H 'Content-Type: application/json' -d '{ "name": "Wang", "title": "software designer", "age": 35, "address": {"city": "guangzhou", "district": "tianhe"}, "content": "I want to do some AI machine learning works" }'
會自動建立 my_index, _doc, 以及各個 field
檢視 index
curl localhost:9200/my_index?pretty
{
"my_index" : {
"aliases" : { },
"mappings" : {
"_doc" : {
"properties" : {
"address" : {
"properties" : {
"city" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"district" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
}
}
},
"age" : {
"type" : "long"
},
"content" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"name" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"title" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
}
}
}
},
"settings" : {
"index" : {
"creation_date" : "1628737053188",
"number_of_shards" : "5",
"number_of_replicas" : "1",
"uuid" : "-NHgaqt4R_SQs2KHd0aJwQ",
"version" : {
"created" : "6050199"
},
"provided_name" : "my_index"
}
}
}
}
會列出 setting 和 mapping
term 查詢
要求完全匹配,即查詢條件不分詞 (資料預設是按分詞索引)
curl -X GET -H "Content-Type: application/json" localhost:9200/my_index/_search?pretty -d '{
"query": {
"term":{
"content":"machine"
}
}
}'
能查到結果
curl -X GET -H "Content-Type: application/json" localhost:9200/my_index/_search?pretty -d '{
"query": {
"term":{
"content":"machine learning"
}
}
}'
不能查到結果
因為查詢條件 "machine learning" 必須完全匹配,但資料是按分詞索引的,沒有 "machine learning" 這個分詞
curl -X GET -H "Content-Type: application/json" localhost:9200/my_index/_search?pretty -d '{
"query": {
"term":{
"content.keyword":"machine"
}
}
}'
不能查到結果
keyword 代表不查分詞資料,而是查原資料,原資料不和 "machine" 完全匹配
curl -X GET -H "Content-Type: application/json" localhost:9200/my_index/_search?pretty -d '{
"query": {
"term":{
"content.keyword":"I want to do some AI machine learning works"
}
}
}'
能查到結果
查詢條件和原資料完全匹配
curl -X GET -H "Content-Type: application/json" localhost:9200/my_index/_search?pretty -d '{
"query": {
"term":{
"address.city":"guangzhou"
}
}
}'
能查到巢狀欄位的結果
分詞大小寫
貌似要使用小寫查詢,可能因為 es 預設將分詞都轉換成小寫
terms 查詢
要求多個詞中的任意一個能完全匹配
curl -X GET -H "Content-Type: application/json" localhost:9200/my_index/_search?pretty -d '{
"query": {
"terms":{
"content": ["machine", "learning"]
}
}
}'
能查到結果,分詞 machine 和 learning 都能匹配上
match 查詢
分詞匹配,即查詢條件會被做分詞處理,並且任一分詞滿足即可
curl -X GET -H "Content-Type: application/json" localhost:9200/my_index/_search?pretty -d '{
"query": {
"match":{
"content": "machine learning"
}
}
}'
能查到結果,兩個分詞都匹配
curl -X GET -H "Content-Type: application/json" localhost:9200/my_index/_search?pretty -d '{
"query": {
"match":{
"content": "works machine AI"
}
}
}'
能查到結果,所有分詞都匹配,並且和順序無關
curl -X GET -H "Content-Type: application/json" localhost:9200/my_index/_search?pretty -d '{
"query": {
"match":{
"content": "machine factory"
}
}
}'
能查到結果,有一個分詞即 machine 滿足即可
match_phrase 查詢
查詢條件會被當成一個完整的詞彙對待,原資料包含這個詞彙才匹配
(對比 term 則是原資料和查詢詞彙完全一樣才匹配,match_phrase 是包含的關係)
curl -X GET -H "Content-Type: application/json" localhost:9200/my_index/_search?pretty -d '{
"query": {
"match_phrase":{
"content": "machine factory"
}
}
}'
不能查到結果,因為 machine factory 不匹配
curl -X GET -H "Content-Type: application/json" localhost:9200/my_index/_search?pretty -d '{
"query": {
"match_phrase":{
"content": "works machine AI"
}
}
}'
不能查到結果,雖然原資料包含這三個分詞,但 match_phrase 是把 works machine AI 當成一個完整的單詞對待
curl -X GET -H "Content-Type: application/json" localhost:9200/my_index/_search?pretty -d '{
"query": {
"match_phrase":{
"content": "AI machine learning"
}
}
}'
能查到結果,因為 AI machine learning 有作為完整連續的單詞出現
curl -X GET -H "Content-Type: application/json" localhost:9200/my_index/_search?pretty -d '{
"query": {
"match_phrase":{
"content": {
"query": "some AI works",
"slop" : 2
}
}
}
}'
能查到結果
雖然 some AI works 作為一個完整的單詞沒有出現,但 slop 2 表示如果最多跳過兩個分詞就能滿足的話也算匹配上,這裡跳過 machine learning
curl -X GET -H "Content-Type: application/json" localhost:9200/my_index/_search?pretty -d '{
"query": {
"match_phrase":{
"content": {
"query": "some AI works",
"slop" : 1
}
}
}
}'
不能查到結果,只跳過一個分詞依然無法匹配上
multi_match
對多個欄位進行 match 查詢,有一個欄位滿足的話就算匹配上
curl -X GET -H "Content-Type: application/json" localhost:9200/my_index/_search?pretty -d '{
"query": {
"multi_match":{
"query": "machine learning",
"fields" : ["title", "content"]
}
}
}'
能查到結果,content 滿足查詢條件
curl -X GET -H "Content-Type: application/json" localhost:9200/my_index/_search?pretty -d '{
"query": {
"multi_match":{
"query": "designer",
"fields" : ["title", "content"]
}
}
}'
能查到結果,title 滿足查詢條件
curl -X GET -H "Content-Type: application/json" localhost:9200/my_index/_search?pretty -d '{
"query": {
"multi_match":{
"query": "manager",
"fields" : ["title", "content"]
}
}
}'
不能查到結果,title 和 content 都不滿足查詢條件
match_all 查詢
返回所有文件
curl -X GET -H "Content-Type: application/json" localhost:9200/my_index/_search?pretty -d '{
"query": {
"match_all":{}
}
}'
不指定條件
bool 查詢
聯合查詢,多個條件同時滿足才匹配
每個條件可以是 must, filter, should, must_not
must: 必須滿足 must 子句的條件,並且參與計算分值
filter: 必須滿足 filter 子句的條件,不參與計算分值
should: 至少滿足 should 子句的一個或多個條件(由 minimum_should_match 引數決定),參與計算分值
must_not: 必須不滿足 must_not 定義的條件
curl -X GET -H "Content-Type: application/json" localhost:9200/my_index/_search?pretty -d '{
"query": {
"bool":{
"must": [
{
"term": {"address.city": "guangzhou"}
},
{
"match": {"content": "machine learning"}
}
],
"must_not": {
"range": {"age": {"gt": 35}}
},
"filter": {
"match": {"title": "designer"}
},
"should": [
{
"term": {"name": "Li"}
},
{
"match_phrase": {"content": "AI machine"}
}
],
"minimum_should_match" : 1
}
}
}'
能查到結果,因為 bool 下的所有條件都能滿足
curl -X GET -H "Content-Type: application/json" localhost:9200/my_index/_search?pretty -d '{
"query": {
"bool":{
"must": [
{
"term": {"address.city": "guangzhou"}
},
{
"match": {"content": "machine learning"}
}
],
"must_not": {
"range": {"age": {"from": 35, "to": 40}}
},
"filter": {
"match": {"title": "designer"}
},
"should": [
{
"term": {"name": "Li"}
},
{
"match_phrase": {"content": "AI machine"}
}
],
"minimum_should_match" : 1
}
}
}'
不能查到結果,因為 must_not 條件不滿足
bool 可以巢狀
curl -X GET -H "Content-Type: application/json" localhost:9200/my_index/_search?pretty -d '{
"query": {
"bool":{
"must": [
{
"bool": {
"should": [
{
"term": {"name": "Li"}
},
{
"match_phrase": {"content": "AI machine"}
}
]
}
},
{
"bool": {
"filter": {
"match": {"title": "designer"}
}
}
}
]
}
}
}'
must 裡面是多個 bool 查詢
控制查詢返回數
curl -X GET -H "Content-Type: application/json" localhost:9200/my_index/_search?pretty -d '{
"from":0,
"size":2,
"query": {
"term":{
"title":"designer"
}
}
}'
從第一個開始,最多返回 2 個
控制返回欄位
就像 SQL 的 select 選擇特定欄位一樣
curl -X GET -H "Content-Type: application/json" localhost:9200/my_index/_search?pretty -d '{
"_source":["name","age"],
"query": {
"term":{
"title":"designer"
}
}
}'
只返回匹配文件的 name 和 age 欄位
排序
指定排序欄位
curl -X GET -H "Content-Type: application/json" localhost:9200/my_index/_search?pretty -d '{
"_source":["name","age"],
"query": {
"term":{
"title":"designer"
}
},
"sort": [
{
"age": {"order": "desc"}
}
]
}'
結果按 age 的降序排
範圍查詢
支援 from, to, gte, lte, gt, lt 等等,比如
{
"query": {
"range": {
"date": {
"gte": "2021-08-01",
"lte": "2021-08-02",
"relation": "within",
"format": "yyyy-MM-dd"
}
}
}
}
relation 也可以是 CONTAINS, INTERSECTS (預設)
因為 date 欄位可以是一個範圍,比如
"date": {"gte":"2021-08-01","lte":"2021-08-03"}
within 表示 date 的範圍在 range 的範圍內
contains 表示 date 的範圍包含了 range 的範圍
intersects 表示 date 的範圍和 range 的範圍有交叉
可以通過 format 指定日期格式
萬用字元查詢
支援 * 和 ?
* 代表 0 個或多個字元
? 代表任意一個字元
模糊查詢
查詢類似的單詞
curl -X GET -H "Content-Type: application/json" localhost:9200/my_index/_search?pretty -d '{
"query": {
"fuzzy":{
"title":"desinger"
}
}
}'
雖然 desinger 寫錯了,但還是能查到
分值
查詢結果會有 _score 欄位表示該文件和查詢條件的相關度
決定分值的因素包括
詞頻: 在文件中出現的次數越多,權重越高
逆向文件頻率: 單詞在所有文件中出現的次數越多,權重越低,比如 and/the 等詞彙
文件長度: 文件越長權重越高
aggregation 查詢
插入更多資料
curl -X POST 'http://localhost:9200/my_index/_doc' -H 'Content-Type: application/json' -d '{
"name": "Wang",
"title": "software designer",
"age": 35,
"address": {"city": "guangzhou", "district": "tianhe"},
"content": "I want to do some AI machine learning works",
"kpi": 3.2,
"date": "2021-01-01T08:00:00Z"
}'
curl -X POST 'http://localhost:9200/my_index/_doc' -H 'Content-Type: application/json' -d '{
"name": "Li",
"title": "senior software designer",
"age": 30,
"address": {"city": "guangzhou", "district": "tianhe"},
"content": "I want to do some K8S works",
"kpi": 4.0,
"date": "2021-01-01T10:00:00Z"
}'
curl -X POST 'http://localhost:9200/my_index/_doc' -H 'Content-Type: application/json' -d '{
"name": "Zhang",
"title": "Test Engineer",
"age": 25,
"address": {"city": "guangzhou", "district": "tianhe"},
"content": "I want to do some auto-test works",
"kpi": 4.5,
"date": "2021-06-01T09:00:00Z"
}'
Aggregation 查詢包括以下幾類
Bucket aggregations: 統計每個分組的記錄的數量
Metrics aggregations: 統計每個分組的記錄的平均值/最大值/等等
Pipeline aggregations: 對 agg 的結果再做進一步計算
下面舉出部門 agg 操作的例子
所有的 agg 操作參考 https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations.html
metrics aggregation : avg
curl -X GET -H "Content-Type: application/json" localhost:9200/my_index/_search?pretty -d '{
"size": 0,
"query": {
"match":{
"title": "designer Engineer"
}
},
"aggs": {
"kpi_avg": {
"avg": {
"field": "kpi",
"missing": 3.5
}
}
}
}'
aggs 是關鍵字,寫成 aggregations 也可以
kpi_avg 是自定義名字,會在結果中出現
avg 是關鍵字,表示要做 avg 操作,field 指定要做 avg 的欄位,missing 表示如果欄位不存在的話要使用的預設值
如果不指定 query 就是對所有資料做 agg
如果不指定 size 為 0,除了打出 agg 的結果,還會把匹配的資料都打出來
指定了 size 為 0 後,就只打出 agg 的結果
"aggregations" : {
"kpi_avg" : {
"value" : 3.900000015894572
}
}
可以一次指定多個 aggs 查詢
metrics aggregation : avg & histogram
如果資料是 histogram 型別 (需要建立 index 時指定)
curl -X PUT 'http://localhost:9200/my_index_histogram' -H 'Content-Type: application/json' -d '{
"mappings" : {
"properties" : {
"my_histogram" : {
"type" : "histogram"
}
}
}
}'
curl -X POST 'http://localhost:9200/my_index_histogram/_doc' -H 'Content-Type: application/json' -d '{
"name": "Zhao",
"title": "manager",
"age": 30,
"address": {"city": "guangzhou", "district": "tianhe"},
"my_histogram": {
"values" : [3.5, 4.0, 4.5],
"counts" : [1, 2, 3]
}
}'
avg 處理是 (3.51 + 4.02 + 4.5*3) / (1+2+3)
curl -X GET -H "Content-Type: application/json" localhost:9200/my_index_histogram/_search?pretty -d '{
"size": 0,
"aggs": {
"score_avg": {
"avg": {
"field": "my_histogram"
}
}
}
}'
結果
"aggregations" : {
"score_avg" : {
"value" : 4.166666666666667
}
}
預設自動建立的 index 欄位不是 histogram 的
metrics aggregation : max/min/sum
curl -X GET -H "Content-Type: application/json" localhost:9200/my_index/_search?pretty -d '{
"size": 0,
"aggs": {
"kpi_avg": {
"max": {
"field": "kpi"
}
}
}
}'
curl -X GET -H "Content-Type: application/json" localhost:9200/my_index/_search?pretty -d '{
"size": 0,
"aggs": {
"kpi_avg": {
"min": {
"field": "kpi"
}
}
}
}'
curl -X GET -H "Content-Type: application/json" localhost:9200/my_index/_search?pretty -d '{
"size": 0,
"aggs": {
"kpi_avg": {
"sum": {
"field": "kpi"
}
}
}
}'
計算 min/max/sum 等等
metrics aggregation : boxplot
curl -X GET -H "Content-Type: application/json" localhost:9200/my_index/_search?pretty -d '{
"size": 0,
"query": {
"match":{
"title": "designer Engineer"
}
},
"aggs": {
"kpi_avg": {
"boxplot": {
"field": "kpi",
"missing": 3.5
}
}
}
}'
結果
"aggregations" : {
"kpi_avg" : {
"min" : 3.200000047683716,
"max" : 4.5,
"q1" : 3.400000035762787,
"q2" : 4.0,
"q3" : 4.375,
"lower" : 3.200000047683716,
"upper" : 4.5
}
}
箱型圖
q1 : 下四分位數 (25%)
q2 : 中位數 (50%)
q3 : 上四分位數 (75%)
lower : 不小於 q1-1.5(q3-q1) 的值中的最小值
upper : 不大於 q3+1.5(q3-q1) 的值中的最大值
min : 最小值
max : 最大值
metrics aggregation : cardinality
curl -X GET -H "Content-Type: application/json" localhost:9200/my_index/_search?pretty -d '{
"size": 0,
"query": {
"match":{
"title": "designer Engineer"
}
},
"aggs": {
"kpi_count": {
"cardinality": {
"field": "kpi"
}
}
}
}'
結果
"aggregations" : {
"kpi_count" : {
"value" : 3
}
}
相當於 count(distinct field)
metrics aggregation : extended_stats
curl -X GET -H "Content-Type: application/json" localhost:9200/my_index/_search?pretty -d '{
"size": 0,
"query": {
"match":{
"title": "designer Engineer"
}
},
"aggs": {
"kpi_stats": {
"extended_stats": {
"field": "kpi"
}
}
}
}'
結果
"aggregations" : {
"kpi_stats" : {
"count" : 3,
"min" : 3.200000047683716,
"max" : 4.5,
"avg" : 3.900000015894572,
"sum" : 11.700000047683716,
"sum_of_squares" : 46.49000030517578,
"variance" : 0.28666664441426565,
"variance_population" : 0.28666664441426565,
"variance_sampling" : 0.4299999666213985,
"std_deviation" : 0.5354125926930237,
"std_deviation_population" : 0.5354125926930237,
"std_deviation_sampling" : 0.6557438269792545,
"std_deviation_bounds" : {
"upper" : 4.970825201280619,
"lower" : 2.8291748305085243,
"upper_population" : 4.970825201280619,
"lower_population" : 2.8291748305085243,
"upper_sampling" : 5.2114876698530805,
"lower_sampling" : 2.588512361936063
}
}
}
各種統計結果
metrics aggregation : geo
建立 geo 型別的資料
curl -X PUT 'http://localhost:9200/my_index_geo' -H 'Content-Type: application/json' -d '{
"mappings" : {
"properties" : {
"location" : {
"type" : "geo_point"
}
}
}
}'
curl -X POST 'http://localhost:9200/my_index_geo/_doc' -H 'Content-Type: application/json' -d '{
"location": "52.374081,4.912350"
}'
curl -X POST 'http://localhost:9200/my_index_geo/_doc' -H 'Content-Type: application/json' -d '{
"location": "52.369219,4.901618"
}'
curl -X POST 'http://localhost:9200/my_index_geo/_doc' -H 'Content-Type: application/json' -d '{
"location": "52.371667,4.914722"
}'
curl -X POST 'http://localhost:9200/my_index_geo/_doc' -H 'Content-Type: application/json' -d '{
"location": "51.222900,4.405200"
}'
可以獲取中心點
curl -X GET -H "Content-Type: application/json" localhost:9200/my_index_geo/_search?pretty -d '{
"size": 0,
"aggs": {
"centroid": {
"geo_centroid": {
"field": "location"
}
}
}
}'
結果
"aggregations" : {
"centroid" : {
"location" : {
"lat" : 52.08446673466824,
"lon" : 4.783472470007837
},
"count" : 4
}
}
還可以獲取邊界等等
metrics aggregation : matrix_stats
計算均值/方差/均差/相關數,等等
bucket aggregation : adjacency_matrix
curl -X GET -H "Content-Type: application/json" localhost:9200/my_index/_search?pretty -d '{
"size": 0,
"aggs": {
"people_group": {
"adjacency_matrix": {
"filters": {
"grpA" : { "terms" : { "title" : ["designer", "engineer"] }},
"grpB" : { "terms" : { "title" : ["senior", "software"] }},
"grpC" : { "terms" : { "content" : ["test", "pv"] }}
}
}
}
}
}'
結果
"aggregations" : {
"people_group" : {
"buckets" : [
{
"key" : "grpA",
"doc_count" : 3
},
{
"key" : "grpA&grpB",
"doc_count" : 2
},
{
"key" : "grpA&grpC",
"doc_count" : 1
},
{
"key" : "grpB",
"doc_count" : 2
},
{
"key" : "grpC",
"doc_count" : 1
}
]
}
}
按照 filters 條件計算每個分組的數量
bucket aggregation : composite
curl -X GET -H "Content-Type: application/json" localhost:9200/my_index/_search?pretty -d '{
"size": 0,
"aggs": {
"composite_group": {
"composite": {
"sources": [
{"title" : { "terms" : { "field" : "title.keyword"}}},
{"age" : { "terms" : { "field" : "age"}}}
]
}
}
}
}'
必須使用 keyword 這樣不分詞,否則聚合不了
結果
"aggregations" : {
"composite_group" : {
"after_key" : {
"title" : "software designer",
"age" : 35
},
"buckets" : [
{
"key" : {
"title" : "Test Engineer",
"age" : 25
},
"doc_count" : 1
},
{
"key" : {
"title" : "senior software designer",
"age" : 30
},
"doc_count" : 1
},
{
"key" : {
"title" : "software designer",
"age" : 35
},
"doc_count" : 1
}
]
}
}
可以看到,結果類似於
select
field_a, field_b, count(*)
group by
field_a, field_b
除了 terms 還可以是 Histogram、Date histogram、GeoTile grid
curl -X GET -H "Content-Type: application/json" localhost:9200/my_index/_search?pretty -d '{
"size": 0,
"aggs": {
"my_buckets": {
"composite": {
"sources": [
{
"date": {
"date_histogram": {
"field": "date",
"calendar_interval": "1d"
}
}
}
]
}
}
}
}'
比如這個 Date histogram 分組的時候, 把 date 欄位精確到天然後按天分組
"aggregations" : {
"my_buckets" : {
"after_key" : {
"date" : 1622505600000
},
"buckets" : [
{
"key" : {
"date" : 1609459200000
},
"doc_count" : 2
},
{
"key" : {
"date" : 1622505600000
},
"doc_count" : 1
}
]
}
}
可以通過 format 欄位指定日期格式
bucket aggregation : date_histogram/auto_date_histogram
curl -X GET -H "Content-Type: application/json" localhost:9200/my_index/_search?pretty -d '{
"size": 0,
"aggs": {
"my_buckets": {
"date_histogram": {
"field": "date",
"calendar_interval": "1d"
}
}
}
}'
按天統計,但這裡會從最小值到最大值,這個例子是從 2021-01-01 到 2021-06-01 每天出一個 bucket 哪怕是 0
只能是 1d,要指定多天的必須用 fixed_interval
curl -X GET -H "Content-Type: application/json" localhost:9200/my_index/_search?pretty -d '{
"size": 0,
"aggs": {
"my_buckets": {
"date_histogram": {
"field": "date",
"fixed_interval": "2d"
}
}
}
}'
auto_date_histogram 和 date_histogram 差不多,但是是通過指定 bucket 讓系統自動選擇 interval 儘量達成 bucket 目標數
curl -X GET -H "Content-Type: application/json" localhost:9200/my_index/_search?pretty -d '{
"size": 0,
"aggs": {
"my_buckets": {
"auto_date_histogram": {
"field": "date",
"buckets": 3
}
}
}
}'
結果
"aggregations" : {
"my_buckets" : {
"buckets" : [
{
"key_as_string" : "2021-01-01T00:00:00.000Z",
"key" : 1609459200000,
"doc_count" : 2
},
{
"key_as_string" : "2021-04-01T00:00:00.000Z",
"key" : 1617235200000,
"doc_count" : 1
}
],
"interval" : "3M"
}
}
系統自動選了 3M 做 interval
bucket aggregation : term/filter/filters
curl -X GET -H "Content-Type: application/json" localhost:9200/my_index/_search?pretty -d '{
"size": 0,
"aggs": {
"my_buckets": {
"terms": {
"field": "age"
}
}
}
}'
按某個 field 統計分組數, 相當於 select age, count(*) from table group by age
curl -X GET -H "Content-Type: application/json" localhost:9200/my_index/_search?pretty -d '{
"size": 0,
"aggs": {
"my_buckets": {
"filter": { "term": {"title": "designer"}}
}
}
}'
按某個 field 的某個值統計, 相當於 select count(*) from table where title like %designer%
curl -X GET -H "Content-Type: application/json" localhost:9200/my_index/_search?pretty -d '{
"size": 0,
"aggs": {
"my_buckets": {
"filters": {
"filters": {
"title": { "match": {"title": "designer"}},
"age": { "match": {"age": 35}}
}
}
}
}
}'
分別統計兩個 field 相當於做了兩次 filter 查詢
bucket aggregation : range/date range
curl -X GET -H "Content-Type: application/json" localhost:9200/my_index/_search?pretty -d '{
"size": 0,
"aggs": {
"age_range": {
"range": {
"field": "age",
"ranges": [
{ "to": 35 },
{ "from": 30, "to": 35 },
{ "from": 35 }
]
}
}
}
}'
結果
"aggregations" : {
"age_range" : {
"buckets" : [
{
"key" : "*-35.0",
"to" : 35.0,
"doc_count" : 2
},
{
"key" : "30.0-35.0",
"from" : 30.0,
"to" : 35.0,
"doc_count" : 1
},
{
"key" : "35.0-*",
"from" : 35.0,
"doc_count" : 1
}
]
}
}
計算各個年齡段的數量
curl -X GET -H "Content-Type: application/json" localhost:9200/my_index/_search?pretty -d '{
"size": 0,
"aggs": {
"my_range": {
"date_range": {
"field": "date",
"ranges": [
{ "to": "now-3M/M" },
{ "from": "now-3M/M" }
]
}
}
}
}'
結果
"aggregations" : {
"my_range" : {
"buckets" : [
{
"key" : "*-2021-05-01T00:00:00.000Z",
"to" : 1.6198272E12,
"to_as_string" : "2021-05-01T00:00:00.000Z",
"doc_count" : 2
},
{
"key" : "2021-05-01T00:00:00.000Z-*",
"from" : 1.6198272E12,
"from_as_string" : "2021-05-01T00:00:00.000Z",
"doc_count" : 1
}
]
}
}
計算各個時間段的數量
curl -X GET -H "Content-Type: application/json" localhost:9200/my_index/_search?pretty -d '{
"size": 0,
"aggs": {
"my_range": {
"date_range": {
"field": "date",
"ranges": [
{ "key": "Older", "from":"2021-05-01" },
{ "key": "Newer", "to":"2021-05-01" }
]
}
}
}
}'
指定具體日期
即分組統計 (count) 又計數 (如 avg)
curl -X GET -H "Content-Type: application/json" localhost:9200/my_index/_search?pretty -d '{
"size": 0,
"aggs": {
"my_buckets": {
"filter": { "term": {"title": "designer"}},
"aggs": {
"avg_age": { "avg": { "field": "age" } }
}
}
}
}'
結果
"aggregations" : {
"my_buckets" : {
"doc_count" : 2,
"avg_age" : {
"value" : 32.5
}
}
}
可以看到即統計分組數,又對分組計算平均值
Pipeline aggregations : avg_bucket
POST _search
{
"size": 0,
"aggs": {
"sales_per_month": {
"date_histogram": {
"field": "date",
"calendar_interval": "month"
},
"aggs": {
"sales": {
"sum": {
"field": "price"
}
}
}
},
"avg_monthly_sales": {
// tag::avg-bucket-agg-syntax[]
"avg_bucket": {
"buckets_path": "sales_per_month>sales",
"gap_policy": "skip",
"format": "#,##0.00;(#,##0.00)"
}
// end::avg-bucket-agg-syntax[]
}
}
}
結果
"aggregations": {
"sales_per_month": {
"buckets": [
{
"key_as_string": "2015/01/01 00:00:00",
"key": 1420070400000,
"doc_count": 3,
"sales": {
"value": 550.0
}
},
{
"key_as_string": "2015/02/01 00:00:00",
"key": 1422748800000,
"doc_count": 2,
"sales": {
"value": 60.0
}
},
{
"key_as_string": "2015/03/01 00:00:00",
"key": 1425168000000,
"doc_count": 2,
"sales": {
"value": 375.0
}
}
]
},
"avg_monthly_sales": {
"value": 328.33333333333333,
"value_as_string": "328.33"
}
}
計算每個 bucket 的 avg,再計算 bucket avg 的 avg
Pipeline aggregations : cumulative_sum
POST /sales/_search
{
"size": 0,
"aggs": {
"sales_per_month": {
"date_histogram": {
"field": "date",
"calendar_interval": "month"
},
"aggs": {
"sales": {
"sum": {
"field": "price"
}
},
"cumulative_sales": {
"cumulative_sum": {
"buckets_path": "sales"
}
}
}
}
}
}
結果
"aggregations": {
"sales_per_month": {
"buckets": [
{
"key_as_string": "2015/01/01 00:00:00",
"key": 1420070400000,
"doc_count": 3,
"sales": {
"value": 550.0
},
"cumulative_sales": {
"value": 550.0
}
},
{
"key_as_string": "2015/02/01 00:00:00",
"key": 1422748800000,
"doc_count": 2,
"sales": {
"value": 60.0
},
"cumulative_sales": {
"value": 610.0
}
},
{
"key_as_string": "2015/03/01 00:00:00",
"key": 1425168000000,
"doc_count": 2,
"sales": {
"value": 375.0
},
"cumulative_sales": {
"value": 985.0
}
}
]
}
}
計算每個 bucket 的 sum,再計算 bucket sum 在每個階段的累加 sum
Pipeline aggregations : max_bucket
POST /sales/_search
{
"size": 0,
"aggs": {
"sales_per_month": {
"date_histogram": {
"field": "date",
"calendar_interval": "month"
},
"aggs": {
"sales": {
"sum": {
"field": "price"
}
}
}
},
"max_monthly_sales": {
"max_bucket": {
"buckets_path": "sales_per_month>sales"
}
}
}
}
結果
"aggregations": {
"sales_per_month": {
"buckets": [
{
"key_as_string": "2015/01/01 00:00:00",
"key": 1420070400000,
"doc_count": 3,
"sales": {
"value": 550.0
}
},
{
"key_as_string": "2015/02/01 00:00:00",
"key": 1422748800000,
"doc_count": 2,
"sales": {
"value": 60.0
}
},
{
"key_as_string": "2015/03/01 00:00:00",
"key": 1425168000000,
"doc_count": 2,
"sales": {
"value": 375.0
}
}
]
},
"max_monthly_sales": {
"keys": ["2015/01/01 00:00:00"],
"value": 550.0
}
}
計算每個 bucket 的 sum,再取 sum 最大的 bucket
指令碼執行
支援指令碼查詢: 略