Facet切面統計(高版本中為aggregations)
儘管官網上強調,facet在以後的版本中將會從elasticsearch中移除,推薦使用aggregations。但在工作上,自己還是使用了facet。在閱讀《Mastering Elasticsearch》的時候,看到了對facet的介紹,介紹的非常的實用和易懂,於是就摘譯了一部分出來,供需要的參考。
當使用ElasticSearch 刻面(faceting)機制時,需要牢記:刻面(faceting)結果僅在查詢(query)結果上計算;如果你在query實體外包含過濾(filter),這樣的過濾不會限制刻面統計的文件(document)
來看例子:
首先,使用以下命令往books索引內插入一些文字:
curl -XPUT 'localhost:9200/books/book/1' -d '{
"id":"1", "title":"Test book 1", "category":"book",
"price":29.99
}'
curl -XPUT 'localhost:9200/books/book/2' -d '{
"id":"2", "title":"Test book 2", "category":"book",
"price":39.99
}'
curl -XPUT 'localhost:9200/books/book/3' -d '{
"id":"3", "title":"Test comic 1","category":"comic",
"price":11.99
}'
curl -XPUT 'localhost:9200/books/book/4' -d '{
"id":"4", "title":"Test comic 2","category":"comic",
"price":15.99
}'
讓我們來看看當使用查詢(query)和過濾(filter)時,刻面(faceting)是如何工作的。我們將會執行一個簡單的查詢(query)——返回books索引上的所有文件。同樣,我們會包含一個過濾來將查詢結果限制僅僅屬於book分類(category),以及包含一個針對price欄位的範圍切面,來檢視有多少文件的價格低於30和有多少是高於30.整個查詢如下:
{
"query": {
"match_all": {}
},
"filter": {
"term": {
"category": "book"
}
},
"facets": {
"price": {
"range": {
"field": "price",
"ranges": [
{
"to": 30
},
{
"from": 30
}
]
}
}
}
}
執行後,我們將得到以下結果:
{
…
"hits":{
"total":2,
"max_score": 1.0,
"hits": [
{
"_index": "books",
"_type": "book",
"_id": "1",
"_score": 1.0,
"_source": {
"id": "1",
"title": "Test book 1",
"category": "book",
"price": 29.99
}
},
{
"_index": "books",
"_type": "book",
"_id": "2",
"_score": 1.0,
"_source": {
"id": "2",
"title": "Test book 2",
"category": "book",
"price": 39.99
}
}
]
},
"facets": {
"price": {
"_type": "range",
"ranges": [
{
"to": 30.0,
"count": 3,
"min": 11.99,
"max": 29.99,
"total_count": 3,
"total": 57.97,
"mean": 19.323333333333334
},
{
"from": 30.0,
"count": 1,
"min": 39.99,
"max": 39.99,
"total_count": 1,
"total": 39.99,
"mean": 39.99
}
]
}
}
}
從結果可以看出,儘管filter限制只包括category欄位取值為book的文件,但facet並不是只在這些文件上執行,而是在books索引上的所有文件上執行(因為match_all查詢)。也就是說,刻面機制在計算的時候是不考慮filter的。但如果filter作為query的一部分呢?比如filtered查詢?繼續看例子。
{
"query": {
"filtered": {
"query": {
"match_all": {}
},
"filter": {
"term": {
"category": "book"
}
}
}
},
"facets": {
"price": {
"range": {
"field": "price",
"ranges": [
{
"to": 30
},
{
"from": 30
}
]
}
}
}
}
返回結果:
{
...
"hits":{
"total": 2,
"max_score": 1.0,
"hits": [
{
"_index": "books",
"_type": "book",
"_id": "1",
"_score": 1.0,
"_source": {
"id": "1",
"title": "Test book 1",
"category": "book",
"price": 29.99
}
},
{
"_index": "books",
"_type": "book",
"_id": "2",
"_score": 1.0,
"_source": {
"id": "2",
"title": "Test book2",
"category": "book",
"price": 39.99
}
}
]
},
"facets": {
"price": {
"_type": "range",
"ranges": [
{
"to": 30.0,
"count": 1,
"min": 29.99,
"max": 29.99,
"total_count": 1,
"total": 29.99,
"mean": 29.99
},
{
"from": 30.0,
"count": 1,
"min": 39.99,
"max": 39.99,
"total_count": 1,
"total": 39.99,
"mean": 39.99
}
]
}
}
}
從返回結果可以看出,這個時候的filter限制了facet的計算範圍。
現在,想象我們想要僅僅對title欄位包含”2”的書籍計算刻面。我們可以在query增加第二個filter,但是這樣的話,會限制查詢結果,這並不是我們想要的。我們要做的是引入facet filter。
在提供facet的同級使用facet_filter,這允許我們限制計算刻面的文字。比如如果想限制刻面計算只針對title欄位包含”2“的文字,elasticsearch語句可修改為:
{
"query": {
"filtered": {
"query": {
"match_all": {
}
},
"filter": {
"term": {
"category": "book"
}
}
}
}"facets": {
"price": {
"range": {
"field": "price",
"ranges": [
{
"to": 30
},
{
"from": 30
}
]
},
"facet_filter": {
"term": {
"title": "2"
}
}
}
}
}
返回結果:
{
...
"hits":{
"total":2,
"max_score": 1.0,
"hits": [
{
"_index": "books",
"_type": "book",
"_id": "1",
"_score": 1.0,
"_source": {
"id": "1",
"title": "Test book 1",
"category": "book",
"price": 29.99
}
},
{
"_index": "books",
"_type": "book",
"_id": "2",
"_score": 1.0,
"_source": {
"id": "2",
"title": "Test book 2",
"category": "book",
"price": 39.99
}
}
]
},
"facets": {
"price": {
"_type": "range",
"ranges": [
{
"to": 30.0,
"count": 0,
"total_count": 0,
"total": 0.0,
"mean": 0.0
},
{
"from": 30.0,
"count": 1,
"min": 39.99,
"max": 39.99,
"total_count": 1,
"total": 39.99,
"mean": 39.99
}
]
}
}
}
從上面可以看出,facet限制在了一個文字。而query沒變。
現在,假如我們想要對所有category欄位為”book“的文件進行query(查詢),但是想要對索引中的所有文件都進行facet,改怎麼辦呢?
直接看語句吧:
{
"query": {
"term": {
"category": "book"
}
},
"facets": {
"price": {
"range": {
"field":"price",
"ranges": [
{
"to": 30
},
{
"from": 30
}
]
},
"global": true
}
}
}
返回結果:
{
...
"hits":{
"total":2,
"max_score": 0.30685282,
"hits": [
{
"_index": "books",
"_type": "book",
"_id": "1",
"_score": 0.30685282,
"_source": {
"id": "1",
"title": "Test book 1",
"category": "book",
"price": 29.99
}
},
{
"_index": "books",
"_type": "book",
"_id": "2",
"_score": 0.30685282,
"_source": {
"id": "2",
"title": "Test book 2",
"category": "book",
"price": 39.99
}
}
]
},
"facets": {
"price": {
"_type": "range",
"ranges": [
{
"to": 30.0,
"count": 3,
"min": 11.99,
"max": 29.99,
"total_count": 3,
"total": 57.97,
"mean": 19.323333333333334
},
{
"from": 30.0,
"count": 1,
"min": 39.99,
"max": 39.99,
"total_count": 1,
"total": 39.99,
"mean": 39.99
}
]
}
}
}
這就是global帶給facet的好處。