Elasticsearch 論壇實戰-基於dis_max實現best fields策略進行多欄位搜尋
技術標籤:Elasticsearch實戰elasticsearch
Elasticsearch實戰
準備資料
PUT /forum/post/_bulk {"index":{"_id":1}} {"title":"java php", "content":" kibana forum open MIjMReACTGaN564AnCZuHg"} {"index":{"_id":2}} {"title":"elasticsearch php", "content":"post open 4508327"} {"index":{"_id":3}} {"title":"elasticsearch hadoop", "content":"java kibana green open"}
執行如下查詢可觀察結果
GET /forum/post/_search { "query": { "bool": { "should": [ { "match": { "title": "java kibana" } }, { "match": { "content": "java kibana" } } ] } } }
結果分析
#! Deprecation: [types removal] Specifying types in search requests is deprecated. { "took" : 0, "timed_out" : false, "_shards" : { "total" : 1, "successful" : 1, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : { "value" : 2, "relation" : "eq" }, "max_score" : 1.43398, "hits" : [ { "_index" : "forum", "_type" : "post", "_id" : "1", "_score" : 1.43398, "_source" : { "title" : "java php", "content" : " kibana forum open MIjMReACTGaN564AnCZuHg" } }, { "_index" : "forum", "_type" : "post", "_id" : "3", "_score" : 1.3988109, "_source" : { "title" : "elasticsearch hadoop", "content" : "java kibana green open" } } ] } }
期望的是doc3(content裡面java kibana都匹配到了),結果是doc1排在了前面
計算每個document的relevance score:每個query的分數,乘以matched query數量,除以總query數量
算一下doc4的分數
"match": {"title": "java kibana"},針對doc1,是有一個分數的
"match": {"content": "java kibana"},針對doc1,也是有一個分數的所以是兩個分數加起來,比如說,1.1 + 1.2 = 2.3
matched query數量 = 2
總query數量 = 22.3 * 2 / 2 = 2.3
算一下doc3的分數
"match": {"title": "java kibana"},針對doc3,是有一個分數的
"match": {"content": "java kibana"},針對doc3,也是有一個分數的所以說,只有一個query是有分數的,比如2.3
matched query數量 = 1
總query數量 = 22.3 * 1 / 2 = 1.15
doc3的分數 = 1.15 < doc1的分數 = 2.3
解決方案
best fields策略,就是說,搜尋到的結果,應該是某一個field中匹配到了儘可能多的關鍵詞,被排在前面;而不是儘可能多的field匹配到了少數的關鍵詞,排在了前面
dis_max語法,直接取多個query中,分數最高的那一個query的分數即可
GET /forum/post/_search
{
"query": {
"dis_max": {
"queries": [
{
"match": {
"title": "java kibana"
}
},
{
"match": {
"content": "java kibana"
}
}
]
}
}
}
#! Deprecation: [types removal] Specifying types in search requests is deprecated.
{
"took" : 1,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 2,
"relation" : "eq"
},
"max_score" : 1.3988109,
"hits" : [
{
"_index" : "forum",
"_type" : "post",
"_id" : "3",
"_score" : 1.3988109,
"_source" : {
"title" : "elasticsearch hadoop",
"content" : "java kibana green open"
}
},
{
"_index" : "forum",
"_type" : "post",
"_id" : "1",
"_score" : 0.9808291,
"_source" : {
"title" : "java php",
"content" : " kibana forum open MIjMReACTGaN564AnCZuHg"
}
}
]
}
}
歡迎訪問我的個人部落格:小馬部落格
如果有疑問,歡迎諮詢公眾號《小馬JAVA》