Elasticsearch（四）elasticsearch複雜檢索

阿新 • • 發佈：2019-01-10

Query-string 搜尋通過命令非常方便地進行臨時性的即席搜尋，但它有自身的侷限性（參見輕量搜尋）。Elasticsearch 提供一個豐富靈活的查詢語言叫做查詢表示式，它支援構建更加複雜和健壯的查詢。
領域特定語言（DSL），指定了使用一個 JSON 請求。我們可以像這樣重寫之前的查詢所有 Smith 的搜尋：
GET /megacorp/employee/_search
{
“query” : {
“match” : {
“last_name” : “Smith”
}
}
}
View in Sense
返回結果與之前的查詢一樣，但還是可以看到有一些變化。其中之一是，不再使用 query-string 引數，而是一個請求體替代。這個請求使用 JSON 構造，並使用了一個 match 查詢（屬於查詢型別之一，後續將會了解）。

更復雜的搜尋

現在嘗試下更復雜的搜尋。同樣搜尋姓氏為 Smith 的僱員，但這次我們只需要年齡大於 30 的。查詢需要稍作調整，使用過濾器 filter ，它支援高效地執行一個結構化查詢。

GET /megacorp/employee/_search
{
    "query" : {
        "bool": {
            "must": {
                "match" : {
                    "last_name" : "smith"  
                }
            },
            "filter" 
: {
                "range" : {
                    "age" : { "gt" : 30 }  
                }
            }
        }
    }
}

這部分與我們之前使用的 match 查詢一樣。
這部分是一個 range 過濾器，它能找到年齡大於 30 的文件，其中 gt 表示_大於(_great than)。
目前無需太多擔心語法問題，後續會更詳細地介紹。只需明確我們添加了一個過濾器用於執行一個範圍查詢，並複用之前的 match 查詢。現在結果只返回了一個僱員，叫 Jane Smith，32 歲。

{
   ...
   "hits": {
      "total":      1,
      "max_score":  0.30685282,
      "hits": [
         {
            ...
            "_source": {
               "first_name":  "Jane",
               "last_name":   "Smith",
               "age":         32,
               "about":       "I like to collect rock albums",
               "interests": [ "music" ]
            }
         }
      ]
   }
}

bool簡單介紹

發生描述
must 該條款（查詢）必須出現在匹配的檔案，並將有助於得分。
filter 子句（查詢）必須出現在匹配的文件中。然而不像 must查詢的分數將被忽略。Filter子句在過濾器上下文中執行，這意味著評分被忽略，子句被考慮用於快取記憶體。

should 子句（查詢）應該出現在匹配的文件中。如果 bool查詢位於查詢上下文中並且具有mustorfilter子句，那麼bool即使沒有 should查詢匹配，文件也將匹配查詢。在這種情況下，這些條款僅用於影響分數。如果bool查詢是過濾器上下文或者兩者都不存在，must或者filter至少有一個should查詢必須與文件相匹配才能與bool查詢匹配。這種行為可以通過設定minimum_should_match引數來顯式控制。

must_not 子句（查詢）不能出現在匹配的文件中。子句在過濾器上下文中執行，意味著評分被忽略，子句被考慮用於快取記憶體。因為計分被忽略，0所有檔案的分數被返回。

即，must：必須匹配，filter:匹配的結果過濾，should:至少有一個 must_not:不能匹配

Client程式演示bool查詢

term

增加一個方法：

/*
     * 簡單運用一個bool查詢，查詢姓Smith且年齡大於的員工
     * 查詢姓Smith的員工
     * 過濾為大於30歲的
     */
    private static void findEmployeeByAgeAndName(Client client) {
        SearchRequestBuilder request = client.prepareSearch("megacorp1")
                .setTypes("employee1")
                .setSearchType(SearchType.DFS_QUERY_THEN_FETCH) 
                .setQuery(QueryBuilders.boolQuery().must(termQuery("last_name","Smith")).filter(rangeQuery("age").gt(30)));
//      SearchResponse response = request.get();
        printResponseHits(request.get());
    }

封裝檢視結果方法：

//檢視結果
    private static void printResponseHits(SearchResponse response) {
        SearchHits searchHits = response.getHits();
        Iterator<SearchHit> iterator = searchHits.iterator();
        while(iterator.hasNext()) {
            SearchHit hit = iterator.next();
            String index = hit.getIndex();
            String type = hit.getType();
            String id = hit.getId();
            float score = hit.getScore();
            System.out.println("index="+index+" type="+type+" id="+id+" score="+score+" source-->"+hit.getSourceAsString());
        }
    }

Main方法中增加呼叫

// 5.查詢姓smith的僱員，過濾過濾器查詢示例 bool查詢
findEmployeeByAgeAndName(client);

結果顯示：
index=megacorp1 type=employee1 id=2 score=1.2809339 source–>{“first_name”:”Jane”,”last_name”:”Smith”,”age”:”32”,”about”:”I like to collect rock albums”,”interests”:[“music”]}
有興趣的可以自己debug到request檢視bool的請求

Head外掛示例

這裡寫圖片描述

match

我們可以現在用match來寫下：
將剛才的例子改為如下

SearchRequestBuilder request = client.prepareSearch("megacorp1")
                .setTypes("employee1")
                .setSearchType(SearchType.DFS_QUERY_THEN_FETCH) 
                .setQuery(QueryBuilders.boolQuery().must(matchQuery("last_name","Smith")).filter(rangeQuery("age").gt(30)));
//      SearchResponse response = request.get();
        printResponseHits(request.get());

再次呼叫此方法返回結果為：
index=megacorp1 type=employee1 id=2 score=1.3862944 source–>{“first_name”:”Jane”,”last_name”:”Smith”,”age”:”32”,”about”:”I like to collect rock albums”,”interests”:[“music”]}

head外掛示例

這裡寫圖片描述
我們的結果沒有區別，因為這裡我們的索引不會進行分詞解析。
我們去可以之前可以分詞解析的索引megacorp中實驗以下：

term:rock climbing

這裡寫圖片描述

Match:rock climbing

這裡寫圖片描述
可以看出檔案的結果根據相關性評分排序。整個都匹配的在第一個，匹配其中一個的在後面。
Elasticsearch 預設按照相關性得分排序，即每個文件跟查詢的匹配程度。第一個最高得分的結果很明顯：John Smith 的 about 屬性清楚地寫著 “rock climbing” 。
但為什麼 Jane Smith 也作為結果返回了呢？原因是她的 about 屬性裡提到了 “rock” 。因為只有 “rock” 而沒有 “climbing” ，所以她的相關性得分低於 John 的。
這是一個很好的案例，闡明瞭 Elasticsearch 如何在全文屬性上搜索並返回相關性最強的結果。Elasticsearch中的相關性概念非常重要，也是完全區別於傳統關係型資料庫的一個概念，資料庫中的一條記錄要麼匹配要麼不匹配。

至於之前的last_name為何查不出，還是一個疑問。
嘗試將lastname增加這個沒有下劃線的欄位，term依舊沒有查出來。
嘗試將內容改為Smith Smith中間空格形式也查不出來term
歡迎解惑。

短語搜尋

找出一個屬性中的獨立單詞是沒有問題的，但有時候想要精確匹配一系列單詞或者短語。比如，我們想執行這樣一個查詢，僅匹配同時包含 “rock” 和 “climbing” ，並且二者以短語 “rock climbing” 的形式緊挨著的僱員記錄。
為此對 match 查詢稍作調整，使用一個叫做 match_phrase 的查詢：

GET /megacorp/employee/_search
{
    "query" : {
        "match_phrase" : {
            "about" : "rock climbing"
        }
    }
}

毫無懸念，返回結果僅有 John Smith 的文件。

{
   ...
   "hits": {
      "total":      1,
      "max_score":  0.23013961,
      "hits": [
         {
            ...
            "_score":         0.23013961,
            "_source": {
               "first_name":  "John",
               "last_name":   "Smith",
               "age":         25,
               "about":       "I love to go rock climbing",
               "interests": [ "sports", "music" ]
            }
         }
      ]
   }
}

Client程式演示

增加一個方法

/**
     * match phrase查詢
     * 僅匹配同時包含 “rock” 和 “climbing” ，並且 二者以短語 “rock climbing” 的形式緊挨著的僱員記錄。
     * @param client 客戶端
     * @param field 欄位
     * @param phrase 詞語
     */
    private static void findEmployeesWithOneUniqueMatchPhrase(Client client, String field, String phrase) {
        SearchRequestBuilder request = client.prepareSearch("megacorp")
                .setTypes("employee")
                .setSearchType(SearchType.DFS_QUERY_THEN_FETCH) 
                .setQuery(QueryBuilders.boolQuery().must(matchPhraseQuery(field, phrase)));
        printResponseHits(request.get());
    }

Main方法中呼叫

// 6.match_phrase查詢 僅匹配同時包含 “rock” 和 “climbing” ，並且 二者以短語 “rock climbing” 的形式緊挨著的僱員記錄。
            findEmployeesWithOneUniqueMatchPhrase(client,"about","rock climbing");

我增加了一些資料：
結果顯示：
index=megacorp type=employee id=5 score=0.6449836 source–>{“first_name”:”John”,”last_name”:”Smith1”,”age”:25,”about”:”I love to go rock climbing”,”interests”:[“sports”,”music”]}
index=megacorp type=employee id=8 score=0.6449836 source–>{“first_name”:”John”,”last_name”:”蜂蜜柚子”,”age”:25,”about”:”I love to go rock climbing”,”interests”:[“sports”,”music”]}
index=megacorp type=employee id=9 score=0.6449836 source–>{“first_name”:”John”,”last_name”:”蜂蜜”,”age”:25,”about”:”I love to go rock climbing”,”interests”:[“sports”,”music”]}
index=megacorp type=employee id=10 score=0.6449836 source–>{“first_name”:”John”,”last_name”:”Smith Smith”,”age”:25,”about”:”I love to go rock climbing”,”interests”:[“sports”,”music”]}
index=megacorp type=employee id=6 score=0.6449836 source–>{“first_name”:”John”,”last_name”:”Smith 1”,”age”:26,”about”:”I love to go rock climbing”,”interests”:[“sports”,”art”]}
index=megacorp type=employee id=1 score=0.6449836 source–>{“first_name”:”John”,”last_name”:”Smith”,”age”:25,”about”:”I love to go rock climbing”,”interests”:[“sports”,”music”]}
index=megacorp type=employee id=7 score=0.6449836 source–>{“first_name”:”John”,”last_name”:”蜂蜜柚子蜂蜜”,”age”:25,”about”:”I love to go rock climbing”,”interests”:[“sports”,”music”]}

可以看出結果完全符合。

Head外掛示例

{“query”:{“match_phrase”:{“about”:”rock climbing”}}}

抱歉圖發不出！！！

Elasticsearch（四）elasticsearch複雜檢索

更復雜的搜尋

bool簡單介紹

Client程式演示bool查詢

term

Head外掛示例

match

head外掛示例

短語搜尋

Client程式演示

Head外掛示例

Elasticsearch（四）elasticsearch複雜檢索

（四）elasticsearch 搜索工具

Elasticsearch學習筆記（四）ElasticSearch分布式機制

elasticsearch（四）java 使用更新操作API

elasticsearch（四）之 elasticsearch常用的一些叢集命令

ElasticSearch（四）kibana實現CURD

ElasticSearch（四）查詢、分詞器

Elasticsearch（二）elasticsearch索引資料與簡單檢索GET一個文件

百度地圖開發（四）之POI檢索

Elasticsearch（五）elasticsearch高亮搜尋

elasticsearch系列（四）部署

搜索引擎ElasticSearch系列（四）： ElasticSearch2.4.4 sql插件安裝

ElasticSearch學習筆記（四）

Elasticsearch學習（四）文件CRUD操作

Elasticsearch實踐（四）：IK分詞

（十四）Elasticsearch叢集配置

搜尋引擎（四）：如何使用ElasticSearch官方文件

ES學習（四）拼音外掛分詞elasticsearch-analysis-pinyin

Elasticsearch 通關教程（四）：分散式工作原理

ElasticSearch學習總結（四）：分散式特性

Elasticsearch（四）elasticsearch複雜檢索

更復雜的搜尋

bool簡單介紹

Client程式演示bool查詢

term

Head外掛示例

match

head外掛示例

短語搜尋

Client程式演示

Head外掛示例

相關推薦