Elasticsearch搜尋引擎學習筆記(五)
阿新 • • 發佈:2021-06-23
搜尋功能
資料準備
1、自定義詞庫
慕課網
慕課
課網
慕
課
網
2、新建立索引shop
3、建立mappings
POST /shop/_mapping (7.x之前的版本:/shop/_mapping/_doc) { "properties": { "id": { "type": "long" }, "age": { "type": "integer" }, "username": { "type": "keyword" }, "nickname": { "type": "text", "analyzer": "ik_max_word" }, "money": { "type": "float" }, "desc": { "type": "text", "analyzer": "ik_max_word" }, "sex": { "type": "byte" }, "birthday": { "type": "date" }, "face": { "type": "text", "index": false } } }
4、錄入資料
POST /shop/_doc/1001 { "id": 1001, "age": 18, "username": "imoocAmazing", "nickname": "慕課網", "money": 88.8, "desc": "我在慕課網學習java和前端,學習到了很多知識", "sex": 0, "birthday": "1992-12-24", "face": "https://www.imooc.com/static/img/index/logo.png" } { "id": 1002, "age": 19, "username": "justbuy", "nickname": "周杰棍", "money": 77.8, "desc": "今天上下班都很堵,車流量很大", "sex": 1, "birthday": "1993-01-24", "face": "https://www.imooc.com/static/img/index/logo.png" } { "id": 1003, "age": 20, "username": "bigFace", "nickname": "飛翔的巨鷹", "money": 66.8, "desc": "慕課網團隊和導遊坐飛機去海外旅遊,去了新馬泰和歐洲", "sex": 1, "birthday": "1996-01-14", "face": "https://www.imooc.com/static/img/index/logo.png" } { "id": 1004, "age": 22, "username": "flyfish", "nickname": "水中魚", "money": 55.8, "desc": "昨天在學校的池塘裡,看到有很多魚在游泳,然後就去慕課網上課了", "sex": 0, "birthday": "1988-02-14", "face": "https://www.imooc.com/static/img/index/logo.png" } { "id": 1005, "age": 25, "username": "gotoplay", "nickname": "ps遊戲機", "money": 155.8, "desc": "今年生日,女友送了我一臺play station遊戲機,非常好玩,非常不錯", "sex": 1, "birthday": "1989-03-14", "face": "https://www.imooc.com/static/img/index/logo.png" } { "id": 1006, "age": 19, "username": "missimooc", "nickname": "我叫小慕", "money": 156.8, "desc": "我叫凌雲慕,今年20歲,是一名律師,我在琦䯲星球做演講", "sex": 1, "birthday": "1993-04-14", "face": "https://www.imooc.com/static/img/index/logo.png" } { "id": 1007, "age": 19, "username": "msgame", "nickname": "gamexbox", "money": 1056.8, "desc": "明天去進貨,最近微軟處理很多遊戲機,還要買xbox遊戲卡帶", "sex": 1, "birthday": "1985-05-14", "face": "https://www.imooc.com/static/img/index/logo.png" } { "id": 1008, "age": 19, "username": "muke", "nickname": "慕學習", "money": 1056.8, "desc": "大學畢業後,可以到imooc.com進修", "sex": 1, "birthday": "1995-06-14", "face": "https://www.imooc.com/static/img/index/logo.png" } { "id": 1009, "age": 22, "username": "shaonian", "nickname": "騷年輪", "money": 96.8, "desc": "騷年在大學畢業後,考研究生去了", "sex": 1, "birthday": "1998-07-14", "face": "https://www.imooc.com/static/img/index/logo.png" } { "id": 1010, "age": 30, "username": "tata", "nickname": "隔壁老王", "money": 100.8, "desc": "隔壁老外去國外出差,帶給我很多好吃的", "sex": 1, "birthday": "1988-07-14", "face": "https://www.imooc.com/static/img/index/logo.png" } { "id": 1011, "age": 31, "username": "sprder", "nickname": "皮特帕克", "money": 180.8, "desc": "它是一個超級英雄", "sex": 1, "birthday": "1989-08-14", "face": "https://www.imooc.com/static/img/index/logo.png" } { "id": 1012, "age": 31, "username": "super hero", "nickname": "super hero", "money": 188.8, "desc": "BatMan, GreenArrow, SpiderMan, IronMan... are all Super Hero", "sex": 1, "birthday": "1980-08-14", "face": "https://www.imooc.com/static/img/index/logo.png" }
請求引數的查詢(QueryString)
GET /shop/_doc/_search?q=desc:慕課網 GET /shop/_doc/_search?q=nickname:慕&q=age:25
DSL查詢
QueryString用的很少,一旦引數複雜就難以構建,所以大多查詢都會使用dsl來進行查詢更好。
# 查詢 POST /shop/_doc/_search { "query": { "match": { "desc": "慕課網" } } } # 判斷某個欄位是否存在 { "query": { "exists": { "field": "desc" } } }
查詢所有
GET /shop/_doc/_search 或 POST /shop/_doc/_search { "query": { "match_all": {} }, "_source": ["id", "nickname", "age"] }
分頁
POST /shop/_doc/_search { "query": { "match_all": {} }, "from": 0, "size": 10 } { "query": { "match_all": {} }, "_source": [ "id", "nickname", "age" ], "from": 0, "size": 10 }
term精確搜尋與match分詞搜尋
term搜尋的時候會把使用者搜尋內容,比如“慕課網強大”作為一整個關鍵詞去搜索,而不會對其進行分詞後再搜尋;
match會把使用者搜尋內容分詞,然後再搜尋
POST /shop/_doc/_search { "query": { "term": { "desc": "慕課網" } } } 對比 { "query": { "match": { "desc": "慕課網" } } }
terms 多個詞語匹配檢索
POST /shop/_doc/_search { "query": { "terms": { "desc": ["慕課網", "學習", "騷年"] } } }
match_phrase 短語匹配
match:分詞後只要有匹配就返回,match_phrase:分詞結果必須在text欄位分詞中都包含,而且順序必須相同,而且必須都是連續的。(搜尋比較嚴格)
slop:允許詞語間跳過的數量,是“詞”的數量,不是“字”的數量
POST /shop/_doc/_search { "query": { "match_phrase": { "desc": { "query": "大學 畢業 研究生", "slop": 2 } } } }
match(operator)
operator
or:搜尋內容分詞後,只要存在一個詞語匹配就展示結果
and:搜尋內容分詞後,都要滿足詞語匹配。
POST /shop/_doc/_search { "query": { "match": { "desc": "慕課網" } } } # 等同於 { "query": { "match": { "desc": { "query": "xbox遊戲機", "operator": "or" } } } } # 相當於 select * from shop where desc='xbox' or|and desc='遊戲機'
match(minimum_should_match)
minimum_should_match
minimum_should_match: 最低匹配精度,至少有[分詞後的詞語個數]x百分百,得出一個數據值取整。舉個例子:當前屬性設定為<code>70</code>,若一個使用者查詢檢索內容分詞後有10個詞語,那麼匹配度按照 10x70%=7,則desc中至少需要有7個詞語匹配,就展示;若分詞後有8個,則 8x70%=5.6,則desc中至少需要有5個詞語匹配,就展示。
minimum_should_match 也能設定具體的數字,表示個數
POST /shop/_doc/_search { "query": { "match": { "desc": { "query": "女友生日送我好玩的xbox遊戲機", "minimum_should_match": "60%" } } } }
根據文件主鍵ids搜尋
GET /shop/_doc/1001 或 POST /shop/_doc/_search { "query": { "ids": { "type": "_doc", "values": ["1001", "1010", "1008"] } } }