1. 程式人生 > >ES學習筆記四-Query DSL

ES學習筆記四-Query DSL

queries and filters

Although we refer to the query DSL, in reality there are two DSLs: the query DSL and the filter DSL.Query clauses and filter clauses are similar in nature, but have slightly different purposes.

filter:結果是或否,查詢速度快,可以被快取,一般用在真實值的查詢上。

query:查詢結果與搜尋內容的相關性怎樣,不能被快取,一般用在全文檢索上。

most important queries and filters

term filter {query:{ "term":"value" }} terms filer { query:{   "terms":["a","b"] } } range filter
{"range":{"age":{"gte":20,"lt":30}}}
exists and missing filter The exists and missing filters are used to find documents in which the specified field either has one or more values (exists) or doesn’t have any values (
missing). It is similar in nature to IS_NULL (missing) and NOT IS_NULL (exists)in SQL

bool filter

用於複合查詢

must should must_not

{

"query":{

  "bool":{

must:{

 "query":{

         "match":{

            "text":"fadsfdasfds"

     }

    }


}

  }

}

}

QUERYS:

MATCH

The match query should be the standard query that you reach for whenever you want to query for a full-text or exact value in almost any field.

If you run a match query against a full-text field, it will analyze the query string by using the correct analyzer for that field before executing the search:

{"match":{"tweet":"About Search"}}
VIEW IN SENSE

If you use it on a field containing an exact value, such as a number, a date, a Boolean, or a not_analyzed string field, then it will search for that exact value:

{"match":{"age":26}}{"match":{"date":"2014-09-01"}}{"match":{"public":true}}{"match":{"tag":"full_text"}}
For exact-value searches, you probably want to use a filter instead of a query, as a filter will be cached.

MULTI_MATCH

bool query

combining queries with filters

GET /_search
{"query":{"filtered":{"query":{"match":{"email":"business opportunity"}},"filter":{"term":{"folder":"inbox"}}}}}

just a filter

While in query context, if you need to use a filter without a query (for instance, to match all emails in the inbox), you can just omit the query:

GET /_search
{"query":{"filtered":{"filter":{"term":{"folder":"inbox"}}}}}

You seldom need to use a query as a filter, but we have included it for completeness' sake. The only time you may need it is when you need to use full-text matching while in filter context.

finding multiple exact values

GET /my_store/products/_search
{"query":{"filtered":{"filter":{"terms":{"price":[20,30]}}}}}

contains, but does not equal

GET /my_index/my_type/_search
{"query":{"filtered":{"filter":{"bool":{"must":[{"term":{"tags":"search"}},{"term":{"tag_count":1}}]}}}}}

When used on date fields, the range filter supports date math operations. For example, if we want to find all documents that have a timestamp sometime in the last hour:

"range":{"timestamp":{"gt":"now-1h"}}

When used on date fields, the range filter supports date math operations. For example, if we want to find all documents that have a timestamp sometime in the last hour:

"range":{"timestamp":{"gt":"now-1h"}}
Less than January 1, 2014 plus one month

dealing with null values

GET /my_index/posts/_search
{"query":{"filtered":{"filter":{"exists":{"field":"tags"}}}}}
GET /my_index/posts/_search
{"query":{"filtered":{"filter":{"missing":{"field":"tags"}}}}}

all about caching

cache 是實時的,所以不用擔心快取的有效期問題。 Leaf filters have to consult the inverted index on disk, so it makes sense to cache them. Compound filters, on the other hand, use fast bit logic to combine the bitsets resulting from their inner clauses, so it is efficient to recalculate them every time.
Certain leaf filters, however, are not cached by default, because it doesn’t make sense to do so:
某些頁節點的過濾器不會被快取,因為快取他們並沒有意義。 例如 Script filters The results from script filters cannot be cached because the meaning of the script is opaque to Elasticsearch. Geo-filters The geolocation filters, which we cover in more detail in Geolocation , are usually used to filter results based on the geolocation of a specific user. Since each user has a unique geolocation, it is unlikely that geo-filters will be reused, so it makes no sense to cache them. Date ranges Date ranges that use the now function (for example "now-1h"), result in values accurate to the millisecond. Every time the filter is run, now returns a new time. Older filters will never be reused, so caching is disabled by default. However, when using now with rounding (for example, now/d rounds to the nearest day), caching is enabled by default.Sometimes the default caching strategy is not correct. Perhaps you have a complicated boolexpression that is reused several times in the same query. Or you have a filter on a date field that will never be reused. The default caching strategy can be overridden on almost any filter by setting the _cache flag:
{"range":{"timestamp":{"gt":"2014-01-02 16:15:14"},"_cache":false}}

filter order

過濾條件越精確的過濾器應該排在前邊。例如 a filter返回1w個結果,b filter返回10個結果,則應將b過濾器置於a之前。 Cached filters are very fast, so they should be placed before filters that are not cacheable.
被快取的過濾器非常快,應該放在為被快取的之前。

full-text search

Term-based queries

Queries like the term or fuzzy queries are low-level queries that have no analysis phase. They operate on a single term. A term query for the term Foo looks for that exact term in the inverted index and calculates the TF/IDF relevance _score for each document that contains the term.

It is important to remember that the term query looks in the inverted index for the exact term only; it won’t match any variants like foo or FOO. It doesn’t matter how the term came to be in the index, just that it is. If you were to index ["Foo","Bar"] into an exact value not_analyzedfield, or Foo Bar into an analyzed field with the whitespace analyzer, both would result in having the two terms Foo and Bar in the inverted index.

Full-text queries

Queries like the match or query_string queries are high-level queries that understand the mapping of a field:

  • If you use them to query a date or integer field, they will treat the query string as a date or integer, respectively.
  • If you query an exact value (not_analyzed) string field, they will treat the whole query string as a single term.
  • But if you query a full-text (analyzed) field, they will first pass the query string through the appropriate analyzer to produce the list of terms to be queried.

a single-word queryedit

Our first example explains what happens when we use the match query to search within a full-text field for a single word:

GET /my_index/my_type/_search
{"query":{"match":{"title":"QUICK!"}}}
VIEW IN SENSE

Elasticsearch executes the preceding match query as follows:

  1. Check the field type.

    The title field is a full-text (analyzedstring field, which means that the query string should be analyzed too.

  2. Analyze the query string.

    The query string QUICK! is passed through the standard analyzer, which results in the single term quick. Because we have a just a single term, the match query can be executed as a single low-level term query.

  3. Find matching docs.

    The term query looks up quick in the inverted index and retrieves the list of documents that contain that term—in this case, documents 1, 2, and 3.

  4. Score each doc.

    The term query calculates the relevance _score for each matching document, by combining the term frequency (how often quick appears in the title field of each document), with the inverse document frequency (how often quick appears in the titlefield in all documents in the index), and the length of each field (shorter fields are considered more relevant). See What Is Relevance?.

multiword queries

GET /my_index/my_type/_search
{"query":{"match":{"title":{"query":"BROWN DOG!","operator":"and"}}}}

controlling precision

GET /my_index/my_type/_search
{"query":{"match":{"title":{"query":"quick brown dog","minimum_should_match":"75%"}}}}

controlling precision

            
           

相關推薦

ES學習筆記-Query DSL

queries and filters Although we refer to the query DSL, in reality there are two DSLs: the query DSL and the filter DSL.Query clauses

如何輸出格式化的字符串(學習筆記

linux python 格式化整數 浮點數 如何輸出格式化的字符串(學習筆記四)我們經常會輸出類似 ‘親愛的xxx你好!你xx月的話費是xx,余額是xx‘ 之類的字符串,而xxx的內容都是根據變量變化的,所以,需要一種簡便的格式化字符串的方式。在Python中,采用的格式化方式和C語言是一致的,

cocos2d-x-3.1 國際化strings.xml解決亂碼問題 (coco2d-x 學習筆記)

source ron 文件 亂碼問題 resource -s type fileutil ani 今天寫程序的時候發現輸出文字亂碼,盡管在實際開發中把字符串寫在代碼裏是不好的做法。可是有時候也是為了方便,遇到此問題第一時間在腦子裏面聯想到android下的strings

【Unity 3D】學習筆記十二:粒子特效

空間 獲得 material package 一個 log 創建 spa mpi 粒子特效 粒子特效的原理是將若幹粒子無規則的組合在一起。來模擬火焰,爆炸。水滴,霧氣等效果。要使用粒子特效首先要創建,在hierarchy視圖中點擊create——particle s

Tomcat學習筆記()

客服 list illegal state oid () ons mov tom Servlet容器部分 servlet容器用來處理請求servlet資源,並為web客服端填充response對象模塊,在tomcat中,共有4種類型的容器,分別是:Engi

JavaWeb學習筆記 request&response

cer 代碼 gbk msi 抓包工具 rom service net war HttpServletResponse 我們在創建Servlet時會覆蓋service()方法,或doGet()/doPost(),這些方法都有兩個參數,一個為代表請求的request和代表響

java 核心學習筆記() 單例類

com null tools 初始化 equal inf div 特殊 對象 如果一個類始終只能創建一個實例,那麽這個類被稱作單例類。 一些特殊的應用場景可能會用到,為了保證只能創建一個實例,需要將構造方法用private修飾,不允許在類之外的其它地方創建類的實例。 又要保

Python學習筆記()

pop rem 通過 修改 排序 python語言 創建 eve () 一、list創建   list 是Python語言中一種內置的數據類型  list 中可以存放不同類型的數據   list = [] #創建一個空列表  list = [1,2,3] #創建一個非空列

spring學習筆記:spring常用註解總結

bean logs single 配置文件 屬性註入 ring 如果 let ons 使用spring的註解,需要在配置文件中配置組件掃描器,用於在指定的包中掃描註解 <context:component-scan base-package="xxx.xxx.xxx

StackExchange.Redis學習筆記() 事務控制和Batch批量操作

成了 pan arp 展示 關於 public 連續 因此 用戶 Redis事物 Redis命令實現事務 Redis的事物包含在multi和exec(執行)或者discard(回滾)命令中 和sql事務不同的是,Redis調用Exec只是將所有的命令變成一個單元一起執行,期

Linux學習筆記()---centos7系統安裝後的一些簡單操作

完成 oss ctrl http windows images 取ip地址 fig ifconfig centos7系統安裝後的一些簡單操作 上次我們通過虛擬機已經安裝完成CentOS7。重啟系統後,進入登陸界面。系統登陸成功後,如下所示:我們虛擬機默認網絡是使用NAT,這

AWS學習筆記()--CLI創建EC2時執行腳本

scl type cycle 實例 doc settings shell腳本 system input When you launch an instance in Amazon EC2, you have the option of passing user data t

Bootstrap學習筆記()表單input

控件 屬性 icon val 制作表單 pan 選擇 提示信息 AI 單行輸入框,常見的文本輸入框,也就是input的type屬性值為text。在Bootstrap中使用input時也必須添加type類型,如果沒有指定type類型,將無法得到正確的樣式,因為Bootstra

Maven學習筆記(坐標和依賴)

Maven學習筆記 坐標 什麽是坐標? 在平面幾何中坐標(x,y)可以標識平面中唯一的一點 Maven 坐標主要組成 groupId:定義當前 Maven 項目隸屬項目、組織 artifactId:定義實際項目中的一個模塊 version:定義當前項目的當前版本

linux初級學習筆記:Linux文件管理類命令詳解!(視頻序號:03_1)

單詞 linux初級 linux文件管理 查看 stat 顯示行數 swd 字符處理 行數 本節學習的命令:cat(tac),more,less,head,tail,cut,sort,uniq,wc,tr 本節學習的技能:目錄管理         文件管理         

Java學習筆記:Java的八種基本數據類型

text 封裝 image 情況 p s 浮點數 align 不容易 字符 Java的八種基本數據類型   Java語言提供了八種基本類型。六種數字類型(四個整數型,兩個浮點型),一種字符類型,還有一種布爾型。   Java基本類型共有八種,基本類型可以分為三類,字符類

[原] OpenGL ES 學習筆記 (一)

信號 ppi sci DC RM 視錐 技術分享 img 比較 1. OpenGL ES 的坐標系在屏幕上的分布               OpenGL ES 的坐標系{x, y, z} 通過圖片的三維坐標系可以知道: - 它是一個三維坐標系 {x,

python學習筆記():生成器、內置函數、json

pen ear 數據 數字 strong 通過 lte callable 通用 一、生成器 生成器是什麽?其實和list差不多,只不過list生成的時候數據已經在內存裏面了,而生成器中生成的數據是當被調用時才生成呢,這樣就節省了內存空間。 1、 列表生成式,在第二篇博客

netty權威指南學習筆記——TCP粘包/拆包之粘包問題解決

方法 pan 對象 protect row 學習 ddl .get font   發生了粘包,我們需要將其清晰的進行拆包處理,這裏采用LineBasedFrameDecoder來解決 LineBasedFrameDecoder的工作原理是它依次遍歷ByteBuf中的可讀字節

Docker學習筆記:Docker容器(container)

fan .com lba docke file 域名 學習 link uno 一:查看容器查看運行容器docker ps查看所有容器docker ps -a二:創建容器docker create <image> ##創建容器 docker start <