ES學習筆記四-Query DSL

queries and filters

Although we refer to the query DSL, in reality there are two DSLs: the query DSL and the filter DSL.Query clauses and filter clauses are similar in nature, but have slightly different purposes.

filter：結果是或否，查詢速度快，可以被快取，一般用在真實值的查詢上。

query：查詢結果與搜尋內容的相關性怎樣，不能被快取，一般用在全文檢索上。

most important queries and filters

term filter {query：{ "term":"value" }} terms filer { query:{ "terms":["a","b"] } } range filter

{"range":{"age":{"gte":20,"lt":30}}}

exists and missing filter The exists and missing filters are used to find documents in which the specified field either has one or more values (exists) or doesn’t have any values (

missing). It is similar in nature to IS_NULL (missing) and

NOT
 IS_NULL

(exists)in SQL

bool filter

用於複合查詢

must should must_not

{

"query":{

"bool":{

must:{

"query":{

"match":{

"text":"fadsfdasfds"

}

QUERYS:

MATCH

The match query should be the standard query that you reach for whenever you want to query for a full-text or exact value in almost any field.

If you run a match query against a full-text field, it will analyze the query string by using the correct analyzer for that field before executing the search:

{"match":{"tweet":"About Search"}}

VIEW IN SENSE

If you use it on a field containing an exact value, such as a number, a date, a Boolean, or a not_analyzed string field, then it will search for that exact value:

{"match":{"age":26}}{"match":{"date":"2014-09-01"}}{"match":{"public":true}}{"match":{"tag":"full_text"}}

For exact-value searches, you probably want to use a filter instead of a query, as a filter will be cached.

MULTI_MATCH

bool query

combining queries with filters

GET /_search
{"query":{"filtered":{"query":{"match":{"email":"business opportunity"}},"filter":{"term":{"folder":"inbox"}}}}}

just a filter

While in query context, if you need to use a filter without a query (for instance, to match all emails in the inbox), you can just omit the query:

GET /_search
{"query":{"filtered":{"filter":{"term":{"folder":"inbox"}}}}}

You seldom need to use a query as a filter, but we have included it for completeness' sake. The only time you may need it is when you need to use full-text matching while in filter context.

finding multiple exact values

GET /my_store/products/_search
{"query":{"filtered":{"filter":{"terms":{"price":[20,30]}}}}}

contains, but does not equal

GET /my_index/my_type/_search
{"query":{"filtered":{"filter":{"bool":{"must":[{"term":{"tags":"search"}},{"term":{"tag_count":1}}]}}}}}

When used on date fields, the range filter supports date math operations. For example, if we want to find all documents that have a timestamp sometime in the last hour:

"range":{"timestamp":{"gt":"now-1h"}}

When used on date fields, the range filter supports date math operations. For example, if we want to find all documents that have a timestamp sometime in the last hour:

"range":{"timestamp":{"gt":"now-1h"}}

Less than January 1, 2014 plus one month

dealing with null values

GET /my_index/posts/_search
{"query":{"filtered":{"filter":{"exists":{"field":"tags"}}}}}

GET /my_index/posts/_search
{"query":{"filtered":{"filter":{"missing":{"field":"tags"}}}}}

all about caching

cache 是實時的，所以不用擔心快取的有效期問題。 Leaf filters have to consult the inverted index on disk, so it makes sense to cache them. Compound filters, on the other hand, use fast bit logic to combine the bitsets resulting from their inner clauses, so it is efficient to recalculate them every time.
Certain leaf filters, however, are not cached by default, because it doesn’t make sense to do so:
某些頁節點的過濾器不會被快取，因為快取他們並沒有意義。例如 Script filters The results from script filters cannot be cached because the meaning of the script is opaque to Elasticsearch. Geo-filters The geolocation filters, which we cover in more detail in Geolocation , are usually used to filter results based on the geolocation of a specific user. Since each user has a unique geolocation, it is unlikely that geo-filters will be reused, so it makes no sense to cache them. Date ranges Date ranges that use the now function (for example "now-1h"), result in values accurate to the millisecond. Every time the filter is run, now returns a new time. Older filters will never be reused, so caching is disabled by default. However, when using now with rounding (for example, now/d rounds to the nearest day), caching is enabled by default.Sometimes the default caching strategy is not correct. Perhaps you have a complicated boolexpression that is reused several times in the same query. Or you have a filter on a date field that will never be reused. The default caching strategy can be overridden on almost any filter by setting the _cache flag:

{"range":{"timestamp":{"gt":"2014-01-02 16:15:14"},"_cache":false}}

filter order

過濾條件越精確的過濾器應該排在前邊。例如 a filter返回1w個結果，b filter返回10個結果，則應將b過濾器置於a之前。 Cached filters are very fast, so they should be placed before filters that are not cacheable.
被快取的過濾器非常快，應該放在為被快取的之前。

full-text search

Term-based queries

Queries like the term or fuzzy queries are low-level queries that have no analysis phase. They operate on a single term. A term query for the term Foo looks for that exact term in the inverted index and calculates the TF/IDF relevance _score for each document that contains the term.

It is important to remember that the term query looks in the inverted index for the exact term only; it won’t match any variants like foo or FOO. It doesn’t matter how the term came to be in the index, just that it is. If you were to index ["Foo","Bar"] into an exact value not_analyzedfield, or Foo Bar into an analyzed field with the whitespace analyzer, both would result in having the two terms Foo and Bar in the inverted index.

Full-text queries

Queries like the match or query_string queries are high-level queries that understand the mapping of a field:

If you use them to query a date or integer field, they will treat the query string as a date or integer, respectively.
If you query an exact value (not_analyzed) string field, they will treat the whole query string as a single term.
But if you query a full-text (analyzed) field, they will first pass the query string through the appropriate analyzer to produce the list of terms to be queried.

a single-word queryedit

Our first example explains what happens when we use the match query to search within a full-text field for a single word:

GET /my_index/my_type/_search
{"query":{"match":{"title":"QUICK!"}}}

VIEW IN SENSE

Elasticsearch executes the preceding match query as follows:

Check the field type.

The title field is a full-text (analyzed) string field, which means that the query string should be analyzed too.
Analyze the query string.

The query string QUICK! is passed through the standard analyzer, which results in the single term quick. Because we have a just a single term, the match query can be executed as a single low-level term query.
Find matching docs.

The term query looks up quick in the inverted index and retrieves the list of documents that contain that term—in this case, documents 1, 2, and 3.
Score each doc.

The term query calculates the relevance _score for each matching document, by combining the term frequency (how often quick appears in the title field of each document), with the inverse document frequency (how often quick appears in the titlefield in all documents in the index), and the length of each field (shorter fields are considered more relevant). See What Is Relevance?.

multiword queries

GET /my_index/my_type/_search
{"query":{"match":{"title":{"query":"BROWN DOG!","operator":"and"}}}}

controlling precision

GET /my_index/my_type/_search
{"query":{"match":{"title":{"query":"quick brown dog","minimum_should_match":"75%"}}}}

controlling precision

              
           
              
              
            
            相關推薦
			   
            
            
            
 

    

    
    ES學習筆記四-Query DSL
      
                

queries and filters

Although we refer to the query DSL, in reality there are two DSLs: the query DSL and the filter DSL.Query
 clauses  

  
 

    

    
    如何輸出格式化的字符串（學習筆記四）
      linux  python  格式化整數   浮點數   如何輸出格式化的字符串（學習筆記四）我們經常會輸出類似 ‘親愛的xxx你好！你xx月的話費是xx，余額是xx‘ 之類的字符串，而xxx的內容都是根據變量變化的，所以，需要一種簡便的格式化字符串的方式。在Python中，采用的格式化方式和C語言是一致的， 

  
 

    

    
    cocos2d-x-3.1 國際化strings.xml解決亂碼問題 (coco2d-x 學習筆記四)
      source   ron   文件   亂碼問題   resource   -s   type   fileutil   ani   

今天寫程序的時候發現輸出文字亂碼，盡管在實際開發中把字符串寫在代碼裏是不好的做法。可是有時候也是為了方便，遇到此問題第一時間在腦子裏面聯想到android下的strings 

  
 

    

    
    【Unity 3D】學習筆記四十二：粒子特效
      空間   獲得   material   package   一個   log   創建   spa   mpi   

粒子特效


粒子特效的原理是將若幹粒子無規則的組合在一起。來模擬火焰，爆炸。水滴，霧氣等效果。要使用粒子特效首先要創建，在hierarchy視圖中點擊create——particle s 

  
 

    

    
    Tomcat學習筆記(四)
      客服   list   illegal   state   oid   ()   ons   mov   tom      Servlet容器部分
      servlet容器用來處理請求servlet資源，並為web客服端填充response對象模塊，在tomcat中，共有4種類型的容器，分別是：Engi 

  
 

    

    
    JavaWeb學習筆記四 request&response
      cer   代碼   gbk   msi   抓包工具   rom   service   net   war   HttpServletResponse
我們在創建Servlet時會覆蓋service()方法，或doGet()/doPost(),這些方法都有兩個參數，一個為代表請求的request和代表響 

  
 

    

    
    java 核心學習筆記(四) 單例類
      com   null   tools   初始化   equal   inf   div   特殊   對象   如果一個類始終只能創建一個實例，那麽這個類被稱作單例類。
一些特殊的應用場景可能會用到，為了保證只能創建一個實例，需要將構造方法用private修飾，不允許在類之外的其它地方創建類的實例。
又要保 

  
 

    

    
    Python學習筆記(四)
      pop   rem   通過   修改   排序   python語言   創建   eve   ()   一、list創建
　　list 是Python語言中一種內置的數據類型　　list 中可以存放不同類型的數據
　　list = []  #創建一個空列表　　list = [1,2,3] #創建一個非空列 

  
 

    

    
    spring學習筆記四：spring常用註解總結
      bean   logs   single   配置文件   屬性註入   ring   如果   let   ons   使用spring的註解，需要在配置文件中配置組件掃描器，用於在指定的包中掃描註解

<context:component-scan base-package="xxx.xxx.xxx 

  
 

    

    
    StackExchange.Redis學習筆記(四) 事務控制和Batch批量操作
      成了   pan   arp   展示   關於   public   連續   因此   用戶   Redis事物
Redis命令實現事務
Redis的事物包含在multi和exec（執行）或者discard（回滾）命令中
和sql事務不同的是，Redis調用Exec只是將所有的命令變成一個單元一起執行，期 

  
 

    

    
    Linux學習筆記(四)---centos7系統安裝後的一些簡單操作
      完成   oss   ctrl   http   windows   images   取ip地址   fig   ifconfig   centos7系統安裝後的一些簡單操作
上次我們通過虛擬機已經安裝完成CentOS7。重啟系統後，進入登陸界面。系統登陸成功後，如下所示：我們虛擬機默認網絡是使用NAT，這 

  
 

    

    
    AWS學習筆記(四)--CLI創建EC2時執行腳本
      scl   type   cycle   實例   doc   settings   shell腳本   system   input   When you launch an instance in Amazon EC2, you have the option of passing user data t 

  
 

    

    
    Bootstrap學習筆記(四)表單input
      控件   屬性   icon   val   制作表單   pan   選擇   提示信息   AI   單行輸入框，常見的文本輸入框，也就是input的type屬性值為text。在Bootstrap中使用input時也必須添加type類型，如果沒有指定type類型，將無法得到正確的樣式，因為Bootstra 

  
 

    

    
    Maven學習筆記四（坐標和依賴）
      Maven學習筆記    坐標     什麽是坐標？
在平面幾何中坐標（x,y）可以標識平面中唯一的一點     Maven 坐標主要組成     groupId：定義當前 Maven 項目隸屬項目、組織     artifactId：定義實際項目中的一個模塊     version：定義當前項目的當前版本  

  
 

    

    
    linux初級學習筆記四：Linux文件管理類命令詳解！(視頻序號：03_1)
      單詞   linux初級   linux文件管理   查看   stat   顯示行數   swd   字符處理   行數   本節學習的命令：cat(tac)，more，less，head，tail，cut，sort，uniq，wc，tr
本節學習的技能：目錄管理
　　　　　　　　文件管理
　　　　　　　　 

  
 

    

    
    Java學習筆記四:Java的八種基本數據類型
      text   封裝   image   情況   p s   浮點數   align   不容易   字符   Java的八種基本數據類型
 
　　Java語言提供了八種基本類型。六種數字類型（四個整數型，兩個浮點型），一種字符類型，還有一種布爾型。
　　Java基本類型共有八種，基本類型可以分為三類，字符類 

  
 

    

    
    [原] OpenGL ES 學習筆記 (一)
      信號   ppi   sci   DC   RM   視錐   技術分享   img   比較   1. OpenGL ES 的坐標系在屏幕上的分布
 


 


　　　　　　　　　　　　　　OpenGL ES 的坐標系{x, y, z}
 

通過圖片的三維坐標系可以知道：
- 它是一個三維坐標系 {x, 

  
 

    

    
    python學習筆記(四):生成器、內置函數、json
      pen   ear   數據   數字   strong   通過   lte   callable   通用    
一、生成器
生成器是什麽？其實和list差不多，只不過list生成的時候數據已經在內存裏面了，而生成器中生成的數據是當被調用時才生成呢，這樣就節省了內存空間。
1、 列表生成式，在第二篇博客 

  
 

    

    
    netty權威指南學習筆記四——TCP粘包/拆包之粘包問題解決
      方法   pan   對象   protect   row   學習   ddl   .get   font   　　發生了粘包，我們需要將其清晰的進行拆包處理，這裏采用LineBasedFrameDecoder來解決
LineBasedFrameDecoder的工作原理是它依次遍歷ByteBuf中的可讀字節 

  
 

    

    
    Docker學習筆記四：Docker容器(container)
      fan   .com   lba   docke   file   域名   學習   link   uno   一：查看容器查看運行容器docker ps查看所有容器docker ps -a二：創建容器docker create <image> ##創建容器

docker start <