ES學習筆記四-Query DSL
queries and filters
Although we refer to the query DSL, in reality there are two DSLs: the query DSL and the filter DSL.Query clauses and filter clauses are similar in nature, but have slightly different purposes.
filter:結果是或否,查詢速度快,可以被快取,一般用在真實值的查詢上。
query:查詢結果與搜尋內容的相關性怎樣,不能被快取,一般用在全文檢索上。
most important queries and filters
{"range":{"age":{"gte":20,"lt":30}}}exists and missing filter The
exists
and missing
filters
are used
to find documents in which the specified field either has one or more values (exists
)
or doesn’t have any values (missing
).
It is similar in nature to IS_NULL
(missing
)
and NOT
IS_NULL
(exists
)in
SQL
bool filter
用於複合查詢
must should must_not
{
"query":{
"bool":{
must:{
"query":{
"match":{
"text":"fadsfdasfds"
}
}
}
}
}
}
QUERYS:
MATCH
The match
query
should be the standard query that you
reach for whenever you want to query for a full-text or exact value in almost any field.
If you run a match
query
against a full-text field, it will analyze the query string by using the correct analyzer for that field before executing the search:
{"match":{"tweet":"About Search"}}VIEW IN SENSE
If you use it on a field containing an exact value, such
as a number, a date, a Boolean, or a not_analyzed
string
field, then it will search for that exact value:
{"match":{"age":26}}{"match":{"date":"2014-09-01"}}{"match":{"public":true}}{"match":{"tag":"full_text"}}For exact-value searches, you probably want to use a filter instead of a query, as a filter will be cached.
MULTI_MATCH
bool query
combining queries with filters
GET /_search
{"query":{"filtered":{"query":{"match":{"email":"business opportunity"}},"filter":{"term":{"folder":"inbox"}}}}}
just a filter
While in query context, if you need to use a filter without a query (for instance, to match all emails in the inbox), you can just omit the query:
GET /_search
{"query":{"filtered":{"filter":{"term":{"folder":"inbox"}}}}}
You seldom need to use a query as a filter, but we have included it for completeness' sake. The only time you may need it is when you need to use full-text matching while in filter context.
finding multiple exact values
GET /my_store/products/_search
{"query":{"filtered":{"filter":{"terms":{"price":[20,30]}}}}}
contains, but does not equal
GET /my_index/my_type/_search
{"query":{"filtered":{"filter":{"bool":{"must":[{"term":{"tags":"search"}},{"term":{"tag_count":1}}]}}}}}
When used on date fields, the range
filter supports date
math operations. For example, if we want to find all documents that have a timestamp sometime in the last hour:
"range":{"timestamp":{"gt":"now-1h"}}
When used on date fields, the range
filter supports date
math operations. For example, if we want to find all documents that have a timestamp sometime in the last hour:
"range":{"timestamp":{"gt":"now-1h"}}Less than January 1, 2014 plus one month
dealing with null values
GET /my_index/posts/_search
{"query":{"filtered":{"filter":{"exists":{"field":"tags"}}}}}
GET /my_index/posts/_search
{"query":{"filtered":{"filter":{"missing":{"field":"tags"}}}}}
all about caching
cache 是實時的,所以不用擔心快取的有效期問題。 Leaf filters have to consult the inverted index on disk, so it makes sense to cache them. Compound filters, on the other hand, use fast bit logic to combine the bitsets resulting from their inner clauses, so it is efficient to recalculate them every time.Certain leaf filters, however, are not cached by default, because it doesn’t make sense to do so:
某些頁節點的過濾器不會被快取,因為快取他們並沒有意義。 例如 Script filters The results from
script
filters cannot
be cached because the meaning of the script is opaque to Elasticsearch.
Geo-filters
The geolocation filters, which we cover in more detail in Geolocation ,
are usually used to filter results based on the geolocation of a specific user. Since each user has a unique geolocation, it is unlikely that geo-filters will be reused, so it makes no sense to cache them.
Date ranges
Date ranges that use
the now
function
(for example "now-1h"
),
result in values accurate to the millisecond. Every time the filter is run, now
returns
a new time. Older filters will never be reused, so caching is disabled by default. However, when using now
with
rounding (for example, now/d
rounds
to the nearest day), caching is enabled by default.Sometimes the default caching strategy is not correct. Perhaps you have a complicated bool
expression
that is reused several times in the same query. Or you have a filter on a date
field
that will never be reused. The default caching strategy can
be overridden on almost any filter by setting the _cache
flag:{"range":{"timestamp":{"gt":"2014-01-02 16:15:14"},"_cache":false}}
filter order
過濾條件越精確的過濾器應該排在前邊。例如 a filter返回1w個結果,b filter返回10個結果,則應將b過濾器置於a之前。 Cached filters are very fast, so they should be placed before filters that are not cacheable.被快取的過濾器非常快,應該放在為被快取的之前。
full-text search
Term-based queries
Queries like the term
or fuzzy
queries
are low-level queries that have no analysis phase. They operate
on a single term. A term
query
for the term Foo
looks
for that exact term in the inverted index and calculates the TF/IDF relevance _score
for
each document that contains the term.
It is important to remember that the term
query
looks in the inverted index for the exact term only; it won’t match any variants like foo
or FOO
.
It doesn’t matter how the term came to be in the index, just that it is. If you were to index ["Foo","Bar"]
into
an exact value not_analyzed
field,
or Foo Bar
into
an analyzed field with the whitespace
analyzer,
both would result in having the two terms Foo
and Bar
in
the inverted index.
Queries like the match
or query_string
queries
are high-level queries that understand the mapping of a field:
- If you use them to query a
date
orinteger
field, they will treat the query string as a date or integer, respectively. - If you query an exact value (
not_analyzed
) string field, they will treat the whole query string as a single term. - But if you query a full-text (
analyzed
) field, they will first pass the query string through the appropriate analyzer to produce the list of terms to be queried.
a single-word queryedit
Our first example explains what happens
when we use the match
query
to search within a full-text field for a single word:
GET /my_index/my_type/_searchVIEW IN SENSE
{"query":{"match":{"title":"QUICK!"}}}
Elasticsearch executes the preceding match
query as
follows:
-
Check the field type.
The
title
field is a full-text (analyzed
)string
field, which means that the query string should be analyzed too. -
Analyze the query string.
The query string
QUICK!
is passed through the standard analyzer, which results in the single termquick
. Because we have a just a single term, thematch
query can be executed as a single low-levelterm
query. -
Find matching docs.
The
term
query looks upquick
in the inverted index and retrieves the list of documents that contain that term—in this case, documents 1, 2, and 3. -
Score each doc.
The
term
query calculates the relevance_score
for each matching document, by combining the term frequency (how oftenquick
appears in thetitle
field of each document), with the inverse document frequency (how oftenquick
appears in thetitle
field in all documents in the index), and the length of each field (shorter fields are considered more relevant). See What Is Relevance?.
multiword queries
GET /my_index/my_type/_search
{"query":{"match":{"title":{"query":"BROWN DOG!","operator":"and"}}}}
controlling precision
GET /my_index/my_type/_search
{"query":{"match":{"title":{"query":"quick brown dog","minimum_should_match":"75%"}}}}
controlling precision
相關推薦
ES學習筆記四-Query DSL
queries and filters Although we refer to the query DSL, in reality there are two DSLs: the query DSL and the filter DSL.Query clauses
如何輸出格式化的字符串(學習筆記四)
linux python 格式化整數 浮點數 如何輸出格式化的字符串(學習筆記四)我們經常會輸出類似 ‘親愛的xxx你好!你xx月的話費是xx,余額是xx‘ 之類的字符串,而xxx的內容都是根據變量變化的,所以,需要一種簡便的格式化字符串的方式。在Python中,采用的格式化方式和C語言是一致的,
cocos2d-x-3.1 國際化strings.xml解決亂碼問題 (coco2d-x 學習筆記四)
source ron 文件 亂碼問題 resource -s type fileutil ani 今天寫程序的時候發現輸出文字亂碼,盡管在實際開發中把字符串寫在代碼裏是不好的做法。可是有時候也是為了方便,遇到此問題第一時間在腦子裏面聯想到android下的strings
【Unity 3D】學習筆記四十二:粒子特效
空間 獲得 material package 一個 log 創建 spa mpi 粒子特效 粒子特效的原理是將若幹粒子無規則的組合在一起。來模擬火焰,爆炸。水滴,霧氣等效果。要使用粒子特效首先要創建,在hierarchy視圖中點擊create——particle s
Tomcat學習筆記(四)
客服 list illegal state oid () ons mov tom Servlet容器部分 servlet容器用來處理請求servlet資源,並為web客服端填充response對象模塊,在tomcat中,共有4種類型的容器,分別是:Engi
JavaWeb學習筆記四 request&response
cer 代碼 gbk msi 抓包工具 rom service net war HttpServletResponse 我們在創建Servlet時會覆蓋service()方法,或doGet()/doPost(),這些方法都有兩個參數,一個為代表請求的request和代表響
java 核心學習筆記(四) 單例類
com null tools 初始化 equal inf div 特殊 對象 如果一個類始終只能創建一個實例,那麽這個類被稱作單例類。 一些特殊的應用場景可能會用到,為了保證只能創建一個實例,需要將構造方法用private修飾,不允許在類之外的其它地方創建類的實例。 又要保
Python學習筆記(四)
pop rem 通過 修改 排序 python語言 創建 eve () 一、list創建 list 是Python語言中一種內置的數據類型 list 中可以存放不同類型的數據 list = [] #創建一個空列表 list = [1,2,3] #創建一個非空列
spring學習筆記四:spring常用註解總結
bean logs single 配置文件 屬性註入 ring 如果 let ons 使用spring的註解,需要在配置文件中配置組件掃描器,用於在指定的包中掃描註解 <context:component-scan base-package="xxx.xxx.xxx
StackExchange.Redis學習筆記(四) 事務控制和Batch批量操作
成了 pan arp 展示 關於 public 連續 因此 用戶 Redis事物 Redis命令實現事務 Redis的事物包含在multi和exec(執行)或者discard(回滾)命令中 和sql事務不同的是,Redis調用Exec只是將所有的命令變成一個單元一起執行,期
Linux學習筆記(四)---centos7系統安裝後的一些簡單操作
完成 oss ctrl http windows images 取ip地址 fig ifconfig centos7系統安裝後的一些簡單操作 上次我們通過虛擬機已經安裝完成CentOS7。重啟系統後,進入登陸界面。系統登陸成功後,如下所示:我們虛擬機默認網絡是使用NAT,這
AWS學習筆記(四)--CLI創建EC2時執行腳本
scl type cycle 實例 doc settings shell腳本 system input When you launch an instance in Amazon EC2, you have the option of passing user data t
Bootstrap學習筆記(四)表單input
控件 屬性 icon val 制作表單 pan 選擇 提示信息 AI 單行輸入框,常見的文本輸入框,也就是input的type屬性值為text。在Bootstrap中使用input時也必須添加type類型,如果沒有指定type類型,將無法得到正確的樣式,因為Bootstra
Maven學習筆記四(坐標和依賴)
Maven學習筆記 坐標 什麽是坐標? 在平面幾何中坐標(x,y)可以標識平面中唯一的一點 Maven 坐標主要組成 groupId:定義當前 Maven 項目隸屬項目、組織 artifactId:定義實際項目中的一個模塊 version:定義當前項目的當前版本
linux初級學習筆記四:Linux文件管理類命令詳解!(視頻序號:03_1)
單詞 linux初級 linux文件管理 查看 stat 顯示行數 swd 字符處理 行數 本節學習的命令:cat(tac),more,less,head,tail,cut,sort,uniq,wc,tr 本節學習的技能:目錄管理 文件管理
Java學習筆記四:Java的八種基本數據類型
text 封裝 image 情況 p s 浮點數 align 不容易 字符 Java的八種基本數據類型 Java語言提供了八種基本類型。六種數字類型(四個整數型,兩個浮點型),一種字符類型,還有一種布爾型。 Java基本類型共有八種,基本類型可以分為三類,字符類
[原] OpenGL ES 學習筆記 (一)
信號 ppi sci DC RM 視錐 技術分享 img 比較 1. OpenGL ES 的坐標系在屏幕上的分布 OpenGL ES 的坐標系{x, y, z} 通過圖片的三維坐標系可以知道: - 它是一個三維坐標系 {x,
python學習筆記(四):生成器、內置函數、json
pen ear 數據 數字 strong 通過 lte callable 通用 一、生成器 生成器是什麽?其實和list差不多,只不過list生成的時候數據已經在內存裏面了,而生成器中生成的數據是當被調用時才生成呢,這樣就節省了內存空間。 1、 列表生成式,在第二篇博客
netty權威指南學習筆記四——TCP粘包/拆包之粘包問題解決
方法 pan 對象 protect row 學習 ddl .get font 發生了粘包,我們需要將其清晰的進行拆包處理,這裏采用LineBasedFrameDecoder來解決 LineBasedFrameDecoder的工作原理是它依次遍歷ByteBuf中的可讀字節
Docker學習筆記四:Docker容器(container)
fan .com lba docke file 域名 學習 link uno 一:查看容器查看運行容器docker ps查看所有容器docker ps -a二:創建容器docker create <image> ##創建容器 docker start <