Elasticsearch---學習記錄(2)
僅供自己作學習筆記,詳情請移步es官方文件
9.記錄------sql外掛
安裝sql外掛以後,就有兩種方式查詢資料
-
還是url裡面直接使用
_sql
+"sql查詢語句"
curl -XPOST http://172.16.150.149:29200/_sql?pretty -d "SELECT * FROM facebook" { "took" : 1, "timed_out" : false, "_shards" : { "total" : 3, "successful" : 3, "failed" : 0 }, "hits" : { "total" : 4, "max_score" : 1.0, "hits" : [ { "_index" : "facebook", "_type" : "blog", "_id" : "pretty", "_score" : 1.0, "_source" : { "title" : "website", "text" : "blog is making", "date" : "2018/1016" } }, { "_index" : "facebook", "_type" : "blog", "_id" : "AWZ668ZcHFL4sAFl7IMI", "_score" : 1.0, "_source" : { "title" : "website", "text" : "blog is making", "date" : "2018/1016" } }, { "_index" : "facebook", "_type" : "blog", "_id" : "AWZ67I_dHFL4sAFl7IMJ", "_score" : 1.0, "_source" : { "title" : "website", "text" : "blog is making", "date" : "2018/1016" } }, { "_index" : "facebook", "_type" : "blog", "_id" : "123", "_score" : 1.0, "_source" : { "title" : "change version num", "text" : "changing...", "views" : 0, "tags" : [ "testing" ] } } ] } }
-
sql外掛視覺化介面
10.記錄------GET多個文件
mget API 要求有一個 docs 陣列作為引數,每個 元素包含需要檢索文件的元資料, 包括 _index 、 _type 和 _id 。
當_index,_type相同的情況下,直接就傳一個ids
陣列
curl -i -XGET http://172.16.150.149:29200/facebook/blog/_mget?pretty -d " {"ids":["123","888"]}" HTTP/1.1 200 OK Content-Type: application/json; charset=UTF-8 Content-Length: 504 { "docs" : [ { "_index" : "facebook", "_type" : "blog", "_id" : "123", "_version" : 121, "found" : true, "_source" : { "title" : "change version num", "text" : "changing...", "views" : 0, "tags" : [ "testing" ] } }, { "_index" : "facebook", "_type" : "blog", "_id" : "888", "_version" : 1, "found" : true, "_source" : { "title" : "website", "text" : "new test is made", "date" : "2018/10/17" } } ] }
11.記錄------bulk批量操作
為什麼需要換行?
肯定是要從效能消耗的角度上看.以每條指令,作為一個數據源操作,直接讀取,減少JVM的消耗.
bulk API 按如下步驟順序執行:
客戶端向 Node 1 -master傳送 bulk 請求。
Node 1 為每個節點建立一個批量請求,並將這些請求並行轉發到每個包含主分片的節點主機。
主分片一個接一個按順序執行每個操作。當每個操作成功時,主分片並行轉發新文件(或刪除)到副本分片,然後執行下一個操作。 一旦所有的副本分片報告所有操作成功,該節點將向協調節點報告成功,協調節點將這些響應收集整理並返回給客戶端。
由這個也可以看出是bulk的操作是非原子性的.
自己遇到的問題是怎麼換行,而不是續行?
在github上面看到了解決方案(自己使用ubuntu進行測試),加入-H 'Content-Type: application/json'
curl -H 'Content-Type: application/json' -i -XPOST http://172.16.150.149:29200/_bulk -d '
{"create":{"_index":"twitter","_type":"newtype","_id":970}}
{ "create": { "_index": "user", "_type": "doc", "_id": "2" }}
'
然後就可以愉快地隨意換行了,結尾註意'
,其實忘記輸入,直接回車,也只會有另起一行.
12.瞭解------routing的作用
文件中講了es的儲存方式,這裡就簡單瞭解記錄.
shard = hash(routing) % number_of_primary_shards
routing 是一個可變值,預設是文件的 _id ,也可以設定成一個自定義的值。 routing 通過 hash 函式生成一個數字,然後這個數字再除以 number_of_primary_shards (主分片的數量)後得到 餘數 。這個分佈在 0 到 number_of_primary_shards-1 之間的餘數,就是我們所尋求的文件所在分片的位置。
這就解釋了為什麼我們要在建立索引的時候就確定好主分片的數量 並且永遠不會改變這個數量:因為如果數量變化了,那麼所有之前路由的值都會無效,文件也再也找不到了。
13.記錄-------空搜尋
不指定查詢語句
GET /_search
curl -XGET http://172.16.150.149:29200/facebook/_search?pretty
{
"took" : 1,
"timed_out" : false,
"_shards" : {
"total" : 3,
"successful" : 3,
"failed" : 0
},
"hits" : {
"total" : 5,
"max_score" : 1.0,
"hits" : [ {
"_index" : "facebook",
"_type" : "blog",
"_id" : "pretty",
"_score" : 1.0,
"_source" : {
"title" : "website",
"text" : "blog is making",
"date" : "2018/1016"
}
}, {
"_index" : "facebook",
"_type" : "blog",
"_id" : "888",
"_score" : 1.0,
"_source" : {
"title" : "website",
"text" : "new test is made",
"date" : "2018/10/17"
}
}, {
"_index" : "facebook",
"_type" : "blog",
"_id" : "AWZ668ZcHFL4sAFl7IMI",
"_score" : 1.0,
"_source" : {
"title" : "website",
"text" : "blog is making",
"date" : "2018/1016"
}
}, {
"_index" : "facebook",
"_type" : "blog",
"_id" : "AWZ67I_dHFL4sAFl7IMJ",
"_score" : 1.0,
"_source" : {
"title" : "website",
"text" : "blog is making",
"date" : "2018/1016"
}
}, {
"_index" : "facebook",
"_type" : "blog",
"_id" : "123",
"_score" : 1.0,
"_source" : {
"title" : "change version num",
"text" : "changing...",
"views" : 0,
"tags" : [ "testing" ]
}
} ]
}
}
主要欄位含義
took:查詢消耗時間.
timeout:設定一個時間來等待各個節點,分片返回的結果,過時就關閉連線.
hits:記錄查詢的總數資訊,以及各個索引的資訊_index,_type,_id等.
shards:分片資訊.