1. 程式人生 > >Elasticsearch---學習記錄(2)

Elasticsearch---學習記錄(2)

僅供自己作學習筆記,詳情請移步es官方文件

9.記錄------sql外掛

安裝sql外掛以後,就有兩種方式查詢資料

  • 還是url裡面直接使用_sql+"sql查詢語句"

    curl -XPOST http://172.16.150.149:29200/_sql?pretty -d "SELECT * FROM facebook"
          {
            "took" : 1,
            "timed_out" : false,
            "_shards" : {
          "total" : 3,
          "successful" : 3,
          "failed" : 0
            },
            "hits" : {
          "total" : 4,
          "max_score" : 1.0,
          "hits" : [ {
            "_index" : "facebook",
            "_type" : "blog",
            "_id" : "pretty",
            "_score" : 1.0,
            "_source" : {
          "title" : "website",
          "text" : "blog is making",
          "date" : "2018/1016"
            }
          }, {
            "_index" : "facebook",
            "_type" : "blog",
            "_id" : "AWZ668ZcHFL4sAFl7IMI",
            "_score" : 1.0,
            "_source" : {
          "title" : "website",
          "text" : "blog is making",
          "date" : "2018/1016"
            }
          }, {
            "_index" : "facebook",
            "_type" : "blog",
            "_id" : "AWZ67I_dHFL4sAFl7IMJ",
            "_score" : 1.0,
            "_source" : {
          "title" : "website",
          "text" : "blog is making",
          "date" : "2018/1016"
            }
          }, {
            "_index" : "facebook",
            "_type" : "blog",
            "_id" : "123",
            "_score" : 1.0,
            "_source" : {
          "title" : "change version num",
          "text" : "changing...",
          "views" : 0,
          "tags" : [ "testing" ]
            }
          } ]
            }
          }
    
  • sql外掛視覺化介面

10.記錄------GET多個文件

mget API 要求有一個 docs 陣列作為引數,每個 元素包含需要檢索文件的元資料, 包括 _index 、 _type 和 _id 。

當_index,_type相同的情況下,直接就傳一個ids陣列

curl -i -XGET http://172.16.150.149:29200/facebook/blog/_mget?pretty -d " {"ids":["123","888"]}"
HTTP/1.1 200 OK
Content-Type: application/json; charset=UTF-8
Content-Length: 504

{
  "docs" : [ {
"_index" : "facebook",
"_type" : "blog",
"_id" : "123",
"_version" : 121,
"found" : true,
"_source" : {
  "title" : "change version num",
  "text" : "changing...",
  "views" : 0,
  "tags" : [ "testing" ]
}
  }, {
"_index" : "facebook",
"_type" : "blog",
"_id" : "888",
"_version" : 1,
"found" : true,
"_source" : {
  "title" : "website",
  "text" : "new test is made",
  "date" : "2018/10/17"
}
  } ]
}

11.記錄------bulk批量操作

為什麼需要換行?
肯定是要從效能消耗的角度上看.以每條指令,作為一個數據源操作,直接讀取,減少JVM的消耗.

bulk API 按如下步驟順序執行:

客戶端向 Node 1 -master傳送 bulk 請求。

Node 1 為每個節點建立一個批量請求,並將這些請求並行轉發到每個包含主分片的節點主機。

主分片一個接一個按順序執行每個操作。當每個操作成功時,主分片並行轉發新文件(或刪除)到副本分片,然後執行下一個操作。 一旦所有的副本分片報告所有操作成功,該節點將向協調節點報告成功,協調節點將這些響應收集整理並返回給客戶端。

由這個也可以看出是bulk的操作是非原子性的.

自己遇到的問題是怎麼換行,而不是續行?

在github上面看到了解決方案(自己使用ubuntu進行測試),加入-H 'Content-Type: application/json'

 curl -H 'Content-Type: application/json' -i -XPOST http://172.16.150.149:29200/_bulk -d '
{"create":{"_index":"twitter","_type":"newtype","_id":970}}
{ "create": { "_index": "user", "_type": "doc", "_id": "2" }}
'

然後就可以愉快地隨意換行了,結尾註意',其實忘記輸入,直接回車,也只會有另起一行.

12.瞭解------routing的作用

文件中講了es的儲存方式,這裡就簡單瞭解記錄.

shard = hash(routing) % number_of_primary_shards

routing 是一個可變值,預設是文件的 _id ,也可以設定成一個自定義的值。 routing 通過 hash 函式生成一個數字,然後這個數字再除以 number_of_primary_shards (主分片的數量)後得到 餘數 。這個分佈在 0 到 number_of_primary_shards-1 之間的餘數,就是我們所尋求的文件所在分片的位置。

這就解釋了為什麼我們要在建立索引的時候就確定好主分片的數量 並且永遠不會改變這個數量:因為如果數量變化了,那麼所有之前路由的值都會無效,文件也再也找不到了。

13.記錄-------空搜尋

不指定查詢語句

GET /_search

  curl -XGET http://172.16.150.149:29200/facebook/_search?pretty
{
  "took" : 1,
  "timed_out" : false,
  "_shards" : {
"total" : 3,
"successful" : 3,
"failed" : 0
  },
  "hits" : {
"total" : 5,
"max_score" : 1.0,
"hits" : [ {
  "_index" : "facebook",
  "_type" : "blog",
  "_id" : "pretty",
  "_score" : 1.0,
  "_source" : {
"title" : "website",
"text" : "blog is making",
"date" : "2018/1016"
  }
}, {
  "_index" : "facebook",
  "_type" : "blog",
  "_id" : "888",
  "_score" : 1.0,
  "_source" : {
"title" : "website",
"text" : "new test is made",
"date" : "2018/10/17"
  }
}, {
  "_index" : "facebook",
  "_type" : "blog",
  "_id" : "AWZ668ZcHFL4sAFl7IMI",
  "_score" : 1.0,
  "_source" : {
"title" : "website",
"text" : "blog is making",
"date" : "2018/1016"
  }
}, {
  "_index" : "facebook",
  "_type" : "blog",
  "_id" : "AWZ67I_dHFL4sAFl7IMJ",
  "_score" : 1.0,
  "_source" : {
"title" : "website",
"text" : "blog is making",
"date" : "2018/1016"
  }
}, {
  "_index" : "facebook",
  "_type" : "blog",
  "_id" : "123",
  "_score" : 1.0,
  "_source" : {
"title" : "change version num",
"text" : "changing...",
"views" : 0,
"tags" : [ "testing" ]
  }
} ]
  }
}

主要欄位含義

took:查詢消耗時間.

timeout:設定一個時間來等待各個節點,分片返回的結果,過時就關閉連線.

hits:記錄查詢的總數資訊,以及各個索引的資訊_index,_type,_id等.
shards:分片資訊.