ElasticSearch RestHighLevelClient 教程(三) 刪除&&查詢刪除
前言
刪除文件作為ES操作中重要的一部分,其必要性毋庸置疑。而根據官網文件api可知,有兩種刪除方式:一是直接根據index
,type
,id
直接刪除,而第二種是查詢刪除,也就是所謂的Delete By Query API
。
第一種刪除方式因為id作為唯一標識,所以如果文件存在肯定能指定刪除。
而第二種查詢刪除的方式,其作用過程相當於先查詢出滿足條件的文件,再根據文件ID依次刪除。所以必須注意查詢條件,確定查詢結果範圍。否則會誤刪很多文件。
當使用RestHighLevelClient操作時,第一種api沒有問題,而第二種雖然提供了DeleteByQueryRequest
Delete By Query
快,但是目前只能使用這種方式曲線救國了。
還有一種方式就是使用RestClient,靈活拼接json語句,傳送Http請求。
正文
準備資料
/PUT http://{{host}}:{{port}}/delete_demo
{
"mappings":{
"demo":{
"properties":{
"content":{
"type ":"text",
"fields":{
"keyword":{
"type":"keyword"
}
}
}
}
}
}
}
/POST http://{{host}}:{{port}}/_bulk
{"index":{"_index":"delete_demo","_type":"demo" }}
{"content":"test1"}
{"index":{"_index":"delete_demo","_type":"demo"}}
{"content":"test1"}
{"index":{"_index":"delete_demo","_type":"demo"}}
{"content":"test1 add"}
{"index":{"_index":"delete_demo","_type":"demo"}}
{"content":"test2"}
注意:批量操作時,每行資料後面都得回車換行,最後一行後要跟空行!
{
"took": 7,
"errors": false,
"items": [
{
"index": {
"_index": "delete_demo",
"_type": "demo",
"_id": "AWExGSdW00f4t28WAPen",
"_version": 1,
"result": "created",
"_shards": {
"total": 2,
"successful": 1,
"failed": 0
},
"created": true,
"status": 201
}
},
{
"index": {
"_index": "delete_demo",
"_type": "demo",
"_id": "AWExGSdW00f4t28WAPeo",
"_version": 1,
"result": "created",
"_shards": {
"total": 2,
"successful": 1,
"failed": 0
},
"created": true,
"status": 201
}
},
{
"index": {
"_index": "delete_demo",
"_type": "demo",
"_id": "AWExGSdW00f4t28WAPep",
"_version": 1,
"result": "created",
"_shards": {
"total": 2,
"successful": 1,
"failed": 0
},
"created": true,
"status": 201
}
},
{
"index": {
"_index": "delete_demo",
"_type": "demo",
"_id": "AWExGSdW00f4t28WAPeq",
"_version": 1,
"result": "created",
"_shards": {
"total": 2,
"successful": 1,
"failed": 0
},
"created": true,
"status": 201
}
}
]
}
ID方式刪除
API格式
/DELETE http://{{host}}:{{port}}/delete_demo/demo/AWExGSdW00f4t28WAPen
Java 客戶端
public class ElkDaoTest extends BaseTest{
@Autowired
private RestHighLevelClient rhlClient;
private String index;
private String type;
private String id;
@Before
public void prepare(){
index = "delete_demo";
type = "demo";
id = "AWExGSdW00f4t28WAPeo";
}
@Test
public void delete(){
DeleteRequest deleteRequest = new DeleteRequest(index,type,id);
DeleteResponse response = null;
try {
response = rhlClient.delete(deleteRequest);
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
System.out.println(response);
}
}
同樣刪除成功。
Delete By Query
API方式
首先重新把之前的資料恢復到四個文件。
/POST http://{{host}}:{{port}}/delete_demo/demo/_delete_by_query
{
"query":{
"match":{
"content":"test1"
}
}
}
{
"took": 14,
"timed_out": false,
"total": 3,
"deleted": 3,
"batches": 1,
"version_conflicts": 0,
"noops": 0,
"retries": {
"bulk": 0,
"search": 0
},
"throttled_millis": 0,
"requests_per_second": -1,
"throttled_until_millis": 0,
"failures": []
}
/GET http://{{host}}:{{port}}/delete_demo/demo/_search
{
"took": 0,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 1,
"hits": [
{
"_index": "delete_demo",
"_type": "demo",
"_id": "AWExKDse00f4t28WAafF",
"_score": 1,
"_source": {
"content": "test2"
}
}
]
}
}
結果顯示刪除了三個文件,即test1
,test1
,test1 add
,只剩下test2
。顯然是將查詢到的結果都刪除了。
如果使用term
,也是同樣按照查詢匹配刪除。
/POST http://{{host}}:{{port}}/delete_demo/demo/_delete_by_query
{
"query":{
"term":{
"content.keyword":"test1"
}
}
}
{
"took": 6,
"timed_out": false,
"total": 2,
"deleted": 2,
"batches": 1,
"version_conflicts": 0,
"noops": 0,
"retries": {
"bulk": 0,
"search": 0
},
"throttled_millis": 0,
"requests_per_second": -1,
"throttled_until_millis": 0,
"failures": []
}
證明Delete By Query
就是先查詢再刪除的過程。
Java 客戶端
使用RestHighLevelClient
public class ElkDaoTest extends BaseTest { @Autowired private RestHighLevelClient rhlClient; private String index; private String type; private String deleteText; @Before public void prepare() { index = "delete_demo"; type = "demo"; deleteText = "test1"; } @Test public void delete() { try { SearchSourceBuilder sourceBuilder = new SearchSourceBuilder(); sourceBuilder.timeout(new TimeValue(2, TimeUnit.SECONDS)); TermQueryBuilder termQueryBuilder1 = QueryBuilders.termQuery("content.keyword", deleteText); sourceBuilder.query(termQueryBuilder1); SearchRequest searchRequest = new SearchRequest(index); searchRequest.types(type); searchRequest.source(sourceBuilder); SearchResponse response = rhlClient.search(searchRequest); SearchHits hits = response.getHits(); List<String> docIds = new ArrayList<>(hits.getHits().length); for (SearchHit hit : hits) { docIds.add(hit.getId()); } BulkRequest bulkRequest = new BulkRequest(); for (String id : docIds) { DeleteRequest deleteRequest = new DeleteRequest(index, type, id); bulkRequest.add(deleteRequest); } rhlClient.bulk(bulkRequest); } catch (IOException e) { e.printStackTrace(); } } }
恢復資料再執行以上程式碼,查詢只剩下
test1 add
和test2
兩個文件,刪除查詢成功。具體查詢不再貼出。使用RestClient
之前系列文章就有提到過,rhlClient是對RestClient的封裝,而rhlClient有部分功能還在完善,還未在java中實現。那麼使用restClient直接以http的形式呼叫ES服務就好了。
public class ElkDaoTest extends BaseTest { @Autowired private RestClient restClient; private String index; private String type; private String deleteText; @Before public void prepare() { index = "delete_demo"; type = "demo"; deleteText = "test1"; } @Test public void delete() { String endPoint = "/" + index + "/" + type +"/_delete_by_query"; String source = genereateQueryString(); HttpEntity entity = new NStringEntity(source, ContentType.APPLICATION_JSON); try { restClient.performRequest("POST", endPoint,Collections.<String, String> emptyMap(), entity); } catch (IOException e) { // TODO Auto-generated catch block e.printStackTrace(); } } public String genereateQueryString(){ IndexRequest indexRequest = new IndexRequest(); XContentBuilder builder; try { builder = JsonXContent.contentBuilder() .startObject() .startObject("query") .startObject("term") .field("content.keyword",deleteText) .endObject() .endObject() .endObject(); indexRequest.source(builder); } catch (IOException e) { // TODO Auto-generated catch block e.printStackTrace(); } String source = indexRequest.source().utf8ToString(); return source; } }
執行後,同樣刪除了
test1
的兩個文件,功能實現。優點就在於不需要發起兩次HTTP連線,節省時間。
總結
就刪除操作而言,RestHighLevelClient所能做的還不夠完善,因此要聯絡RestClient的靈活性才能實現我們想要的功能。