Elasticsearch實踐（二）在Springboot微服務中整合搜尋服務

阿新 • • 發佈：2019-02-13

關於如何用Docker搭建Elasticsearch叢集環境可以參考前一篇：Elasticsearch實踐（一）用Docker搭建Elasticsearch叢集。本文主要介紹，如果在Springboot體系中整合Elasticsearch服務。本文基於:Elasticsearch版本是2.2.4,Springboot版本是1.5.3.RELEASE，spring-data-elasticsearch:2.1.3.RELEASE。

Elasticsearch官方API

Elasticsearch提供了多種api。可以直接使用官方提供的Java API進行使用。

ElasticSearc Java API。如果是使用Spring框架的專案，還可以用spring-data-elasticsearch的api。基於spring可以使用Annotation，索引文件不需要任何xml式的配置。而且使用上非常簡便。其儲存、查詢介面繼承了JpaRepository，所以對於引入JPA的專案來說，上手非常快。
Elasticsearch也提供了http協議的API,資源API風格是restful的，所以也都比較好記憶。在有需要的場景時查官網是最快的。

Springboot專案中使用spring-data-elasticsearch框架整合

Springboot專案整合elasticsearch，可以使用spring-data-elasticsearch。官方連結：

spring-data-elasticsearch Doc 熟悉JPA以及使用過Spring-data-common專案的開發者，應該很快會上手spring-data-elasticsearch。首先要做的就是在gradle專案中，引入‘org.springframework.data:spring-data-elasticsearch:2.1.3.RELEASE’以及‘org.springframework.boot:spring-boot-starter-data-elasticsearch:your_springboot_version’ 。在我們對於索引資料的crud操作api中，主要用的是ElasticsearchRepository

介面，其繼承與spring-data的基礎repository包的介面CrudRepository。先看一下介面的主要方法：

@NoRepositoryBean
public interface CrudRepository<T, ID extends Serializable> extends Repository<T, ID> {

    /**
     * Saves a given entity. Use the returned instance for further operations as the save operation might have changed the
     * entity instance completely.
     * 
     * @param entity
     * @return the saved entity
     */
    <S extends T> S save(S entity);

    /**
     * Saves all given entities.
     * 
     * @param entities
     * @return the saved entities
     * @throws IllegalArgumentException in case the given entity is {@literal null}.
     */
    <S extends T> Iterable<S> save(Iterable<S> entities);

    /**
     * Retrieves an entity by its id.
     * 
     * @param id must not be {@literal null}.
     * @return the entity with the given id or {@literal null} if none found
     * @throws IllegalArgumentException if {@code id} is {@literal null}
     */
    T findOne(ID id);

    /**
     * Returns whether an entity with the given id exists.
     * 
     * @param id must not be {@literal null}.
     * @return true if an entity with the given id exists, {@literal false} otherwise
     * @throws IllegalArgumentException if {@code id} is {@literal null}
     */
    boolean exists(ID id);

    /**
     * Returns all instances of the type.
     * 
     * @return all entities
     */
    Iterable<T> findAll();

    /**
     * Returns all instances of the type with the given IDs.
     * 
     * @param ids
     * @return
     */
    Iterable<T> findAll(Iterable<ID> ids);

    /**
     * Returns the number of entities available.
     * 
     * @return the number of entities
     */
    long count();

    /**
     * Deletes the entity with the given id.
     * 
     * @param id must not be {@literal null}.
     * @throws IllegalArgumentException in case the given {@code id} is {@literal null}
     */
    void delete(ID id);

    /**
     * Deletes a given entity.
     * 
     * @param entity
     * @throws IllegalArgumentException in case the given entity is {@literal null}.
     */
    void delete(T entity);

    /**
     * Deletes the given entities.
     * 
     * @param entities
     * @throws IllegalArgumentException in case the given {@link Iterable} is {@literal null}.
     */
    void delete(Iterable<? extends T> entities);

    /**
     * Deletes all entities managed by the repository.
     */
    void deleteAll();
    ...
}

其對於Elasticsearch的文件（@Document）的資料的操作就類似於JPA中對於資料庫表（@Entity）的介面。可以用findByXX的方式進行查詢，也可以自定義@Query()方式進行查詢。在開發的過程中，對於一些特殊的查詢場景，可以查詢spring-data-elasticsearch原始碼中的示例，基本包含了各種場景的API,專案git：spring-data-elasticsearch Git

使用spring-boot-starter-data-elasticsearch做啟動時搜尋服務的配置

使用Springboot,可以在啟動時對很多服務Bean進行注入。一下是通過Autowire方式，使用spring-boot-starter-data-elasticsearch:2.1.3.RELEASE來處理基於Springboot的微服務啟動時連線Elasticsearch叢集，以及注入應用程式碼需要使用的 ElasticsearchTemplate。Configuration類如下：

import org.apache.commons.lang3.StringUtils;
import org.elasticsearch.client.transport.TransportClient;
import org.elasticsearch.common.settings.Settings;
import org.elasticsearch.common.transport.InetSocketTransportAddress;
import org.springframework.beans.factory.annotation.Value;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import org.springframework.data.elasticsearch.core.ElasticsearchTemplate;
import org.springframework.data.elasticsearch.repository.config.EnableElasticsearchRepositories;

import java.net.InetAddress;

/**
 * 使用的是es 2.4.4 版本，因為springboot 1.5.x,以及目前版本最多支援到es 2.x。
 * <p>
 * Created by lijingyao on 2017/5/17 16:32.
 */
@Configuration
@EnableElasticsearchRepositories(basePackages = "com.puregold.ms")
public class SearchConfig {

    // 假設使用三個node,(一主兩備)的配置。在實際的生產環境，需在properties檔案中替換成實際ip(內網或者外網ip)
    @Value("${elasticsearch.host1}")
    private String esHost;// master node

    @Value("${elasticsearch.host2:}") 
    private String esHost2;//replica node

    @Value("${elasticsearch.host3:}")
    private String esHost3;//replica node

    @Value("${elasticsearch.port}")
    private int esPort;

    @Value("${elasticsearch.clustername}")
    private String esClusterName;

    @Bean
    public TransportClient transportClient() throws Exception {

        Settings settings = Settings.settingsBuilder()
                .put("cluster.name", esClusterName)
                .build();

        TransportClient transportClient = TransportClient.builder()
                .settings(settings)
                .build()
                .addTransportAddress(new InetSocketTransportAddress(InetAddress.getByName(esHost), esPort));
        if (StringUtils.isNotEmpty(esHost2)) {
            transportClient.addTransportAddress(new InetSocketTransportAddress(InetAddress.getByName(esHost2), esPort));
        }
        if (StringUtils.isNotEmpty(esHost3)) {
            transportClient.addTransportAddress(new InetSocketTransportAddress(InetAddress.getByName(esHost3), esPort));
        }
        return transportClient;
    }

    @Bean
    public ElasticsearchTemplate elasticsearchTemplate() throws Exception {
        return new ElasticsearchTemplate(transportClient());
    }

}

使用spring-data-elasticsearch基於註解的示例API

建立索引和文件,同JPA的 @Entity，@Table，可以通過在搜尋的文件實體類新增@Document註解的方式，在啟動Springboot應用時會直接建立以及更新Elasticsearch的index以及document。
下面建立一個示例。示例中包含兩個Document,一個是OrderDocument,一個是DetailOrderDocument。示例中OrderDocument和DetailOrderDocument是parent-child關聯，可以參考官方對於p-c的描述：indexing-parent-child。Elasticsearch支援多種對於文件模型的關聯。在建立parent child關係的時候需要注意：child 需要根據parant的id進行路由，parantid 和child的parantid 必須是string。否則回在啟動時報錯：

nested exception is java.lang.IllegalArgumentException: Parent ID property should be String

OrderDocument

@Document(indexName = OrderDocument.INDEX, type = OrderDocument.ORDER_TYPE, refreshInterval = "-1")
public class OrderDocument {


    public static final String INDEX = "orders-test";
    public static final String ORDER_TYPE = "order-document";
    public static final String DETAIL_TYPE = "order-detail-document";


    @Id
    private String id;

    // 訂單備註，不需要分詞，可以搜尋
    @Field(type = FieldType.String, index = FieldIndex.not_analyzed)
    private String note;

    // 訂單名稱，可以通過ik 分詞器進行分詞
    @Field(type = FieldType.String, searchAnalyzer = "ik", analyzer = "ik")
    private String name;


    // 訂單價格
    @Field(type = FieldType.Long)
    private Long price;

    public String getId() {
        return id;
    }

    public void setId(String id) {
        this.id = id;
    }

    public String getNote() {
        return note;
    }

    public void setNote(String note) {
        this.note = note;
    }

    public String getName() {
        return name;
    }

    public void setName(String name) {
        this.name = name;
    }

    public Long getPrice() {
        return price;
    }

    public void setPrice(Long price) {
        this.price = price;
    }
}

DetailOrderDocument

@Document(indexName = OrderDocument.INDEX, type = OrderDocument.DETAIL_TYPE, shards = 10, replicas = 2, refreshInterval = "-1")
public class DetailOrderDocument {


    @Id
    private String id;

    // 指定主訂單關聯的父子關係
    @Field(type = FieldType.String, store = true)
    @Parent(type = OrderDocument.ORDER_TYPE)
    private String parentId;


    // 子訂單價格
    @Field(type = FieldType.Long)
    private Long price;

    public String getId() {
        return id;
    }

    public void setId(String id) {
        this.id = id;
    }

    public String getParentId() {
        return parentId;
    }

    public void setParentId(String parentId) {
        this.parentId = parentId;
    }

    public Long getPrice() {
        return price;
    }

    public void setPrice(Long price) {
        this.price = price;
    }
}

以上就在 “orders-test” 索引中建立了兩個Document。@Id註解對應著Elasticsearch的id。可以系統自動生成，也可以建立文件資料時指定固定的id,但是一定要保證唯一性。
啟動好之後可以通過curl xget來查詢索引的結構。結果如下：

{
  "orders-test" : {
    "aliases" : { },
    "mappings" : {
      "order-detail-document" : {
        "_parent" : {
          "type" : "order-document"
        },
        "_routing" : {
          "required" : true
        },
        "properties" : {
          "parentId" : {
            "type" : "string",
            "store" : true
          },
          "price" : {
            "type" : "long"
          }
        }
      },
      "order-document" : {
        "properties" : {
          "name" : {
            "type" : "string",
            "analyzer" : "ik"
          },
          "note" : {
            "type" : "string",
            "index" : "not_analyzed"
          },
          "price" : {
            "type" : "long"
          }
        }
      }
    },
    "settings" : {
      "index" : {
        "refresh_interval" : "-1",
        "number_of_shards" : "10",
        "creation_date" : "1511403448676",
        "store" : {
          "type" : "fs"
        },
        "number_of_replicas" : "2",
        "uuid" : "sHA5s7kEQA2AWCAA8-aBlQ",
        "version" : {
          "created" : "2040499"
        }
      }
    },
    "warmers" : { }
  }
}

另，剛才程式碼中，通過設定@Document的引數 number_of_shards，number_of_replicas。可以看到建立文件的settings引數：”number_of_shards” : “10”, “number_of_replicas” : “2”。如果不指定引數，則預設分別是 number_of_shards=5，number_of_replicas=1。其他預設引數可以檢視public @interface Document原始碼。

有特殊字元的自生成的id
用findOne 時會報錯，可以用findById 來代替,用query terms精確查詢是可以的

{
  "error" : {
    "root_cause" : [ {
      "type" : "routing_missing_exception",
      "reason" : "routing is required for [XX]/[YY]/[yourid]",
      "index" : "forests"
    } ],
    "type" : "routing_missing_exception",
    "reason" : "routing is required for [XX]/[YY]/[yourid]",
    "index" : "forests"
  },
  "status" : 400
}

Repositories&ElasticsearchTemplate

文件建立好之後，對於文件資料的索引可以繼承spring-data-elasticsearch的ElasticsearchRepository。使用CurdRepository介面規範來完成基礎的查詢，儲存，更新操作。如下簡單舉例了兩個查詢語句。

public interface DetailOrderDocumentRepository extends ElasticsearchRepository<DetailOrderDocument, String> {


    List<DetailOrderDocument> findByParentId(String parentId, Sort sort);

    DetailOrderDocument findById(String id);
}

如果是比較複雜的查詢場景，可以在Repository介面寫@Query語句。也可以使用ElasticsearchTemplate來寫更靈活的定製化查詢：

@Component
public class OrderManager {

    @Autowired
    private ElasticsearchTemplate elasticsearchTemplate;


    public Page<OrderDocument> queryPagedOrders(Integer pageNo, Integer pageSize, String name, Long minPrice, Long maxPrice) {
        // 預設，價格升序（為了支援豐富的排序場景，建議將所有可能的排序規則放到統一的enum中
        Pageable pageable = new PageRequest(pageNo, pageSize, new Sort(new Sort.Order(Sort.Direction.ASC, "price")));

        NativeSearchQueryBuilder nbq = new NativeSearchQueryBuilder().withIndices(OrderDocument.INDEX).withTypes(OrderDocument
                .ORDER_TYPE).withSearchType(SearchType.DEFAULT).withPageable(pageable);


        BoolQueryBuilder bqb = boolQuery();
        // 匹配訂單name
        if (StringUtils.isNotEmpty(name)) {
            bqb.must(termQuery("name", name));
        }
        // 查詢價格區間 minPrice<=price<=maxPrice
        if (minPrice != null && minPrice >= 0) {
            bqb.filter(rangeQuery("price").gte(minPrice));
        }
        if (maxPrice != null && maxPrice >= 0) {
            bqb.filter(rangeQuery("price").lte(maxPrice));
        }

        Page<OrderDocument> page = elasticsearchTemplate.queryForPage(nbq.withQuery(bqb).build(), OrderDocument.class);

        return page;
    }


}

其他注意事項

如果需要刪除parent child對映的索引
一般的索引都可以直接使用：

curl -XDELETE 'http://yourip:9200/orders-test/?pretty'

但parant-child 關係mapping的時候，刪除之後，如果想重建索引，在啟動springboot的時候會出現異常：
can’t add a _parent field that points to an already existing type, that isn’t already a parent 解決方案是在@Document 屬性中設定 createIndex = false（預設是true），只在parent document上設定就可以了.這樣就可以自由刪除index，啟動時重建索引。

更新文件的分詞
官方對於更新對映的說法:mapping-intro
也就是Elasticsearch不支援直接更新mapping欄位的索引方式（不能把一個analyzed欄位設定成not_analyzed）。可以支援新增新的對映欄位並且制定分詞方式（如 ik)，或者只能刪除index,重建索引。
如我們示例程式碼的：

    @Field(type = FieldType.String, searchAnalyzer = "ik", analyzer = "ik")
    private String name;

一旦索引建立完成，無法再變更name欄位為not_analyzed。所以在一開始設計索引文件時需要謹慎判斷。

分頁，資料查詢多的場景
對於資料量很大的文件的索引查詢，會出現以下報錯：

Failed to execute phase [query], all shards failed; shardFailures {[X-XXXXX][YYYY][0]: RemoteTransportException[[your-node][yourip:9300][indices:data/read/search[phase/query]]]; nested:
 QueryPhaseExecutionException[Result window is too large, from + size must be less than or equal to: [10000] but was [99020].
  See the scroll api for a more efficient way to request large data sets. This limit can be set by changing the [index.max_result_window] index level parameter.]; }{[X-XXXXX][YYYY][1]: RemoteTransportException[[

可以通過以下命令修改索引index_name。這個是index級別的設定，但是不建議更改設定，會增加ES node的記憶體負擔。

curl -XPUT "http://your_cluster:9200/index_name/_settings" -d '{ "index" : { "max_result_window" : 500000 } }'

雖然可以解決索引資料量大的問題，但是介面的效能會有問題：基本上平均返回時間會+200-300ms。推薦用scroll api：
Elasticsearch在處理大結果集時可以使用scan和scroll。在Spring Data Elasticsearch中，可以向下面那樣使用ElasticsearchTemplate來使用scan和scroll處理大結果集。可以參考：關於scroll 。
search api返回一個單一的結果“頁”，而 scroll API 可以被用來檢索大量的結果（甚至所有的結果），就像在傳統資料庫中使用的遊標 cursor。
使用示例如下：

String scrollId = elasticsearchTemplate.scan(nbq.withQuery(bqb).build(), 1000, false);

Page<OrderDocument> page = elasticsearchTemplate.scroll(scrollId, 2000L, OrderDocument.class);

檢視節點所有配置資訊

結果中還可以看到所有可用外掛列表。可以用來檢驗分詞外掛等是否安裝成功。
檢視mapping資訊:

刪除child文件索引值，並且新增其他的索引值:
可以檢視官方文件Indexing parent and child。通過curl刪除，查詢時都需要指定parentId，因為前面已經介紹過了，child文件是通過parentId進行路由的.如下需要新增routing。

curl -XDELETE 'http://your_cluster:9200/orders-test/order-detail-document/_query?routing=parent_order_id&pretty' -H 'Content-Type: application/json' -d'
{
   "query": {
      "bool": {
         "must": [
               { "term" : 
               { "id" : "detail_order_id" } 
               }
         ]
      }
   }
}
'

查詢時也一樣:

http://your_cluster:9200/orders-test/order-detail-document/_search?&_routing= parent_order_id&q=id:detail_order_id&pretty

同理，新增的時候也需要指定routing。

更新

最新的SpringDataElasticsearch 可以支援到了5.x版本。目前還是rc版，在Realease版本出來之後，服務會進行一次升級，屆時會更新一個升級文章。目前對應版本資訊如下：
Spring-data-elasticsearch 版本對照

Elasticsearch實踐（二）在Springboot微服務中整合搜尋服務

Elasticsearch官方API

Springboot專案中使用spring-data-elasticsearch框架整合

使用spring-boot-starter-data-elasticsearch做啟動時搜尋服務的配置

使用spring-data-elasticsearch基於註解的示例API

OrderDocument

DetailOrderDocument

Repositories&ElasticsearchTemplate

其他注意事項

相關文件

更新

Elasticsearch實踐（二）在Springboot微服務中整合搜尋服務

Elasticsearch實踐（二）：搜尋

.Net微服務實踐（二）：Ocelot介紹和快速開始

微服務實踐（五）：微服務的事件驅動數據管理

（二）surging 微服務框架使用系列之surging 的準備工作consul安裝

Choerodon 的微服務之路（二）：微服務閘道器

SpringCloud（二）Rest微服務構建案例

SpringCloud入門最佳實踐（三）Rest微服務構建案例工程模組

百億資料入庫elasticsearch生產實踐（二）

分散式定時任務Elastic-Job框架在SpringBoot工程中的應用實踐（二）

微服務實踐（五）：微服務的事件驅動數據管理 - DockOne.io

Mongodb基礎實踐（二）

MVC項目實踐（二）——需求分析

JVM高級特性與實踐（二）：對象存活判定算法（引用）與回收

Spring Boot參考教程（二）SpringBoot特性

前端工程化思考與實踐（二）

springcloud實踐（二）之api網關：zuul

springboot熱部署（二）——springboot熱部署與發布

KVM虛擬化實踐（二）

Spring Boot 最佳實踐（二）集成Jsp與生產環境部署

Elasticsearch實踐（二）在Springboot微服務中整合搜尋服務

Elasticsearch官方API

Springboot專案中使用spring-data-elasticsearch框架整合

使用spring-boot-starter-data-elasticsearch做啟動時搜尋服務的配置

使用spring-data-elasticsearch基於註解的示例API

OrderDocument

DetailOrderDocument

Repositories&ElasticsearchTemplate

其他注意事項

相關文件

更新

相關推薦