初識elasticsearch_2(查詢和整合springboot)
初始化
首先將官網所下載的json文件,放入到es中,采用如下命令:
curl -H "Content-Type: application/json" -XPOST ‘localhost:9200/bank/account/_bulk?pretty&refresh‘ --data-binary "@accounts.json"
curl ‘localhost:9200/_cat/indices?v‘
search API
接下來可以開始查詢啦.可以通過2種方式進行查詢,分別為將其放在RESTAPI中或者將其放在RESTAPI的請求體中.顯然請求體的形式更加具有代表性並且也更加易讀/
先看放在RESTAPI中的,下面的語句查詢出了bank索引的所有的文檔.
GET /bank/_search?q=*&sort=account_number:asc&pretty
參數列表代表q=*查詢所有,sort=account_number:asc,代表結果按照account_number升序排列,pretty代表將返回結果以格式化JSON的形式輸出.
可以看看返回值,返回值說明寫在註釋裏面:
{ "took" : 63, // 是否延遲 "timed_out" : false, // 當前搜索的有多少個shards "_shards" : { "total" : 5, "successful" : 5, "skipped" : 0, "failed" : 0 }, // 搜索結果 "hits" : { // 符合搜索結果的條數 "total" : 1000, "max_score" : null, // 結果的數組,默認顯示前10條 "hits" : [ { "_index" : "bank", "_type" : "account", "_id" : "0", // 排序字段 "sort": [0], "_score" : null, "_source" : {"account_number":0,"balance":16623,"firstname":"Bradshaw","lastname":"Mckenzie","age":29,"gender":"F","address":"244 Columbus Place","employer":"Euron","email":"[email protected]","city":"Hobucken","state":"CO"} }, { "_index" : "bank", "_type" : "account", "_id" : "1", "sort": [1], "_score" : null, "_source" : {"account_number":1,"balance":39225,"firstname":"Amber","lastname":"Duke","age":32,"gender":"M","address":"880 Holmes Lane","employer":"Pyrami","email":"[email protected]","city":"Brogan","state":"IL"} }, ... ] } }
可以采用請求體的方式去請求:
GET /bank/_search
{
"query": { "match_all": {} },
"sort": [
{ "account_number": "asc" }
]
}
返回的結果是一樣的.
通過增加參數,可以控制返回的結果條數:
// 展示一條 GET /bank/_search { "query": { "match_all": {} }, "size": 1 } // 第10條~第20條 GET /bank/_search { "query": { "match_all": {} }, "from": 10, "size": 10 }
下面的是根據balance進行倒序排列
GET /bank/_search
{
"query": { "match_all": {} },
"sort": { "balance": { "order": "desc" } }
}
默認情況下,返回的source是包含所有的數據結構的,如果我們不想返回document的所有的數據結構,可以采用下面的語句:
GET /bank/_search
{
"query": { "match_all": {} },
"_source": ["account_number", "balance"]
}
可以看看返回值:
{
"took": 11,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 999,
"max_score": 1,
"hits": [
{
"_index": "bank",
"_type": "account",
"_id": "25",
"_score": 1,
"_source": {
"account_number": 25,
"balance": 40540
}
}
]
}
}
接下來可以看看根據字段過濾的,下面的篩選了account_number為20的訂單
GET /bank/_search
{
"query": { "match": { "account_number": 20 } }
}
下面篩選出了地址值包含mill,lane的結果
GET /bank/_search
{
"query": { "match": { "address": "mill lane" } }
}
如果要篩選包含短語mill lane的呢:
GET /bank/_search
{
"query": { "match_phrase": { "address": "mill lane" } }
}
緊接著來看看bool查詢.
以下bool查詢和上面的查詢是一樣的,查詢出包含短語包含短語mill lane的:
GET /bank/_search
{
"query": {
"bool": {
"must": [
{ "match": { "address": "mill" } },
{ "match": { "address": "lane" } }
]
}
}
}
Must代表所有的查詢都必須返回true.再看看下面的語句:
GET /bank/_search
{
"query": {
"bool": {
"should": [
{ "match": { "address": "mill" } },
{ "match": { "address": "lane" } }
]
}
}
}
should代表這些查詢中,當中的一個,必須返回true.
下面的語句,代表地址中既不能包含mill也不能包含lane:
GET /bank/_search
{
"query": {
"bool": {
"must_not": [
{ "match": { "address": "mill" } },
{ "match": { "address": "lane" } }
]
}
}
}
must_not要求查詢結果對於所有的query都不滿足
各個條件之間是可以相互組合的,如下:
GET /bank/_search
{
"query": {
"bool": {
"must": [
{ "match": { "age": "40" } }
],
"must_not": [
{ "match": { "state": "ID" } }
]
}
}
}
我們可以通過過濾器(filter)搜索banalance在20000到30000之間的東西
GET /bank/_search
{
"query": {
"bool": {
"must": { "match_all": {} },
"filter": {
"range": {
"balance": {
"gte": 20000,
"lte": 30000
}
}
}
}
}
}
註意,must中”match”是不支持gte和lte的.
分組,註意,es可以在額外返回一個aggressions的數組,可以通過參數說明對返回的數組進行分組.如下所示:
GET /bank/_search
{
"size": 0,
"aggs": {
"group_by_state": {
"terms": {
"field": "state.keyword"
}
}
}
}
上面的語句大概等同於如下SQL:
SELECT state, COUNT(*) FROM bank GROUP BY state ORDER BY COUNT(*) DESC
下面的語句計算了按照state分類後,balance的平均值
GET /bank/_search
{
"size": 0,
"aggs": {
"group_by_state": {
"terms": {
"field": "state.keyword"
},
"aggs": {
"average_balance": {
"avg": {
"field": "balance"
}
}
}
}
}
}
註意,我們使用了兩次aggs,註意,當我們需要對結果進行操作的時候,我們可以使用aggs嵌套的方式去從返回值中提取需要的數據.
下面是一個演示aggs嵌套的例子:
GET /bank/_search
{
"size": 0,
"aggs": {
"group_by_age": {
"range": {
"field": "age",
"ranges": [
{
"from": 20,
"to": 30
},
{
"from": 30,
"to": 40
},
{
"from": 40,
"to": 50
}
]
},
"aggs": {
"group_by_gender": {
"terms": {
"field": "gender.keyword"
},
"aggs": {
"average_balance": {
"avg": {
"field": "balance"
}
}
}
}
}
}
}
}
這行語句的目的主要是先按照年齡段進行分組,在按照性別進行分組,最後取balance的平均值.返回值如下:
{
"took": 8,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 999,
"max_score": 0,
"hits": []
},
"aggregations": {
"group_by_age": {
"buckets": [
{
"key": "20.0-30.0",
"from": 20,
"to": 30,
"doc_count": 450,
"group_by_gender": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "M",
"doc_count": 231,
"average_balance": {
"value": 27400.982683982686
}
},
{
"key": "F",
"doc_count": 219,
"average_balance": {
"value": 25341.260273972603
}
}
]
}
},
{
"key": "30.0-40.0",
"from": 30,
"to": 40,
"doc_count": 504,
"group_by_gender": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "F",
"doc_count": 253,
"average_balance": {
"value": 25670.869565217392
}
},
{
"key": "M",
"doc_count": 251,
"average_balance": {
"value": 24288.239043824702
}
}
]
}
},
{
"key": "40.0-50.0",
"from": 40,
"to": 50,
"doc_count": 45,
"group_by_gender": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "M",
"doc_count": 24,
"average_balance": {
"value": 26474.958333333332
}
},
{
"key": "F",
"doc_count": 21,
"average_balance": {
"value": 27992.571428571428
}
}
]
}
}
]
}
}
}
springboot整合elasticsearch
由於springboot使用的是spring-data-elasticsearch,但是目前這個最高版本對應的es版本沒有到5,因此我們使用較低的es版本進行測試.采用的es版本是2.3.2,對應的spring-data-elasticsearch版本為2.1.0,spring-boot版本采用1.5.1,springboot-starter-elasticsearch版本為1.5.1.RELEASE
- pom.xml
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-data-elasticsearch</artifactId>
<version>1.5.1.RELEASE</version>
</dependency>
- application.properties
# ES
spring.data.elasticsearch.repositories.enabled = true
spring.data.elasticsearch.cluster-nodes = 127.0.0.1:9300
- 實體類(Account)
需要註意的是,indexName,type都不能有大寫.否則會報錯
@Document(indexName = "bank",type = "account")
public class Account implements Serializable{
@Id
private Long id;
private Integer account_number;
private Long balance;
private String firstname;
private String lastname;
private Integer age;
private String gender;
private String address;
private String employer;
private String email;
private String city;
private String state;
// get&set
}
- 操作es的repository
非常簡單只需要繼承即可.
public interface AccountRepository extends ElasticsearchRepository<Account,Long> {
}
- service
需要註意的是,在保存的時候,當文檔對應的索引沒有的時候,es會為我們手動創建,在保存文檔的時候需要手動指定id,否則es會將null作為文檔的id.
@Service
public class AccountServiceEsImpl {
@Autowired AccountRepository accountRepository;
/**
* 保存賬號
*/
public Long save(Account account) {
Account acountSaved = accountRepository.save(account);
return acountSaved.getId();
}
/**
* 根據地址值過濾
* @return
*/
public List<Account> queryByAddress() {
// 根據地址值過濾
Pageable page = new PageRequest(0,10);
BoolQueryBuilder queryBuilder = QueryBuilders.boolQuery();
queryBuilder.must(QueryBuilders.matchQuery("address","Beijing"));
SearchQuery query =
new NativeSearchQueryBuilder().withQuery(queryBuilder).withPageable(page).build();
Page<Account> pages = accountRepository.search(query);
return pages.getContent();
}
}
初識elasticsearch_2(查詢和整合springboot)