python elasticsearch 分組統計
阿新 • • 發佈:2018-12-07
聚合(Aggregations):
query = { "query": { "bool": { "must": [ {"term": {"company_id": company_id}}, {"term": {"subject_type": 2}}, ], "must_not": [], "should": [] } }, "size": 0, # 設定返回資訊條數 "aggs": { "group_by_real_name": { "terms": {"field": "real_name.keyword"}, "aggs": { #條件巢狀 "group_by_screen_id": { "terms": {"field": "screen_id"}, } } } } }
設定fielddata=true:
在系統終端設定:
curl -i -H "Content-Type:application/json" -XPUT 127.0.0.1:9200/your_index/_mapping/your_type/?pretty -d'{"your_type":{"properties":{"your_field_name":{"type":"text","fielddata":true}}}}'
將以上標紅位置更改為自己對應的欄位,在Linux上似乎可以直接操作,windows似乎需要下一個curl外掛,我未曾設定過fielddata=true,我是使用了上面real_name.keyword
curl -i -H "Content-Type:application/json" -XPUT 127.0.0.1:9200/event/_mapping/koala-index/?pretty -d'{"koala-index":{"properties":{"real_name":{"type":"text","fielddata":true}}}}'
去重統計:
query = { "query": { "bool": { "must": [ {"term": {"company_id": company_id}}, ], "must_not": [], "should": [] } }, "size": 0, "aggs": { "group_by_screen_id": { "terms": {"field": "screen_id"}, "aggs": {"group_by_subject_type": { "terms": {"field": "subject_type"}, "aggs": {"distinct_subject_ids": { #去重統計 "cardinality": {"field": "subject_id"} }} } } } }, "sort": [ {"timestamp": {"order": 'desc'}} ], "from": (page - 1) * size, "size": size, }