1. 程式人生 > >Elasticsearch分析聚合

Elasticsearch分析聚合

Elasticsearch不僅僅適合做全文檢索,分析聚合功能也很好用。下面通過例項來學習。

一、準備資料

{"index":{ "_index": "books", "_type": "IT", "_id": "1" }}
{"id":"1","title":"Java程式設計思想","language":"java","author":"Bruce Eckel","price":70.20,"year":    2007,"description":"Java學習必讀經典,殿堂級著作!贏得了全球程式設計師的廣泛讚譽。"}

{"index":{ "_index": "books", "_type": "IT", "_id": "2" }}
{"id":"2","title":"Java程式效能優化","language":"java","author":"葛一鳴","price":46.50,"year":     2012,"description":"讓你的Java程式更快、更穩定。深入剖析軟體設計層面、程式碼層面、JVM虛擬機器層面的優化方法"}

{"index":{ "_index": "books", "_type": "IT", "_id": "3" }}
{"id":"3","title":"Python科學計算","language":"python","author":"張若愚","price":81.40,"year":    2016,"description":"零基礎學python,光碟中作者獨家整合開發winPython執行環境,涵蓋了Python各個擴充套件庫"}

{"index":{ "_index": "books", "_type": "IT", "_id": "4" }}
{"id":"4","title":"Python基礎教程","language":"python","author":"張若愚","price":54.50,"year": 2014,"description":"經典的Python入門教程,層次鮮明,結構嚴謹,內容翔實"}

{"index":{ "_index": "books", "_type": "IT", "_id": "5" }}
{"id":"5","title":"JavaScript高階程式設計","language":"javascript","author":"Nicholas C.Zakas","price":66.40,"year":2012,"description":"JavaScript技術經典名著"}

準備5條資料,儲存著books.json中,批量匯入:

curl -XPOST "http://localhost:9200/_bulk?pretty" --data-binary @books.json

二、Group By分組統計

執行命令:

curl -XPOST "http://localhost:9200/books/_search?pretty" -d '{
"size": 0,
  "aggs": {
    "per_count": {
      "terms": {
        "field": "language"
      }
    }
  }
}'

統計結果:

{
  "took" : 3,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "failed" : 0
  },
  "hits" : {
    "total" : 5,
    "max_score" : 0.0,
    "hits" : [ ]
  },
  "aggregations" : {
    "per_count" : {
      "doc_count_error_upper_bound" : 0,
      "sum_other_doc_count" : 0,
      "buckets" : [ {
        "key" : "java",
        "doc_count" : 2
      }, {
        "key" : "python",
        "doc_count" : 2
      }, {
        "key" : "javascript",
        "doc_count" : 1
      } ]
    }
  }
}

按程式語言分類,java類2本,python類1本,javascript類1本。

三、Max最大值

執行命令,統計price最大的:

curl -XPOST "http://localhost:9200/books/_search?pretty" -d '{
  "size": 0,
  "aggs": {
    "max_price": {
      "max": {
        "field": "price"
      }
    }
  }
}'

返回結果:

{
  "took" : 2,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "failed" : 0
  },
  "hits" : {
    "total" : 5,
    "max_score" : 0.0,
    "hits" : [ ]
  },
  "aggregations" : {
    "max_price" : {
      "value" : 81.4
    }
  }
}

四、Min最小值

通用,求價格最便宜的那本:

curl -XPOST "http://localhost:9200/books/_search?pretty" -d '{
  "size": 0,
  "aggs": {
    "max_price": {
      "max": {
        "field": "price"
      }
    }
  }
}'

統計結果:

{
  "took" : 3,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "failed" : 0
  },
  "hits" : {
    "total" : 5,
    "max_score" : 0.0,
    "hits" : [ ]
  },
  "aggregations" : {
    "max_price" : {
      "value" : 81.4
    }
  }
}

五、Average平均值

分組統計並求5本書的評價價格:

curl -XPOST "http://localhost:9200/books/_search?pretty" -d '{
"size": 0,
"aggs": {
    "per_count": {
        "terms": {
            "field": "language"
        },
        "aggs": {
            "avg_price": {
                "avg": {
                    "field": "price"
                }
            }
        }
    }
}
}
'

返回結果:

{
  "took" : 4,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "failed" : 0
  },
  "hits" : {
    "total" : 5,
    "max_score" : 0.0,
    "hits" : [ ]
  },
  "aggregations" : {
    "per_count" : {
      "doc_count_error_upper_bound" : 0,
      "sum_other_doc_count" : 0,
      "buckets" : [ {
        "key" : "java",
        "doc_count" : 2,
        "avg_price" : {
          "value" : 58.35
        }
      }, {
        "key" : "python",
        "doc_count" : 2,
        "avg_price" : {
          "value" : 67.95
        }
      }, {
        "key" : "javascript",
        "doc_count" : 1,
        "avg_price" : {
          "value" : 66.4
        }
      } ]
    }
  }
}

六、Sum求和

求5本書總價:

curl -XPOST "http://localhost:9200/books/_search?pretty" -d '
{
  "size": 0,
  "aggs": {
    "sum_price": {
      "sum": {
        "field": "price"
      }
    }
  }
}'

返回結果:

{
  "took" : 6,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "failed" : 0
  },
  "hits" : {
    "total" : 5,
    "max_score" : 0.0,
    "hits" : [ ]
  },
  "aggregations" : {
    "sum_price" : {
      "value" : 319.0
    }
  }
}

七、基本統計

基本統計會返回欄位的最大值、最小值、平均值、求和:

curl -XPOST "http://localhost:9200/books/_search?pretty" -d '{
"size": 0,
"aggs": {
    "grades_stats": {
        "stats": {
            "field": "price"
        }
    }
}
}'

返回結果:

{
  "took" : 2,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "failed" : 0
  },
  "hits" : {
    "total" : 5,
    "max_score" : 0.0,
    "hits" : [ ]
  },
  "aggregations" : {
    "grades_stats" : {
      "count" : 5,
      "min" : 46.5,
      "max" : 81.4,
      "avg" : 63.8,
      "sum" : 319.0
    }
  }
}

八、高階統計

高階統計還會返回方差、標準差等:

curl -XPOST "http://localhost:9200/books/_search?pretty" -d'
{
  "size": 0,
  "aggs": {
    "grades_stats": {
      "extended_stats": {
        "field": "price"
      }
    }
  }
}
'

統計結果:

{
  "took" : 3,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "failed" : 0
  },
  "hits" : {
    "total" : 5,
    "max_score" : 0.0,
    "hits" : [ ]
  },
  "aggregations" : {
    "grades_stats" : {
      "count" : 5,
      "min" : 46.5,
      "max" : 81.4,
      "avg" : 63.8,
      "sum" : 319.0,
      "sum_of_squares" : 21095.46,
      "variance" : 148.65199999999967,
      "std_deviation" : 12.19229264740638,
      "std_deviation_bounds" : {
        "upper" : 88.18458529481276,
        "lower" : 39.41541470518724
      }
    }
  }
}

九、百分比統計

curl -XPOST "http://localhost:9200/books/_search?pretty" -d '
{
    "size": 0,
    "aggs": {
        "load_time_outlier": {
            "percentiles": {
                "field": "year"
            }
        }
    }
}
'

返回結果:

{
  "took" : 3,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "failed" : 0
  },
  "hits" : {
    "total" : 5,
    "max_score" : 0.0,
    "hits" : [ ]
  },
  "aggregations" : {
    "load_time_outlier" : {
      "values" : {
        "1.0" : 2007.2,
        "5.0" : 2008.0000000000002,
        "25.0" : 2012.0,
        "50.0" : 2012.0,
        "75.0" : 2014.0,
        "95.0" : 2015.6000000000001,
        "99.0" : 2015.92
      }
    }
  }
}

十、分段統計

統計價格小於50、50-80、大於80的百分比:

curl -XPOST "http://localhost:9200/books/_search?pretty" -d '{
    "size": 0,
    "aggs": {
        "price_ranges": {
            "range": {
                "field": "price",
                "ranges": [{
                    "to": 50
                }, {
                    "from": 50,
                    "to": 80
                }, {
                    "from": 80
                }]
            }
        }
    }
}
'

返回結果:

{
  "took" : 1,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "failed" : 0
  },
  "hits" : {
    "total" : 5,
    "max_score" : 0.0,
    "hits" : [ ]
  },
  "aggregations" : {
    "price_ranges" : {
      "buckets" : [ {
        "key" : "*-50.0",
        "to" : 50.0,
        "to_as_string" : "50.0",
        "doc_count" : 1
      }, {
        "key" : "50.0-80.0",
        "from" : 50.0,
        "from_as_string" : "50.0",
        "to" : 80.0,
        "to_as_string" : "80.0",
        "doc_count" : 3
      }, {
        "key" : "80.0-*",
        "from" : 80.0,
        "from_as_string" : "80.0",
        "doc_count" : 1
      } ]
    }
  }
}