1. 首页
  2. elasticsearch教程

15-十五、Elasticsearch 教程: 聚合计算

聚合框架用于收集搜索查询选择的所有数据。该框架由许多构建块组成,有助于构建复杂的数据摘要

下面的 JSON 对象使用聚合函数的一般请求正文格式


"aggregations" : { "<aggregation_name>" : { "<aggregation_type>" : { <aggregation_body> } [,"meta" : { [<meta_data_body>] } ]? [,"aggregations" : { [<sub_aggregation>]+ } ]? } }

Elasticsearch 提供了大量的聚合函数,它们都有各自不同的目的

矩阵聚合 ( Metrics )

这些聚合函数可以根据聚合文档的字段值计算度量值,而且有时可以从脚本生成一些值

数字矩阵既可以是单值,也可以是平均聚合或多值统计等

平均数聚合 ( avg )

该聚合函数用于计算文档中出现的任何数字字段的平均值

例如


POST http://localhost:9200/user_admin/_search?pretty

请求正文


{ "aggs":{ "avg_money":{"avg":{"field":"money"}} } }

返回响应结果


{ "took" : 160, "timed_out" : false, "_shards" : { "total" : 5, "successful" : 5, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : 3, "max_score" : 1.0, "hits" : [ { "_index" : "user_admin", "_type" : "user", "_id" : "2", "_score" : 1.0, "_source" : { "nickname" : "雅少", "description" : "虚怀若谷", "street" : "四川大学", "city" : "Chengdu", "state" : "Sichuan", "zip" : "610044", "location" : [ 104.094537, 30.640174 ], "money" : 68023, "tags" : [ "Python", "HTML" ], "vitality" : "7.8" } }, { "_index" : "user_admin", "_type" : "user", "_id" : "1", "_score" : 1.0, "_source" : { "nickname" : "站长", "description" : "搜云库技术团队 ,教程 ", "street" : "东四十条", "city" : "Beijing", "state" : "Beijing", "zip" : "100007", "location" : [ 116.432727, 39.937732 ], "money" : 5201814, "tags" : [ "PHP", "Python" ], "vitality" : "9.0" } }, { "_index" : "user_admin", "_type" : "user", "_id" : "3", "_score" : 1.0, "_source" : { "nickname" : "歌者", "description" : "程序设计也是设计,研发新菜也是研发", "street" : "五道口", "city" : "Beijing", "state" : "Beijing", "zip" : "100083", "location" : [ 116.346346, 39.999333 ], "money" : 71128, "tags" : [ "Java", "Scala" ], "vitality" : "6.9" } } ] }, "aggregations" : { "avg_money" : { "value" : 1780321.6666666667 } } }

如果一个或多个聚合文档中不存在此值,默认情况下它们会被忽略

我们可以在聚合中添加缺失字段来设置缺失字段的默认值


{ "aggs":{ "avg_money":{ "avg":{ "field":"money" "missing":0 } } } }

基数聚合 ( cardinality )

基数聚合 ( cardinality ) 用于计算特定字段的不同值的计数

例如


POST http://localhost:9200/user*/_search?pretty

请求正文


{ "aggs":{ "distinct_nickname_count":{"cardinality":{"field":"nickname"}} } }

响应内容


{ "error" : { "root_cause" : [ { "type" : "illegal_argument_exception", "reason" : "Fielddata is disabled on text fields by default. Set fielddata=true on [nickname] in order to load fielddata in memory by uninverting the inverted index. Note that this can however use significant memory. Alternatively use a keyword field instead." } ], "type" : "search_phase_execution_exception", "reason" : "all shards failed", "phase" : "query", "grouped" : true, "failed_shards" : [ { "shard" : 0, "index" : "user", "node" : "4zwAMlTzRCaioBeOE9PaNw", "reason" : { "type" : "illegal_argument_exception", "reason" : "Fielddata is disabled on text fields by default. Set fielddata=true on [nickname] in order to load fielddata in memory by uninverting the inverted index. Note that this can however use significant memory. Alternatively use a keyword field instead." } } ], "caused_by" : { "type" : "illegal_argument_exception", "reason" : "Fielddata is disabled on text fields by default. Set fielddata=true on [nickname] in order to load fielddata in memory by uninverting the inverted index. Note that this can however use significant memory. Alternatively use a keyword field instead.", "caused_by" : { "type" : "illegal_argument_exception", "reason" : "Fielddata is disabled on text fields by default. Set fielddata=true on [nickname] in order to load fielddata in memory by uninverting the inverted index. Note that this can however use significant memory. Alternatively use a keyword field instead." } } }, "status" : 400 }

很明显响应出错了,提示 Fielddata 要单独加载

好吧,那我们先运行下面的请求来修改下


PUT http://localhost:9200/user*/_mapping/user

请求正文


{ "properties": { "nickname": { "type": "text", "fielddata": true } } }

响应内容


{"acknowledged":true}

然后重新发起刚刚报错的请求,响应如下


{ "took" : 186, "timed_out" : false, "_shards" : { "total" : 10, "successful" : 10, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : 5, "max_score" : 1.0, "hits" : [ { "_index" : "user", "_type" : "user", "_id" : "2", "_score" : 1.0, "_source" : { "nickname" : "枫晚", "description" : "停车坐爰枫林晚", "street" : "苏州大学", "city" : "Suzhou", "state" : "Jiangsu", "zip" : "215006", "location" : [ 120.65426, 31.30797 ], "money" : 10235, "tags" : [ "Java", "Android" ], "vitality" : "3.5" } }, { "_index" : "user_admin", "_type" : "user", "_id" : "2", "_score" : 1.0, "_source" : { "nickname" : "雅少", "description" : "虚怀若谷", "street" : "四川大学", "city" : "Chengdu", "state" : "Sichuan", "zip" : "610044", "location" : [ 104.094537, 30.640174 ], "money" : 68023, "tags" : [ "Python", "HTML" ], "vitality" : "7.8" } }, { "_index" : "user", "_type" : "user", "_id" : "1", "_score" : 1.0, "_source" : { "nickname" : "question", "description" : "问题少年也是少年", "street" : "张江高科技园区", "city" : "Shanghai", "state" : "Shanghai", "zip" : "201204", "location" : [ 121.60632, 31.199305 ], "money" : 13648, "tags" : [ "VUE", "HTML" ], "vitality" : "8.8" } }, { "_index" : "user_admin", "_type" : "user", "_id" : "1", "_score" : 1.0, "_source" : { "nickname" : "站长", "description" : "搜云库技术团队 ,教程 ", "street" : "东四十条", "city" : "Beijing", "state" : "Beijing", "zip" : "100007", "location" : [ 116.432727, 39.937732 ], "money" : 5201814, "tags" : [ "PHP", "Python" ], "vitality" : "9.0" } }, { "_index" : "user_admin", "_type" : "user", "_id" : "3", "_score" : 1.0, "_source" : { "nickname" : "歌者", "description" : "程序设计也是设计,研发新菜也是研发", "street" : "五道口", "city" : "Beijing", "state" : "Beijing", "zip" : "100083", "location" : [ 116.346346, 39.999333 ], "money" : 71128, "tags" : [ "Java", "Scala" ], "vitality" : "6.9" } } ] }, "aggregations" : { "distinct_nickname_count" : { "value" : 9 } } }

扩展统计聚合 ( extended_stats )

此聚合用于生成有关聚合文档中特定数字字段的所有统计信息

例如


POST http://localhost:9200/user_admin/user/_search?pretty

请求正文


{ "aggs" : { "money_stats" : { "extended_stats" : { "field" : "money" } } } }

响应内容


{ "took" : 12, "timed_out" : false, "_shards" : { "total" : 5, "successful" : 5, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : 3, "max_score" : 1.0, "hits" : [ { "_index" : "user_admin", "_type" : "user", "_id" : "2", "_score" : 1.0, "_source" : { "nickname" : "雅少", "description" : "虚怀若谷", "street" : "四川大学", "city" : "Chengdu", "state" : "Sichuan", "zip" : "610044", "location" : [ 104.094537, 30.640174 ], "money" : 68023, "tags" : [ "Python", "HTML" ], "vitality" : "7.8" } }, { "_index" : "user_admin", "_type" : "user", "_id" : "1", "_score" : 1.0, "_source" : { "nickname" : "站长", "description" : "搜云库技术团队 ,教程 ", "street" : "东四十条", "city" : "Beijing", "state" : "Beijing", "zip" : "100007", "location" : [ 116.432727, 39.937732 ], "money" : 5201814, "tags" : [ "PHP", "Python" ], "vitality" : "9.0" } }, { "_index" : "user_admin", "_type" : "user", "_id" : "3", "_score" : 1.0, "_source" : { "nickname" : "歌者", "description" : "程序设计也是设计,研发新菜也是研发", "street" : "五道口", "city" : "Beijing", "state" : "Beijing", "zip" : "100083", "location" : [ 116.346346, 39.999333 ], "money" : 71128, "tags" : [ "Java", "Scala" ], "vitality" : "6.9" } } ] }, "aggregations" : { "money_stats" : { "count" : 3, "min" : 68023.0, "max" : 5201814.0, "avg" : 1780321.6666666667, "sum" : 5340965.0, "sum_of_squares" : 2.7068555211509E13, "variance" : 5.853306500366889E12, "std_deviation" : 2419360.762756743, "std_deviation_bounds" : { "upper" : 6619043.192180153, "lower" : -3058399.858846819 } } } }

最大值聚合 ( max )

最大值聚合用于查找聚合文档中特定数字字段的最大值

例如


POST http://localhost:9200/user*/_search

请求正文


{ "aggs" : { "max_money" : { "max" : { "field" : "money" } } } }

响应内容


{ "took" : 22, "timed_out" : false, "_shards" : { "total" : 5, "successful" : 5, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : 3, "max_score" : 1.0, "hits" : [ { "_index" : "user_admin", "_type" : "user", "_id" : "2", "_score" : 1.0, "_source" : { "nickname" : "雅少", "description" : "虚怀若谷", "street" : "四川大学", "city" : "Chengdu", "state" : "Sichuan", "zip" : "610044", "location" : [ 104.094537, 30.640174 ], "money" : 68023, "tags" : [ "Python", "HTML" ], "vitality" : "7.8" } }, { "_index" : "user_admin", "_type" : "user", "_id" : "1", "_score" : 1.0, "_source" : { "nickname" : "站长", "description" : "搜云库技术团队 ,教程 ", "street" : "东四十条", "city" : "Beijing", "state" : "Beijing", "zip" : "100007", "location" : [ 116.432727, 39.937732 ], "money" : 5201814, "tags" : [ "PHP", "Python" ], "vitality" : "9.0" } }, { "_index" : "user_admin", "_type" : "user", "_id" : "3", "_score" : 1.0, "_source" : { "nickname" : "歌者", "description" : "程序设计也是设计,研发新菜也是研发", "street" : "五道口", "city" : "Beijing", "state" : "Beijing", "zip" : "100083", "location" : [ 116.346346, 39.999333 ], "money" : 71128, "tags" : [ "Java", "Scala" ], "vitality" : "6.9" } } ] }, "aggregations" : { "max_money" : { "value" : 5201814.0 } } }

最小值聚合 ( min )

最小值聚合用于查找聚合文档中特定数字字段的最小值

例如


POST http://localhost:9200/user*/_search?pretty

请求正文


{ "aggs" : { "min_money" : { "min" : { "field" : "money" } } } }

响应内容


{ "took" : 20, "timed_out" : false, "_shards" : { "total" : 10, "successful" : 10, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : 5, "max_score" : 1.0, "hits" : [ { "_index" : "user", "_type" : "user", "_id" : "2", "_score" : 1.0, "_source" : { "nickname" : "枫晚", "description" : "停车坐爰枫林晚", "street" : "苏州大学", "city" : "Suzhou", "state" : "Jiangsu", "zip" : "215006", "location" : [ 120.65426, 31.30797 ], "money" : 10235, "tags" : [ "Java", "Android" ], "vitality" : "3.5" } }, { "_index" : "user_admin", "_type" : "user", "_id" : "2", "_score" : 1.0, "_source" : { "nickname" : "雅少", "description" : "虚怀若谷", "street" : "四川大学", "city" : "Chengdu", "state" : "Sichuan", "zip" : "610044", "location" : [ 104.094537, 30.640174 ], "money" : 68023, "tags" : [ "Python", "HTML" ], "vitality" : "7.8" } }, { "_index" : "user", "_type" : "user", "_id" : "1", "_score" : 1.0, "_source" : { "nickname" : "question", "description" : "问题少年也是少年", "street" : "张江高科技园区", "city" : "Shanghai", "state" : "Shanghai", "zip" : "201204", "location" : [ 121.60632, 31.199305 ], "money" : 13648, "tags" : [ "VUE", "HTML" ], "vitality" : "8.8" } }, { "_index" : "user_admin", "_type" : "user", "_id" : "1", "_score" : 1.0, "_source" : { "nickname" : "站长", "description" : "搜云库技术团队 ,教程 ", "street" : "东四十条", "city" : "Beijing", "state" : "Beijing", "zip" : "100007", "location" : [ 116.432727, 39.937732 ], "money" : 5201814, "tags" : [ "PHP", "Python" ], "vitality" : "9.0" } }, { "_index" : "user_admin", "_type" : "user", "_id" : "3", "_score" : 1.0, "_source" : { "nickname" : "歌者", "description" : "程序设计也是设计,研发新菜也是研发", "street" : "五道口", "city" : "Beijing", "state" : "Beijing", "zip" : "100083", "location" : [ 116.346346, 39.999333 ], "money" : 71128, "tags" : [ "Java", "Scala" ], "vitality" : "6.9" } } ] }, "aggregations" : { "min_money" : { "value" : 10235.0 } } }

求和聚合 ( sum )

求和聚合 ( sum ) 用于计算聚合文档中特定数字字段的总和

例如


POST http://localhost:9200/user*/_search?pretty

请求正文


{ "aggs" : { "total_money" : { "sum" : { "field" : "money" } } } }

返回响应


{ "took" : 10, "timed_out" : false, "_shards" : { "total" : 10, "successful" : 10, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : 5, "max_score" : 1.0, "hits" : [ { "_index" : "user", "_type" : "user", "_id" : "2", "_score" : 1.0, "_source" : { "nickname" : "枫晚", "description" : "停车坐爰枫林晚", "street" : "苏州大学", "city" : "Suzhou", "state" : "Jiangsu", "zip" : "215006", "location" : [ 120.65426, 31.30797 ], "money" : 10235, "tags" : [ "Java", "Android" ], "vitality" : "3.5" } }, { "_index" : "user_admin", "_type" : "user", "_id" : "2", "_score" : 1.0, "_source" : { "nickname" : "雅少", "description" : "虚怀若谷", "street" : "四川大学", "city" : "Chengdu", "state" : "Sichuan", "zip" : "610044", "location" : [ 104.094537, 30.640174 ], "money" : 68023, "tags" : [ "Python", "HTML" ], "vitality" : "7.8" } }, { "_index" : "user", "_type" : "user", "_id" : "1", "_score" : 1.0, "_source" : { "nickname" : "question", "description" : "问题少年也是少年", "street" : "张江高科技园区", "city" : "Shanghai", "state" : "Shanghai", "zip" : "201204", "location" : [ 121.60632, 31.199305 ], "money" : 13648, "tags" : [ "VUE", "HTML" ], "vitality" : "8.8" } }, { "_index" : "user_admin", "_type" : "user", "_id" : "1", "_score" : 1.0, "_source" : { "nickname" : "站长", "description" : "搜云库技术团队 ,教程 ", "street" : "东四十条", "city" : "Beijing", "state" : "Beijing", "zip" : "100007", "location" : [ 116.432727, 39.937732 ], "money" : 5201814, "tags" : [ "PHP", "Python" ], "vitality" : "9.0" } }, { "_index" : "user_admin", "_type" : "user", "_id" : "3", "_score" : 1.0, "_source" : { "nickname" : "歌者", "description" : "程序设计也是设计,研发新菜也是研发", "street" : "五道口", "city" : "Beijing", "state" : "Beijing", "zip" : "100083", "location" : [ 116.346346, 39.999333 ], "money" : 71128, "tags" : [ "Java", "Scala" ], "vitality" : "6.9" } } ] }, "aggregations" : { "total_money" : { "value" : 5364848.0 } } }

此外,还存在一些其它聚合函数用于计算地理位置,如地理边界聚合和地理质心聚合

批量聚合 ( Bucket )

这些聚合包含了许多具有统一标准的不同类型的桶聚合,它们用于确定文档是否应该属于某个桶。

下面我们将会罗列这些桶聚合

子聚合

批量聚合会生成一组文档,这些文档将映射到父桶中

参数 type 用于定义父索引

例如,假如我们有一个品牌及其不同的模型,然后模型类型将包含以下 _parent 字段


{ "model" : { "_parent" : { "type" : "brand" } } }

还有很多其它的特殊的批量集合,在某些特定的情况下很好用,我们罗列如下


、Date Histogram 聚合 、Date Range 聚合 、Filter 聚合 、Filters 聚合 、Geo Distance 聚合 、GeoHash grid 聚合 、Global 聚合 、Histogram 聚合 、IPv4 Range 聚合 、Missing 聚合 、Nested 聚合 、Range 聚合 、Reverse nested 聚合 、Sampler 聚合 、Significant Terms 聚合 、Terms 聚合

聚合元数据

可以在请求时使用 meta 参数添加关于聚合的一些数据,然后就可以在响应时获取到这些数据


POST http://localhost:9200/user*/report/_search?pretty

请求正文


{ "aggs" : { "min_money" : { "avg" : { "field" : "money" } , "meta" :{"dsc" :"Lowest Moneys"} } } }

响应内容


{ "took" : 30, "timed_out" : false, "_shards" : { "total" : 10, "successful" : 10, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : 0, "max_score" : null, "hits" : [ ] }, "aggregations" : { "min_money" : { "meta" : { "dsc" : "Lowest Moneys" }, "value" : null } } }

希望读者能够给小编留言,也可以点击[此处扫下面二维码关注微信公众号](https://www.ycbbs.vip/?p=28 "此处扫下面二维码关注微信公众号")

看完两件小事

如果你觉得这篇文章对你挺有启发,我想请你帮我两个小忙:

  1. 关注我们的 GitHub 博客,让我们成为长期关系
  2. 把这篇文章分享给你的朋友 / 交流群,让更多的人看到,一起进步,一起成长!
  3. 关注公众号 「方志朋」,公众号后台回复「666」 免费领取我精心整理的进阶资源教程
  4. JS中文网,Javascriptc中文网是中国领先的新一代开发者社区和专业的技术媒体,一个帮助开发者成长的社区,是给开发者用的 Hacker News,技术文章由为你筛选出最优质的干货,其中包括:Android、iOS、前端、后端等方面的内容。目前已经覆盖和服务了超过 300 万开发者,你每天都可以在这里找到技术世界的头条内容。

    本文著作权归作者所有,如若转载,请注明出处

    转载请注明:文章转载自「 Java极客技术学习 」https://www.javajike.com

    标题:15-十五、Elasticsearch 教程: 聚合计算

    链接:https://www.javajike.com/article/1258.html

« 16-十六、Elasticsearch 教程: 索引 API
14-十四、Elasticsearch 教程: 搜索 API»

相关推荐

QR code