ElasticSearch 指标聚合

未分类 3年前 (2022) 程序员胖胖胖虎阿

373 0 0

松哥原创的 Spring Boot 视频教程已经杀青，感兴趣的小伙伴戳这里-->Spring Boot+Vue+微人事视频教程

Es 中的聚合分析我们主要从三个方面来学习：

指标聚合
桶聚合
管道聚合

今天我们先来看相对简单的指标聚合。

以下是视频笔记：

注意，笔记只是视频内容的一个简要记录，因此笔记内容比较简单，完整的内容可以查看视频。

22.1 Max Aggregation

统计最大值。例如查询价格最高的书：

GET books/_search
{
  "aggs": {
    "max_price": {
      "max": {
        "field": "price"
      }
    }
  }
}

查询结果如下：

ElasticSearch 指标聚合 — image-20201202154048167

GET books/_search
{
  "aggs": {
    "max_price": {
      "max": {
        "field": "price",
        "missing": 1000
      }
    }
  }
}

如果某个文档中缺少 price 字段，则设置该字段的值为 1000。

也可以通过脚本来查询最大值：

GET books/_search
{
  "aggs": {
    "max_price": {
      "max": {
        "script": {
          "source": "if(doc['price'].size()!=0){doc.price.value}"
        }
      }
    }
  }
}

使用脚本时，可以先通过 doc['price'].size()!=0 去判断文档是否有对应的属性。

22.2 Min Aggregation

统计最小值，用法和 Max Aggregation 基本一致：

GET books/_search
{
  "aggs": {
    "min_price": {
      "min": {
        "field": "price",
        "missing": 1000
      }
    }
  }
}

脚本：

GET books/_search
{
  "aggs": {
    "min_price": {
      "min": {
        "script": {
          "source": "if(doc['price'].size()!=0){doc.price.value}"
        }
      }
    }
  }
}

22.3 Avg Aggregation

统计平均值：

GET books/_search
{
  "aggs": {
    "avg_price": {
      "avg": {
        "field": "price"
      }
    }
  }
}

GET books/_search
{
  "aggs": {
    "avg_price": {
      "avg": {
        "script": {
          "source": "if(doc['price'].size()!=0){doc.price.value}"
        }
      }
    }
  }
}

22.4 Sum Aggregation

求和：

GET books/_search
{
  "aggs": {
    "sum_price": {
      "sum": {
        "field": "price"
      }
    }
  }
}

GET books/_search
{
  "aggs": {
    "sum_price": {
      "sum": {
        "script": {
          "source": "if(doc['price'].size()!=0){doc.price.value}"
        }
      }
    }
  }
}

22.5 Cardinality Aggregation

cardinality aggregation 用于基数统计。类似于 SQL 中的 distinct count(0)：

text 类型是分析型类型，默认是不允许进行聚合操作的，如果相对 text 类型进行聚合操作，需要设置其 fielddata 属性为 true，这种方式虽然可以使 text 类型进行聚合操作，但是无法满足精准聚合，如果需要精准聚合，可以设置字段的子域为 keyword。

方式一：

重新定义 books 索引：

PUT books
{
  "mappings": {
    "properties": {
      "name":{
        "type": "text",
        "analyzer": "ik_max_word"
      },
      "publish":{
        "type": "text",
        "analyzer": "ik_max_word",
        "fielddata": true
      },
      "type":{
        "type": "text",
        "analyzer": "ik_max_word"
      },
      "author":{
        "type": "keyword"
      },
      "info":{
        "type": "text",
        "analyzer": "ik_max_word"
      },
      "price":{
        "type": "double"
      }
    }
  }
}

定义完成后，重新插入数据（参考之前的视频）。

接下来就可以查询出版社的总数量：

GET books/_search
{
  "aggs": {
    "publish_count": {
      "cardinality": {
        "field": "publish"
      }
    }
  }
}

查询结果如下：

ElasticSearch 指标聚合这种聚合方式可能会不准确。可以将 publish 设置为 keyword 类型或者设置子域为 keyword。

PUT books
{
  "mappings": {
    "properties": {
      "name":{
        "type": "text",
        "analyzer": "ik_max_word"
      },
      "publish":{
        "type": "keyword"
      },
      "type":{
        "type": "text",
        "analyzer": "ik_max_word"
      },
      "author":{
        "type": "keyword"
      },
      "info":{
        "type": "text",
        "analyzer": "ik_max_word"
      },
      "price":{
        "type": "double"
      }
    }
  }
}

查询结果如下：

ElasticSearch 指标聚合对比查询结果可知，使用 fileddata 的方式，查询结果不准确。

22.6 Stats Aggregation

基本统计，一次性返回 count、max、min、avg、sum：

GET books/_search
{
  "aggs": {
    "stats_query": {
      "stats": {
        "field": "price"
      }
    }
  }
}

22.7 Extends Stats Aggregation

高级统计，比 stats 多出来：平方和、方差、标准差、平均值加减两个标准差的区间：

GET books/_search
{
  "aggs": {
    "es": {
      "extended_stats": {
        "field": "price"
      }
    }
  }
}

22.8 Percentiles Aggregation

百分位统计。

GET books/_search
{
  "aggs": {
    "p": {
      "percentiles": {
        "field": "price",
        "percents": [
          1,
          5,
          10,
          15,
          25,
          50,
          75,
          95,
          99
        ]
      }
    }
  }
}

22.9 Value Count Aggregation

可以按照字段统计文档数量（包含指定字段的文档数量）：

GET books/_search
{
  "aggs": {
    "count": {
      "value_count": {
        "field": "price"
      }
    }
  }
}

ElasticSearch 系列其他文章：

打算出一个 ElasticSearch 教程，谁赞成，谁反对？
ElasticSearch 从安装开始
ElasticSearch 第三弹，核心概念介绍
ElasticSearch 中的中文分词器该怎么玩？
ElasticSearch 索引基本操作
ElasticSearch 文档的添加、获取以及更新
ElasticSearch 文档的删除和批量操作
ElasticSearch 文档路由，你的数据到底存在哪一个分片上？
ElasticSearch 并发的处理方式：锁和版本控制
ElasticSearch 中的倒排索引到底是什么？
ElasticSearch 动态映射与静态映射
ElasticSearch 四种字段类型详解
ElasticSearch 中的地理类型和特殊类型
ElasticSearch 23 种映射参数详解
ElasticSearch 如何配置某个字段的权重？
ElasticSearch 23 种映射参数详解【3】
ElasticSearch 映射模版
ElasticSearch 搜索入门
ElasticSearch 全文搜索怎么玩？
ElasticSearch 打错字还能搜索到？试试 fuzzy query！
ElasticSearch 复合查询，理解 Es 中的文档评分策略！
想搜索附近评分较高的餐厅，ElasticSearch 大显身手！
ElasticSearch 如何像 MySQL 一样做多表联合查询？
ElasticSearch 地理位置查询与特殊查询
ElasticSearch 搜索高亮与排序

往期推荐

0
1

50+ 需求文档免费下载！

0
2

Spring Security 教程合集

0
3

接了两个私活，都是血汗钱

ElasticSearch 指标聚合

本文分享自微信公众号 - 江南一点雨（a_javaboy）。
如有侵权，请联系 support@oschina.cn 删除。
本文参与“OSC源创计划”，欢迎正在阅读的你也加入，一起分享。

版权声明：程序员胖胖胖虎阿发表于 2022年8月31日下午11:16。
转载请注明：ElasticSearch 指标聚合 | 胖虎的工具箱-编程导航

扩展ant-design-vue的按钮样式的方法

程序员胖胖胖虎阿

314

ElasticSearch 入门

程序员胖胖胖虎阿

298

Python 潮流周刊#83：uv 的使用技巧（摘要）

程序员胖胖胖虎阿

141

Stream distinct 根据list某个字段去重

程序员胖胖胖虎阿

250

从Multirepo到Monorepo 袋鼠云数栈前端研发效率提升探索之路

程序员胖胖胖虎阿

264

Tomcat中如何指定JDK版本

程序员胖胖胖虎阿

152

暂无评论

暂无评论...

ElasticSearch 指标聚合

22.1 Max Aggregation

22.2 Min Aggregation

22.3 Avg Aggregation

22.4 Sum Aggregation

22.5 Cardinality Aggregation

22.6 Stats Aggregation

22.7 Extends Stats Aggregation

22.8 Percentiles Aggregation

22.9 Value Count Aggregation

推荐一款小巧、提高效率的软件

中医养生绝招：五首曲子保你健康

相关文章

暂无评论