Logstash整合Kafka( 二 )
Mapping配置{"mappings": {"_default_": {"_all": {"enabled": true},"dynamic_templates": [{"my_template": {"match_mapping_type": "string","mapping": {"type": "string","fields": {"raw": {"type": "string","index": "not_analyzed"}}}}}]},"api": {"properties": {"timestamp": {"format": "strict_date_optional_time||epoch_millis","type": "date"},"message": {"type": "string","index": "not_analyzed"},"level": {"type": "string"},"host": {"type": "string"},"logs": {"properties": {"uid": {"type": "long"},"status": {"type": "string"},"did": {"type": "long"},"device-id": {"type": "string"},"device_id": {"type": "string"},"errorMsg": {"type": "string"},"rpid": {"type": "string"},"url": {"type": "string"},"errorStatus": {"type": "long"},"ip": {"type": "string"},"timestamp": {"type": "string","index": "not_analyzed"},"hb_uid": {"type": "long"},"duid": {"type": "string"},"request": {"type": "string"},"name": {"type": "string"},"errorCode": {"type": "string"},"ua": {"type": "string"},"server_timestamp": {"type": "long"},"bid": {"type": "long"}}},"path": {"type": "string","index": "not_analyzed"},"type": {"type": "string","index": "not_analyzed"},"@timestamp": {"format": "strict_date_optional_time||epoch_millis","type": "date"},"@version": {"type": "string","index": "not_analyzed"}}}}}Elasticsearch 会自动使用自己的默认分词器(空格 , 点 , 斜线等分割)来分析字段 。 分词器对于搜索和评分是非常重要的 , 但是大大降低了索引写入和聚合请求的性能 。 所以 logstash 模板定义了一种叫”多字段”(multi-field)类型的字段 。 这种类型会自动添加一个 “.raw” 结尾的字段 , 并给这个字段设置为不启用分词器 。 简单说 , 你想获取 url 字段的聚合结果的时候 , 不要直接用 “url”, 而是用 “url.raw” 作为字段名 。
这里使用dynamic_templates是因为我们这里有嵌套结构logs , 即使我们在内嵌的logs结构中定义了字段是not_analyzed , 但是新创建出来的索引数据仍然是analyzed的(不知道是为什么) 。 如果字段都是analyzed就无法在Kibana中进行统计 , 这里使用dynamic_templates , 给所有动态字段都加一个raw字段 , 这个字段名就是原字段(比如:logs.name)后面加上一个.raw(变成logs.name.raw) , 专门用来解决analyzed无法做统计的 , 所有的.raw字段都是not_analyzed , 这样就可以使用.raw字段(logs.name.raw)进行统计分析了 , 而全文搜索可以继续使用原字段(logs.name) 。
这里还需要注意的就是 , 需要精确匹配的字段要设置成not_analyzed(例如:某些ID字段 , 或者可枚举的字段等等) , 需要全文搜索的字段要设置成analyzed(例如:日志详情 , 或者具体错误信息等等) , 否则在Kibana全文搜索的时候搜索结果是正确的 , 但是没有高亮 , 就是因为全文搜索默认搜索的是_all字段 , 高亮结果返回却是在_source字段中 。 还有Kibana的全文搜索默认是搜索的_all字段 , 需要在ES创建mapping的时候设置_all开启状态 。
Highlight高亮不能应用在非String类型的字段上 , 必须把integer , long等非String类型的字段转化成String类型来创建索引 , 这样这些字段才能够被高亮搜索 。
还有就是记得每次修改完ES Mapping文件要刷新Kibana中的索引
最终修改后的ES Mapping如下:
{"mappings": {"_default_": {"_all": {"enabled": true},"dynamic_templates": [{"my_template": {"match_mapping_type": "string","mapping": {"type": "string","fields": {"raw": {"type": "string","index": "not_analyzed"}}}}}]},"api": {"properties": {"timestamp": {"format": "strict_date_optional_time||epoch_millis","type": "date"},"message": {"type": "string","index": "not_analyzed"},"level": {"type": "string","index": "not_analyzed"},"host": {"type": "string","index": "not_analyzed"},"logs": {"properties": {"uid": {"type": "string","index": "not_analyzed"},"status": {"type": "string","index": "not_analyzed"},"did": {"type": "long","fields": {"as_string": {"type": "string","index": "not_analyzed"}}},"device-id": {"type": "string","index": "not_analyzed"},"device_id": {"type": "string","index": "not_analyzed"},"errorMsg": {"type": "string","fields": {"raw": {"type": "string","index": "not_analyzed"}}},"rpid": {"type": "string","index": "not_analyzed"},"url": {"type": "string","fields": {"raw": {"type": "string","index": "not_analyzed"}}},"errorStatus": {"type": "long","fields": {"as_string": {"type": "string","index": "not_analyzed"}}},"ip": {"type": "string","index": "not_analyzed"},"timestamp": {"type": "string","index": "not_analyzed"},"hb_uid": {"type": "long","fields": {"as_string": {"type": "string","index": "not_analyzed"}}},"duid": {"type": "string","index": "not_analyzed"},"request": {"type": "string","fields": {"raw": {"type": "string","index": "not_analyzed"}}},"name": {"type": "string","fields": {"raw": {"type": "string","index": "not_analyzed"}}},"errorCode": {"type": "string","index": "not_analyzed"},"ua": {"type": "string","fields": {"raw": {"type": "string","index": "not_analyzed"}}},"server_timestamp": {"type": "long"},"bid": {"type": "long","fields": {"as_string": {"type": "string","index": "not_analyzed"}}}}},"path": {"type": "string","fields": {"raw": {"type": "string","index": "not_analyzed"}}},"type": {"type": "string","index": "not_analyzed"},"@timestamp": {"format": "strict_date_optional_time||epoch_millis","type": "date"},"@version": {"type": "string","index": "not_analyzed"}}}}}
推荐阅读
- 人脸识别设备主板如何选型 软硬整合大幅缩短开发时间
- 三星公布2021年款电视阵容:屏幕技术大升级 整合Google Duo等服务
- 整合零代码+AI+云原生技术,「速优云」布局智慧教培和智慧社区
- 整合K12业务 在线教育企业跟谁学升级旗下高途课堂
- 全力推进手机×AIoT战略 小米宣布整合成立三大部门:直接向雷军汇报
- 互联网企业都在用的Kafka为什么可以这么快?
- Kafka支持的分布式架构超越经典软件设计的五个原因
- flink消费kafka的offset与checkpoint
- 使用Kafka和Kafka Stream设计高可用任务调度
- 微软已经完成将Pinterest整合到Edge收藏夹的工作
