Logstash整合Kafka( 三 ) 前面我们已经实现通过Logstash读取trac

name ， request ， path ， ua ， url ， errorMsg虽然设置了analyzed ，但是同时也需要做统计，所以在这几个字段单独加上了.raw字段，用来统计使用。
这里有个小技巧就是我们没有直接把long类型的字段直接转换成String类型，我们是在这个long类型的字段下创建了一个as_string字段， as_string这个字段是String类型的，并且是not_analyzed ，这样Kibana在全文搜索的时候就会高亮出来long类型的字段了，实际上是高亮的long类型字段下的String字段。举例：下面是搜索一个logs.bid字段， logs.bid这个字段是long类型的，但是我们在这个字段下创建了一个logs.bid.as_string字段，实际上highlight高亮的字段也是logs.bid.as_string这个字段。
【Logstash整合Kafka】参考：

..."highlight": {"logs.duid": ["@kibana-highlighted-field@wasl6@/kibana-highlighted-field@"],\"logs.bid.as_string": ["@kibana-highlighted-field@79789714801950720@/kibana-highlighted-field@"],"type": ["@kibana-highlighted-field@api@/kibana-highlighted-field@"],"logs.request": ["GET /@kibana-highlighted-field@api@/kibana-highlighted-field@/hongbao/realname/info"]}...

Kibana查询Request

{"size": 500,"highlight": {"pre_tags": ["@kibana-highlighted-field@"],"post_tags": ["@/kibana-highlighted-field@"],"fields": {"*": {}},"require_field_match": false,"fragment_size": 2147483647},"query": {"filtered": {"query": {"query_string": {"query": "keyword","analyze_wildcard": true}}}},"fields": ["*","_source"]}

这里Kibana全文搜索使用的是query_string语法，下面是常用的参数

query：可以使用简单的Lucene语法
default_field：指定默认查询哪些字段，默认值是_all
analyze_wildcard：默认情况下，通配符查询是不会被分词的，如果该属性设置为true ，将尽力去分词。（原文：By default, wildcards terms in a query string are not analyzed. By setting this value to true, a best effort will be made to analyze those as well.）

下面是ES官方文档的相关说明

WildcardsWildcard searches can be run on individual terms, using ? to replace a single character, and * to replace zero or more characters:qu?ck bro*Be aware that wildcard queries can use an enormous amount of memory and perform very badly?—?just think how many terms need to be queried to match the query string "a* b* c*".WarningAllowing a wildcard at the beginning of a word (eg "*ing") is particularly heavy, because all terms in the index need to be examined, just in case they match. Leading wildcards can be disabled by setting allow_leading_wildcard to false.Wildcarded terms are not analyzed by default?—?they are lowercased (lowercase_expanded_terms defaults to true) but no further analysis is done, mainly because it is impossible to accurately analyze a word that is missing some of its letters. However, by setting analyze_wildcard to true, an attempt will be made to analyze wildcarded words before searching the term list for matching terms.

遇到的问题和解决方法Q : 公司之前的架构是Flume + KafKa + Logstash + ES ，但是使用Flume作为Shipper端添加相关的type、host、path等Header字段会按照StringSerializer序列化到Kafka中，但是Logstash无法解析Flume序列化后的Header字段A : 将Shipper端换成Logstash ，保证Shipper和Indexer用同样的序列化和反序列化方式。