Name: fusion-log-indexer
Owner: Lucidworks
Description: Watch a directory for logs and send each line to a Fusion pipeline as a PipelineDocument using grok for parsing.
Created: 2015-12-08 19:09:41.0
Updated: 2017-12-02 06:55:28.0
Pushed: 2017-11-16 07:43:26.0
Homepage: null
Size: 1828
Language: Java
GitHub Committers
User | Most Recent Commit | # Commits |
---|
Other Committers
User | Most Recent Commit | # Commits |
---|
This project offers up a number of tools designed to quickly and efficiently get logs into Fusion. It supports a pluggable Parsing strategy (with implementations for Grok, DNS, JSON and NoOp) as well as a number of preconfigured Grok patterns similar to what is available in Logstash and other engines.
After cloning the repository, do the following on the command line:
The output JAR file is in the target directory
`java -jar ./target/fusion-log-indexer-1.0-exe.jar
`Watches and sends in logs from old Lucidworks Search system in 500 at a time to the my_collection collection using the default pipeline:`java -jar ./target/fusion-log-indexer-1.0-exe.jar -dir ~/projects/content/lucid/lucidfind/logs/<br />
-fusion "http://localhost:8764/api/apollo/index-pipelines/my_collection-default/collections/my_collection/index"
-fusionUser USER_HERE -fusionPass PASSWORD_HERE -senderThreads 4 -fusionBatchSize 500 --verbose -lineParserConfig sample-properties/lws-grok-parser.properties
`
Nagios example: `java -jar ./target/fusion-log-indexer-1.0-exe.jar -dir ~/projects/content/nagios/<br />
-fusion "http://localhost:8764/api/apollo/index-pipelines/nagios-default/collections/nagios/index"
-fusionUser USER -fusionPass PASSWORD -lineParserConfig sample-properties/nagios-grok-parser.properties
`
Let's see how to handle parsing of a solr log file that has single-line and multi-line log messages (such as stacktraces). Specifically, we'll see how to parse the following snippet from a log generated by Solr 6.5.1:
- 2017-06-01 13:58:13.153; [ ] org.apache.solr.servlet.HttpSolrCall; [admin] webapp=null path=/admin/cores params={indexInfo=false&wt=json&_=1496325489976} status=0 QTime=0
- 2017-06-01 13:58:13.165; [ ] org.apache.solr.servlet.HttpSolrCall; [admin] webapp=null path=/admin/info/system params={wt=json&_=1496325489977} status=0 QTime=12
- 2017-06-01 13:58:13.169; [ ] org.apache.solr.handler.admin.CollectionsHandler; Invoked Collection Action :list with params action=LIST&wt=json&_=1496325489977 and sendToOCPQueue=true
- 2017-06-01 13:58:13.170; [ ] org.apache.solr.servlet.HttpSolrCall; [admin] webapp=null path=/admin/collections params={action=LIST&wt=json&_=1496325489977} status=0 QTime=0
R - 2017-06-01 13:58:23.840; [c:gettingstarted s:shard1 r:core_node1 x:gettingstarted_shard1_replica1] org.apache.solr.common.SolrException; org.apache.solr.common.SolrException: undefined field: "notafield"
at org.apache.solr.schema.IndexSchema.getField(IndexSchema.java:1239)
at org.apache.solr.request.SimpleFacets.getTermCounts(SimpleFacets.java:438)
at org.apache.solr.request.SimpleFacets.getTermCounts(SimpleFacets.java:405)
at org.apache.solr.request.SimpleFacets.lambda$getFacetFieldCounts$0(SimpleFacets.java:803)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at org.apache.solr.request.SimpleFacets$3.execute(SimpleFacets.java:742)
at org.apache.solr.request.SimpleFacets.getFacetFieldCounts(SimpleFacets.java:818)
at org.apache.solr.handler.component.FacetComponent.getFacetCounts(FacetComponent.java:329)
at org.apache.solr.handler.component.FacetComponent.process(FacetComponent.java:273)
at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:295)
at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:173)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:2440)
at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:723)
at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:529)
at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:347)
at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:298)
- 2017-06-01 13:58:23.842; [c:gettingstarted s:shard1 r:core_node1 x:gettingstarted_shard1_replica1] org.apache.solr.core.SolrCore; [gettingstarted_shard1_replica1] webapp=/solr path=/select params={q=*:*&facet.field=notafield&indent=on&facet=on&wt=json&_=1496325493119} hits=32 status=400 QTime=61
A grok pattern from the resources/patterns/solr
file for this log could be:
_651_LOG4J %{LOGLEVEL:level_s} - %{TIMESTAMP_ISO8601:logdate}; \[(?:%{DATA:mdc_s}| )\] %{DATA:category_s}; \[(?:%{DATA:core_s}| )\] %{JAVALOGMESSAGE:logmessage}
NOTE: You don't need to worry about multiple spaces as the parser collapses multiple whitespace characters down to a single space automatically.
Notice that the timestamp in the log has format: yyyy-MM-dd HH:mm:ss.SSS
. Consequently, you'll need to set the following property in your log parser properties file:
FieldFormat=yyyy-MM-dd HH:mm:ss.SSS
Lastly, if you want to parse more fields from search requests, you can set the following property:
RequestGrokPattern=%{SOLR_6_REQUEST}
The final parser properties file you'll need to parse the example log above is:
PatternFile=patterns/grok-patterns
Pattern=%{SOLR_651_LOG4J}
601TimestampFieldName=timestamp_tdt
FieldName=logdate
FieldFormat=yyyy-MM-dd HH:mm:ss.SSS
essageFieldName=message_txt_en
RequestGrokPattern=%{SOLR_6_REQUEST}
To parse the example log, save the example log entries above into solr_example/solr.log
and then run:
-jar target/fusion-log-indexer-1.0-exe.jar -dir solr_example \
ineParserClass parsers.SolrLogParser -lineParserConfig solr_log_parser.properties \
arseOnly
Please submit a pull request against the master branch with your changes.
For Grok, we are using https://github.com/thekrakken/java-grok/ implementation, which is a little thin on documentation. However, there
are some useful tools available for learning and working with Grok. Additionally, see the `src/main/resources/patterns
` directory
for examples ranging from Apache logs to MongoDB to Nagios.
Useful Sites: