fusion-log-indexer

This project offers up a number of tools designed to quickly and efficiently get logs into Fusion. It supports a pluggable Parsing strategy (with implementations for Grok, DNS, JSON and NoOp) as well as a number of preconfigured Grok patterns similar to what is available in Logstash and other engines.

Features

Fast, multithreaded, lightweight client for installation on machines to be monitored
Pluggable log parsing logic with support for a variety of formats, including Grok, JSON and DNS
Ability to watch directories and automatically index new content
Integration with Lucidworks Fusion

Getting Started

Prerequisites

Maven (http://maven.apache.org)
Java 1.7 or later
Lucidworks Fusion

Building

After cloning the repository, do the following on the command line:

mvn package // -DskipTests if you want to skip the tests

The output JAR file is in the target directory

Running

To see all options: java -jar ./target/fusion-log-indexer-1.0-exe.jar

Basic Examples

Watches and sends in logs from old Lucidworks Search system in 500 at a time to the my_collection collection using the default pipeline:java -jar ./target/fusion-log-indexer-1.0-exe.jar -dir ~/projects/content/lucid/lucidfind/logs/ -fusion "http://localhost:8764/api/apollo/index-pipelines/my_collection-default/collections/my_collection/index" -fusionUser USER_HERE -fusionPass PASSWORD_HERE -senderThreads 4 -fusionBatchSize 500 --verbose -lineParserConfig sample-properties/lws-grok-parser.properties
Nagios example: java -jar ./target/fusion-log-indexer-1.0-exe.jar -dir ~/projects/content/nagios/ -fusion "http://localhost:8764/api/apollo/index-pipelines/nagios-default/collections/nagios/index" -fusionUser USER -fusionPass PASSWORD -lineParserConfig sample-properties/nagios-grok-parser.properties

Multi-line Parsing Example

Let's see how to handle parsing of a solr log file that has single-line and multi-line log messages (such as stacktraces). Specifically, we'll see how to parse the following snippet from a log generated by Solr 6.5.1:

INFO  - 2017-06-01 13:58:13.153; [   ] org.apache.solr.servlet.HttpSolrCall; [admin] webapp=null path=/admin/cores params={indexInfo=false&wt=json&_=1496325489976} status=0 QTime=0
INFO  - 2017-06-01 13:58:13.165; [   ] org.apache.solr.servlet.HttpSolrCall; [admin] webapp=null path=/admin/info/system params={wt=json&_=1496325489977} status=0 QTime=12
INFO  - 2017-06-01 13:58:13.169; [   ] org.apache.solr.handler.admin.CollectionsHandler; Invoked Collection Action :list with params action=LIST&wt=json&_=1496325489977 and sendToOCPQueue=true
INFO  - 2017-06-01 13:58:13.170; [   ] org.apache.solr.servlet.HttpSolrCall; [admin] webapp=null path=/admin/collections params={action=LIST&wt=json&_=1496325489977} status=0 QTime=0
ERROR - 2017-06-01 13:58:23.840; [c:gettingstarted s:shard1 r:core_node1 x:gettingstarted_shard1_replica1] org.apache.solr.common.SolrException; org.apache.solr.common.SolrException: undefined field: "notafield"
        at org.apache.solr.schema.IndexSchema.getField(IndexSchema.java:1239)
        at org.apache.solr.request.SimpleFacets.getTermCounts(SimpleFacets.java:438)
        at org.apache.solr.request.SimpleFacets.getTermCounts(SimpleFacets.java:405)
        at org.apache.solr.request.SimpleFacets.lambda$getFacetFieldCounts$0(SimpleFacets.java:803)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at org.apache.solr.request.SimpleFacets$3.execute(SimpleFacets.java:742)
        at org.apache.solr.request.SimpleFacets.getFacetFieldCounts(SimpleFacets.java:818)
        at org.apache.solr.handler.component.FacetComponent.getFacetCounts(FacetComponent.java:329)
        at org.apache.solr.handler.component.FacetComponent.process(FacetComponent.java:273)
        at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:295)
        at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:173)
        at org.apache.solr.core.SolrCore.execute(SolrCore.java:2440)
        at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:723)
        at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:529)
        at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:347)
        at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:298)
INFO  - 2017-06-01 13:58:23.842; [c:gettingstarted s:shard1 r:core_node1 x:gettingstarted_shard1_replica1] org.apache.solr.core.SolrCore; [gettingstarted_shard1_replica1]  webapp=/solr path=/select params={q=*:*&facet.field=notafield&indent=on&facet=on&wt=json&_=1496325493119} hits=32 status=400 QTime=61

A grok pattern from the resources/patterns/solr file for this log could be:

SOLR_651_LOG4J %{LOGLEVEL:level_s} - %{TIMESTAMP_ISO8601:logdate}; \[(?:%{DATA:mdc_s}| )\] %{DATA:category_s}; \[(?:%{DATA:core_s}| )\] %{JAVALOGMESSAGE:logmessage}

NOTE: You don't need to worry about multiple spaces as the parser collapses multiple whitespace characters down to a single space automatically.

Notice that the timestamp in the log has format: yyyy-MM-dd HH:mm:ss.SSS. Consequently, you'll need to set the following property in your log parser properties file:

dateFieldFormat=yyyy-MM-dd HH:mm:ss.SSS

Lastly, if you want to parse more fields from search requests, you can set the following property:

solrRequestGrokPattern=%{SOLR_6_REQUEST}

The final parser properties file you'll need to parse the example log above is:

grokPatternFile=patterns/grok-patterns
grokPattern=%{SOLR_651_LOG4J}
iso8601TimestampFieldName=timestamp_tdt
dateFieldName=logdate
dateFieldFormat=yyyy-MM-dd HH:mm:ss.SSS
logMessageFieldName=message_txt_en
solrRequestGrokPattern=%{SOLR_6_REQUEST}

To parse the example log, save the example log entries above into solr_example/solr.log and then run:

java -jar target/fusion-log-indexer-1.0-exe.jar -dir solr_example \
  -lineParserClass parsers.SolrLogParser -lineParserConfig solr_log_parser.properties \
  -parseOnly

Contributing

Please submit a pull request against the master branch with your changes.

Grokking Grok

For Grok, we are using https://github.com/thekrakken/java-grok/ implementation, which is a little thin on documentation. However, there are some useful tools available for learning and working with Grok. Additionally, see the src/main/resources/patterns directory for examples ranging from Apache logs to MongoDB to Nagios.

Useful Sites:

Name		Name	Last commit message	Last commit date
Latest commit History 27 Commits
sample-properties		sample-properties
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
pom.xml		pom.xml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

sample-properties

sample-properties

src

src

.gitignore

.gitignore

LICENSE

LICENSE

README.md

README.md

pom.xml

pom.xml

Repository files navigation

fusion-log-indexer

Features

Getting Started

Prerequisites

Building

Running

Basic Examples

Multi-line Parsing Example

Contributing

Grokking Grok

About

Releases

Packages

Contributors 2

Languages

License

lucidworks/fusion-log-indexer

Folders and files

Latest commit

History

Repository files navigation

fusion-log-indexer

Features

Getting Started

Prerequisites

Building

Running

Basic Examples

Multi-line Parsing Example

Contributing

Grokking Grok

About

Resources

License

Stars

Watchers

Forks

Languages