Multiline source won't include java stacktrace in message #13654
-
We need some help using the multiline feature. We're using a log like the following:
In our /etc/vector/vector.toml, we have the following for our transforms which successfully groups the timestamp, the log severity, the class name and the message for when the application is working properly:
However this does not handle the stacktrace, so in the 'sources' section, we added a 'multiline' configuration as follows:
We've been reading the documentation here: According to that documentation, using the config we have, the 'halt_before' setting should parse the line beginning with a stacktrace (ie, starting with a '[' character), and continue aggregating the lines until it reaches the condition line, the next line beginning with a '[' character. However, no aggregation occurs, and only the lines with the timestamps appear in the output:
What are we missing here? Is there some setting that we have misconfigured? or does the multiline not work as documented? We have also tried using the 'continue_through' setting and changing the 'condition_pattern' to be `^[\s]+' to at least catch the indented lines, but that did not work either. Given the stacktraces also have lines beginning with 'Caused by' it made more sense to use the 'halt_before' setting. Any help would be much appreciated. We are using vector versions 0.23.0 and 0.22.3 |
Beta Was this translation helpful? Give feedback.
Replies: 6 comments 3 replies
-
Hi @jonathanderham-streamotion ! The multiline configuration you are using seems to work for me. With the below config: data_dir = "/tmp/vector/"
[sources.source0]
type = "file"
include = ["/tmp/tmp.log"]
[sources.source0.multiline]
mode = "halt_before"
start_pattern = "^\\["
condition_pattern = "^\\["
timeout_ms = 1000
[sinks.sink0]
type = "console"
inputs = ["source0"]
[sinks.sink0.encoding]
codec = "json" In the output I see: {"file":"/tmp/tmp.log","host":"COMP-J4C4P27K9Q","message":"[2022-07-18T11:46:25,410+1000] DEBUG [java.class] - first line","source_type":"file","timestamp":"2022-07-21T14:31:59.838198Z"}
{"file":"/tmp/tmp.log","host":"COMP-J4C4P27K9Q","message":"[2022-07-18T11:46:25,418+1000] ERROR [java.class] - Begin event threw exception java.lang.Exception: class name\n at stacktrace.info(javaClass.java:77)\n at stacktrace.info(javaClass.java:276)\n at stacktrace.info(javaClass.java:1350)\n stacktrace.info(javaClass.java:377)\n at net.sf.saxon.event.ProxyReceiver.startElement(ProxyReceiver.java:140)\n ...\nError on line 304 column 11 of filename.xsl:\n SXCH0003 org.xml.sax.SAXParseException; systemId:\n file:/filepath/filename.xsl;","source_type":"file","timestamp":"2022-07-21T14:31:59.838258Z"}
{"file":"/tmp/tmp.log","host":"COMP-J4C4P27K9Q","message":"[2022-07-18T11:46:25,418+1000] ERROR [java.class] - Content parsing failed - see exception","source_type":"file","timestamp":"2022-07-21T14:31:59.838263Z"}
{"file":"/tmp/tmp.log","host":"COMP-J4C4P27K9Q","message":"[2022-07-18T11:46:27,240+1000] DEBUG [java.class] - Logging Provider: org.jboss.logging.Slf4jLoggerProvider","source_type":"file","timestamp":"2022-07-21T14:32:00.839232Z"} Which seems to have correctly handled the multi-line log events. Could you share your whole Vector configuration? I'm guessing these is something else amiss. |
Beta Was this translation helpful? Give feedback.
-
Hi @jszwedko, Here is my whole config:
You will notice that the 'input' value for my sink is the toml key of my transform, not the toml key of my source, as I want to print the tranformed output. This seems to be the only difference. When I change the input value in the sink to the source ('logs' in my case, 'source0' in yours) like you have done, I do in fact get the stacktrace multiline output, but you will also notice that the rest of the output is different, which we do not want - ie, the json has 'file' and host' keys, rather than 'application_id', 'log_type', and 'class' for example, so the transform has not worked properly. Our output is being sent to ElasticSearch and needs to have the right keys to be indexed properly. We need the log to include the multline output and be transformed, not one or the other. |
Beta Was this translation helpful? Give feedback.
-
Hi @jszwedko, Would adding a reduce transform in between the source and the remap transform help achieve my goals? I have tried this but was unsuccessful. |
Beta Was this translation helpful? Give feedback.
-
Hi @jszwedko, When I added the (?m) in the regex (I just copied the whole regex you had above), I get the following output, which shows all the correct groupings but none of the stacktrace lines:
My config now looks like the following:
|
Beta Was this translation helpful? Give feedback.
-
@jszwedko can you show me your output? |
Beta Was this translation helpful? Give feedback.
-
@jszwedko huzzah we have a winner. |
Beta Was this translation helpful? Give feedback.
@jszwedko huzzah we have a winner.
Thanks so much for your help!