Skip to content

Commit db94f0c

Browse files
authored
Merge pull request #14266 from mburke5678/logging-json-parsing
Logging json parsing
2 parents dd26120 + 0ef763d commit db94f0c

File tree

2 files changed

+75
-0
lines changed

2 files changed

+75
-0
lines changed

logging/efk-logging-fluentd.adoc

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -40,5 +40,7 @@ include::modules/efk-logging-fluentd-external.adoc[leveloffset=+1]
4040

4141
include::modules/efk-logging-fluentd-throttling.adoc[leveloffset=+1]
4242

43+
include::modules/efk-logging-fluentd-json.adoc[leveloffset=+1]
44+
4345

4446

modules/efk-logging-fluentd-json.adoc

Lines changed: 73 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,73 @@
1+
// Module included in the following assemblies:
2+
//
3+
// * logging/efk-logging-fluentd.adoc
4+
5+
[id="efk-logging-fluentd-json-{context}"]
6+
= Configuring Fluentd JSON parsing
7+
8+
You can configure Fluentd to inspect each log message to determine if the message is in *JSON* format and merge
9+
the message into the JSON payload document posted to Elasticsearch. This feature is disabled by default.
10+
11+
You can enable or disable this feature by editing the `MERGE_JSON_LOG` environment variable in the *fluentd* daemonset.
12+
13+
[IMPORTANT]
14+
====
15+
Enabling this feature comes with risks, including:
16+
17+
* Possible log loss due to Elasticsearch rejecting documents due to inconsistent type mappings.
18+
* Potential buffer storage leak caused by rejected message cycling.
19+
* Overwrite of data for field with same names.
20+
21+
The features in this topic should be used by only experienced Fluentd and Elasticsearch users.
22+
====
23+
24+
.Prerequisite
25+
26+
* Set cluster logging to the unmanaged state. In managed state, the Cluster Logging Operator reverts changes made to the `fluentd` configuration map.
27+
28+
.Procedure
29+
30+
Use the following command to enable this feature:
31+
32+
----
33+
oc set env ds/fluentd MERGE_JSON_LOG=true <1>
34+
----
35+
36+
<1> Set this to `false` to disable this feature or `true` to enable this feature.
37+
38+
[id="efk-logging-fluentd-json-string-{context}"]
39+
== Setting MERGE_JSON_LOG and CDM_UNDEFINED_TO_STRING
40+
41+
You can use link:https://github.com/openshift/origin-aggregated-logging/blob/master/fluentd/README.md[environment variables] to modify your fluentd configuration.
42+
43+
However, if you set the `CDM_UNDEFINED_TO_STRING` enviroment variable to `true` when `MERGE_JSON_LOG` is also set to `true`, you will receive an Elasticsearch *400* error if you have already generated conflicting fields, until the indices roll over for the next day. The error occurs because when`MERGE_JSON_LOG=true`, Fluentd adds fields with data types other than *string*. When you set `CDM_UNDEFINED_TO_STRING=true`, Fluentd attempts to add those fields with a *string* value resulting in the Elasticsearch *400* error.
44+
45+
When Fluentd rolls over the indices for the next day's logs, it will create a brand new index. The field definitions are updated and you will not get the *400* error.
46+
47+
Records that have *hard* errors, such as schema violations, corrupted data, and so forth, cannot be retried. Fluent sends the records for error handling. If you link:https://docs.fluentd.org/v1.0/articles/config-file#@error-label[add a
48+
`<label @ERROR>` section] to your Fluentd config, as the last <label>, you can handle these records as needed.
49+
50+
For example:
51+
52+
----
53+
data:
54+
fluent.conf:
55+
56+
....
57+
58+
<label @ERROR>
59+
<match **>
60+
@type file
61+
path /var/log/fluent/dlq
62+
time_slice_format %Y%m%d
63+
time_slice_wait 10m
64+
time_format %Y%m%dT%H%M%S%z
65+
compress gzip
66+
</match>
67+
</label>
68+
----
69+
70+
This section writes error records to the link:https://www.elastic.co/guide/en/logstash/current/dead-letter-queues.html[Elasticsearch dead letter queue (DLQ) file]. See link:https://docs.fluentd.org/v0.12/articles/out_file[the fluentd documentation] for more information about the file output.
71+
72+
Then you can edit the file to clean up the records manually, edit the file to use with the Elasticsearch `/_bulk index` API and use cURL to add those records. For more information on
73+
Elasticsearch Bulk API, see link:https://www.elastic.co/guide/en/elasticsearch/reference/5.6/docs-bulk.html[the Elasticsearch documentation].

0 commit comments

Comments
 (0)