Skip to content

Avoid unnecessary decompression from buffer #72

Open
@khuongduybui

Description

@khuongduybui

Problem

I use compress gzip buffer (built-in, no plugin) and compress_request true with this http output plugin.
Fluentd attempts to gunzip the buffer from disk, which is then recompressed by this plugin.

Steps to replicate

# Upload configuration for Syslog events
<match syslog.events>
  @type http
  endpoint_url "https://<redacted>"
  http_method post

# send compressed events
  compress_request true
  serializer json
  buffered true
  bulk_request true
# specify recoverable/repeatable status codes
  recoverable_status_codes 404, 500, 502, 503, 504

  # every 5 minutes or every 10 MBs
  <buffer tag,time>
    @type file
    path /shared/logs_5/buffer/syslog
    timekey 5m
    timekey_wait 0m
    timekey_use_utc true
    chunk_limit_size 10MB
    compress gzip
    total_limit_size 50GB
    overflow_action drop_oldest_chunk
    retry_timeout 7d
    retry_max_interval 3600
  </buffer>
  <format>
    @type json
    add_newline true
  </format>
</match>

Expected Behavior or What you need to ask

According to Fluentd doc https://docs.fluentd.org/configuration/buffer-section#:~:text=Fluentd%20will%20decompress,plugin%20as%20is):

Fluentd will decompress these compressed chunks automatically before passing them to the output plugin (The exceptional case is when the output plugin can transfer data in compressed form. In this case, the data will be passed to the plugin as is).

Can we somehow let fluentd know that this output plugin can transfer data in compressed form and skip the decomp / re-comp?

The main reason why we came to this revelation is due to fluentd having errors sometimes when decompressing the gzip'ed buffer chunks and choke on it with the same up-to-1-week retry logic that we put in place for cases like network loss. We'd rather fluentd pass the bad chunks to this plugin, which sends them as-is to my endpoint in the cloud, where we have all the processing power to attempt to recover them or discard them without choking up the pipe.

Using Fluentd and out_http plugin versions

  • OS version: Debian 11
  • Bear Metal or Within Docker or Kubernetes or other: official Docker image
  • Fluentd version: 1.16.1
  • out_http plugin 1.3.4
abbrev (default: 0.1.0)
async (1.31.0)
async-http (0.60.1)
async-io (1.34.3)
async-pool (0.4.0)
base64 (default: 0.1.1)
benchmark (default: 0.2.0)
bigdecimal (default: 3.1.1)
bson (4.15.0)
bundler (default: 2.3.26)
cgi (default: 0.3.6)
concurrent-ruby (1.2.2)
console (1.16.2)
cool.io (1.7.1)
csv (default: 3.2.5)
date (default: 3.2.2)
debug (1.6.3)
delegate (default: 0.2.0)
did_you_mean (default: 1.6.1)
digest (default: 3.1.0)
drb (default: 2.1.0)
english (default: 0.7.1)
erb (default: 2.2.3)
error_highlight (default: 0.3.0)
etc (default: 1.3.0)
fcntl (default: 1.0.1)
fiber-local (1.0.0)
fiddle (default: 1.1.0)
fileutils (default: 1.6.0)
find (default: 0.1.1)
fluent-config-regexp-type (1.0.0)
fluent-plugin-mongo (1.6.0)
fluent-plugin-multi-format-parser (1.0.0)
fluent-plugin-out-http (1.3.4)
fluent-plugin-prometheus (2.1.0)
fluent-plugin-rewrite-tag-filter (2.4.0)
fluentd (1.16.1)
forwardable (default: 1.3.2)
getoptlong (default: 0.1.1)
http_parser.rb (0.8.0)
io-console (default: 0.5.11)
io-nonblock (default: 0.1.0)
io-wait (default: 0.2.1)
ipaddr (default: 1.2.4)
irb (default: 1.4.1)
json (2.6.3, default: 2.6.1)
logger (default: 1.5.0)
matrix (0.4.2)
minitest (5.15.0)
mongo (2.18.3)
msgpack (1.7.0)
mutex_m (default: 0.1.1)
net-ftp (0.1.3)
net-http (default: 0.3.0)
net-imap (0.2.3)
net-pop (0.1.1)
net-protocol (default: 0.1.2)
net-smtp (0.3.1)
nio4r (2.5.9)
nkf (default: 0.1.1)
observer (default: 0.1.1)
oj (3.14.3)
open-uri (default: 0.2.0)
open3 (default: 0.1.1)
openssl (default: 3.0.1)
optparse (default: 0.2.0)
ostruct (default: 0.5.2)
pathname (default: 0.2.0)
power_assert (2.0.1)
pp (default: 0.3.0)
prettyprint (default: 0.1.1)
prime (0.1.2)
prometheus-client (4.2.2)
protocol-hpack (1.4.2)
protocol-http (0.24.1)
protocol-http1 (0.15.0)
protocol-http2 (0.15.1)
pstore (default: 0.1.1)
psych (default: 4.0.4)
racc (default: 1.6.0)
rake (13.0.6)
rbs (2.7.0)
rdoc (default: 6.4.0)
readline (default: 0.0.3)
readline-ext (default: 0.1.4)
reline (default: 0.3.1)
resolv (default: 0.2.1)
resolv-replace (default: 0.1.0)
rexml (3.2.5)
rinda (default: 0.1.1)
rss (0.2.9)
ruby2_keywords (default: 0.0.5)
securerandom (default: 0.2.0)
serverengine (2.3.2)
set (default: 1.0.2)
shellwords (default: 0.1.0)
sigdump (0.2.4)
singleton (default: 0.1.1)
stringio (default: 3.0.1)
strptime (0.2.5)
strscan (default: 3.0.1)
syslog (default: 0.1.0)
tempfile (default: 0.1.2)
test-unit (3.5.3)
time (default: 0.2.2)
timeout (default: 0.2.0)
timers (4.3.5)
tmpdir (default: 0.1.2)
traces (0.9.1)
tsort (default: 0.1.0)
typeprof (0.21.3)
tzinfo (2.0.6)
tzinfo-data (1.2023.3)
un (default: 0.2.0)
uri (default: 0.12.1)
weakref (default: 0.1.1)
webrick (1.8.1)
yajl-ruby (1.4.3)
yaml (default: 0.2.0)
zlib (default: 2.1.1)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions