Filter Buffer

Description

This filter receives logs, stores them in redis, and sends them at the right moment to the right filter connected in output. Its main uses are: waiting a certain amount of time and send all logs received during this time (set interval to what you need and required_log_lines to 1), or waiting to have enough logs before sending them (set a low interval and required_log_lines to what you need), or you can make a mix of both (set interval and required_log_lines to what you need). interval and required_log_lines are set in Buffer's config file, see Config section of this wiki page.

Filter Code

0x62756672

Dependencies

No specific dependencies

Darwin configuration

Example of darwin configuration for this filter :

Note that for this filter, next_filter, nb_thread and cache_size are mandatory (for legacy) but have no impact on filter's behavior.

{
    "buffer_1": {
        "exec_path": "/path/to/darwin/build/darwin_buffer",
        "config_file":"/path/to/filter.conf",
        "output": "LOG",
        "next_filter": "",
        "nb_thread": 1,
        "log_level": "DEBUG",
        "cache_size": 0
    }
}

Config file

The config file is made of 3 parts: The redis config, input format and outputs.

Redis

redis_socket_path: The redis server on which to store logs.

{
    "redis_socket_path": "path/to/redis/server/socket.sock"
}

Input Format

input_format: An array of json objects, this array must contain all values needed by the output filters.
This represents the list of all arguments that should always be provided as inputs for each call to the filter. Each object contains:

name: the name of the value
type: the type of the value (string or int only for now)

For example, an input_format for a sofa connection would be: (see Sofa documentation)

{
    "input_format": [
        {"name": "ip", "type": "string"},
        {"name": "hostname", "type": "string"},
        {"name": "os", "type": "string"},
        {"name": "proto", "type": "string"},
        {"name": "port", "type": "string"}
    ]
}

For each connection between Buffer Filter and another Filter, you need to provide to Buffer the same data than you would have to provide to the according Filter. Except for Anomaly Filter.
For an Anomaly connection you need to provide the same data as for Tanomaly.

For example, an input_format for an anomaly connection would be: (see Tanomaly documentation)

{
    "input_format": [
        {"name": "net_src_ip", "type": "string"},
        {"name": "net_dst_ip", "type": "string"},
        {"name": "net_dst_port", "type": "string"},
        {"name": "ip_proto", "type": "string"}
    ]
}

Outputs

outputs: An array of json objects, containing configs to connect to output filters.
Each object contains:

filter_type:
- fsofa: specific output type suited for the SOFA filter
- fanomaly: specific output type with post-processing suited for the UNAD filter
- sum: generic filter used for accumulating inputs over time, can be used for VAML/VAST filters
filter_socket_path: the socket of the filter receiving the resulting output
_interval*: the interval (in seconds) between two sending tries (sent only if there is more logs than required_log_lines)
required_log_lines: the number of required logs before sending to output filter
redis_lists: an array of json objects containing a source and a list name (see Body section of this doc)
- source: a source name which will always be the first element of a log line (empty source in config will match all sources in body)
- name: the name of the redis storage associated to the source.

Correspondence with input formats: Each filter_type takes its data from specific inputs.
Here is a list of valid input_format groups for each supported output types:

filter_type: fsofa
- ip
- hostname
- os
- proto
- port
filter_type: fanomaly
- net_src_ip
- net_dst_ip
- net_dst_port
- net_proto
filter_type: sum
- decimal

Outputs Example

Here is an example of outputs configuration, with complex sources/redis lists splitting:

{
    "outputs": [
        {
            "filter_type": "fanomaly",
            "filter_socket_path": "path/to/output/anomaly/socket.sock",
            "interval": 300,
            "required_log_lines": 11,
            "redis_lists": [{
                "source": "source_1",
                "name": "anomaly_list_1"
            },
            {
                "source": "source_2",
                "name": "anomaly_list_2"
            }]
        },
        {
            "filter_type": "fsofa",
            "filter_socket_path": "path/to/output/sofa/socket.sock",
            "interval": 200,
            "required_log_lines": 15,
            "redis_lists": [{
                 "source": "source_1",
                 "name": "sofa_list_1"
            },
            {
                 "source": "source_3",
                 "name": "sofa_list_3"
            },
            {
                "source": "",
                "name": "sofa_list_all"
            }]
        }
     ]
}

In the above example, a log line from:

source_1 will be stored in anomaly_list_1 AND sofa_list_1
source_2 will be stored in anomaly_list_2
source_3 will be stored in sofa_list_3 Additionally, all log lines will end up in sofa_list_all.

Any filter linked to a buffer will get from a log line only the data that it needs to perform, not paying attention to other data in log line. The only error handling is made on the type of the data, the content validation is left to the output filter (but for output types such as fanomaly, post-processing will guarantee at least the format of data given to the output filter).

You can have several filters in output, all of them having one or more source/name redis storage pair(s).

Complete Example

Finally, here is a complete config file example, with the redis parameters, an input format and several outputs:

{
  "redis_socket_path": "/var/sockets/redis/redis.sock",
  "input_format": [
    {"name": "net_src_ip", "type": "string"},
    {"name": "net_dst_ip", "type": "string"},
    {"name": "net_dst_port", "type": "string"},
    {"name": "ip_proto", "type": "string"},
    {"name": "ip", "type": "string"},
    {"name": "hostname", "type": "string"},
    {"name": "os", "type": "string"},
    {"name": "proto", "type": "string"},
    {"name": "port", "type": "string"},
  ],
  "outputs": [
    {
      "filter_type": "fanomaly",
      "filter_socket_path": "/var/sockets/darwin/anomaly.sock",
      "interval": 300,
      "required_log_lines": 11,
      "redis_lists": [{
        "source": "source_1",
        "name": "darwin_buffer_anomaly_source_1"
      },
      {
        "source": "source_2",
        "name": "darwin_buffer_anomaly_source_2"
      }]
    },
    {
      "filter_type": "fanomaly",
      "filter_socket_path": "/var/sockets/darwin/anomaly_2.sock",
      "interval": 300,
      "required_log_lines": 11,
      "redis_lists": [{
        "source": "",
        "name": "darwin_buffer_anomaly_all_sources"
      }]
    },
    {
      "filter_type": "fsofa",
      "filter_socket_path": "/var/sockets/darwin/sofa.sock",
      "interval": 100,
      "required_log_lines": 15,
      "redis_lists": [{
        "source": "source_1",
        "name": "darwin_buffer_sofa_1"
      },
      {
        "source": "source_2",
        "name": "darwin_buffer_sofa_2"
      }]
    }
  ]
}

Body

Unlike the other filters, the content of the body will change depending on the buffer's config.
If we take this input_format as an example:

{
    "input_format": [
        {"name": "net_src_ip", "type": "string"},
        {"name": "net_dst_ip", "type": "string"},
        {"name": "net_dst_port", "type": "string"},
        {"name": "ip_proto", "type": "string"}
    ]
}

The body would look like this:

[
    ["<source>", "<net_src_ip>", "<net_dst_ip">, <"net_dst_port">, "<ip_proto>"],
    ...
]

The order in the input_format has no importance as long as you keep the same order in the body.
Every line from body must start by the source and contain as many elements as in the config input_format part (+1 for source).
An empty source in the body will only trigger empty sources in output config.

Example

Here is an example of a body:

[
    ["source_1", "127.0.0.1", "127.0.0.4", "22", "6"],
    ["source_1", "127.0.0.2", "127.0.0.5", "24", "6"],
    ["source_2", "127.0.0.1", "127.0.0.3", "22", "17"],
    ["", "127.0.0.2", "127.0.0.1", "21", "17"],
    ...
]

In a case where you have two different filters in output, you need to provide a complete log line that will be split between the different filters, according to what they need. The order in a body line is therefore really important as it is the only way to ensure that a data will end up in the correct filter.

In case of multiple sources for a same filter, each redis storage will be treated independently and data will be sent separately from buffer to filter. Multiple sources can be relevant if you want to treat with a single filter different data sets from different origins.

Results

This filter returns 0 if the line was correctly parsed, and 101 if not.

It does not raise alerts.

Home
Darwin Configuration
Connect to Darwin with Python
Management Socket
Alert Format
Statistics
Rsyslog Configuration
Workflow
Automated Testing
Unit Testing
Available Docker images
Filters
- Template
- Anomaly
- Buffer
- Connection
- Content Inspection
- DGA
- HostLookup
- Sofa
- TAnomaly
- Test
- Yara
- (Coming) Session
- (Obsolete) End
- (Obsolete) Logs
- (Obsolete) Reputation
Tools
- Reconciler
HAproxy
- General Configuration
- SPOE Data
Contributions

Filter Buffer

Filter Buffer

Description

Filter Code

Dependencies

Darwin configuration

Config file

Redis

Input Format

Outputs

Outputs Example

Complete Example

Body

Example

Results

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally