-
Notifications
You must be signed in to change notification settings - Fork 11
Filter Buffer
This filter receives logs, stores them in redis, and sends them at the right moment to the right filter connected in output. Its main uses are: waiting a certain amount of time and send all logs received during this time (set interval to what you need and required_log_lines to 1), or waiting to have enough logs before sending them (set a low interval and required_log_lines to what you need), or you can make a mix of both (set interval and required_log_lines to what you need). interval and required_log_lines are set in Buffer's config file, see Config section of this wiki page.
0x62756672
No specific dependencies
Example of darwin configuration for this filter :
Note that for this filter, next_filter, nb_thread and cache_size are mandatory (for legacy) but have no impact on filter's behavior.
{
"buffer_1": {
"exec_path": "/path/to/darwin/build/darwin_buffer",
"config_file":"/path/to/filter.conf",
"output": "LOG",
"next_filter": "",
"nb_thread": 1,
"log_level": "DEBUG",
"cache_size": 0
}
}
The config file is made of 3 parts: The redis config, input format and outputs.
redis_socket_path: The redis server on which to store logs.
{
"redis_socket_path": "path/to/redis/server/socket.sock"
}
input_format: An array of json objects, this array must contain all values needed by the output filters.
This represents the list of all arguments that should always be provided as inputs for each call to the filter.
Each object contains:
- name: the name of the value
- type: the type of the value (string or int only for now)
For example, an input_format for a sofa connection would be: (see Sofa documentation)
{
"input_format": [
{"name": "ip", "type": "string"},
{"name": "hostname", "type": "string"},
{"name": "os", "type": "string"},
{"name": "proto", "type": "string"},
{"name": "port", "type": "string"}
]
}
For each connection between Buffer Filter and another Filter, you need to provide to Buffer the same data than you would have to provide to the according Filter. Except for Anomaly Filter.
For an Anomaly connection you need to provide the same data as for Tanomaly.
For example, an input_format for an anomaly connection would be: (see Tanomaly documentation)
{
"input_format": [
{"name": "net_src_ip", "type": "string"},
{"name": "net_dst_ip", "type": "string"},
{"name": "net_dst_port", "type": "string"},
{"name": "ip_proto", "type": "string"}
]
}
outputs: An array of json objects, containing configs to connect to output filters.
Each object contains:
-
filter_type:
- fsofa: specific output type suited for the SOFA filter
- fanomaly: specific output type with post-processing suited for the UNAD filter
- sum: generic filter used for accumulating inputs over time, can be used for VAML/VAST filters
- filter_socket_path: the socket of the filter receiving the resulting output
- _interval*: the interval (in seconds) between two sending tries (sent only if there is more logs than required_log_lines)
- required_log_lines: the number of required logs before sending to output filter
-
redis_lists: an array of json objects containing a source and a list name (see Body section of this doc)
- source: a source name which will always be the first element of a log line (empty source in config will match all sources in body)
- name: the name of the redis storage associated to the source.
Correspondence with input formats:
Each filter_type takes its data from specific inputs.
Here is a list of valid input_format groups for each supported output types:
- filter_type: fsofa
- ip
- hostname
- os
- proto
- port
- filter_type: fanomaly
- net_src_ip
- net_dst_ip
- net_dst_port
- net_proto
- filter_type: sum
- decimal
Here is an example of outputs configuration, with complex sources/redis lists splitting:
{
"outputs": [
{
"filter_type": "fanomaly",
"filter_socket_path": "path/to/output/anomaly/socket.sock",
"interval": 300,
"required_log_lines": 11,
"redis_lists": [{
"source": "source_1",
"name": "anomaly_list_1"
},
{
"source": "source_2",
"name": "anomaly_list_2"
}]
},
{
"filter_type": "fsofa",
"filter_socket_path": "path/to/output/sofa/socket.sock",
"interval": 200,
"required_log_lines": 15,
"redis_lists": [{
"source": "source_1",
"name": "sofa_list_1"
},
{
"source": "source_3",
"name": "sofa_list_3"
},
{
"source": "",
"name": "sofa_list_all"
}]
}
]
}
In the above example, a log line from:
- source_1 will be stored in anomaly_list_1 AND sofa_list_1
- source_2 will be stored in anomaly_list_2
- source_3 will be stored in sofa_list_3 Additionally, all log lines will end up in sofa_list_all.
Any filter linked to a buffer will get from a log line only the data that it needs to perform, not paying attention to other data in log line. The only error handling is made on the type of the data, the content validation is left to the output filter (but for output types such as fanomaly, post-processing will guarantee at least the format of data given to the output filter).
You can have several filters in output, all of them having one or more source/name redis storage pair(s).
Finally, here is a complete config file example, with the redis parameters, an input format and several outputs:
{
"redis_socket_path": "/var/sockets/redis/redis.sock",
"input_format": [
{"name": "net_src_ip", "type": "string"},
{"name": "net_dst_ip", "type": "string"},
{"name": "net_dst_port", "type": "string"},
{"name": "ip_proto", "type": "string"},
{"name": "ip", "type": "string"},
{"name": "hostname", "type": "string"},
{"name": "os", "type": "string"},
{"name": "proto", "type": "string"},
{"name": "port", "type": "string"},
],
"outputs": [
{
"filter_type": "fanomaly",
"filter_socket_path": "/var/sockets/darwin/anomaly.sock",
"interval": 300,
"required_log_lines": 11,
"redis_lists": [{
"source": "source_1",
"name": "darwin_buffer_anomaly_source_1"
},
{
"source": "source_2",
"name": "darwin_buffer_anomaly_source_2"
}]
},
{
"filter_type": "fanomaly",
"filter_socket_path": "/var/sockets/darwin/anomaly_2.sock",
"interval": 300,
"required_log_lines": 11,
"redis_lists": [{
"source": "",
"name": "darwin_buffer_anomaly_all_sources"
}]
},
{
"filter_type": "fsofa",
"filter_socket_path": "/var/sockets/darwin/sofa.sock",
"interval": 100,
"required_log_lines": 15,
"redis_lists": [{
"source": "source_1",
"name": "darwin_buffer_sofa_1"
},
{
"source": "source_2",
"name": "darwin_buffer_sofa_2"
}]
}
]
}
Unlike the other filters, the content of the body will change depending on the buffer's config.
If we take this input_format as an example:
{
"input_format": [
{"name": "net_src_ip", "type": "string"},
{"name": "net_dst_ip", "type": "string"},
{"name": "net_dst_port", "type": "string"},
{"name": "ip_proto", "type": "string"}
]
}
The body would look like this:
[
["<source>", "<net_src_ip>", "<net_dst_ip">, <"net_dst_port">, "<ip_proto>"],
...
]
The order in the input_format has no importance as long as you keep the same order in the body.
Every line from body must start by the source and contain as many elements as in the config input_format part (+1 for source).
An empty source in the body will only trigger empty sources in output config.
Here is an example of a body:
[
["source_1", "127.0.0.1", "127.0.0.4", "22", "6"],
["source_1", "127.0.0.2", "127.0.0.5", "24", "6"],
["source_2", "127.0.0.1", "127.0.0.3", "22", "17"],
["", "127.0.0.2", "127.0.0.1", "21", "17"],
...
]
In a case where you have two different filters in output, you need to provide a complete log line that will be split between the different filters, according to what they need. The order in a body line is therefore really important as it is the only way to ensure that a data will end up in the correct filter.
In case of multiple sources for a same filter, each redis storage will be treated independently and data will be sent separately from buffer to filter. Multiple sources can be relevant if you want to treat with a single filter different data sets from different origins.
This filter returns 0 if the line was correctly parsed, and 101 if not.
It does not raise alerts.