-
Notifications
You must be signed in to change notification settings - Fork 1.8k
enhancement(splunk_hec sink): Use a response cookie to route ack checks to the same Splunk indexer #23156
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
…ks to same indexer Closes vectordotdev#19417. This is particularly useful when running Splunk in a clustered environment with multiple indexer hosts. In this environment, acknowledgement IDs are frequently duplicated across multiple indexers (they all start at `0` and count upwards as they receive requests with the same `X-Splunk-Request-Channel` header, so there will be lots of reuse). One common way to distinguish between multiple hosts behind a load balancer is to return a cookie to specify which indexer to respond back to. This is the recommended way, for instance, to set up an AWS ELB for a Splunk indexer cluster such that it has cookie stickiness enabled: https://docs.splunk.com/Documentation/AddOns/released/Firehose/ConfigureanELB https://community.splunk.com/t5/Getting-Data-In/How-to-configure-the-load-balancer-to-handle-HEC-data/td-p/742116
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left a small suggestion, but it's non blocking.
@@ -5,6 +5,18 @@ base: components: sinks: splunk_hec_logs: configuration: { | |||
description: "Splunk HEC acknowledgement configuration." | |||
required: false | |||
type: object: options: { | |||
cookie_name: { | |||
description: """ | |||
The name of a cookie to extract from the Splunk HEC response to use when querying for acknowledgements. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The name of a cookie to extract from the Splunk HEC response to use when querying for acknowledgements. | |
Specifies the name of a cookie to extract from the Splunk HEC response and use when querying for acknowledgements. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@estherk15 Thanks for the review! I've changed this and added a bit more text describing why a cookie is useful here (specifically that it is useful for routing the request to the correct indexer)
Summary
This is particularly useful when running Splunk in a clustered environment with multiple indexer hosts. In this environment, acknowledgement IDs are frequently duplicated across multiple indexers (they all start at
0
and count upwards as they receive requests with the sameX-Splunk-Request-Channel
header, so there will be lots of reuse). One common way to distinguish between multiple hosts behind a load balancer is to return a cookie to specify which indexer to respond back to. This is the recommended way, for instance, to set up an AWS ELB for a Splunk indexer cluster such that it has cookie stickiness enabled:https://docs.splunk.com/Documentation/AddOns/released/Firehose/ConfigureanELB https://community.splunk.com/t5/Getting-Data-In/How-to-configure-the-load-balancer-to-handle-HEC-data/td-p/742116
I'm not actually sure how acknowledgements would have worked with multiple indexers previously. From what I can tell, given the way it is structured with the hashmap keyed by ack ID, it would only ever work with single-indexer clusters due to the ack collision issue and not being able to find which indexer to properly query for acknowledgement details.
I'm also very new to writing Rust, so some of the stuff I've written might not be the best way to do things. Feedback is very welcome!
Change Type
Is this a breaking change?
How did you test this PR?
I tested with some added unit and integration tests, as well as this local config and running
cargo run --release -- --config test_config.yaml
. This was sending to our Splunk cluster with 50+ indexers behind an AWS ALB and no events were dropped or any error logs generated over ~ 10 hours:Does this PR include user facing changes?
References