Skip to content

Ensure all restHeaders from ActionPlugin.getRestHeaders are carried to threadContext for tracing #5396

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 9 commits into
base: main
Choose a base branch
from

Conversation

cwperks
Copy link
Member

@cwperks cwperks commented Jun 13, 2025

Description

This PR ensures that allowlisted headers from ActionPlugin.getRestHeaders() are carried from HTTP Headers -> ThreadContext headers in order to trace a request end-to-end through the cluster.

  • Category (Enhancement, New feature, Bug fix, Test fix, Refactoring, Maintenance, Documentation)

Bug fix

Issues Resolved

Resolves #4799

Check List

  • New functionality includes testing
  • New functionality has been documented
  • New Roles/Permissions have a corresponding security dashboards plugin PR
  • API changes companion pull request created
  • Commits are signed per the DCO using --signoff

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

…o threadContext for tracing

Signed-off-by: Craig Perkins <cwperx@amazon.com>
requestHeadersToCopy,
getThreadContext().getHeader(ConfigConstants.OPENDISTRO_SECURITY_REQUEST_HEADERS).split(",")
);
requestHeadersToCopy.remove(Task.X_OPAQUE_ID); // Special case where this header is preserved during stashContext.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes in this file are necessary because currently only X_OPAQUE_ID is not lost when stashing here: https://github.com/opensearch-project/OpenSearch/blob/main/server/src/main/java/org/opensearch/common/util/concurrent/ThreadContext.java#L157-L163

Ideally, this is solved another way through the core. Either:

  1. Change X_OPAQUE_ID to a persistent header and remove the logic in stashContext. persistentHeaders are a special category of headers that are not stashable.
  2. Update the logic in ThreadContext to not stash any of the headers from https://github.com/opensearch-project/OpenSearch/blob/main/server/src/main/java/org/opensearch/action/ActionModule.java#L581-L586

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO, option 1 is better.

@cwperks
Copy link
Member Author

cwperks commented Jun 13, 2025

@sgup432 This PR would need additional tests, but this is what the general changes could look like if just done within the confines of the security plugin.

cwperks added 3 commits June 13, 2025 13:52
Signed-off-by: Craig Perkins <cwperx@amazon.com>
Signed-off-by: Craig Perkins <cwperx@amazon.com>
Copy link

codecov bot commented Jun 20, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 72.77%. Comparing base (a581bc6) to head (560507d).
Report is 2 commits behind head on main.

Additional details and impacted files

Impacted file tree graph

@@           Coverage Diff           @@
##             main    #5396   +/-   ##
=======================================
  Coverage   72.76%   72.77%           
=======================================
  Files         397      397           
  Lines       24591    24606   +15     
  Branches     3741     3747    +6     
=======================================
+ Hits        17894    17906   +12     
- Misses       4867     4872    +5     
+ Partials     1830     1828    -2     
Files with missing lines Coverage Δ
...opensearch/security/filter/SecurityRestFilter.java 85.40% <100.00%> (+0.32%) ⬆️
...g/opensearch/security/support/ConfigConstants.java 95.23% <ø> (ø)
...search/security/transport/SecurityInterceptor.java 77.10% <100.00%> (+1.00%) ⬆️

... and 7 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@cwperks cwperks marked this pull request as ready for review June 23, 2025 17:43
@@ -130,7 +131,8 @@ public void testShouldAllowAtRestAndBlockAtTransport() {
// granted at Rest layer
auditLogsRule.assertExactlyOne(privilegePredicateRESTLayer(GRANTED_PRIVILEGES, REST_ONLY, GET, "/" + PROTECTED_API));
// missing at Transport layer
auditLogsRule.assertExactlyOne(
auditLogsRule.assertExactlyScanAll(
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was tricky fixing the tests in this file. The reason these started to fail is because of the logic here to separate REST-Layer audit logs from Transport-Layer audit logs. We should find a more robust way to handle this and aim to have robust assertions in the tests.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why scan all to only evaluate 1?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All of these assertions need to be revisited after this bugfix.

From what I've gathered, the test audit log tries to split messages into 2 distinct categories:

  • REST-Layer messages
  • Transport-Layer messages

The previous logic to sort these was not working properly and relied on the present of audit_transport_headers to determine if the message was from the transport-layer.

I'm using this method here because it was the only appropriate one I found, but I think the right fix is either not to separate the messages into each category or to make sure the logic to separate the messages is correct.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

okay, would you mind opening an issue (good-first issue?) to address this in future.

if (xOpaqueId != null) {
threadContext.putHeader(Task.X_OPAQUE_ID, xOpaqueId);
for (Map.Entry<String, String> header : tmpHeaders.entrySet()) {
threadContext.putHeader(header.getKey(), header.getValue());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

qq, will values be null for any header?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no, null is not possible. There could be empty

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does this need specific headers or all headers? Copying/populating all headers might be costly?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 Could this cause perf regression?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See the linked issue: #4799

There is a bug with ActionPlugin.getRestHeaders() where the headers are getting dropped because the security plugin is not carrying them from rest -> transport and then also propagating them internally in the transport layer.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't think this would have a high performance penalty.

requestHeadersToCopy,
getThreadContext().getHeader(ConfigConstants.OPENDISTRO_SECURITY_REQUEST_HEADERS).split(",")
);
requestHeadersToCopy.remove(Task.X_OPAQUE_ID); // Special case where this header is preserved during stashContext.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO, option 1 is better.

@@ -130,7 +131,8 @@ public void testShouldAllowAtRestAndBlockAtTransport() {
// granted at Rest layer
auditLogsRule.assertExactlyOne(privilegePredicateRESTLayer(GRANTED_PRIVILEGES, REST_ONLY, GET, "/" + PROTECTED_API));
// missing at Transport layer
auditLogsRule.assertExactlyOne(
auditLogsRule.assertExactlyScanAll(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why scan all to only evaluate 1?

if (xOpaqueId != null) {
threadContext.putHeader(Task.X_OPAQUE_ID, xOpaqueId);
for (Map.Entry<String, String> header : tmpHeaders.entrySet()) {
threadContext.putHeader(header.getKey(), header.getValue());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 Could this cause perf regression?

@@ -9,6 +9,7 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
- Create a mechanism for plugins to explicitly declare actions they need to perform with their assigned PluginSubject ([#5341](https://github.com/opensearch-project/security/pull/5341))

### Changed
- Ensure all restHeaders from ActionPlugin.getRestHeaders are carried to threadContext for tracing ([#5396](https://github.com/opensearch-project/security/pull/5396))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the motivation for this change? Without this how hard is it be debug?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See the issue as described here: #4799

WLM (workload management) found this issue bc they have logic to tag queries and assign them to groups. It was noticed that after overridding ActionPlugin.getRestHeaders() the the headers they expected to trace around with the entire request handling process (i.e. from rest down to transport and carried node-to-node) were getting dropped bc of logic in the security repo to control what headers are sent internally. Not all HTTP Headers should be copied to the threadcontext and carried around, but the one's explicitly allowed via ActionPlugin.getRestHeaders() should be.

cwperks and others added 5 commits July 20, 2025 20:37
Signed-off-by: Craig Perkins <cwperx@amazon.com>
Signed-off-by: Craig Perkins <cwperx@amazon.com>
Signed-off-by: Craig Perkins <cwperx@amazon.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] SecurityRestFilter drops the headers from ThreadContext
4 participants