Skip to content

Conversation

@AlonZivony
Copy link
Contributor

@AlonZivony AlonZivony commented Sep 17, 2025

1. Explain what the PR does

A simple implementation idea for supporting the changes of tail calls upon changes in the events in use.
This allows both changes by policy and fallbacks/failures to work properly with the tailcalls.

Implementation

As the changes are hard to track, this PR implement a very simple idea - instead of tracking the diff, we can create for each change the tailcalls from scratches.
It works like this:

  1. Add a watcher option for the dependency manager to track any change at all to the events (whether it is a fallback, add or remove).
  2. Add the logic of cleaning all of the tail call maps
  3. Repopulate the tail call maps with current values.

This might not be the most efficient solution, but it should be the easiest.
In the future we can think how to improve the efficiency with delaying the rebuilding of the maps until it is really needed so we can prevent multiple rebuildings in a row

2. Explain how to test it

3. Other comments

Copilot AI review requested due to automatic review settings September 17, 2025 17:08
@AlonZivony AlonZivony marked this pull request as draft September 17, 2025 17:08
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR implements support for dynamically updating eBPF tail calls when event dependencies change in the Tracee runtime security tool. The implementation adds state change notifications to the dependency manager and automatically rebuilds tail call mappings when events are added, removed, or when fallbacks are applied.

Key changes:

  • Added state change watchers to the dependency manager that trigger when the dependency tree changes
  • Implemented tail call map rebuilding functionality that clears and repopulates all tail call mappings
  • Connected dependency state changes to tail call updates through a subscription mechanism

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

File Description
pkg/events/dependencies/manager.go Added state change subscription mechanism and triggers for dependency tree modifications
pkg/ebpf/tracee.go Implemented tail call map rebuilding functionality and registered state change watcher

var keysToDelete []uint32
for iterator.Next() {
keyBytes := iterator.Key()
if len(keyBytes) >= 4 { // uint32 is 4 bytes
Copy link

Copilot AI Sep 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The unsafe pointer conversion lacks bounds checking. If keyBytes has fewer than 4 bytes, this will cause a buffer overflow. Add a check to ensure len(keyBytes) == 4 before the conversion.

Suggested change
if len(keyBytes) >= 4 { // uint32 is 4 bytes
if len(keyBytes) == 4 { // uint32 is 4 bytes

Copilot uses AI. Check for mistakes.
Comment on lines +1562 to +1565
err := t.rebuildAllTailCalls()
if err != nil {
logger.Errorw("Failed to rebuild tailcalls after dependency change", "error", err)
}
Copy link

Copilot AI Sep 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The callback function could potentially deadlock since it's called within the dependency manager's mutex and rebuildAllTailCalls() may need to access dependency state. Consider using a goroutine or channel to defer the tail call rebuilding to avoid potential deadlocks.

Suggested change
err := t.rebuildAllTailCalls()
if err != nil {
logger.Errorw("Failed to rebuild tailcalls after dependency change", "error", err)
}
go func() {
err := t.rebuildAllTailCalls()
if err != nil {
logger.Errorw("Failed to rebuild tailcalls after dependency change", "error", err)
}
}()

Copilot uses AI. Check for mistakes.
Comment on lines +698 to +706
// Now delete all the collected keys
for _, key := range keysToDelete {
err := bpfMap.DeleteKey(unsafe.Pointer(&key))
if err != nil {
// Log but don't fail on individual delete errors
logger.Debugw("Failed to delete tailcall map entry", "map", bpfMap.Name(), "key", key, "error", err)
}
}

Copy link

Copilot AI Sep 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Silently ignoring delete errors could mask serious issues like corrupted BPF maps or permission problems. Consider collecting and returning accumulated errors, or at least logging at a higher severity level for certain error types.

Suggested change
// Now delete all the collected keys
for _, key := range keysToDelete {
err := bpfMap.DeleteKey(unsafe.Pointer(&key))
if err != nil {
// Log but don't fail on individual delete errors
logger.Debugw("Failed to delete tailcall map entry", "map", bpfMap.Name(), "key", key, "error", err)
}
}
// Now delete all the collected keys, accumulate errors
var deleteErrors []error
for _, key := range keysToDelete {
err := bpfMap.DeleteKey(unsafe.Pointer(&key))
if err != nil {
// Log at warning level and accumulate error
logger.Warnw("Failed to delete tailcall map entry", "map", bpfMap.Name(), "key", key, "error", err)
deleteErrors = append(deleteErrors, fmt.Errorf("map %s key %d: %w", bpfMap.Name(), key, err))
}
}
if len(deleteErrors) > 0 {
return errors.Join(deleteErrors...)
}

Copilot uses AI. Check for mistakes.
@AlonZivony
Copy link
Contributor Author

I made it quickly with Cursor, so upon deciding that it is what we want to implement I will improve the names and logic.
This gives us a rough scale of the change.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant