New component: Enrichment Processor

### The purpose and use-cases of the new component

This issue is a follow up to a presentation regarding enhancing enrichment capabilities of the Collector on the Collector SIG meeting (July 23rd).
The feedback was to create an issue to get the discussion going, so here it is:

The OpenTelemetry Collector currently supports limited enrichment types, mostly focusing on self-contained parsing and contextual metadata. To improve the versatility of the Collector in comparison to other data collectors and transformation tools, we should expand its capabilities to include other enrichment types.

The original document [“Enrichment in OTel Collector”](https://docs.google.com/document/d/1fCV8R6YE56LQ4uCRwx8_QiCsvJc-lI4mAm3MHnIffzs/edit?usp=sharing) introduces a taxonomy of enrichment types and its support in the Collector:
	•	Type 1: Self-Contained Parsing & Derivation (supported)
	•	Type 2: Reference Data Lookup (Static or Semi-Static) (very limited support)
	•	Type 3: Dynamic External Enrichment (Live Lookups) (not supported)
	•	Type 4: Contextual Metadata Enrichment (supported)
	•	Type 5: Cross-Event Correlation & Aggregation (not supported)
	•	Type 6: Analytical & ML-Based Enrichment (not supported)

Of this list, looking at similar tools (comparison can be seen in the original document), type 2 and type 3 are the strongest candidates to include in the Collector to facilitate migration of workloads to the Collector from other tools.

From this problem statement we could consider introducing a Lookup Processor to aimed at handling both static reference data lookups (Type 2) and dynamic external enrichments (Type 3).

The processor would support:

* Local lookups: Using static or semi-static data sources such as CSV, JSON, or inline key/value pairs.
* Remote lookups: Dynamic enrichment from APIs, DNS, databases, or cache systems like Redis or Memcached.

### Example configuration for the component

```yaml
processors:
  lookupprocessor/json:
    source: json
    path: "/tmp/file.json"
    json_path: .process.name
    source_attribute: process.name
    target_attribute: aws.log.group.names
    timeout: 25ms
    refresh_interval: 1h

  lookupprocessor/http:
    source: http
    url: "https://my.url/query"
    params:
      org: resource.attributes["foo"]
    target_attribute: org
    timeout: 1s
    cache:
      size: 100
      ttl: 60 # seconds
```

Implementing such a processor requires significant considerations to abide by the [long term vision](https://github.com/open-telemetry/opentelemetry-collector/blob/main/docs/vision.md), namely:
	•	Progressive implementation, starting with basic local and then remote lookup capabilities.
	•	Modular structure facilitating easy addition of new lookup sources.
	•	Built-in caching and timeout mechanisms for performance optimization.
	•	Comprehensive and useful observability metrics (success/failure rates, latency percentiles).

This is not an exhaustive list of concerns and neither does it provide solutions, just an acknowledgment that these will have to be addressed.

### Telemetry data types supported
Logs, Metrics, Traces, Profiles

### Code Owner(s)
@jsvd 

### Sponsor (optional)

_No response_

### Additional context

Similar ideas and alternative solutions have been suggested in the past:

* https://github.com/open-telemetry/opentelemetry-collector-contrib/issues/20888
* https://github.com/open-telemetry/opentelemetry-collector-contrib/issues/18526
* https://github.com/open-telemetry/opentelemetry-collector-contrib/issues/29627
* https://github.com/open-telemetry/opentelemetry-collector-contrib/issues/40526
* https://github.com/open-telemetry/opentelemetry-collector-contrib/issues/40936
* https://github.com/open-telemetry/opentelemetry-collector-contrib/pull/34398

### Tip

<sub>[React](https://github.blog/news-insights/product-news/add-reactions-to-pull-requests-issues-and-comments/) with 👍 to help prioritize this issue. Please use comments to provide useful context, avoiding `+1` or `me too`, to help us triage it. Learn more [here](https://opentelemetry.io/community/end-user/issue-participation/).</sub>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

New component: Enrichment Processor #41816

The purpose and use-cases of the new component

Example configuration for the component

Telemetry data types supported

Code Owner(s)

Sponsor (optional)

Additional context

Tip

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

New component: Enrichment Processor #41816

Description

The purpose and use-cases of the new component

Example configuration for the component

Telemetry data types supported

Code Owner(s)

Sponsor (optional)

Additional context

Tip

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions