From 4e3618d45a1fa812f6b7403c293599915ff364c5 Mon Sep 17 00:00:00 2001 From: Nate Archer Date: Mon, 26 May 2025 08:46:54 -0500 Subject: [PATCH 1/5] Rework Data Prepper section Signed-off-by: Nate Archer --- _data-prepper/getting-started.md | 3 + _data-prepper/getting-started/concepts.md | 20 +++ .../getting-started/getting-started.md | 9 ++ .../getting-started/install-and-configure.md | 52 ++++++++ .../getting-started/run-data-prepper.md | 124 ++++++++++++++++++ _data-prepper/index.md | 55 +------- .../configuring-log4j.md | 2 + .../managing-data-prepper.md | 4 +- .../managing-data-prepper/monitoring.md | 2 + .../managing-data-prepper/peer-forwarder.md | 2 + .../source-coordination.md | 2 + .../pipelines/{ => functions}/cidrcontains.md | 0 .../pipelines/{ => functions}/contains.md | 0 .../pipelines/{ => functions}/functions.md | 2 + .../pipelines/{ => functions}/get-metadata.md | 0 .../pipelines/{ => functions}/has-tags.md | 0 .../pipelines/{ => functions}/join.md | 0 .../pipelines/{ => functions}/length.md | 0 18 files changed, 223 insertions(+), 54 deletions(-) create mode 100644 _data-prepper/getting-started/concepts.md create mode 100644 _data-prepper/getting-started/getting-started.md create mode 100644 _data-prepper/getting-started/install-and-configure.md create mode 100644 _data-prepper/getting-started/run-data-prepper.md rename _data-prepper/pipelines/{ => functions}/cidrcontains.md (100%) rename _data-prepper/pipelines/{ => functions}/contains.md (100%) rename _data-prepper/pipelines/{ => functions}/functions.md (96%) rename _data-prepper/pipelines/{ => functions}/get-metadata.md (100%) rename _data-prepper/pipelines/{ => functions}/has-tags.md (100%) rename _data-prepper/pipelines/{ => functions}/join.md (100%) rename _data-prepper/pipelines/{ => functions}/length.md (100%) diff --git a/_data-prepper/getting-started.md b/_data-prepper/getting-started.md index 5dc90316d0f..10aea92bdf9 100644 --- a/_data-prepper/getting-started.md +++ b/_data-prepper/getting-started.md @@ -6,6 +6,8 @@ redirect_from: - /clients/data-prepper/get-started/ --- + + # Getting started with OpenSearch Data Prepper OpenSearch Data Prepper is an independent component, not an OpenSearch plugin, that converts data for use with OpenSearch. It's not bundled with the all-in-one OpenSearch installation packages. @@ -54,6 +56,7 @@ Configuration files are read from specific subdirectories in the application's h 2. `config/data-prepper-config.yaml`: Used for the Data Prepper server configuration. You can supply your own pipeline configuration file path followed by the server configuration file path. However, this method will not be supported in a future release. See the following example: + ``` bin/data-prepper pipelines.yaml data-prepper-config.yaml ``` diff --git a/_data-prepper/getting-started/concepts.md b/_data-prepper/getting-started/concepts.md new file mode 100644 index 00000000000..d15e074d2be --- /dev/null +++ b/_data-prepper/getting-started/concepts.md @@ -0,0 +1,20 @@ +--- +layout: default +title: Concepts +nav_order: 10 +grand_parent: OpenSearch Data Prepper +parent: Getting started with OpenSearch Data Prepper +--- + +## Key concepts and fundamentals + +Data Prepper ingests data through customizable [pipelines]({{site.url}}{{site.baseurl}}/data-prepper/pipelines/pipelines/). These pipelines consist of pluggable components that you can customize to fit your needs, even allowing you to plug in your own implementations. A Data Prepper pipeline consists of the following components: + +- One [source]({{site.url}}{{site.baseurl}}/data-prepper/pipelines/configuration/sources/sources/) +- One or more [sinks]({{site.url}}{{site.baseurl}}/data-prepper/pipelines/configuration/sinks/sinks/) +- (Optional) One [buffer]({{site.url}}{{site.baseurl}}/data-prepper/pipelines/configuration/buffers/buffers/) +- (Optional) One or more [processors]({{site.url}}{{site.baseurl}}/data-prepper/pipelines/configuration/processors/processors/) + +Each pipeline contains two required components: `source` and `sink`. If a `buffer`, a `processor`, or both are missing from the pipeline, then Data Prepper uses the default `bounded_blocking` buffer and a no-op processor. Note that a single instance of Data Prepper can have one or more pipelines. + + \ No newline at end of file diff --git a/_data-prepper/getting-started/getting-started.md b/_data-prepper/getting-started/getting-started.md new file mode 100644 index 00000000000..790f35cd522 --- /dev/null +++ b/_data-prepper/getting-started/getting-started.md @@ -0,0 +1,9 @@ +--- +layout: default +title: Getting started with OpenSearch Data Prepper +nav_order: 5 +parent: OpenSearch Data Prepper +has_children: yes +redirect_from: + - /clients/data-prepper/get-started/ +--- \ No newline at end of file diff --git a/_data-prepper/getting-started/install-and-configure.md b/_data-prepper/getting-started/install-and-configure.md new file mode 100644 index 00000000000..f36f01a251a --- /dev/null +++ b/_data-prepper/getting-started/install-and-configure.md @@ -0,0 +1,52 @@ +## 1. Installing Data Prepper + +There are two ways to install Data Prepper: you can run the Docker image or build from source. + +The easiest way to use Data Prepper is by running the Docker image. We suggest that you use this approach if you have [Docker](https://www.docker.com) available. Run the following command: + +``` +docker pull opensearchproject/data-prepper:latest +``` +{% include copy.html %} + +If you have special requirements that require you to build from source, or if you want to contribute, see the [Developer Guide](https://github.com/opensearch-project/data-prepper/blob/main/docs/developer_guide.md). + +## 2. Configuring Data Prepper + +Two configuration files are required to run a Data Prepper instance. Optionally, you can configure a Log4j 2 configuration file. See [Configuring Log4j]({{site.url}}{{site.baseurl}}/data-prepper/managing-data-prepper/configuring-log4j/) for more information. The following list describes the purpose of each configuration file: + +* `pipelines.yaml`: This file describes which data pipelines to run, including sources, processors, and sinks. +* `data-prepper-config.yaml`: This file contains Data Prepper server settings that allow you to interact with exposed Data Prepper server APIs. +* `log4j2-rolling.properties` (optional): This file contains Log4j 2 configuration options and can be a JSON, YAML, XML, or .properties file type. + +For Data Prepper versions earlier than 2.0, the `.jar` file expects the pipeline configuration file path to be followed by the server configuration file path. See the following configuration path example: + +``` +java -jar data-prepper-core-$VERSION.jar pipelines.yaml data-prepper-config.yaml +``` + +Optionally, you can add `"-Dlog4j.configurationFile=config/log4j2.properties"` to the command to pass a custom Log4j 2 configuration file. If you don't provide a properties file, Data Prepper defaults to the `log4j2.properties` file in the `shared-config` directory. + + +Starting with Data Prepper 2.0, you can launch Data Prepper by using the following `data-prepper` script that does not require any additional command line arguments: + +``` +bin/data-prepper +``` + +Configuration files are read from specific subdirectories in the application's home directory: +1. `pipelines/`: Used for pipeline configurations. Pipeline configurations can be written in one or more YAML files. +2. `config/data-prepper-config.yaml`: Used for the Data Prepper server configuration. + +You can supply your own pipeline configuration file path followed by the server configuration file path. However, this method will not be supported in a future release. See the following example: + +``` +bin/data-prepper pipelines.yaml data-prepper-config.yaml +``` + +The Log4j 2 configuration file is read from the `config/log4j2.properties` file located in the application's home directory. + +To configure Data Prepper, see the following information for each use case: + +* [Trace analytics]({{site.url}}{{site.baseurl}}/data-prepper/common-use-cases/trace-analytics/): Learn how to collect trace data and customize a pipeline that ingests and transforms that data. +* [Log analytics]({{site.url}}{{site.baseurl}}/data-prepper/common-use-cases/log-analytics/): Learn how to set up Data Prepper for log observability. \ No newline at end of file diff --git a/_data-prepper/getting-started/run-data-prepper.md b/_data-prepper/getting-started/run-data-prepper.md new file mode 100644 index 00000000000..476604a9370 --- /dev/null +++ b/_data-prepper/getting-started/run-data-prepper.md @@ -0,0 +1,124 @@ +--- +layout: default +title: Running Data Prepper +nav_order: 15 +grand_parent: OpenSearch Data Prepper +parent: Getting started with OpenSearch Data Prepper +--- + +## Defining a pipeline + +Create a Data Prepper pipeline file named `pipelines.yaml`, similar to the following sample configuration: + +```yml +simple-sample-pipeline: + workers: 2 + delay: "5000" + source: + random: + sink: + - stdout: +``` +{% include copy.html %} + +### Basic pipeline configurations + +To understand how the pipeline components function within a Data Prepper configuration, see the following examples. Each pipeline configuration uses a `yaml` file format. For more information, see [Pipelines]({{site.url}}{{site.baseurl}}/data-prepper/pipelines/pipelines/) for more information and examples. + +#### Minimal configuration + +The following minimal pipeline configuration reads from the file source and writes the data to another file on the same path. It uses the default options for the `buffer` and `processor` components. + +```yml +sample-pipeline: + source: + file: + path: + sink: + - file: + path: +``` + +#### Comprehensive configuration + +The following comprehensive pipeline configuration uses both required and optional components: + +```yml +sample-pipeline: + workers: 4 # Number of workers + delay: 100 # in milliseconds, how often the workers should run + source: + file: + path: + buffer: + bounded_blocking: + buffer_size: 1024 # max number of events the buffer will accept + batch_size: 256 # max number of events the buffer will drain for each read + processor: + - string_converter: + upper_case: true + sink: + - file: + path: +``` + +In the given pipeline configuration, the `source` component reads string events from the `input-file` and pushes the data to a bounded buffer with a maximum size of `1024`. The `workers` component specifies `4` concurrent threads that will process events from the buffer, each reading a maximum of `256` events from the buffer every `100` milliseconds. Each `workers` component runs the `string_converter` processor, which converts the strings to uppercase and writes the processed output to the `output-file`. + +## 4. Running Data Prepper + +Run the following command with your pipeline configuration YAML. + +```bash +docker run --name data-prepper \ + -v /${PWD}/pipelines.yaml:/usr/share/data-prepper/pipelines/pipelines.yaml \ + opensearchproject/data-prepper:latest + +``` +{% include copy.html %} + +The example pipeline configuration above demonstrates a simple pipeline with a source (`random`) sending data to a sink (`stdout`). For examples of more advanced pipeline configurations, see [Pipelines]({{site.url}}{{site.baseurl}}/clients/data-prepper/pipelines/). + +After starting Data Prepper, you should see log output and some UUIDs after a few seconds: + +```yml +2021-09-30T20:19:44,147 [main] INFO com.amazon.dataprepper.pipeline.server.DataPrepperServer - Data Prepper server running at :4900 +2021-09-30T20:19:44,681 [random-source-pool-0] INFO com.amazon.dataprepper.plugins.source.RandomStringSource - Writing to buffer +2021-09-30T20:19:45,183 [random-source-pool-0] INFO com.amazon.dataprepper.plugins.source.RandomStringSource - Writing to buffer +2021-09-30T20:19:45,687 [random-source-pool-0] INFO com.amazon.dataprepper.plugins.source.RandomStringSource - Writing to buffer +2021-09-30T20:19:46,191 [random-source-pool-0] INFO com.amazon.dataprepper.plugins.source.RandomStringSource - Writing to buffer +2021-09-30T20:19:46,694 [random-source-pool-0] INFO com.amazon.dataprepper.plugins.source.RandomStringSource - Writing to buffer +2021-09-30T20:19:47,200 [random-source-pool-0] INFO com.amazon.dataprepper.plugins.source.RandomStringSource - Writing to buffer +2021-09-30T20:19:49,181 [simple-test-pipeline-processor-worker-1-thread-1] INFO com.amazon.dataprepper.pipeline.ProcessWorker - simple-test-pipeline Worker: Processing 6 records from buffer +07dc0d37-da2c-447e-a8df-64792095fb72 +5ac9b10a-1d21-4306-851a-6fb12f797010 +99040c79-e97b-4f1d-a70b-409286f2a671 +5319a842-c028-4c17-a613-3ef101bd2bdd +e51e700e-5cab-4f6d-879a-1c3235a77d18 +b4ed2d7e-cf9c-4e9d-967c-b18e8af35c90 +``` +The remainder of this page provides examples for running Data Prepper from the Docker image. If you +built it from source, refer to the [Developer Guide](https://github.com/opensearch-project/data-prepper/blob/main/docs/developer_guide.md) for more information. + +However you configure your pipeline, you'll run Data Prepper the same way. You run the Docker +image and modify both the `pipelines.yaml` and `data-prepper-config.yaml` files. + +For Data Prepper 2.0 or later, use this command: + +``` +docker run --name data-prepper -p 4900:4900 -v ${PWD}/pipelines.yaml:/usr/share/data-prepper/pipelines/pipelines.yaml -v ${PWD}/data-prepper-config.yaml:/usr/share/data-prepper/config/data-prepper-config.yaml opensearchproject/data-prepper:latest +``` +{% include copy.html %} + +For Data Prepper versions earlier than 2.0, use this command: + +``` +docker run --name data-prepper -p 4900:4900 -v ${PWD}/pipelines.yaml:/usr/share/data-prepper/pipelines.yaml -v ${PWD}/data-prepper-config.yaml:/usr/share/data-prepper/data-prepper-config.yaml opensearchproject/data-prepper:1.x +``` +{% include copy.html %} + +Once Data Prepper is running, it processes data until it is shut down. Once you are done, shut it down with the following command: + +``` +POST /shutdown +``` +{% include copy-curl.html %} \ No newline at end of file diff --git a/_data-prepper/index.md b/_data-prepper/index.md index 63ff2fd07c1..8b9df7eaaa2 100644 --- a/_data-prepper/index.md +++ b/_data-prepper/index.md @@ -16,61 +16,10 @@ redirect_from: OpenSearch Data Prepper is a server-side data collector capable of filtering, enriching, transforming, normalizing, and aggregating data for downstream analysis and visualization. Data Prepper is the preferred data ingestion tool for OpenSearch. It is recommended for most data ingestion use cases in OpenSearch and for processing large, complex datasets. -With Data Prepper you can build custom pipelines to improve the operational view of applications. Two common use cases for Data Prepper are trace analytics and log analytics. [Trace analytics]({{site.url}}{{site.baseurl}}/data-prepper/common-use-cases/trace-analytics/) can help you visualize event flows and identify performance problems. [Log analytics]({{site.url}}{{site.baseurl}}/data-prepper/common-use-cases/log-analytics/) equips you with tools to enhance your search capabilities, conduct comprehensive analysis, and gain insights into your applications' performance and behavior. +With Data Prepper you can build custom pipelines to improve the operational view of applications. Two common use cases for Data Prepper are trace analytics and log analytics. -## Key concepts and fundamentals +- [Trace analytics]({{site.url}}{{site.baseurl}}/data-prepper/common-use-cases/trace-analytics/) can help you visualize event flows and identify performance problems. -- - [Log analytics]({{site.url}}{{site.baseurl}}/data-prepper/common-use-cases/log-analytics/) equips you with tools to enhance your search capabilities, conduct comprehensive analysis, and gain insights into your applications' performance and behavior. -Data Prepper ingests data through customizable [pipelines]({{site.url}}{{site.baseurl}}/data-prepper/pipelines/pipelines/). These pipelines consist of pluggable components that you can customize to fit your needs, even allowing you to plug in your own implementations. A Data Prepper pipeline consists of the following components: - -- One [source]({{site.url}}{{site.baseurl}}/data-prepper/pipelines/configuration/sources/sources/) -- One or more [sinks]({{site.url}}{{site.baseurl}}/data-prepper/pipelines/configuration/sinks/sinks/) -- (Optional) One [buffer]({{site.url}}{{site.baseurl}}/data-prepper/pipelines/configuration/buffers/buffers/) -- (Optional) One or more [processors]({{site.url}}{{site.baseurl}}/data-prepper/pipelines/configuration/processors/processors/) - -Each pipeline contains two required components: `source` and `sink`. If a `buffer`, a `processor`, or both are missing from the pipeline, then Data Prepper uses the default `bounded_blocking` buffer and a no-op processor. Note that a single instance of Data Prepper can have one or more pipelines. - -## Basic pipeline configurations - -To understand how the pipeline components function within a Data Prepper configuration, see the following examples. Each pipeline configuration uses a `yaml` file format. For more information, see [Pipelines]({{site.url}}{{site.baseurl}}/data-prepper/pipelines/pipelines/) for more information and examples. - -### Minimal configuration - -The following minimal pipeline configuration reads from the file source and writes the data to another file on the same path. It uses the default options for the `buffer` and `processor` components. - -```yml -sample-pipeline: - source: - file: - path: - sink: - - file: - path: -``` - -### Comprehensive configuration - -The following comprehensive pipeline configuration uses both required and optional components: - -```yml -sample-pipeline: - workers: 4 # Number of workers - delay: 100 # in milliseconds, how often the workers should run - source: - file: - path: - buffer: - bounded_blocking: - buffer_size: 1024 # max number of events the buffer will accept - batch_size: 256 # max number of events the buffer will drain for each read - processor: - - string_converter: - upper_case: true - sink: - - file: - path: -``` - -In the given pipeline configuration, the `source` component reads string events from the `input-file` and pushes the data to a bounded buffer with a maximum size of `1024`. The `workers` component specifies `4` concurrent threads that will process events from the buffer, each reading a maximum of `256` events from the buffer every `100` milliseconds. Each `workers` component runs the `string_converter` processor, which converts the strings to uppercase and writes the processed output to the `output-file`. ## Next steps diff --git a/_data-prepper/managing-data-prepper/configuring-log4j.md b/_data-prepper/managing-data-prepper/configuring-log4j.md index fe256e0da5e..ac4010bed05 100644 --- a/_data-prepper/managing-data-prepper/configuring-log4j.md +++ b/_data-prepper/managing-data-prepper/configuring-log4j.md @@ -5,6 +5,8 @@ parent: Managing OpenSearch Data Prepper nav_order: 20 --- + + # Configuring Log4j You can configure logging using Log4j in OpenSearch Data Prepper. diff --git a/_data-prepper/managing-data-prepper/managing-data-prepper.md b/_data-prepper/managing-data-prepper/managing-data-prepper.md index 204510be248..1e42b965f7e 100644 --- a/_data-prepper/managing-data-prepper/managing-data-prepper.md +++ b/_data-prepper/managing-data-prepper/managing-data-prepper.md @@ -7,4 +7,6 @@ nav_order: 20 # Managing OpenSearch Data Prepper -You can perform administrator functions for OpenSearch Data Prepper, including system configuration, interacting with core APIs, Log4j configuration, and monitoring. You can set up peer forwarding to coordinate multiple Data Prepper nodes when using stateful aggregation. \ No newline at end of file +You can perform administrator functions for OpenSearch Data Prepper, including system configuration, interacting with core APIs, Log4j configuration, and monitoring. You can set up peer forwarding to coordinate multiple Data Prepper nodes when using stateful aggregation. + + \ No newline at end of file diff --git a/_data-prepper/managing-data-prepper/monitoring.md b/_data-prepper/managing-data-prepper/monitoring.md index cb29e49a518..10199b90100 100644 --- a/_data-prepper/managing-data-prepper/monitoring.md +++ b/_data-prepper/managing-data-prepper/monitoring.md @@ -4,6 +4,8 @@ title: Monitoring parent: Managing OpenSearch Data Prepper nav_order: 25 --- + + # Monitoring OpenSearch Data Prepper with metrics diff --git a/_data-prepper/managing-data-prepper/peer-forwarder.md b/_data-prepper/managing-data-prepper/peer-forwarder.md index 9d54aef87c9..13ad9361f7e 100644 --- a/_data-prepper/managing-data-prepper/peer-forwarder.md +++ b/_data-prepper/managing-data-prepper/peer-forwarder.md @@ -5,6 +5,8 @@ nav_order: 12 parent: Managing OpenSearch Data Prepper --- + + # Peer forwarder Peer forwarder is an HTTP service that performs peer forwarding of an `event` between OpenSearch Data Prepper nodes for aggregation. This HTTP service uses a hash-ring approach to aggregate events and determine which Data Prepper node it should handle on a given trace before rerouting it to that node. Currently, peer forwarder is supported by the `aggregate`, `service_map_stateful`, and `otel_traces_raw` [processors]({{site.url}}{{site.baseurl}}/data-prepper/pipelines/configuration/processors/processors/). diff --git a/_data-prepper/managing-data-prepper/source-coordination.md b/_data-prepper/managing-data-prepper/source-coordination.md index 5dc85e50a7c..5a77ba501e1 100644 --- a/_data-prepper/managing-data-prepper/source-coordination.md +++ b/_data-prepper/managing-data-prepper/source-coordination.md @@ -5,6 +5,8 @@ nav_order: 35 parent: Managing OpenSearch Data Prepper --- + + # Source coordination _Source coordination_ is the concept of coordinating and distributing work between OpenSearch Data Prepper data sources in a multi-node environment. Some data sources, such as Amazon Kinesis or Amazon Simple Queue Service (Amazon SQS), handle coordination natively. Other data sources, such as OpenSearch, Amazon Simple Storage Service (Amazon S3), Amazon DynamoDB, and JDBC/ODBC, do not support source coordination. diff --git a/_data-prepper/pipelines/cidrcontains.md b/_data-prepper/pipelines/functions/cidrcontains.md similarity index 100% rename from _data-prepper/pipelines/cidrcontains.md rename to _data-prepper/pipelines/functions/cidrcontains.md diff --git a/_data-prepper/pipelines/contains.md b/_data-prepper/pipelines/functions/contains.md similarity index 100% rename from _data-prepper/pipelines/contains.md rename to _data-prepper/pipelines/functions/contains.md diff --git a/_data-prepper/pipelines/functions.md b/_data-prepper/pipelines/functions/functions.md similarity index 96% rename from _data-prepper/pipelines/functions.md rename to _data-prepper/pipelines/functions/functions.md index caed78ac550..44689578609 100644 --- a/_data-prepper/pipelines/functions.md +++ b/_data-prepper/pipelines/functions/functions.md @@ -10,6 +10,8 @@ has_children: true OpenSearch Data Prepper offers a range of built-in functions that can be used within expressions to perform common data preprocessing tasks, such as calculating lengths, checking for tags, retrieving metadata, searching for substrings, checking IP address ranges, and joining list elements. These functions include the following: + + - [`cidrContains()`]({{site.url}}{{site.baseurl}}/data-prepper/pipelines/cidrcontains/) - [`contains()`]({{site.url}}{{site.baseurl}}/data-prepper/pipelines/contains/) - [`getMetadata()`]({{site.url}}{{site.baseurl}}/data-prepper/pipelines/get-metadata/) diff --git a/_data-prepper/pipelines/get-metadata.md b/_data-prepper/pipelines/functions/get-metadata.md similarity index 100% rename from _data-prepper/pipelines/get-metadata.md rename to _data-prepper/pipelines/functions/get-metadata.md diff --git a/_data-prepper/pipelines/has-tags.md b/_data-prepper/pipelines/functions/has-tags.md similarity index 100% rename from _data-prepper/pipelines/has-tags.md rename to _data-prepper/pipelines/functions/has-tags.md diff --git a/_data-prepper/pipelines/join.md b/_data-prepper/pipelines/functions/join.md similarity index 100% rename from _data-prepper/pipelines/join.md rename to _data-prepper/pipelines/functions/join.md diff --git a/_data-prepper/pipelines/length.md b/_data-prepper/pipelines/functions/length.md similarity index 100% rename from _data-prepper/pipelines/length.md rename to _data-prepper/pipelines/functions/length.md From 54a4ef37bfdfb5c6ad70431c52fccc977d26b6f9 Mon Sep 17 00:00:00 2001 From: Archer Date: Thu, 19 Jun 2025 12:10:17 -0500 Subject: [PATCH 2/5] Fix links Signed-off-by: Archer --- _data-prepper/getting-started/concepts.md | 2 +- _data-prepper/getting-started/install-and-configure.md | 9 +++++++++ .../managing-data-prepper/configuring-data-prepper.md | 2 +- _data-prepper/pipelines/functions/cidrcontains.md | 2 ++ _data-prepper/pipelines/functions/contains.md | 2 ++ _data-prepper/pipelines/functions/functions.md | 2 ++ _data-prepper/pipelines/functions/get-metadata.md | 2 ++ _data-prepper/pipelines/functions/has-tags.md | 2 ++ _data-prepper/pipelines/functions/join.md | 2 ++ _data-prepper/pipelines/functions/length.md | 2 ++ 10 files changed, 25 insertions(+), 2 deletions(-) diff --git a/_data-prepper/getting-started/concepts.md b/_data-prepper/getting-started/concepts.md index d15e074d2be..10b2d9e4b14 100644 --- a/_data-prepper/getting-started/concepts.md +++ b/_data-prepper/getting-started/concepts.md @@ -6,7 +6,7 @@ grand_parent: OpenSearch Data Prepper parent: Getting started with OpenSearch Data Prepper --- -## Key concepts and fundamentals +# Key concepts and fundamentals Data Prepper ingests data through customizable [pipelines]({{site.url}}{{site.baseurl}}/data-prepper/pipelines/pipelines/). These pipelines consist of pluggable components that you can customize to fit your needs, even allowing you to plug in your own implementations. A Data Prepper pipeline consists of the following components: diff --git a/_data-prepper/getting-started/install-and-configure.md b/_data-prepper/getting-started/install-and-configure.md index f36f01a251a..f2da871f510 100644 --- a/_data-prepper/getting-started/install-and-configure.md +++ b/_data-prepper/getting-started/install-and-configure.md @@ -1,3 +1,12 @@ +--- +layout: default +title: Install and configure OpenSearch Data Prepper +nav_order: 10 +grand_parent: OpenSearch Data Prepper +parent: Getting started with OpenSearch Data Prepper +--- + + ## 1. Installing Data Prepper There are two ways to install Data Prepper: you can run the Docker image or build from source. diff --git a/_data-prepper/managing-data-prepper/configuring-data-prepper.md b/_data-prepper/managing-data-prepper/configuring-data-prepper.md index ab5f3aa0667..52b3f775d3b 100644 --- a/_data-prepper/managing-data-prepper/configuring-data-prepper.md +++ b/_data-prepper/managing-data-prepper/configuring-data-prepper.md @@ -103,7 +103,7 @@ check_interval | No | Duration | Specifies the time between checks of the heap s ### Extension plugins -Data Prepper provides support for user-configurable extension plugins. Extension plugins are common configurations shared across pipeline plugins, such as [sources, buffers, processors, and sinks]({{site.url}}{{site.baseurl}}/data-prepper/index/#key-concepts-and-fundamentals). +Data Prepper provides support for user-configurable extension plugins. Extension plugins are common configurations shared across pipeline plugins, such as [sources, buffers, processors, and sinks]({{site.url}}{{site.baseurl}}/data-prepper/getting-started/concepts/). ### AWS extension plugins diff --git a/_data-prepper/pipelines/functions/cidrcontains.md b/_data-prepper/pipelines/functions/cidrcontains.md index 898f1bc1f58..65a1a7eb606 100644 --- a/_data-prepper/pipelines/functions/cidrcontains.md +++ b/_data-prepper/pipelines/functions/cidrcontains.md @@ -4,6 +4,8 @@ title: cidrContains() parent: Functions grand_parent: Pipelines nav_order: 5 +redirect_from: + - /data-prepper/pipelines/cidrcontains/ --- # cidrContains() diff --git a/_data-prepper/pipelines/functions/contains.md b/_data-prepper/pipelines/functions/contains.md index 657f66bd28e..565acdc08ee 100644 --- a/_data-prepper/pipelines/functions/contains.md +++ b/_data-prepper/pipelines/functions/contains.md @@ -4,6 +4,8 @@ title: contains() parent: Functions grand_parent: Pipelines nav_order: 10 +redirect_from: + - /data-prepper/pipelines/contains/ --- # contains() diff --git a/_data-prepper/pipelines/functions/functions.md b/_data-prepper/pipelines/functions/functions.md index 44689578609..1a12278207e 100644 --- a/_data-prepper/pipelines/functions/functions.md +++ b/_data-prepper/pipelines/functions/functions.md @@ -4,6 +4,8 @@ title: Functions parent: Pipelines nav_order: 10 has_children: true +redirect_from: + - /data-prepper/pipelines/functions/ --- # Functions diff --git a/_data-prepper/pipelines/functions/get-metadata.md b/_data-prepper/pipelines/functions/get-metadata.md index fc89ed51d6c..e0753322050 100644 --- a/_data-prepper/pipelines/functions/get-metadata.md +++ b/_data-prepper/pipelines/functions/get-metadata.md @@ -4,6 +4,8 @@ title: getMetadata() parent: Functions grand_parent: Pipelines nav_order: 15 +redirect_from: + - /data-prepper/pipelines/get-metadata/ --- # getMetadata() diff --git a/_data-prepper/pipelines/functions/has-tags.md b/_data-prepper/pipelines/functions/has-tags.md index 85058429936..e65b541f5d9 100644 --- a/_data-prepper/pipelines/functions/has-tags.md +++ b/_data-prepper/pipelines/functions/has-tags.md @@ -4,6 +4,8 @@ title: hasTags() parent: Functions grand_parent: Pipelines nav_order: 20 +redirect_from: + - /data-prepper/pipelines/has-tags/ --- # hasTags() diff --git a/_data-prepper/pipelines/functions/join.md b/_data-prepper/pipelines/functions/join.md index 3a4d77d5c2e..17305bec3d0 100644 --- a/_data-prepper/pipelines/functions/join.md +++ b/_data-prepper/pipelines/functions/join.md @@ -4,6 +4,8 @@ title: join() parent: Functions grand_parent: Pipelines nav_order: 25 +redirect_from: + - /data-prepper/pipelines/join/ --- # join() diff --git a/_data-prepper/pipelines/functions/length.md b/_data-prepper/pipelines/functions/length.md index fca4b10df2a..53a620687f2 100644 --- a/_data-prepper/pipelines/functions/length.md +++ b/_data-prepper/pipelines/functions/length.md @@ -4,6 +4,8 @@ title: length() parent: Functions grand_parent: Pipelines nav_order: 30 +redirect_from: + - /data-prepper/pipelines/length/ --- # length() From 793150b30e27c0050e6b75455190248ce5c9dbd3 Mon Sep 17 00:00:00 2001 From: Archer Date: Thu, 19 Jun 2025 12:38:34 -0500 Subject: [PATCH 3/5] Add cards and intros for new pages Signed-off-by: Archer --- _data-prepper/getting-started.md | 163 ------------------ .../getting-started/getting-started.md | 21 ++- .../getting-started/install-and-configure.md | 7 + .../getting-started/run-data-prepper.md | 4 + _data-prepper/index.md | 26 ++- .../pipelines/functions/functions.md | 30 +++- 6 files changed, 72 insertions(+), 179 deletions(-) delete mode 100644 _data-prepper/getting-started.md diff --git a/_data-prepper/getting-started.md b/_data-prepper/getting-started.md deleted file mode 100644 index 10aea92bdf9..00000000000 --- a/_data-prepper/getting-started.md +++ /dev/null @@ -1,163 +0,0 @@ ---- -layout: default -title: Getting started with OpenSearch Data Prepper -nav_order: 5 -redirect_from: - - /clients/data-prepper/get-started/ ---- - - - -# Getting started with OpenSearch Data Prepper - -OpenSearch Data Prepper is an independent component, not an OpenSearch plugin, that converts data for use with OpenSearch. It's not bundled with the all-in-one OpenSearch installation packages. - -If you are migrating from Open Distro Data Prepper, see [Migrating from Open Distro]({{site.url}}{{site.baseurl}}/data-prepper/migrate-open-distro/). -{: .note} - -## 1. Installing Data Prepper - -There are two ways to install Data Prepper: you can run the Docker image or build from source. - -The easiest way to use Data Prepper is by running the Docker image. We suggest that you use this approach if you have [Docker](https://www.docker.com) available. Run the following command: - -``` -docker pull opensearchproject/data-prepper:latest -``` -{% include copy.html %} - -If you have special requirements that require you to build from source, or if you want to contribute, see the [Developer Guide](https://github.com/opensearch-project/data-prepper/blob/main/docs/developer_guide.md). - -## 2. Configuring Data Prepper - -Two configuration files are required to run a Data Prepper instance. Optionally, you can configure a Log4j 2 configuration file. See [Configuring Log4j]({{site.url}}{{site.baseurl}}/data-prepper/managing-data-prepper/configuring-log4j/) for more information. The following list describes the purpose of each configuration file: - -* `pipelines.yaml`: This file describes which data pipelines to run, including sources, processors, and sinks. -* `data-prepper-config.yaml`: This file contains Data Prepper server settings that allow you to interact with exposed Data Prepper server APIs. -* `log4j2-rolling.properties` (optional): This file contains Log4j 2 configuration options and can be a JSON, YAML, XML, or .properties file type. - -For Data Prepper versions earlier than 2.0, the `.jar` file expects the pipeline configuration file path to be followed by the server configuration file path. See the following configuration path example: - -``` -java -jar data-prepper-core-$VERSION.jar pipelines.yaml data-prepper-config.yaml -``` - -Optionally, you can add `"-Dlog4j.configurationFile=config/log4j2.properties"` to the command to pass a custom Log4j 2 configuration file. If you don't provide a properties file, Data Prepper defaults to the `log4j2.properties` file in the `shared-config` directory. - - -Starting with Data Prepper 2.0, you can launch Data Prepper by using the following `data-prepper` script that does not require any additional command line arguments: - -``` -bin/data-prepper -``` - -Configuration files are read from specific subdirectories in the application's home directory: -1. `pipelines/`: Used for pipeline configurations. Pipeline configurations can be written in one or more YAML files. -2. `config/data-prepper-config.yaml`: Used for the Data Prepper server configuration. - -You can supply your own pipeline configuration file path followed by the server configuration file path. However, this method will not be supported in a future release. See the following example: - -``` -bin/data-prepper pipelines.yaml data-prepper-config.yaml -``` - -The Log4j 2 configuration file is read from the `config/log4j2.properties` file located in the application's home directory. - -To configure Data Prepper, see the following information for each use case: - -* [Trace analytics]({{site.url}}{{site.baseurl}}/data-prepper/common-use-cases/trace-analytics/): Learn how to collect trace data and customize a pipeline that ingests and transforms that data. -* [Log analytics]({{site.url}}{{site.baseurl}}/data-prepper/common-use-cases/log-analytics/): Learn how to set up Data Prepper for log observability. - -## 3. Defining a pipeline - -Create a Data Prepper pipeline file named `pipelines.yaml` using the following configuration: - -```yml -simple-sample-pipeline: - workers: 2 - delay: "5000" - source: - random: - sink: - - stdout: -``` -{% include copy.html %} - -## 4. Running Data Prepper - -Run the following command with your pipeline configuration YAML. - -```bash -docker run --name data-prepper \ - -v /${PWD}/pipelines.yaml:/usr/share/data-prepper/pipelines/pipelines.yaml \ - opensearchproject/data-prepper:latest - -``` -{% include copy.html %} - -The example pipeline configuration above demonstrates a simple pipeline with a source (`random`) sending data to a sink (`stdout`). For examples of more advanced pipeline configurations, see [Pipelines]({{site.url}}{{site.baseurl}}/clients/data-prepper/pipelines/). - -After starting Data Prepper, you should see log output and some UUIDs after a few seconds: - -```yml -2021-09-30T20:19:44,147 [main] INFO com.amazon.dataprepper.pipeline.server.DataPrepperServer - Data Prepper server running at :4900 -2021-09-30T20:19:44,681 [random-source-pool-0] INFO com.amazon.dataprepper.plugins.source.RandomStringSource - Writing to buffer -2021-09-30T20:19:45,183 [random-source-pool-0] INFO com.amazon.dataprepper.plugins.source.RandomStringSource - Writing to buffer -2021-09-30T20:19:45,687 [random-source-pool-0] INFO com.amazon.dataprepper.plugins.source.RandomStringSource - Writing to buffer -2021-09-30T20:19:46,191 [random-source-pool-0] INFO com.amazon.dataprepper.plugins.source.RandomStringSource - Writing to buffer -2021-09-30T20:19:46,694 [random-source-pool-0] INFO com.amazon.dataprepper.plugins.source.RandomStringSource - Writing to buffer -2021-09-30T20:19:47,200 [random-source-pool-0] INFO com.amazon.dataprepper.plugins.source.RandomStringSource - Writing to buffer -2021-09-30T20:19:49,181 [simple-test-pipeline-processor-worker-1-thread-1] INFO com.amazon.dataprepper.pipeline.ProcessWorker - simple-test-pipeline Worker: Processing 6 records from buffer -07dc0d37-da2c-447e-a8df-64792095fb72 -5ac9b10a-1d21-4306-851a-6fb12f797010 -99040c79-e97b-4f1d-a70b-409286f2a671 -5319a842-c028-4c17-a613-3ef101bd2bdd -e51e700e-5cab-4f6d-879a-1c3235a77d18 -b4ed2d7e-cf9c-4e9d-967c-b18e8af35c90 -``` -The remainder of this page provides examples for running Data Prepper from the Docker image. If you -built it from source, refer to the [Developer Guide](https://github.com/opensearch-project/data-prepper/blob/main/docs/developer_guide.md) for more information. - -However you configure your pipeline, you'll run Data Prepper the same way. You run the Docker -image and modify both the `pipelines.yaml` and `data-prepper-config.yaml` files. - -For Data Prepper 2.0 or later, use this command: - -``` -docker run --name data-prepper -p 4900:4900 -v ${PWD}/pipelines.yaml:/usr/share/data-prepper/pipelines/pipelines.yaml -v ${PWD}/data-prepper-config.yaml:/usr/share/data-prepper/config/data-prepper-config.yaml opensearchproject/data-prepper:latest -``` -{% include copy.html %} - -For Data Prepper versions earlier than 2.0, use this command: - -``` -docker run --name data-prepper -p 4900:4900 -v ${PWD}/pipelines.yaml:/usr/share/data-prepper/pipelines.yaml -v ${PWD}/data-prepper-config.yaml:/usr/share/data-prepper/data-prepper-config.yaml opensearchproject/data-prepper:1.x -``` -{% include copy.html %} - -Once Data Prepper is running, it processes data until it is shut down. Once you are done, shut it down with the following command: - -``` -POST /shutdown -``` -{% include copy-curl.html %} - -### Additional configurations - -For Data Prepper 2.0 or later, the Log4j 2 configuration file is read from `config/log4j2.properties` in the application's home directory. By default, it uses `log4j2-rolling.properties` in the *shared-config* directory. - -For Data Prepper 1.5 or earlier, optionally add `"-Dlog4j.configurationFile=config/log4j2.properties"` to the command if you want to pass a custom log4j2 properties file. If no properties file is provided, Data Prepper defaults to the log4j2.properties file in the *shared-config* directory. - -## Next steps - -Trace analytics is an important Data Prepper use case. If you haven't yet configured it, see [Trace analytics]({{site.url}}{{site.baseurl}}/data-prepper/common-use-cases/trace-analytics/). - -Log ingestion is also an important Data Prepper use case. To learn more, see [Log analytics]({{site.url}}{{site.baseurl}}/data-prepper/common-use-cases/log-analytics/). - -To learn how to run Data Prepper with a Logstash configuration, see [Migrating from Logstash]({{site.url}}{{site.baseurl}}/data-prepper/migrating-from-logstash-data-prepper/). - -For information on how to monitor Data Prepper, see [Monitoring]({{site.url}}{{site.baseurl}}/data-prepper/managing-data-prepper/monitoring/). - -## More examples - -For more examples of Data Prepper, see [examples](https://github.com/opensearch-project/data-prepper/tree/main/examples/) in the Data Prepper repo. diff --git a/_data-prepper/getting-started/getting-started.md b/_data-prepper/getting-started/getting-started.md index 790f35cd522..bd3473a0e8f 100644 --- a/_data-prepper/getting-started/getting-started.md +++ b/_data-prepper/getting-started/getting-started.md @@ -4,6 +4,25 @@ title: Getting started with OpenSearch Data Prepper nav_order: 5 parent: OpenSearch Data Prepper has_children: yes +has_toc: false redirect_from: - /clients/data-prepper/get-started/ ---- \ No newline at end of file + - /data-prepper/getting-started/ +items: + - heading: "Understand key concepts" + description: "Learn about the core components and architecture of Data Prepper." + link: "/data-prepper/getting-started/concepts/" + - heading: "Install and configure Data Prepper" + description: "Set up Data Prepper for your environment and configure basic settings." + link: "/data-prepper/getting-started/install-and-configure/" + - heading: "Run Data Prepper" + description: "Start the service and verify that Data Prepper is running correctly." + link: "/data-prepper/getting-started/run-data-prepper/" +--- + +# Getting started with OpenSearch Data Prepper + +This section provides the foundational steps for using OpenSearch Data Prepper. It covers the initial setup, introduces core concepts, and guides you through creating and managing Data Prepper pipelines. Whether your focus is on log collection, trace analysis, or specific use cases, these resources will help you begin working effectively with Data Prepper. + +{% include list.html list_items=page.items%} + diff --git a/_data-prepper/getting-started/install-and-configure.md b/_data-prepper/getting-started/install-and-configure.md index f2da871f510..736e2a5cbeb 100644 --- a/_data-prepper/getting-started/install-and-configure.md +++ b/_data-prepper/getting-started/install-and-configure.md @@ -6,6 +6,13 @@ grand_parent: OpenSearch Data Prepper parent: Getting started with OpenSearch Data Prepper --- +# Install and configure OpenSearch Data Prepper + +This page guides you through the process of installing and configuring OpenSearch Data Prepper. You can install Data Prepper using a pre-built Docker image or by building the project from source, depending on your environment and requirements. + +After installation, you must configure a set of required files that define how Data Prepper runs and processes data. This includes specifying pipeline definitions, server settings, and optional logging configurations. Configuration details vary slightly depending on the version you are using. + +Use this guide to prepare your environment and set up Data Prepper for trace analytics, log ingestion, or other supported use cases. ## 1. Installing Data Prepper diff --git a/_data-prepper/getting-started/run-data-prepper.md b/_data-prepper/getting-started/run-data-prepper.md index 476604a9370..48614d07cf8 100644 --- a/_data-prepper/getting-started/run-data-prepper.md +++ b/_data-prepper/getting-started/run-data-prepper.md @@ -6,6 +6,10 @@ grand_parent: OpenSearch Data Prepper parent: Getting started with OpenSearch Data Prepper --- +This section explains how to run OpenSearch Data Prepper using a defined pipeline configuration. Before starting the service, you must create a valid pipeline YAML file that defines the data flow—from source to sink—with optional processors and buffers. + +You can run Data Prepper using a Docker container or a local build, depending on your setup. This page provides examples for running the Docker image, along with configuration options for different versions of Data Prepper. Once launched, Data Prepper begins processing data according to the specified pipeline and continues until it is manually shut down. + ## Defining a pipeline Create a Data Prepper pipeline file named `pipelines.yaml`, similar to the following sample configuration: diff --git a/_data-prepper/index.md b/_data-prepper/index.md index 8b9df7eaaa2..505303d875c 100644 --- a/_data-prepper/index.md +++ b/_data-prepper/index.md @@ -10,6 +10,23 @@ redirect_from: - /clients/data-prepper/index/ - /monitoring-plugins/trace/data-prepper/ - /data-prepper/index/ +tutorial_cards: + - heading: "Trace analytics" + description: "Visualize event flows and find performance issues." + link: "/data-prepper/common-use-cases/trace-analytics/" + - heading: "Log analytics" + description: "Search, analyze, and gain insights from logs." + link: "/data-prepper/common-use-cases/log-analytics/" +items: + - heading: "Getting started with OpenSearch Data Prepper" + description: "Set up Data Prepper and start processing data." + link: "/data-prepper/getting-started/" + - heading: "Get familiar with Data Prepper pipelines" + description: "Learn how to build and configure pipelines." + link: "/data-prepper/pipelines/pipelines/" + - heading: "Explore common use cases" + description: "See how Data Prepper supports key use cases." + link: "/data-prepper/common-use-cases/common-use-cases/" --- # OpenSearch Data Prepper @@ -18,11 +35,8 @@ OpenSearch Data Prepper is a server-side data collector capable of filtering, en With Data Prepper you can build custom pipelines to improve the operational view of applications. Two common use cases for Data Prepper are trace analytics and log analytics. -- [Trace analytics]({{site.url}}{{site.baseurl}}/data-prepper/common-use-cases/trace-analytics/) can help you visualize event flows and identify performance problems. -- - [Log analytics]({{site.url}}{{site.baseurl}}/data-prepper/common-use-cases/log-analytics/) equips you with tools to enhance your search capabilities, conduct comprehensive analysis, and gain insights into your applications' performance and behavior. +{% include cards.html cards=page.tutorial_cards %} +## Using OpenSearch Data Prepper -## Next steps - -- [Getting started with OpenSearch Data Prepper]({{site.url}}{{site.baseurl}}/data-prepper/getting-started/). -- [Get familiar with Data Prepper pipelines]({{site.url}}{{site.baseurl}}/data-prepper/pipelines/pipelines/). -- [Explore common use cases]({{site.url}}{{site.baseurl}}/data-prepper/common-use-cases/common-use-cases/). +{% include list.html list_items=page.items%} \ No newline at end of file diff --git a/_data-prepper/pipelines/functions/functions.md b/_data-prepper/pipelines/functions/functions.md index 1a12278207e..1c5cb24d74f 100644 --- a/_data-prepper/pipelines/functions/functions.md +++ b/_data-prepper/pipelines/functions/functions.md @@ -6,17 +6,29 @@ nav_order: 10 has_children: true redirect_from: - /data-prepper/pipelines/functions/ +tutorial_cards: + - heading: "cidrContains()" + description: "Checks if an IP is in a CIDR block." + link: "/data-prepper/pipelines/cidrcontains/" + - heading: "contains()" + description: "Checks if a value exists in a string or list." + link: "/data-prepper/pipelines/contains/" + - heading: "getMetadata()" + description: "Retrieves metadata from a record." + link: "/data-prepper/pipelines/get-metadata/" + - heading: "hasTags()" + description: "Checks if a record has specific tags." + link: "/data-prepper/pipelines/has-tags/" + - heading: "join()" + description: "Combines list items into a string." + link: "/data-prepper/pipelines/join/" + - heading: "length()" + description: "Gets the length of a string or list." + link: "/data-prepper/pipelines/length/" --- # Functions -OpenSearch Data Prepper offers a range of built-in functions that can be used within expressions to perform common data preprocessing tasks, such as calculating lengths, checking for tags, retrieving metadata, searching for substrings, checking IP address ranges, and joining list elements. These functions include the following: +OpenSearch Data Prepper offers a range of built-in functions that can be used within expressions to perform common data preprocessing tasks, such as calculating lengths, checking for tags, retrieving metadata, searching for substrings, checking IP address ranges, and joining list elements. - - -- [`cidrContains()`]({{site.url}}{{site.baseurl}}/data-prepper/pipelines/cidrcontains/) -- [`contains()`]({{site.url}}{{site.baseurl}}/data-prepper/pipelines/contains/) -- [`getMetadata()`]({{site.url}}{{site.baseurl}}/data-prepper/pipelines/get-metadata/) -- [`hasTags()`]({{site.url}}{{site.baseurl}}/data-prepper/pipelines/has-tags/) -- [`join()`]({{site.url}}{{site.baseurl}}/data-prepper/pipelines/join/) -- [`length()`]({{site.url}}{{site.baseurl}}/data-prepper/pipelines/length/) \ No newline at end of file +{% include cards.html cards=page.tutorial_cards %} \ No newline at end of file From 743db5d359f73bcb4a652908c0b9791dc8b5d005 Mon Sep 17 00:00:00 2001 From: Archer Date: Thu, 19 Jun 2025 13:31:11 -0500 Subject: [PATCH 4/5] Fix links Signed-off-by: Archer --- _data-prepper/getting-started/getting-started.md | 2 +- _data-prepper/migrating-from-logstash-data-prepper.md | 2 +- .../configuration/processors/convert-entry-type.md | 2 +- .../configuration/processors/delete-entries.md | 2 +- .../configuration/processors/mutate-string.md | 10 +++++----- .../pipelines/configuration/processors/rename-keys.md | 2 +- .../configuration/processors/trace-peer-forwarder.md | 2 +- 7 files changed, 11 insertions(+), 11 deletions(-) diff --git a/_data-prepper/getting-started/getting-started.md b/_data-prepper/getting-started/getting-started.md index bd3473a0e8f..b429564a6d0 100644 --- a/_data-prepper/getting-started/getting-started.md +++ b/_data-prepper/getting-started/getting-started.md @@ -3,7 +3,7 @@ layout: default title: Getting started with OpenSearch Data Prepper nav_order: 5 parent: OpenSearch Data Prepper -has_children: yes +has_children: true has_toc: false redirect_from: - /clients/data-prepper/get-started/ diff --git a/_data-prepper/migrating-from-logstash-data-prepper.md b/_data-prepper/migrating-from-logstash-data-prepper.md index 13548092dce..8e442f3ebc8 100644 --- a/_data-prepper/migrating-from-logstash-data-prepper.md +++ b/_data-prepper/migrating-from-logstash-data-prepper.md @@ -29,7 +29,7 @@ As of the Data Prepper 1.2 release, the following plugins from the Logstash conf ## Running Data Prepper with a Logstash configuration -1. To install Data Prepper's Docker image, see Installing Data Prepper in [Getting Started with OpenSearch Data Prepper]({{site.url}}{{site.baseurl}}/data-prepper/getting-started#1-installing-data-prepper). +1. To install Data Prepper's Docker image, see Installing Data Prepper in [Getting Started with OpenSearch Data Prepper]({{site.url}}{{site.baseurl}}/data-prepper/getting-started/install-and-configure/#1-installing-data-prepper). 2. Run the Docker image installed in Step 1 by supplying your `logstash.conf` configuration. diff --git a/_data-prepper/pipelines/configuration/processors/convert-entry-type.md b/_data-prepper/pipelines/configuration/processors/convert-entry-type.md index cc707832ad7..4d191adbb85 100644 --- a/_data-prepper/pipelines/configuration/processors/convert-entry-type.md +++ b/_data-prepper/pipelines/configuration/processors/convert-entry-type.md @@ -47,7 +47,7 @@ type-conv-pipeline: ``` {% include copy.html %} -Next, create a log file named `logs_json.log` and replace the `path` in the file source of your `pipeline.yaml` file with that filepath. For more information, see [Configuring OpenSearch Data Prepper]({{site.url}}{{site.baseurl}}/data-prepper/getting-started/#2-configuring-data-prepper). +Next, create a log file named `logs_json.log` and replace the `path` in the file source of your `pipeline.yaml` file with that filepath. For more information, see [Configuring OpenSearch Data Prepper]({{site.url}}{{site.baseurl}}/data-prepper/getting-started/install-and-configure/#2-configuring-data-prepper). For example, before you run the `convert_entry_type` processor, if the `logs_json.log` file contains the following event record: diff --git a/_data-prepper/pipelines/configuration/processors/delete-entries.md b/_data-prepper/pipelines/configuration/processors/delete-entries.md index f30bccae232..5a1b940d09e 100644 --- a/_data-prepper/pipelines/configuration/processors/delete-entries.md +++ b/_data-prepper/pipelines/configuration/processors/delete-entries.md @@ -41,7 +41,7 @@ pipeline: ``` {% include copy.html %} -Next, create a log file named `logs_json.log` and replace the `path` in the file source of your `pipeline.yaml` file with that filepath. For more information, see [Configuring OpenSearch Data Prepper]({{site.url}}{{site.baseurl}}/data-prepper/getting-started/#2-configuring-data-prepper). +Next, create a log file named `logs_json.log` and replace the `path` in the file source of your `pipeline.yaml` file with that filepath. For more information, see [Configuring OpenSearch Data Prepper]({{site.url}}{{site.baseurl}}/data-prepper/getting-started/install-and-configure/#2-configuring-data-prepper). For example, before you run the `delete_entries` processor, if the `logs_json.log` file contains the following event record: diff --git a/_data-prepper/pipelines/configuration/processors/mutate-string.md b/_data-prepper/pipelines/configuration/processors/mutate-string.md index b84e63ea61b..3c6269fdaac 100644 --- a/_data-prepper/pipelines/configuration/processors/mutate-string.md +++ b/_data-prepper/pipelines/configuration/processors/mutate-string.md @@ -53,7 +53,7 @@ pipeline: ``` {% include copy.html %} -Next, create a log file named `logs_json.log`. After that, replace the `path` of the file source in your `pipeline.yaml` file with your file path. For more detailed information, see [Configuring OpenSearch Data Prepper]({{site.url}}{{site.baseurl}}/data-prepper/getting-started/#2-configuring-data-prepper). +Next, create a log file named `logs_json.log`. After that, replace the `path` of the file source in your `pipeline.yaml` file with your file path. For more detailed information, see [Configuring OpenSearch Data Prepper]({{site.url}}{{site.baseurl}}/data-prepper/getting-started/install-and-configure/#2-configuring-data-prepper). Before you run OpenSearch Data Prepper, the source appears in the following format: @@ -105,7 +105,7 @@ pipeline: ``` {% include copy.html %} -Next, create a log file named `logs_json.log`. After that, replace the `path` in the file source of your `pipeline.yaml` file with your file path. For more detailed information, see [Configuring OpenSearch Data Prepper]({{site.url}}{{site.baseurl}}/data-prepper/getting-started/#2-configuring-data-prepper). +Next, create a log file named `logs_json.log`. After that, replace the `path` in the file source of your `pipeline.yaml` file with your file path. For more detailed information, see [Configuring OpenSearch Data Prepper]({{site.url}}{{site.baseurl}}/data-prepper/getting-started/install-and-configure/#2-configuring-data-prepper). Before you run Data Prepper, the source appears in the following format: @@ -150,7 +150,7 @@ pipeline: ``` {% include copy.html %} -Next, create a log file named `logs_json.log`. After that, replace the `path` in the file source of your `pipeline.yaml` file with the correct file path. For more detailed information, see [Configuring OpenSearch Data Prepper]({{site.url}}{{site.baseurl}}/data-prepper/getting-started/#2-configuring-data-prepper). +Next, create a log file named `logs_json.log`. After that, replace the `path` in the file source of your `pipeline.yaml` file with the correct file path. For more detailed information, see [Configuring OpenSearch Data Prepper]({{site.url}}{{site.baseurl}}/data-prepper/getting-started/install-and-configure/#2-configuring-data-prepper). Before you run Data Prepper, the source appears in the following format: @@ -195,7 +195,7 @@ pipeline: ``` {% include copy.html %} -Next, create a log file named `logs_json.log`. After that, replace the `path` in the file source of your `pipeline.yaml` file with the correct file path. For more detailed information, see [Configuring OpenSearch Data Prepper]({{site.url}}{{site.baseurl}}/data-prepper/getting-started/#2-configuring-data-prepper). +Next, create a log file named `logs_json.log`. After that, replace the `path` in the file source of your `pipeline.yaml` file with the correct file path. For more detailed information, see [Configuring OpenSearch Data Prepper]({{site.url}}{{site.baseurl}}/data-prepper/getting-started/install-and-configure/#2-configuring-data-prepper). Before you run Data Prepper, the source appears in the following format: @@ -241,7 +241,7 @@ pipeline: ``` {% include copy.html %} -Next, create a log file named `logs_json.log`. After that, replace the `path` in the file source of your `pipeline.yaml` file with the correct file path. For more detailed information, see [Configuring OpenSearch Data Prepper]({{site.url}}{{site.baseurl}}/data-prepper/getting-started/#2-configuring-data-prepper). +Next, create a log file named `logs_json.log`. After that, replace the `path` in the file source of your `pipeline.yaml` file with the correct file path. For more detailed information, see [Configuring OpenSearch Data Prepper]({{site.url}}{{site.baseurl}}/data-prepper/getting-started/install-and-configure/#2-configuring-data-prepper). Before you run Data Prepper, the source appears in the following format: diff --git a/_data-prepper/pipelines/configuration/processors/rename-keys.md b/_data-prepper/pipelines/configuration/processors/rename-keys.md index a2f1711ebf4..c14d3c69b2e 100644 --- a/_data-prepper/pipelines/configuration/processors/rename-keys.md +++ b/_data-prepper/pipelines/configuration/processors/rename-keys.md @@ -44,7 +44,7 @@ pipeline: {% include copy.html %} -Next, create a log file named `logs_json.log` and replace the `path` in the file source of your `pipeline.yaml` file with that filepath. For more information, see [Configuring OpenSearch Data Prepper]({{site.url}}{{site.baseurl}}/data-prepper/getting-started/#2-configuring-data-prepper). +Next, create a log file named `logs_json.log` and replace the `path` in the file source of your `pipeline.yaml` file with that filepath. For more information, see [Configuring OpenSearch Data Prepper]({{site.url}}{{site.baseurl}}/data-prepper/getting-started/install-and-configure/#2-configuring-data-prepper). For example, before you run the `rename_keys` processor, if the `logs_json.log` file contains the following event record: diff --git a/_data-prepper/pipelines/configuration/processors/trace-peer-forwarder.md b/_data-prepper/pipelines/configuration/processors/trace-peer-forwarder.md index 2665b985f72..4f5b70f1076 100644 --- a/_data-prepper/pipelines/configuration/processors/trace-peer-forwarder.md +++ b/_data-prepper/pipelines/configuration/processors/trace-peer-forwarder.md @@ -14,7 +14,7 @@ You should use `trace_peer_forwarder` for Trace Analytics pipelines when you hav ## Usage -To get started with `trace_peer_forwarder`, first configure [peer forwarder]({{site.url}}{{site.baseurl}}/data-prepper/managing-data-prepper/peer-forwarder/). Then create a `pipeline.yaml` file and specify `trace peer forwarder` as the processor. You can configure `peer forwarder` in your `data-prepper-config.yaml` file. For more detailed information, see [Configuring OpenSearch Data Prepper]({{site.url}}{{site.baseurl}}/data-prepper/getting-started/#2-configuring-data-prepper). +To get started with `trace_peer_forwarder`, first configure [peer forwarder]({{site.url}}{{site.baseurl}}/data-prepper/managing-data-prepper/peer-forwarder/). Then create a `pipeline.yaml` file and specify `trace peer forwarder` as the processor. You can configure `peer forwarder` in your `data-prepper-config.yaml` file. For more detailed information, see [Configuring OpenSearch Data Prepper]({{site.url}}{{site.baseurl}}/data-prepper/getting-started/install-and-configure/#2-configuring-data-prepper). See the following example `pipeline.yaml` file: From 3720bfed08f0c2808ee3663f175ca92d763ef41b Mon Sep 17 00:00:00 2001 From: Archer Date: Thu, 26 Jun 2025 11:13:40 -0500 Subject: [PATCH 5/5] Add data prepper getting started. Signed-off-by: Archer --- _data-prepper/getting-started/getting-started.md | 1 - 1 file changed, 1 deletion(-) diff --git a/_data-prepper/getting-started/getting-started.md b/_data-prepper/getting-started/getting-started.md index b429564a6d0..ce7634541d4 100644 --- a/_data-prepper/getting-started/getting-started.md +++ b/_data-prepper/getting-started/getting-started.md @@ -2,7 +2,6 @@ layout: default title: Getting started with OpenSearch Data Prepper nav_order: 5 -parent: OpenSearch Data Prepper has_children: true has_toc: false redirect_from: