opensearch-project · Naarcha-AWS · May 26, 2025 · Jun 19, 2025 · Jun 19, 2025 · Jun 19, 2025
@@ -0,0 +1,20 @@
+---
+layout: default
+title: Concepts
+nav_order: 10
+grand_parent: OpenSearch Data Prepper 
+parent: Getting started with OpenSearch Data Prepper
+---
+
+# Key concepts and fundamentals
+
+Data Prepper ingests data through customizable [pipelines]({{site.url}}{{site.baseurl}}/data-prepper/pipelines/pipelines/). These pipelines consist of pluggable components that you can customize to fit your needs, even allowing you to plug in your own implementations. A Data Prepper pipeline consists of the following components: 
+
+- One [source]({{site.url}}{{site.baseurl}}/data-prepper/pipelines/configuration/sources/sources/)
+- One or more [sinks]({{site.url}}{{site.baseurl}}/data-prepper/pipelines/configuration/sinks/sinks/)
+- (Optional) One [buffer]({{site.url}}{{site.baseurl}}/data-prepper/pipelines/configuration/buffers/buffers/)
+- (Optional) One or more [processors]({{site.url}}{{site.baseurl}}/data-prepper/pipelines/configuration/processors/processors/)
+
+Each pipeline contains two required components: `source` and `sink`. If a `buffer`, a `processor`, or both are missing from the pipeline, then Data Prepper uses the default `bounded_blocking` buffer and a no-op processor. Note that a single instance of Data Prepper can have one or more pipelines. 
+
+<!----Add additional concepts here---->
@@ -0,0 +1,27 @@
+---
+layout: default
+title: Getting started with OpenSearch Data Prepper
+nav_order: 5
+has_children: true
+has_toc: false
+redirect_from:
+  - /clients/data-prepper/get-started/
+  - /data-prepper/getting-started/
+items:
+  - heading: "Understand key concepts"
+    description: "Learn about the core components and architecture of Data Prepper."
+    link: "/data-prepper/getting-started/concepts/"
+  - heading: "Install and configure Data Prepper"
+    description: "Set up Data Prepper for your environment and configure basic settings."
+    link: "/data-prepper/getting-started/install-and-configure/"
+  - heading: "Run Data Prepper"
+    description: "Start the service and verify that Data Prepper is running correctly."
+    link: "/data-prepper/getting-started/run-data-prepper/"
+---
+
+# Getting started with OpenSearch Data Prepper
+
+This section provides the foundational steps for using OpenSearch Data Prepper. It covers the initial setup, introduces core concepts, and guides you through creating and managing Data Prepper pipelines. Whether your focus is on log collection, trace analysis, or specific use cases, these resources will help you begin working effectively with Data Prepper.
+
+{% include list.html list_items=page.items%}
+
@@ -0,0 +1,68 @@
+---
+layout: default
+title: Install and configure OpenSearch Data Prepper
+nav_order: 10
+grand_parent: OpenSearch Data Prepper 
+parent: Getting started with OpenSearch Data Prepper
+---
+
+# Install and configure OpenSearch Data Prepper
+
+This page guides you through the process of installing and configuring OpenSearch Data Prepper. You can install Data Prepper using a pre-built Docker image or by building the project from source, depending on your environment and requirements.
+
+After installation, you must configure a set of required files that define how Data Prepper runs and processes data. This includes specifying pipeline definitions, server settings, and optional logging configurations. Configuration details vary slightly depending on the version you are using.
+
+Use this guide to prepare your environment and set up Data Prepper for trace analytics, log ingestion, or other supported use cases.
+
+## 1. Installing Data Prepper
+
+There are two ways to install Data Prepper: you can run the Docker image or build from source.
+
+The easiest way to use Data Prepper is by running the Docker image. We suggest that you use this approach if you have [Docker](https://www.docker.com) available. Run the following command:  
+
+```
+docker pull opensearchproject/data-prepper:latest
+```
+{% include copy.html %}
+
+If you have special requirements that require you to build from source, or if you want to contribute, see the [Developer Guide](https://github.com/opensearch-project/data-prepper/blob/main/docs/developer_guide.md).
+
+## 2. Configuring Data Prepper
+
+Two configuration files are required to run a Data Prepper instance. Optionally, you can configure a Log4j 2 configuration file. See [Configuring Log4j]({{site.url}}{{site.baseurl}}/data-prepper/managing-data-prepper/configuring-log4j/) for more information. The following list describes the purpose of each configuration file:
+
+* `pipelines.yaml`: This file describes which data pipelines to run, including sources, processors, and sinks. 
+* `data-prepper-config.yaml`: This file contains Data Prepper server settings that allow you to interact with exposed Data Prepper server APIs. 
+* `log4j2-rolling.properties` (optional): This file contains Log4j 2 configuration options and can be a JSON, YAML, XML, or .properties file type. 
+
+For Data Prepper versions earlier than 2.0, the `.jar` file expects the pipeline configuration file path to be followed by the server configuration file path. See the following configuration path example:
+
+```
+java -jar data-prepper-core-$VERSION.jar pipelines.yaml data-prepper-config.yaml
+```
+
+Optionally, you can add `"-Dlog4j.configurationFile=config/log4j2.properties"` to the command to pass a custom Log4j 2 configuration file. If you don't provide a properties file, Data Prepper defaults to the `log4j2.properties` file in the `shared-config` directory.
+
+
+Starting with Data Prepper 2.0, you can launch Data Prepper by using the following `data-prepper` script that does not require any additional command line arguments:
+
+```
+bin/data-prepper
+```
+
+Configuration files are read from specific subdirectories in the application's home directory:
+1. `pipelines/`: Used for pipeline configurations. Pipeline configurations can be written in one or more YAML files.
+2. `config/data-prepper-config.yaml`: Used for the Data Prepper server configuration.
+
+You can supply your own pipeline configuration file path followed by the server configuration file path. However, this method will not be supported in a future release. See the following example:
+
+```
+bin/data-prepper pipelines.yaml data-prepper-config.yaml
+```
+
+The Log4j 2 configuration file is read from the `config/log4j2.properties` file located in the application's home directory.
+
+To configure Data Prepper, see the following information for each use case: 
+
+* [Trace analytics]({{site.url}}{{site.baseurl}}/data-prepper/common-use-cases/trace-analytics/): Learn how to collect trace data and customize a pipeline that ingests and transforms that data. 
+* [Log analytics]({{site.url}}{{site.baseurl}}/data-prepper/common-use-cases/log-analytics/): Learn how to set up Data Prepper for log observability.