diff --git a/docs/best-practices/minimize_optimize_joins.md b/docs/best-practices/minimize_optimize_joins.md index f79b0a47ba5..959f8e9e5a1 100644 --- a/docs/best-practices/minimize_optimize_joins.md +++ b/docs/best-practices/minimize_optimize_joins.md @@ -26,7 +26,7 @@ For a full guide on denormalizing data in ClickHouse see [here](/data-modeling/d ## When JOINs are required {#when-joins-are-required} -When JOINs are required, ensure you’re using **at least version 24.12 and preferably the latest version**, as JOIN performance continues to improve with each new release. As of ClickHouse 24.12, the query planner now automatically places the smaller table on the right side of the join for optimal performance - a task that previously had to be done manually. Even more enhancements are coming soon, including more aggressive filter pushdown and automatic re-ordering of multiple joins. +When JOINs are required, ensure you're using **at least version 24.12 and preferably the latest version**, as JOIN performance continues to improve with each new release. As of ClickHouse 24.12, the query planner now automatically places the smaller table on the right side of the join for optimal performance - a task that previously had to be done manually. Even more enhancements are coming soon, including more aggressive filter pushdown and automatic re-ordering of multiple joins. Follow these best practices to improve JOIN performance: diff --git a/docs/cloud/changelogs/changelog-25_1-25_4.md b/docs/cloud/changelogs/changelog-25_1-25_4.md index 038dd45e061..3671f1980b1 100644 --- a/docs/cloud/changelogs/changelog-25_1-25_4.md +++ b/docs/cloud/changelogs/changelog-25_1-25_4.md @@ -274,7 +274,7 @@ sidebar_label: 'v25.4' * Don't fail silently if user executing `SYSTEM DROP REPLICA` doesn't have enough permissions. [#75377](https://github.com/ClickHouse/ClickHouse/pull/75377) ([Bharat Nallan](https://github.com/bharatnc)). * Add a ProfileEvent about the number of times any of system logs has failed to flush. [#75466](https://github.com/ClickHouse/ClickHouse/pull/75466) ([Alexey Milovidov](https://github.com/alexey-milovidov)). * Add check and logging for decrypting and decompressing. [#75471](https://github.com/ClickHouse/ClickHouse/pull/75471) ([Vitaly Baranov](https://github.com/vitlibar)). -* Added support for the micro sign (U+00B5) in the `parseTimeDelta` function. Now both the micro sign (U+00B5) and the Greek letter mu (U+03BC) are recognized as valid representations for microseconds, aligning ClickHouse's behavior with Go’s implementation ([see time.go](https://github.com/golang/go/blob/ad7b46ee4ac1cee5095d64b01e8cf7fcda8bee5e/src/time/time.go#L983C19-L983C20) and [time/format.go](https://github.com/golang/go/blob/ad7b46ee4ac1cee5095d64b01e8cf7fcda8bee5e/src/time/format.go#L1608-L1609)). [#75472](https://github.com/ClickHouse/ClickHouse/pull/75472) ([Vitaly Orlov](https://github.com/orloffv)). +* Added support for the micro sign (U+00B5) in the `parseTimeDelta` function. Now both the micro sign (U+00B5) and the Greek letter mu (U+03BC) are recognized as valid representations for microseconds, aligning ClickHouse's behavior with Go's implementation ([see time.go](https://github.com/golang/go/blob/ad7b46ee4ac1cee5095d64b01e8cf7fcda8bee5e/src/time/time.go#L983C19-L983C20) and [time/format.go](https://github.com/golang/go/blob/ad7b46ee4ac1cee5095d64b01e8cf7fcda8bee5e/src/time/format.go#L1608-L1609)). [#75472](https://github.com/ClickHouse/ClickHouse/pull/75472) ([Vitaly Orlov](https://github.com/orloffv)). * Replace server setting (`send_settings_to_client`) with client setting (`apply_settings_from_server`) that controls whether client-side code (e.g. parsing INSERT data and formatting query output) should use settings from server's `users.xml` and user profile. Otherwise only settings from client command line, session, and the query are used. Note that this only applies to native client (not e.g. HTTP), and doesn't apply to most of query processing (which happens on the server). [#75478](https://github.com/ClickHouse/ClickHouse/pull/75478) ([Michael Kolupaev](https://github.com/al13n321)). * Keeper improvement: disable digest calculation when committing to in-memory storage for better performance. It can be enabled with `keeper_server.digest_enabled_on_commit` config. Digest is still calculated when preprocessing requests. [#75490](https://github.com/ClickHouse/ClickHouse/pull/75490) ([Antonio Andelic](https://github.com/antonio2368)). * Push down filter expression from JOIN ON when possible. [#75536](https://github.com/ClickHouse/ClickHouse/pull/75536) ([Vladimir Cherkasov](https://github.com/vdimir)). @@ -621,7 +621,7 @@ sidebar_label: 'v25.4' * The universal installation script will propose installation even on macOS. [#74339](https://github.com/ClickHouse/ClickHouse/pull/74339) ([Alexey Milovidov](https://github.com/alexey-milovidov)). * Fix build when kerberos is not enabled. [#74771](https://github.com/ClickHouse/ClickHouse/pull/74771) ([flynn](https://github.com/ucasfl)). * Update to embedded LLVM 19. [#75148](https://github.com/ClickHouse/ClickHouse/pull/75148) ([Konstantin Bogdanov](https://github.com/thevar1able)). -* *Potentially breaking*: Improvement to set even more restrictive defaults. The current defaults are already secure. The user has to specify an option to publish ports explicitly. But when the `default` user doesn’t have a password set by `CLICKHOUSE_PASSWORD` and/or a username changed by `CLICKHOUSE_USER` environment variables, it should be available only from the local system as an additional level of protection. [#75259](https://github.com/ClickHouse/ClickHouse/pull/75259) ([Mikhail f. Shiryaev](https://github.com/Felixoid)). +* *Potentially breaking*: Improvement to set even more restrictive defaults. The current defaults are already secure. The user has to specify an option to publish ports explicitly. But when the `default` user doesn't have a password set by `CLICKHOUSE_PASSWORD` and/or a username changed by `CLICKHOUSE_USER` environment variables, it should be available only from the local system as an additional level of protection. [#75259](https://github.com/ClickHouse/ClickHouse/pull/75259) ([Mikhail f. Shiryaev](https://github.com/Felixoid)). * Integration tests have a 1-hour timeout for single batch of parallel tests running. When this timeout is reached `pytest` is killed without some logs. Internal pytest timeout is set to 55 minutes to print results from a session and not trigger external timeout signal. Closes [#75532](https://github.com/ClickHouse/ClickHouse/issues/75532). [#75533](https://github.com/ClickHouse/ClickHouse/pull/75533) ([Ilya Yatsishin](https://github.com/qoega)). * Make all clickhouse-server related actions a function, and execute them only when launching the default binary in `entrypoint.sh`. A long-postponed improvement was suggested in [#50724](https://github.com/ClickHouse/ClickHouse/issues/50724). Added switch `--users` to `clickhouse-extract-from-config` to get values from the `users.xml`. [#75643](https://github.com/ClickHouse/ClickHouse/pull/75643) ([Mikhail f. Shiryaev](https://github.com/Felixoid)). * For stress tests if server did not exit while we collected stacktraces via gdb additional wait time is added to make `Possible deadlock on shutdown (see gdb.log)` detection less noisy. It will only add delay for cases when test did not finish successfully. [#75668](https://github.com/ClickHouse/ClickHouse/pull/75668) ([Ilya Yatsishin](https://github.com/qoega)). diff --git a/docs/cloud/manage/billing.md b/docs/cloud/manage/billing.md index 8d608096086..3df0e975cc8 100644 --- a/docs/cloud/manage/billing.md +++ b/docs/cloud/manage/billing.md @@ -417,7 +417,7 @@ This dimension covers the compute units provisioned per service just for Postgre ClickPipes. Compute is shared across all Postgres pipes within a service. **It is provisioned when the first Postgres pipe is created and deallocated when no Postgres CDC pipes remain**. The amount of compute provisioned depends on your -organization’s tier: +organization's tier: | Tier | Cost | |------------------------------|-----------------------------------------------| @@ -426,7 +426,7 @@ organization’s tier: #### Example {#example} -Let’s say your service is in Scale tier and has the following setup: +Let's say your service is in Scale tier and has the following setup: - 2 Postgres ClickPipes running continuous replication - Each pipe ingests 500 GB of data changes (CDC) per month @@ -540,7 +540,7 @@ Postgres CDC ClickPipes pricing begins appearing on monthly bills starting **September 1st, 2025**, for all customers—both existing and new. Until then, usage is free. Customers have a **3-month window** starting from **May 29** (the GA announcement date) to review and optimize their usage if needed, although -we expect most won’t need to make any changes. +we expect most won't need to make any changes. @@ -550,7 +550,7 @@ we expect most won’t need to make any changes. No data ingestion charges apply while a pipe is paused, since no data is moved. However, compute charges still apply—either 0.5 or 1 compute unit—based on your -organization’s tier. This is a fixed service-level cost and applies across all +organization's tier. This is a fixed service-level cost and applies across all pipes within that service. diff --git a/docs/cloud/reference/changelog.md b/docs/cloud/reference/changelog.md index 051e9c5e72c..61c323b96f7 100644 --- a/docs/cloud/reference/changelog.md +++ b/docs/cloud/reference/changelog.md @@ -33,7 +33,7 @@ In addition to this ClickHouse Cloud changelog, please see the [Cloud Compatibil ## May 30, 2025 {#may-30-2025} -- We’re excited to announce general availability of **ClickPipes for Postgres CDC** +- We're excited to announce general availability of **ClickPipes for Postgres CDC** in ClickHouse Cloud. With just a few clicks, you can now replicate your Postgres databases and unlock blazing-fast, real-time analytics. The connector delivers faster data synchronization, latency as low as a few seconds, automatic schema changes, @@ -64,7 +64,7 @@ In addition to this ClickHouse Cloud changelog, please see the [Cloud Compatibil * Memory & CPU: Graphs for `CGroupMemoryTotal` (Allocated Memory), `CGroupMaxCPU` (allocated CPU), `MemoryResident` (memory used), and `ProfileEvent_OSCPUVirtualTimeMicroseconds` (CPU used) * Data Transfer: Graphs showing data ingress and egress from ClickHouse Cloud. Learn more [here](/cloud/manage/network-data-transfer). -- We’re excited to announce the launch of our new ClickHouse Cloud Prometheus/Grafana mix-in, +- We're excited to announce the launch of our new ClickHouse Cloud Prometheus/Grafana mix-in, built to simplify monitoring for your ClickHouse Cloud services. This mix-in uses our Prometheus-compatible API endpoint to seamlessly integrate ClickHouse metrics into your existing Prometheus and Grafana setup. It includes diff --git a/docs/data-modeling/projections.md b/docs/data-modeling/projections.md index 6df046ae8ca..44231b1795a 100644 --- a/docs/data-modeling/projections.md +++ b/docs/data-modeling/projections.md @@ -326,7 +326,7 @@ paid prices is streaming 2.17 million rows. When we directly used a second table optimized for this query, only 81.92 thousand rows were streamed from disk. The reason for the difference is that currently, the `optimize_read_in_order` -optimization mentioned above isn’t supported for projections. +optimization mentioned above isn't supported for projections. We inspect the `system.query_log` table to see that ClickHouse automatically used the two projections for the two queries above (see the diff --git a/docs/guides/best-practices/index.md b/docs/guides/best-practices/index.md index e65aa0a2151..1b7acbac54a 100644 --- a/docs/guides/best-practices/index.md +++ b/docs/guides/best-practices/index.md @@ -15,10 +15,10 @@ which covers the main concepts required to improve performance. |---------------------------------------------------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| | [Query Optimization Guide](/optimize/query-optimization) | A good place to start for query optimization, this simple guide describes common scenarios of how to use different performance and optimization techniques to improve query performance. | | [Primary Indexes Advanced Guide](/guides/best-practices/sparse-primary-indexes) | A deep dive into ClickHouse indexing including how it differs from other DB systems, how ClickHouse builds and uses a table's spare primary index and what some of the best practices are for indexing in ClickHouse. | -| [Query Parallelism](/optimize/query-parallelism) | Explains how ClickHouse parallelizes query execution using processing lanes and the max_threads setting. Covers how data is distributed across lanes, how max_threads is applied, when it isn’t fully used, and how to inspect execution with tools like EXPLAIN and trace logs. | +| [Query Parallelism](/optimize/query-parallelism) | Explains how ClickHouse parallelizes query execution using processing lanes and the max_threads setting. Covers how data is distributed across lanes, how max_threads is applied, when it isn't fully used, and how to inspect execution with tools like EXPLAIN and trace logs. | | [Partitioning Key](/optimize/partitioning-key) | Delves into ClickHouse partition key optimization. Explains how choosing the right partition key can significantly improve query performance by allowing ClickHouse to quickly locate relevant data segments. Covers best practices for selecting efficient partition keys and potential pitfalls to avoid. | | [Data Skipping Indexes](/optimize/skipping-indexes) | Explains data skipping indexes as a way to optimize performance. | -| [PREWHERE Optimization](/optimize/prewhere) | Explains how PREWHERE reduces I/O by avoiding reading unnecessary column data. Shows how it’s applied automatically, how the filtering order is chosen, and how to monitor it using EXPLAIN and logs. | +| [PREWHERE Optimization](/optimize/prewhere) | Explains how PREWHERE reduces I/O by avoiding reading unnecessary column data. Shows how it's applied automatically, how the filtering order is chosen, and how to monitor it using EXPLAIN and logs. | | [Bulk Inserts](/optimize/bulk-inserts) | Explains the benefits of using bulk inserts in ClickHouse. | | [Asynchronous Inserts](/optimize/asynchronous-inserts) | Focuses on ClickHouse's asynchronous inserts feature. It likely explains how asynchronous inserts work (batching data on the server for efficient insertion) and their benefits (improved performance by offloading insert processing). It might also cover enabling asynchronous inserts and considerations for using them effectively in your ClickHouse environment. | | [Avoid Mutations](/optimize/avoid-mutations) | Discusses the importance of avoiding mutations (updates and deletes) in ClickHouse. It recommends using append-only inserts for optimal performance and suggests alternative approaches for handling data changes. | diff --git a/docs/guides/best-practices/prewhere.md b/docs/guides/best-practices/prewhere.md index 6178d4f6877..8eadce85f2f 100644 --- a/docs/guides/best-practices/prewhere.md +++ b/docs/guides/best-practices/prewhere.md @@ -23,14 +23,14 @@ This guide explains how PREWHERE works, how to measure its impact, and how to tu ## Query processing without PREWHERE optimization {#query-processing-without-prewhere-optimization} -We’ll start by illustrating how a query on the [uk_price_paid_simple](/parts) table is processed without using PREWHERE: +We'll start by illustrating how a query on the [uk_price_paid_simple](/parts) table is processed without using PREWHERE: Query processing without PREWHERE optimization

-① The query includes a filter on the `town` column, which is part of the table’s primary key, and therefore also part of the primary index. +① The query includes a filter on the `town` column, which is part of the table's primary key, and therefore also part of the primary index. -② To accelerate the query, ClickHouse loads the table’s primary index into memory. +② To accelerate the query, ClickHouse loads the table's primary index into memory. ③ It scans the index entries to identify which granules from the town column might contain rows matching the predicate. @@ -50,13 +50,13 @@ The first three processing steps are the same as before: Query processing with PREWHERE optimization

-① The query includes a filter on the `town` column, which is part of the table’s primary key—and therefore also part of the primary index. +① The query includes a filter on the `town` column, which is part of the table's primary key—and therefore also part of the primary index. ② Similar to the run without the PREWHERE clause, to accelerate the query, ClickHouse loads the primary index into memory, ③ then scans the index entries to identify which granules from the `town` column might contain rows matching the predicate. -Now, thanks to the PREWHERE clause, the next step differs: Instead of reading all relevant columns up front, ClickHouse filters data column by column, only loading what’s truly needed. This drastically reduces I/O, especially for wide tables. +Now, thanks to the PREWHERE clause, the next step differs: Instead of reading all relevant columns up front, ClickHouse filters data column by column, only loading what's truly needed. This drastically reduces I/O, especially for wide tables. With each step, it only loads granules that contain at least one row that survived—i.e., matched—the previous filter. As a result, the number of granules to load and evaluate for each filter decreases monotonically: @@ -92,9 +92,9 @@ Note that ClickHouse processes the same number of rows in both the PREWHERE and ## PREWHERE optimization is automatically applied {#prewhere-optimization-is-automatically-applied} -The PREWHERE clause can be added manually, as shown in the example above. However, you don’t need to write PREWHERE manually. When the setting [`optimize_move_to_prewhere`](/operations/settings/settings#optimize_move_to_prewhere) is enabled (true by default), ClickHouse automatically moves filter conditions from WHERE to PREWHERE, prioritizing those that will reduce read volume the most. +The PREWHERE clause can be added manually, as shown in the example above. However, you don't need to write PREWHERE manually. When the setting [`optimize_move_to_prewhere`](/operations/settings/settings#optimize_move_to_prewhere) is enabled (true by default), ClickHouse automatically moves filter conditions from WHERE to PREWHERE, prioritizing those that will reduce read volume the most. -The idea is that smaller columns are faster to scan, and by the time larger columns are processed, most granules have already been filtered out. Since all columns have the same number of rows, a column’s size is primarily determined by its data type, for example, a `UInt8` column is generally much smaller than a `String` column. +The idea is that smaller columns are faster to scan, and by the time larger columns are processed, most granules have already been filtered out. Since all columns have the same number of rows, a column's size is primarily determined by its data type, for example, a `UInt8` column is generally much smaller than a `String` column. ClickHouse follows this strategy by default as of version [23.2](https://clickhouse.com/blog/clickhouse-release-23-02#multi-stage-prewhere--alexander-gololobov), sorting PREWHERE filter columns for multi-step processing in ascending order of uncompressed size. @@ -156,7 +156,7 @@ The same number of rows was processed (2.31 million), but thanks to PREWHERE, Cl For deeper insight into how ClickHouse applies PREWHERE behind the scenes, use EXPLAIN and trace logs. -We inspect the query’s logical plan using the [EXPLAIN](/sql-reference/statements/explain#explain-plan) clause: +We inspect the query's logical plan using the [EXPLAIN](/sql-reference/statements/explain#explain-plan) clause: ```sql EXPLAIN PLAN actions = 1 SELECT @@ -177,9 +177,9 @@ Prewhere info ... ``` -We omit most of the plan output here, as it’s quite verbose. In essence, it shows that all three column predicates were automatically moved to PREWHERE. +We omit most of the plan output here, as it's quite verbose. In essence, it shows that all three column predicates were automatically moved to PREWHERE. -When reproducing this yourself, you’ll also see in the query plan that the order of these predicates is based on the columns’ data type sizes. Since we haven’t enabled column statistics, ClickHouse uses size as the fallback for determining the PREWHERE processing order. +When reproducing this yourself, you'll also see in the query plan that the order of these predicates is based on the columns' data type sizes. Since we haven't enabled column statistics, ClickHouse uses size as the fallback for determining the PREWHERE processing order. If you want to go even further under the hood, you can observe each individual PREWHERE processing step by instructing ClickHouse to return all test-level log entries during query execution: ```sql diff --git a/docs/guides/best-practices/query-parallelism.md b/docs/guides/best-practices/query-parallelism.md index a3fc65cb28e..8347fe1a357 100644 --- a/docs/guides/best-practices/query-parallelism.md +++ b/docs/guides/best-practices/query-parallelism.md @@ -26,7 +26,7 @@ We use an aggregation query on the [uk_price_paid_simple](/parts) dataset to ill ## Step-by-step: How ClickHouse parallelizes an aggregation query {#step-by-step-how-clickHouse-parallelizes-an-aggregation-query} -When ClickHouse ① runs an aggregation query with a filter on the table’s primary key, it ② loads the primary index into memory to ③ identify which granules need to be processed, and which can be safely skipped: +When ClickHouse ① runs an aggregation query with a filter on the table's primary key, it ② loads the primary index into memory to ③ identify which granules need to be processed, and which can be safely skipped: Index analysis @@ -63,11 +63,11 @@ In ClickHouse Cloud, this same parallelism is achieved through [parallel replica ## Monitoring query parallelism {#monitoring-query-parallelism} -Use these tools to verify that your query fully utilizes available CPU resources and to diagnose when it doesn’t. +Use these tools to verify that your query fully utilizes available CPU resources and to diagnose when it doesn't. -We’re running this on a test server with 59 CPU cores, which allows ClickHouse to fully showcase its query parallelism. +We're running this on a test server with 59 CPU cores, which allows ClickHouse to fully showcase its query parallelism. -To observe how the example query is executed, we can instruct the ClickHouse server to return all trace-level log entries during the aggregation query. For this demonstration, we removed the query’s predicate—otherwise, only 3 granules would be processed, which isn’t enough data for ClickHouse to make use of more than a few parallel processing lanes: +To observe how the example query is executed, we can instruct the ClickHouse server to return all trace-level log entries during the aggregation query. For this demonstration, we removed the query's predicate—otherwise, only 3 granules would be processed, which isn't enough data for ClickHouse to make use of more than a few parallel processing lanes: ```sql runnable=false SELECT max(price) @@ -115,7 +115,7 @@ FROM Note: Read the operator plan above from bottom to top. Each line represents a stage in the physical execution plan, starting with reading data from storage at the bottom and ending with the final processing steps at the top. Operators marked with `× 59` are executed concurrently on non-overlapping data regions across 59 parallel processing lanes. This reflects the value of `max_threads` and illustrates how each stage of the query is parallelized across CPU cores. -ClickHouse’s [embedded web UI](/interfaces/http) (available at the `/play` endpoint) can render the physical plan from above as a graphical visualization. In this example, we set `max_threads` to `4` to keep the visualization compact, showing just 4 parallel processing lanes: +ClickHouse's [embedded web UI](/interfaces/http) (available at the `/play` endpoint) can render the physical plan from above as a graphical visualization. In this example, we set `max_threads` to `4` to keep the visualization compact, showing just 4 parallel processing lanes: Query pipeline @@ -158,7 +158,7 @@ MergeTreeSelect(pool: PrefetchedReadPool, algorithm: Thread) × 30 As shown in the operator plan extract above, even though `max_threads` is set to `59`, ClickHouse uses only **30** concurrent streams to scan the data. -Now let’s run the query: +Now let's run the query: ```sql runnable=false SELECT max(price) @@ -204,7 +204,7 @@ WHERE town = 'LONDON'; └───────────────────────────────────────────────────────┘ ``` -Regardless of the configured `max_threads` value, ClickHouse only allocates additional parallel processing lanes when there’s enough data to justify them. The "max" in `max_threads` refers to an upper limit, not a guaranteed number of threads used. +Regardless of the configured `max_threads` value, ClickHouse only allocates additional parallel processing lanes when there's enough data to justify them. The "max" in `max_threads` refers to an upper limit, not a guaranteed number of threads used. What "enough data" means is primarily determined by two settings, which define the minimum number of rows (163,840 by default) and the minimum number of bytes (2,097,152 by default) that each processing lane should handle: @@ -216,15 +216,15 @@ For clusters with shared storage (e.g. ClickHouse Cloud): * [merge_tree_min_rows_for_concurrent_read_for_remote_filesystem](https://clickhouse.com/docs/operations/settings/settings#merge_tree_min_rows_for_concurrent_read_for_remote_filesystem) * [merge_tree_min_bytes_for_concurrent_read_for_remote_filesystem](https://clickhouse.com/docs/operations/settings/settings#merge_tree_min_bytes_for_concurrent_read_for_remote_filesystem) -Additionally, there’s a hard lower limit for read task size, controlled by: +Additionally, there's a hard lower limit for read task size, controlled by: * [Merge_tree_min_read_task_size](https://clickhouse.com/docs/operations/settings/settings#merge_tree_min_read_task_size) + [merge_tree_min_bytes_per_task_for_remote_reading](https://clickhouse.com/docs/operations/settings/settings#merge_tree_min_bytes_per_task_for_remote_reading) :::warning Don't modify these settings -We don’t recommend modifying these settings in production. They’re shown here solely to illustrate why `max_threads` doesn’t always determine the actual level of parallelism. +We don't recommend modifying these settings in production. They're shown here solely to illustrate why `max_threads` doesn't always determine the actual level of parallelism. ::: -For demonstration purposes, let’s inspect the physical plan with these settings overridden to force maximum concurrency: +For demonstration purposes, let's inspect the physical plan with these settings overridden to force maximum concurrency: ```sql runnable=false EXPLAIN PIPELINE SELECT @@ -258,9 +258,9 @@ This demonstrates that for queries on small datasets, ClickHouse will intentiona ## Where to find more information {#where-to-find-more-information} -If you’d like to dive deeper into how ClickHouse executes queries in parallel and how it achieves high performance at scale, explore the following resources: +If you'd like to dive deeper into how ClickHouse executes queries in parallel and how it achieves high performance at scale, explore the following resources: -* [Query Processing Layer – VLDB 2024 Paper (Web Edition)](/academic_overview#4-query-processing-layer) - A detailed breakdown of ClickHouse’s internal execution model, including scheduling, pipelining, and operator design. +* [Query Processing Layer – VLDB 2024 Paper (Web Edition)](/academic_overview#4-query-processing-layer) - A detailed breakdown of ClickHouse's internal execution model, including scheduling, pipelining, and operator design. * [Partial aggregation states explained](https://clickhouse.com/blog/clickhouse_vs_elasticsearch_mechanics_of_count_aggregations#-multi-core-parallelization) - A technical deep dive into how partial aggregation states enable efficient parallel execution across processing lanes. diff --git a/docs/integrations/data-ingestion/azure-data-factory/using_http_interface.md b/docs/integrations/data-ingestion/azure-data-factory/using_http_interface.md index 7bbc6454d2e..1f211c7a33c 100644 --- a/docs/integrations/data-ingestion/azure-data-factory/using_http_interface.md +++ b/docs/integrations/data-ingestion/azure-data-factory/using_http_interface.md @@ -94,7 +94,7 @@ Azure Data Factory can handle this encoding automatically using its built-in ::: Now you can send JSON-formatted data to this URL. The data should match the -structure of the target table. Here’s a simple example using curl, assuming a +structure of the target table. Here's a simple example using curl, assuming a table with three columns: `col_1`, `col_2`, and `col_3`. ```text curl \ @@ -104,13 +104,13 @@ curl \ You can also send a JSON array of objects, or JSON Lines (newline-delimited JSON objects). Azure Data Factory uses the JSON array format, which works -perfectly with ClickHouse’s `JSONEachRow` input. +perfectly with ClickHouse's `JSONEachRow` input. -As you can see, for this step you don’t need to do anything special on the ClickHouse +As you can see, for this step you don't need to do anything special on the ClickHouse side. The HTTP interface already provides everything needed to act as a REST-like endpoint — no additional configuration required. -Now that we’ve made ClickHouse behave like a REST endpoint, it's time to +Now that we've made ClickHouse behave like a REST endpoint, it's time to configure Azure Data Factory to use it. In the next steps, we'll create an Azure Data Factory instance, set up a Linked @@ -187,7 +187,7 @@ Data Factory instance. 9. Back in the main form select Basic authentication, enter the username and password used to connect to your ClickHouse HTTP interface, click **Test - connection**. If everything is configured correctly, you’ll see a success + connection**. If everything is configured correctly, you'll see a success message. New Linked Service Check Connection @@ -257,7 +257,7 @@ Data](https://clickhouse.com/docs/getting-started/example-datasets/environmental New Dataset Query 6. Click OK to save the expression. Click Test connection. If everything is - configured correctly, you’ll see a Connection successful message. Click Publish + configured correctly, you'll see a Connection successful message. Click Publish all at the top of the page to save your changes. New Dataset Connection Successful diff --git a/docs/integrations/data-ingestion/clickpipes/postgres/index.md b/docs/integrations/data-ingestion/clickpipes/postgres/index.md index 45db0681778..ff7513b65e8 100644 --- a/docs/integrations/data-ingestion/clickpipes/postgres/index.md +++ b/docs/integrations/data-ingestion/clickpipes/postgres/index.md @@ -16,7 +16,7 @@ import select_destination_db from '@site/static/images/integrations/data-ingesti import ch_permissions from '@site/static/images/integrations/data-ingestion/clickpipes/postgres/ch-permissions.jpg' import Image from '@theme/IdealImage'; -# Ingesting Data from Postgres to ClickHouse (using CDC) +# Ingesting data from Postgres to ClickHouse (using CDC) You can use ClickPipes to ingest data from your source Postgres database into ClickHouse Cloud. The source Postgres database can be hosted on-premises or in the cloud including Amazon RDS, Google Cloud SQL, Azure Database for Postgres, Supabase and others. @@ -134,7 +134,7 @@ You can configure the Advanced settings if needed. A brief description of each s 7. You can select the tables you want to replicate from the source Postgres database. While selecting the tables, you can also choose to rename the tables in the destination ClickHouse database as well as exclude specific columns. :::warning - If you are defining a Ordering Key in ClickHouse differently from the Primary Key in Postgres, please don’t forget to read all the [considerations](/integrations/clickpipes/postgres/ordering_keys) around it! + If you are defining an ordering key in ClickHouse differently than from the primary key in Postgres, don't forget to read all the [considerations](/integrations/clickpipes/postgres/ordering_keys) around it ::: ### Review permissions and start the ClickPipe {#review-permissions-and-start-the-clickpipe} diff --git a/docs/managing-data/core-concepts/index.md b/docs/managing-data/core-concepts/index.md index efb2d7c0f6e..326345cf1cb 100644 --- a/docs/managing-data/core-concepts/index.md +++ b/docs/managing-data/core-concepts/index.md @@ -14,5 +14,5 @@ you will learn some of the core concepts of how ClickHouse works. | [Table partitions](/partitions) | Learn what table partitions are and what they are used for. | | [Table part merges](/merges) | Learn what table part merges are and what they are used for. | | [Table shards and replicas](/shards) | Learn what table shards and replicas are and what they are used for. | -| [Primary indexes](/primary-indexes) | Introduces ClickHouse’s sparse primary index and how it helps efficiently skip unnecessary data during query execution. Explains how the index is built and used, with examples and tools for observing its effect. Links to a deep dive for advanced use cases and best practices. | +| [Primary indexes](/primary-indexes) | Introduces ClickHouse's sparse primary index and how it helps efficiently skip unnecessary data during query execution. Explains how the index is built and used, with examples and tools for observing its effect. Links to a deep dive for advanced use cases and best practices. | | [Architectural Overview](/academic_overview) | A concise academic overview of all components of the ClickHouse architecture, based on our VLDB 2024 scientific paper. | diff --git a/docs/managing-data/core-concepts/primary-indexes.md b/docs/managing-data/core-concepts/primary-indexes.md index 9d2a50b4df8..4a1dee57805 100644 --- a/docs/managing-data/core-concepts/primary-indexes.md +++ b/docs/managing-data/core-concepts/primary-indexes.md @@ -14,7 +14,7 @@ import Image from '@theme/IdealImage'; :::tip Looking for advanced indexing details? -This page introduces ClickHouse’s sparse primary index, how it’s built, how it works, and how it helps accelerate queries. +This page introduces ClickHouse's sparse primary index, how it's built, how it works, and how it helps accelerate queries. For advanced indexing strategies and deeper technical detail, see the [primary indexes deep dive](/guides/best-practices/sparse-primary-indexes). ::: @@ -26,7 +26,7 @@ For advanced indexing strategies and deeper technical detail, see the [primary i
-The sparse primary index in ClickHouse helps efficiently identify [granules](https://clickhouse.com/docs/guides/best-practices/sparse-primary-indexes#data-is-organized-into-granules-for-parallel-data-processing)—blocks of rows—that might contain data matching a query’s condition on the table’s primary key columns. In the next section, we explain how this index is constructed from the values in those columns. +The sparse primary index in ClickHouse helps efficiently identify [granules](https://clickhouse.com/docs/guides/best-practices/sparse-primary-indexes#data-is-organized-into-granules-for-parallel-data-processing)—blocks of rows—that might contain data matching a query's condition on the table's primary key columns. In the next section, we explain how this index is constructed from the values in those columns. ### Sparse primary index creation {#sparse-primary-index-creation} @@ -38,7 +38,7 @@ As a [reminder](https://clickhouse.com/docs/parts), in our ① example table wit

-For processing, each column’s data is ④ logically divided into granules—each covering 8,192 rows—which are the smallest units ClickHouse’s data processing mechanics work with. +For processing, each column's data is ④ logically divided into granules—each covering 8,192 rows—which are the smallest units ClickHouse's data processing mechanics work with. This granule structure is also what makes the primary index **sparse**: instead of indexing every row, ClickHouse stores ⑤ the primary key values from just one row per granule—specifically, the first row. This results in one index entry per granule: @@ -59,9 +59,9 @@ We sketch how the sparse primary index is used for query acceleration with anoth ① The example query includes a predicate on both primary key columns: `town = 'LONDON' AND street = 'OXFORD STREET'`. -② To accelerate the query, ClickHouse loads the table’s primary index into memory. +② To accelerate the query, ClickHouse loads the table's primary index into memory. -③ It then scans the index entries to identify which granules might contain rows matching the predicate—in other words, which granules can’t be skipped. +③ It then scans the index entries to identify which granules might contain rows matching the predicate—in other words, which granules can't be skipped. ④ These potentially relevant granules are then loaded and [processed](/optimize/query-parallelism) in memory, along with the corresponding granules from any other columns required for the query. @@ -118,7 +118,7 @@ LIMIT 10; └───────┴────────────────┴──────────────────┘ ``` -Lastly, we use the [EXPLAIN](/sql-reference/statements/explain) clause to see how the primary indexes of all data parts are used to skip granules that can’t possibly contain rows matching the example query’s predicates. These granules are excluded from loading and processing: +Lastly, we use the [EXPLAIN](/sql-reference/statements/explain) clause to see how the primary indexes of all data parts are used to skip granules that can't possibly contain rows matching the example query's predicates. These granules are excluded from loading and processing: ```sql EXPLAIN indexes = 1 SELECT @@ -196,7 +196,7 @@ SELECT count() FROM uk.uk_price_paid_simple; For a deeper look at how sparse primary indexes work in ClickHouse, including how they differ from traditional database indexes and best practices for using them, check out our detailed indexing [deep dive](/guides/best-practices/sparse-primary-indexes). -If you’re interested in how ClickHouse processes data selected by the primary index scan in a highly parallel way, see the query parallelism guide [here](/optimize/query-parallelism). +If you're interested in how ClickHouse processes data selected by the primary index scan in a highly parallel way, see the query parallelism guide [here](/optimize/query-parallelism). diff --git a/docs/materialized-view/incremental-materialized-view.md b/docs/materialized-view/incremental-materialized-view.md index 2c4867c917a..650f4e205c6 100644 --- a/docs/materialized-view/incremental-materialized-view.md +++ b/docs/materialized-view/incremental-materialized-view.md @@ -1077,7 +1077,7 @@ Although our ordering of the arrival of rows from each view is the same, this is ### When to use parallel processing {#materialized-views-when-to-use-parallel} -Enabling `parallel_view_processing=1` can significantly improve insert throughput, as shown above, especially when multiple Materialized Views are attached to a single table. However, it’s important to understand the trade-offs: +Enabling `parallel_view_processing=1` can significantly improve insert throughput, as shown above, especially when multiple Materialized Views are attached to a single table. However, it's important to understand the trade-offs: - **Increased insert pressure**: All Materialized Views are executed simultaneously, increasing CPU and memory usage. If each view performs heavy computation or JOINs, this can overload the system. - **Need for strict execution order**: In rare workflows where the order of view execution matters (e.g., chained dependencies), parallel execution may lead to inconsistent state or race conditions. While possible to design around this, such setups are fragile and may break with future versions. @@ -1090,7 +1090,7 @@ In general, enable `parallel_view_processing=1` when: - You have multiple independent Materialized Views - You're aiming to maximize insert performance -- You're aware of the system’s capacity to handle concurrent view execution +- You're aware of the system's capacity to handle concurrent view execution Leave it disabled when: - Materialized Views have dependencies on one another diff --git a/docs/migrations/postgres/overview.md b/docs/migrations/postgres/overview.md index 581189ca0a1..ed3173ef9ee 100644 --- a/docs/migrations/postgres/overview.md +++ b/docs/migrations/postgres/overview.md @@ -33,11 +33,11 @@ Real-time Change Data Capture (CDC) can be implemented in ClickHouse using [Clic In some cases, a more straightforward approach like manual bulk loading followed by periodic updates may be sufficient. This strategy is ideal for one-time migrations or situations where real-time replication is not required. It involves loading data from PostgreSQL to ClickHouse in bulk, either through direct SQL `INSERT` commands or by exporting and importing CSV files. After the initial migration, you can periodically update the data in ClickHouse by syncing changes from PostgreSQL at regular intervals. -The bulk load process is simple and flexible but comes with the downside of no real-time updates. Once the initial data is in ClickHouse, updates won’t be reflected immediately, so you must schedule periodic updates to sync the changes from PostgreSQL. This approach works well for less time-sensitive use cases, but it introduces a delay between when data changes in PostgreSQL and when those changes appear in ClickHouse. +The bulk load process is simple and flexible but comes with the downside of no real-time updates. Once the initial data is in ClickHouse, updates won't be reflected immediately, so you must schedule periodic updates to sync the changes from PostgreSQL. This approach works well for less time-sensitive use cases, but it introduces a delay between when data changes in PostgreSQL and when those changes appear in ClickHouse. ### Which strategy to choose? {#which-strategy-to-choose} -For most applications that require fresh, up-to-date data in ClickHouse, real-time CDC through ClickPipes is the recommended approach. It provides continuous data syncing with minimal setup and maintenance. On the other hand, manual bulk loading with periodic updates is a viable option for simpler, one-off migrations or workloads where real-time updates aren’t critical. +For most applications that require fresh, up-to-date data in ClickHouse, real-time CDC through ClickPipes is the recommended approach. It provides continuous data syncing with minimal setup and maintenance. On the other hand, manual bulk loading with periodic updates is a viable option for simpler, one-off migrations or workloads where real-time updates aren't critical. --- diff --git a/docs/use-cases/data_lake/unity_catalog.md b/docs/use-cases/data_lake/unity_catalog.md index 33d656e1a17..bcd4848ef26 100644 --- a/docs/use-cases/data_lake/unity_catalog.md +++ b/docs/use-cases/data_lake/unity_catalog.md @@ -118,7 +118,7 @@ SELECT count(*) FROM `uniform.delta_hits` ``` :::note Backticks required -Backticks are required because ClickHouse doesn’t support more than one namespace. +Backticks are required because ClickHouse doesn't support more than one namespace. ::: To inspect the table DDL: diff --git a/docs/use-cases/observability/clickstack/alerts.md b/docs/use-cases/observability/clickstack/alerts.md index 493fd78da28..63c1249fcc2 100644 --- a/docs/use-cases/observability/clickstack/alerts.md +++ b/docs/use-cases/observability/clickstack/alerts.md @@ -16,7 +16,7 @@ import search_alert from '@site/static/images/use-cases/observability/search_ale After entering a [search](/use-cases/observability/clickstack/search), you can create an alert to be notified when the number of events (logs or spans) matching the search exceeds or falls below a threshold. -### Creating an Alert {#creating-an-alert} +### Creating an alert {#creating-an-alert} You can create an alert by clicking the `Alerts` button on the top right of the `Search` page. @@ -26,7 +26,7 @@ The `grouped by` value allows the search to be subject to an aggregation e.g. `S Search alerts -### Common Alert Scenarios {#common-alert-scenarios} +### Common alert scenarios {#common-alert-scenarios} Here are a few common alert scenarios that you can use HyperDX for: diff --git a/docs/use-cases/observability/clickstack/architecture.md b/docs/use-cases/observability/clickstack/architecture.md index 861b556e5c3..f1e00cce062 100644 --- a/docs/use-cases/observability/clickstack/architecture.md +++ b/docs/use-cases/observability/clickstack/architecture.md @@ -12,11 +12,11 @@ import architecture from '@site/static/images/use-cases/observability/clickstack The ClickStack architecture is built around three core components: **ClickHouse**, **HyperDX**, and a **OpenTelemetry (OTel) collector**. A **MongoDB** instance provides storage for the application state. Together, they provide a high-performance, open-source observability stack optimized for logs, metrics, and traces. -## Architecture Overview {#architecture-overview} +## Architecture overview {#architecture-overview} Architecture -## ClickHouse: The database engine {#clickhouse} +## ClickHouse: the database engine {#clickhouse} At the heart of ClickStack is ClickHouse, a column-oriented database designed for real-time analytics at scale. It powers the ingestion and querying of observability data, enabling: @@ -37,7 +37,7 @@ ClickStack includes a pre-configured OpenTelemetry (OTel) collector to ingest te The collector exports telemetry to ClickHouse in efficient batches. It supports optimized table schemas per data source, ensuring scalable performance across all signal types. -## HyperDX: The interface {#hyperdx} +## HyperDX: the interface {#hyperdx} HyperDX is the user interface for ClickStack. It offers: diff --git a/docs/use-cases/observability/clickstack/config.md b/docs/use-cases/observability/clickstack/config.md index 35558c80b85..b39cae8ce8a 100644 --- a/docs/use-cases/observability/clickstack/config.md +++ b/docs/use-cases/observability/clickstack/config.md @@ -18,11 +18,11 @@ The following configuration options are available for each component of ClickSta If using the [All in One](/use-cases/observability/clickstack/deployment/all-in-one), [HyperDX Only](/use-cases/observability/clickstack/deployment/hyperdx-only) or [Local Mode](/use-cases/observability/clickstack/deployment/local-mode-only) simply pass the desired setting via an environment variable e.g. -```bash +```shell docker run -e HYPERDX_LOG_LEVEL='debug' -p 8080:8080 -p 4317:4317 -p 4318:4318 docker.hyperdx.io/hyperdx/hyperdx-all-in-one ``` -### Docker compose {#docker-compose} +### Docker Compose {#docker-compose} If using the [Docker Compose](/use-cases/observability/clickstack/deployment/docker-compose) deployment guide, the [`.env`](https://github.com/hyperdxio/hyperdx/blob/main/.env) file can be used to modify settings. @@ -40,11 +40,11 @@ services: ### Helm {#helm} -#### Customizing values (Optional) {#customizing-values} +#### Customizing values (optional) {#customizing-values} You can customize settings by using `--set` flags e.g. -```bash +```shell helm install my-hyperdx hyperdx/hdx-oss-v2 \ --set replicaCount=2 \ --set resources.limits.cpu=500m \ @@ -62,7 +62,7 @@ helm install my-hyperdx hyperdx/hdx-oss-v2 \ Alternatively edit the `values.yaml`. To retrieve the default values: -```sh +```shell helm show values hyperdx/hdx-oss-v2 > values.yaml ``` diff --git a/docs/use-cases/observability/clickstack/deployment/all-in-one.md b/docs/use-cases/observability/clickstack/deployment/all-in-one.md index 04bd513f6cd..ce971354239 100644 --- a/docs/use-cases/observability/clickstack/deployment/all-in-one.md +++ b/docs/use-cases/observability/clickstack/deployment/all-in-one.md @@ -34,7 +34,7 @@ This option includes authentication, enabling the persistence of dashboards, ale The following will run an OpenTelemetry collector (on port 4317 and 4318) and the HyperDX UI (on port 8080). -```bash +```shell docker run -p 8080:8080 -p 4317:4317 -p 4318:4318 docker.hyperdx.io/hyperdx/hyperdx-all-in-one ``` @@ -60,7 +60,7 @@ To ingest data see ["Ingesting data"](/use-cases/observability/clickstack/ingest To persist data and settings across restarts of the container, users can modify the above docker command to mount the paths `/data/db`, `/var/lib/clickhouse` and `/var/log/clickhouse-server`. For example: -```bash +```shell # ensure directories exist mkdir -p .volumes/db .volumes/ch_data .volumes/ch_logs # modify command to mount paths @@ -87,7 +87,7 @@ If you need to customize the application (8080) or API (8000) ports that HyperDX Customizing the OpenTelemetry ports can simply be changed by modifying the port forwarding flags. For example, replacing `-p 4318:4318` with `-p 4999:4318` to change the OpenTelemetry HTTP port to 4999. -```bash +```shell docker run -p 8080:8080 -p 4317:4317 -p 4999:4318 docker.hyperdx.io/hyperdx/hyperdx-all-in-one ``` @@ -97,7 +97,7 @@ This distribution can be used with ClickHouse Cloud. While the local ClickHouse For example: -```bash +```shell export CLICKHOUSE_ENDPOINT= export CLICKHOUSE_USER= export CLICKHOUSE_PASSWORD= diff --git a/docs/use-cases/observability/clickstack/deployment/docker-compose.md b/docs/use-cases/observability/clickstack/deployment/docker-compose.md index e4a3e98b7ed..7f14b4e6ec9 100644 --- a/docs/use-cases/observability/clickstack/deployment/docker-compose.md +++ b/docs/use-cases/observability/clickstack/deployment/docker-compose.md @@ -46,7 +46,7 @@ These ports enable integrations with a variety of telemetry sources and make the To deploy with Docker Compose clone the HyperDX repo, change into the directory and run `docker-compose up`: -```bash +```shell git clone git@github.com:hyperdxio/hyperdx.git cd hyperdx # switch to the v2 branch @@ -86,7 +86,7 @@ If prompted to create a source, retain all default values and complete the `Tabl Users can modify settings for the stack, such as the version used, through the environment variable file: -```bash +```shell user@example-host hyperdx % cat .env # Used by docker-compose.yml # Used by docker-compose.yml @@ -126,7 +126,7 @@ This distribution can be used with ClickHouse Cloud. Users should: - Remove the ClickHouse service from the [`docker-compose.yaml`](https://github.com/hyperdxio/hyperdx/blob/86465a20270b895320eb21dca13560b65be31e68/docker-compose.yml#L89) file. This is optional if testing, as the deployed ClickHouse instance will simply be ignored - although waste local resources. If removing the service, ensure [any references](https://github.com/hyperdxio/hyperdx/blob/86465a20270b895320eb21dca13560b65be31e68/docker-compose.yml#L65) to the service such as `depends_on` are removed. - Modify the OTel collector to use a ClickHouse Cloud instance by setting the environment variables `CLICKHOUSE_ENDPOINT`, `CLICKHOUSE_USER` and `CLICKHOUSE_PASSWORD` in the compose file. Specifically, add the environment variables to the OTel collector service: - ```bash + ```shell otel-collector: image: ${OTEL_COLLECTOR_IMAGE_NAME}:${IMAGE_VERSION} environment: diff --git a/docs/use-cases/observability/clickstack/deployment/helm.md b/docs/use-cases/observability/clickstack/deployment/helm.md index fe0218e2cb7..11c97410dcf 100644 --- a/docs/use-cases/observability/clickstack/deployment/helm.md +++ b/docs/use-cases/observability/clickstack/deployment/helm.md @@ -45,11 +45,11 @@ The chart supports standard Kubernetes best practices, including: - Kubernetes cluster (v1.20+ recommended) - `kubectl` configured to interact with your cluster -### Add the HyperDX Helm Repository {#add-the-hyperdx-helm-repository} +### Add the HyperDX Helm repository {#add-the-hyperdx-helm-repository} Add the HyperDX Helm repository: -```sh +```shell helm repo add hyperdx https://hyperdxio.github.io/helm-charts helm repo update ``` @@ -58,7 +58,7 @@ helm repo update To install the HyperDX chart with default values: -```sh +```shell helm install my-hyperdx hyperdx/hdx-oss-v2 ``` @@ -66,7 +66,7 @@ helm install my-hyperdx hyperdx/hdx-oss-v2 Verify the installation: -```bash +```shell kubectl get pods -l "app.kubernetes.io/name=hdx-oss-v2" ``` @@ -76,7 +76,7 @@ When all pods are ready, proceed. Port forwarding allows us to access and set up HyperDX. Users deploying to production should instead expose the service via an ingress or load balancer to ensure proper network access, TLS termination, and scalability. Port forwarding is best suited for local development or one-off administrative tasks, not long-term or high-availability environments. -```bash +```shell kubectl port-forward \ pod/$(kubectl get pod -l app.kubernetes.io/name=hdx-oss-v2 -o jsonpath='{.items[0].metadata.name}') \ 8080:3000 @@ -99,16 +99,16 @@ You can override the default connection to the integrated ClickHouse instance. F For an example of using an alternative ClickHouse instance, see ["Create a ClickHouse Cloud connection"](/use-cases/observability/clickstack/getting-started#create-a-cloud-connection). -### Customizing values (Optional) {#customizing-values} +### Customizing values (optional) {#customizing-values} You can customize settings by using `--set` flags. For example: -```bash +```shell helm install my-hyperdx hyperdx/hdx-oss-v2 --set key=value Alternatively, edit the `values.yaml`. To retrieve the default values: -```sh +```shell helm show values hyperdx/hdx-oss-v2 > values.yaml ``` @@ -134,15 +134,15 @@ ingress: pathType: ImplementationSpecific ``` -```bash +```shell helm install my-hyperdx hyperdx/hdx-oss-v2 -f values.yaml ``` -### Using Secrets (Optional) {#using-secrets} +### Using secrets (optional) {#using-secrets} For handling sensitive data such as API keys or database credentials, use Kubernetes secrets. The HyperDX Helm charts provide default secret files that you can modify and apply to your cluster. -#### Using Pre-Configured Secrets {#using-pre-configured-secrets} +#### Using pre-configured secrets {#using-pre-configured-secrets} The Helm chart includes a default secret template located at [`charts/hdx-oss-v2/templates/secrets.yaml`](https://github.com/hyperdxio/helm-charts/blob/main/charts/hdx-oss-v2/templates/secrets.yaml). This file provides a base structure for managing secrets. @@ -163,20 +163,20 @@ data: Apply the secret to your cluster: -```sh +```shell kubectl apply -f secrets.yaml ``` -#### Creating a Custom Secret {#creating-a-custom-secret} +#### Creating a custom secret {#creating-a-custom-secret} If you prefer, you can create a custom Kubernetes secret manually: -```sh +```shell kubectl create secret generic hyperdx-secret \ --from-literal=API_KEY=my-secret-api-key ``` -#### Referencing a Secret {#referencing-a-secret} +#### Referencing a secret {#referencing-a-secret} To reference a secret in `values.yaml`: @@ -193,9 +193,9 @@ hyperdx: ## Using ClickHouse Cloud {#using-clickhouse-cloud} -If using ClickHouse Cloud users disable the ClickHouse instance deployed by the Helm chart and specify the Cloud Cloud credentials: +If using ClickHouse Cloud users disable the ClickHouse instance deployed by the Helm chart and specify the Cloud credentials: -```bash +```shell # specify ClickHouse Cloud credentials export CLICKHOUSE_URL= # full https url export CLICKHOUSE_USER= @@ -221,7 +221,7 @@ otel: clickhouseEndpoint: ${CLICKHOUSE_URL} ``` -```bash +```shell helm install my-hyperdx hyperdx/hdx-oss-v2 -f values.yaml # or if installed... # helm upgrade my-hyperdx hyperdx/hdx-oss-v2 -f values.yaml @@ -234,11 +234,11 @@ By default, this chart also installs ClickHouse and the OTel collector. However, To disable ClickHouse and the OTel collector, set the following values: -```bash +```shell helm install myrelease hyperdx-helm --set clickhouse.enabled=false --set clickhouse.persistence.enabled=false --set otel.enabled=false ``` -## Task Configuration {#task-configuration} +## Task configuration {#task-configuration} By default, there is one task in the chart setup as a cronjob, responsible for checking whether alerts should fire. Here are its configuration options: @@ -248,17 +248,17 @@ By default, there is one task in the chart setup as a cronjob, responsible for c | `tasks.checkAlerts.schedule` | Cron schedule for the check-alerts task | `*/1 * * * *` | | `tasks.checkAlerts.resources` | Resource requests and limits for the check-alerts task | See `values.yaml` | -## Upgrading the Chart {#upgrading-the-chart} +## Upgrading the chart {#upgrading-the-chart} To upgrade to a newer version: -```sh +```shell helm upgrade my-hyperdx hyperdx/hdx-oss-v2 -f values.yaml ``` To check available chart versions: -```sh +```shell helm search repo hyperdx ``` @@ -266,7 +266,7 @@ helm search repo hyperdx To remove the deployment: -```sh +```shell helm uninstall my-hyperdx ``` @@ -274,20 +274,20 @@ This will remove all resources associated with the release, but persistent data ## Troubleshooting {#troubleshooting} -### Checking Logs {#checking-logs} +### Checking logs {#checking-logs} -```sh +```shell kubectl logs -l app.kubernetes.io/name=hdx-oss-v2 ``` -### Debugging a Failed Install {#debugging-a-failed-instance} +### Debugging a failed install {#debugging-a-failed-instance} -```sh +```shell helm install my-hyperdx hyperdx/hdx-oss-v2 --debug --dry-run ``` -### Verifying Deployment {#verifying-deployment} +### Verifying deployment {#verifying-deployment} -```sh +```shell kubectl get pods -l app.kubernetes.io/name=hdx-oss-v2 ``` diff --git a/docs/use-cases/observability/clickstack/deployment/hyperdx-only.md b/docs/use-cases/observability/clickstack/deployment/hyperdx-only.md index 7cef48dce9f..23fe3c6f8da 100644 --- a/docs/use-cases/observability/clickstack/deployment/hyperdx-only.md +++ b/docs/use-cases/observability/clickstack/deployment/hyperdx-only.md @@ -34,7 +34,7 @@ In this mode, data ingestion is left entirely to the user. You can ingest data i Run the following command, modifying `YOUR_MONGODB_URI` as required. -```bash +```shell docker run -e MONGO_URI=mongodb://YOUR_MONGODB_URI -p 8080:8080 docker.hyperdx.io/hyperdx/hyperdx ``` diff --git a/docs/use-cases/observability/clickstack/deployment/local-mode-only.md b/docs/use-cases/observability/clickstack/deployment/local-mode-only.md index bb545362a7f..05d6416cfb9 100644 --- a/docs/use-cases/observability/clickstack/deployment/local-mode-only.md +++ b/docs/use-cases/observability/clickstack/deployment/local-mode-only.md @@ -11,11 +11,15 @@ import Image from '@theme/IdealImage'; import hyperdx_logs from '@site/static/images/use-cases/observability/hyperdx-logs.png'; import hyperdx_2 from '@site/static/images/use-cases/observability/hyperdx-2.png'; -This mode includes the UI with authentication disabled to allow for quick local use. -**User authentication is disabled for this distribution of HyperDX** +Similar to the [all-in-one image](/use-cases/observability/clickstack/deployment/docker-compose), this comprehensive Docker image bundles all ClickStack components: -It does not persist dashboards, saved searches, and alerts. +* **ClickHouse** +* **HyperDX** +* **OpenTelemetry (OTel) collector** (exposing OTLP on ports `4317` and `4318`) +* **MongoDB** (for persistent application state) + +**However, user authentication is disabled for this distribution of HyperDX** ### Suitable for {#suitable-for} @@ -32,7 +36,7 @@ It does not persist dashboards, saved searches, and alerts. Local mode deploys the HyperDX UI only, accessible on port 8080. -```bash +```shell docker run -p 8080:8080 docker.hyperdx.io/hyperdx/hyperdx-local ``` diff --git a/docs/use-cases/observability/clickstack/example-datasets/local-data.md b/docs/use-cases/observability/clickstack/example-datasets/local-data.md index 52b8d515cd6..0f6e3444035 100644 --- a/docs/use-cases/observability/clickstack/example-datasets/local-data.md +++ b/docs/use-cases/observability/clickstack/example-datasets/local-data.md @@ -41,7 +41,7 @@ Create a `otel-file-collector.yaml` file with the following content. **Important**: Populate the value `` with your ingestion API key copied above. -```yml +```yaml receivers: filelog: include: @@ -123,7 +123,7 @@ For more details on the OpenTelemetry (OTel) configuration structure, we recomme Run the following docker command to start an instance of the OTel collector. -```bash +```shell docker run --network=host --rm -it \ --user 0:0 \ -v "$(pwd)/otel-file-collector.yaml":/etc/otel/config.yaml \ diff --git a/docs/use-cases/observability/clickstack/example-datasets/sample-data.md b/docs/use-cases/observability/clickstack/example-datasets/sample-data.md index b28f7ec2146..99828480459 100644 --- a/docs/use-cases/observability/clickstack/example-datasets/sample-data.md +++ b/docs/use-cases/observability/clickstack/example-datasets/sample-data.md @@ -53,7 +53,7 @@ In order to populate the UI with sample data, download the following file: [Sample data](https://storage.googleapis.com/hyperdx/sample.tar.gz) -```bash +```shell # curl curl -O https://storage.googleapis.com/hyperdx/sample.tar.gz # or @@ -68,14 +68,14 @@ To load this data, we simply send it to the HTTP endpoint of the deployed OpenTe First, export the API key copied above. -```bash +```shell # export API key export CLICKSTACK_API_KEY= ``` Run the following command to send the data to the OTel collector: -```bash +```shell for filename in $(tar -tf sample.tar.gz); do endpoint="http://localhost:4318/v1/${filename%.json}" echo "loading ${filename%.json}" diff --git a/docs/use-cases/observability/clickstack/getting-started.md b/docs/use-cases/observability/clickstack/getting-started.md index a745a7213c0..5709eb1df35 100644 --- a/docs/use-cases/observability/clickstack/getting-started.md +++ b/docs/use-cases/observability/clickstack/getting-started.md @@ -40,7 +40,7 @@ This all-in-one image allows you to launch the full stack with a single command, The following will run an OpenTelemetry collector (on port 4317 and 4318) and the HyperDX UI (on port 8080). -```bash +```shell docker run -p 8080:8080 -p 4317:4317 -p 4318:4318 docker.hyperdx.io/hyperdx/hyperdx-all-in-one ``` @@ -49,7 +49,7 @@ To persist data and settings across restarts of the container, users can modify For example: -```bash +```shell # modify command to mount paths docker run \ -p 8080:8080 \ @@ -114,7 +114,7 @@ While we will use the `default` user to connect HyperDX, we recommend creating a Open a terminal and export the credentials copied above: -```bash +```shell export CLICKHOUSE_USER=default export CLICKHOUSE_ENDPOINT= export CLICKHOUSE_PASSWORD= @@ -122,7 +122,7 @@ export CLICKHOUSE_PASSWORD= Run the following docker command: -```bash +```shell docker run -e CLICKHOUSE_ENDPOINT=${CLICKHOUSE_ENDPOINT} -e CLICKHOUSE_USER=default -e CLICKHOUSE_PASSWORD=${CLICKHOUSE_PASSWORD} -p 8080:8080 -p 4317:4317 -p 4318:4318 docker.hyperdx.io/hyperdx/hyperdx-all-in-one ``` @@ -163,11 +163,11 @@ Authentication is not supported. This mode is intended to be used for quick testing, development, demos and debugging use cases where authentication and settings persistence is not necessary. -### Hosted Version {#hosted-version} +### Hosted version {#hosted-version} You can use a hosted version of HyperDX in local mode available at [play.hyperdx.io](https://play.hyperdx.io). -### Self-Hosted Version {#self-hosted-version} +### Self-hosted version {#self-hosted-version} @@ -175,7 +175,7 @@ You can use a hosted version of HyperDX in local mode available at [play.hyperdx The self-hosted local mode image comes with an OpenTelemetry collector and a ClickHouse server pre-configured as well. This makes it easy to consume telemetry data from your applications and visualize it in HyperDX with minimal external setup. To get started with the self-hosted version, simply run the Docker container with the appropriate ports forwarded: -```bash +```shell docker run -p 8080:8080 docker.hyperdx.io/hyperdx/hyperdx-local ``` diff --git a/docs/use-cases/observability/clickstack/ingesting-data/collector.md b/docs/use-cases/observability/clickstack/ingesting-data/collector.md index 5bd61776145..bf851afc5a6 100644 --- a/docs/use-cases/observability/clickstack/ingesting-data/collector.md +++ b/docs/use-cases/observability/clickstack/ingesting-data/collector.md @@ -36,7 +36,7 @@ If you are managing your own OpenTelemetry collector in a standalone deployment To deploy the ClickStack distribution of the OTel connector in a standalone mode, run the following docker command: -```bash +```shell docker run -e OPAMP_SERVER_URL=${OPAMP_SERVER_URL} -e CLICKHOUSE_ENDPOINT=${CLICKHOUSE_ENDPOINT} -e CLICKHOUSE_USER=default -e CLICKHOUSE_PASSWORD=${CLICKHOUSE_PASSWORD} -p 8080:8080 -p 4317:4317 -p 4318:4318 docker.hyperdx.io/hyperdx/hyperdx-otel-collector ``` @@ -60,14 +60,14 @@ All docker images, which include the OpenTelemetry collector, can be configured For example the all-in-one image: -```bash +```shell export OPAMP_SERVER_URL= export CLICKHOUSE_ENDPOINT= export CLICKHOUSE_USER= export CLICKHOUSE_PASSWORD= ``` -```bash +```shell docker run -e OPAMP_SERVER_URL=${OPAMP_SERVER_URL} -e CLICKHOUSE_ENDPOINT=${CLICKHOUSE_ENDPOINT} -e CLICKHOUSE_USER=default -e CLICKHOUSE_PASSWORD=${CLICKHOUSE_PASSWORD} -p 8080:8080 -p 4317:4317 -p 4318:4318 docker.hyperdx.io/hyperdx/hyperdx-all-in-one ``` @@ -228,9 +228,9 @@ By default, inserts into ClickHouse are synchronous and idempotent if identical. From the collector's perspective, (1) and (2) can be hard to distinguish. However, in both cases, the unacknowledged insert can just be retried immediately. As long as the retried insert query contains the same data in the same order, ClickHouse will automatically ignore the retried insert if the original (unacknowledged) insert succeeded. -For this reason, the ClickStack distribution of the OTel collector uses the batch [batch processor](https://github.com/open-telemetry/opentelemetry-collector/blob/main/processor/batchprocessor/README.md). This ensures inserts are sent as consistent batches of rows satisfying the above requirements. If a collector is expected to have high throughput (events per second), and at least 5000 events can be sent in each insert, this is usually the only batching required in the pipeline. In this case the collector will flush batches before the batch processor's `timeout` is reached, ensuring the end-to-end latency of the pipeline remains low and batches are of a consistent size. +For this reason, the ClickStack distribution of the OTel collector uses the [batch processor](https://github.com/open-telemetry/opentelemetry-collector/blob/main/processor/batchprocessor/README.md). This ensures inserts are sent as consistent batches of rows satisfying the above requirements. If a collector is expected to have high throughput (events per second), and at least 5000 events can be sent in each insert, this is usually the only batching required in the pipeline. In this case the collector will flush batches before the batch processor's `timeout` is reached, ensuring the end-to-end latency of the pipeline remains low and batches are of a consistent size. -### Use Asynchronous inserts {#use-asynchronous-inserts} +### Use asynchronous inserts {#use-asynchronous-inserts} Typically, users are forced to send smaller batches when the throughput of a collector is low, and yet they still expect data to reach ClickHouse within a minimum end-to-end latency. In this case, small batches are sent when the `timeout` of the batch processor expires. This can cause problems and is when asynchronous inserts are required. This issue is rare if users are sending data to the ClickStack collector acting as a Gateway - by acting as aggregators, they alleviate this problem - see [Collector roles](#collector-roles). diff --git a/docs/use-cases/observability/clickstack/ingesting-data/kubernetes.md b/docs/use-cases/observability/clickstack/ingesting-data/kubernetes.md index 07dbaa4c8be..caacdba2858 100644 --- a/docs/use-cases/observability/clickstack/ingesting-data/kubernetes.md +++ b/docs/use-cases/observability/clickstack/ingesting-data/kubernetes.md @@ -17,7 +17,7 @@ This guide integrates the following: To send over application-level metrics or APM/traces, you'll need to add the corresponding language integration to your application as well. ::: -## Creating the OTel Helm Chart configuration files {#creating-the-otel-helm-chart-config-files} +## Creating the OTel Helm chart configuration files {#creating-the-otel-helm-chart-config-files} To collect logs and metrics from both each node and the cluster itself, we'll need to deploy two separate OpenTelemetry collectors. One will be deployed as a DaemonSet to collect logs and metrics from each node, and the other will be deployed as a deployment to collect logs and metrics from the cluster itself. @@ -113,7 +113,7 @@ config: - otlphttp ``` -### Creating the Deployment Configuration {#creating-the-deployment-configuration} +### Creating the deployment configuration {#creating-the-deployment-configuration} To collect Kubernetes events and cluster-wide metrics, we'll need to deploy a separate OpenTelemetry collector as a deployment. @@ -171,13 +171,13 @@ the [OpenTelemetry Helm Chart](https://github.com/open-telemetry/opentelemetry-h Add the OpenTelemetry Helm repo: -```bash copy +```shell copy helm repo add open-telemetry https://open-telemetry.github.io/opentelemetry-helm-charts # Add OTel Helm repo ``` Install the chart with the above config: -```bash copy +```shell copy helm install my-opentelemetry-collector-deployment open-telemetry/opentelemetry-collector -f deployment.yaml helm install my-opentelemetry-collector-daemonset open-telemetry/opentelemetry-collector -f daemonset.yaml ``` diff --git a/docs/use-cases/observability/clickstack/ingesting-data/opentelemetry.md b/docs/use-cases/observability/clickstack/ingesting-data/opentelemetry.md index a8a418875d0..13127876c26 100644 --- a/docs/use-cases/observability/clickstack/ingesting-data/opentelemetry.md +++ b/docs/use-cases/observability/clickstack/ingesting-data/opentelemetry.md @@ -45,7 +45,7 @@ To send data to ClickStack, point your OpenTelemetry instrumentation to the foll For most [language SDKs](/use-cases/observability/clickstack/sdks) and telemetry libraries that support OpenTelemetry, users can simply set `OTEL_EXPORTER_OTLP_ENDPOINT` environment variable in your application: -```bash +```shell export OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4318 ``` @@ -56,7 +56,7 @@ In addition, an authorization header containing the API ingestion key is require For language SDKs, this can either be set by an `init` function or via an`OTEL_EXPORTER_OTLP_HEADERS` environment variable e.g.: -```bash +```shell OTEL_EXPORTER_OTLP_HEADERS='authorization=' ``` diff --git a/docs/use-cases/observability/clickstack/ingesting-data/sdks/browser.md b/docs/use-cases/observability/clickstack/ingesting-data/sdks/browser.md index e19db263a5a..dd73220edab 100644 --- a/docs/use-cases/observability/clickstack/ingesting-data/sdks/browser.md +++ b/docs/use-cases/observability/clickstack/ingesting-data/sdks/browser.md @@ -25,7 +25,7 @@ This guide integrates the following: - **XHR/Fetch/Websocket Requests** - **Exceptions** -## Getting Started {#getting-started} +## Getting started {#getting-started}
@@ -36,13 +36,13 @@ This guide integrates the following: Use the following command to install the [browser package](https://www.npmjs.com/package/@hyperdx/browser). -```bash +```shell npm install @hyperdx/browser ``` **Initialize ClickStack** -```js +```javascript import HyperDX from '@hyperdx/browser'; HyperDX.init({ @@ -115,7 +115,7 @@ with the user information. corresponding values, but can be omitted. Any other additional values can be specified and used to search for events. -```js +```javascript HyperDX.setGlobalAttributes({ userId: user.id, userEmail: user.email, @@ -131,7 +131,7 @@ If you're using React, you can automatically capture errors that occur within React error boundaries by passing your error boundary component into the `attachToReactErrorBoundary` function. -```js +```javascript // Import your ErrorBoundary (we're using react-error-boundary as an example) import { ErrorBoundary } from 'react-error-boundary'; @@ -148,7 +148,7 @@ event metadata. Example: -```js +```javascript HyperDX.addAction('Form-Completed', { formId: 'signup-form', formName: 'Signup Form', @@ -160,7 +160,7 @@ HyperDX.addAction('Form-Completed', { To enable or disable network capture dynamically, simply invoke the `enableAdvancedNetworkCapture` or `disableAdvancedNetworkCapture` function as needed. -```js +```javascript HyperDX.enableAdvancedNetworkCapture(); ``` @@ -174,7 +174,7 @@ download, etc. via [`PerformanceResourceTiming`](https://developer.mozilla.org/e If you're using `express` with `cors` packages, you can use the following snippet to enable the header: -```js +```javascript var cors = require('cors'); var onHeaders = require('on-headers'); diff --git a/docs/use-cases/observability/clickstack/ingesting-data/sdks/deno.md b/docs/use-cases/observability/clickstack/ingesting-data/sdks/deno.md index 394245a30d4..e82b48a4f36 100644 --- a/docs/use-cases/observability/clickstack/ingesting-data/sdks/deno.md +++ b/docs/use-cases/observability/clickstack/ingesting-data/sdks/deno.md @@ -43,7 +43,7 @@ log.getLogger('my-otel-logger').info('Hello from Deno!'); ### Run the application {#run-the-application} -```sh +```shell OTEL_EXPORTER_OTLP_HEADERS="authorization=" \ OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4318 \ OTEL_SERVICE_NAME="" \ diff --git a/docs/use-cases/observability/clickstack/ingesting-data/sdks/elixir.md b/docs/use-cases/observability/clickstack/ingesting-data/sdks/elixir.md index e14dd5ca4c1..ad65b2937aa 100644 --- a/docs/use-cases/observability/clickstack/ingesting-data/sdks/elixir.md +++ b/docs/use-cases/observability/clickstack/ingesting-data/sdks/elixir.md @@ -50,7 +50,7 @@ config :logger, Afterwards you'll need to configure the following environment variables in your shell to ship telemetry to ClickStack: -```bash +```shell export HYPERDX_API_KEY='' \ OTEL_SERVICE_NAME='' ``` diff --git a/docs/use-cases/observability/clickstack/ingesting-data/sdks/golang.md b/docs/use-cases/observability/clickstack/ingesting-data/sdks/golang.md index ba49cbf7bd0..9711a330433 100644 --- a/docs/use-cases/observability/clickstack/ingesting-data/sdks/golang.md +++ b/docs/use-cases/observability/clickstack/ingesting-data/sdks/golang.md @@ -29,7 +29,7 @@ instrumentation isn't required to get value out of tracing. To install the OpenTelemetry and HyperDX Go packages, use the command below. It is recommended to check out the [current instrumentation packages](https://github.com/open-telemetry/opentelemetry-go-contrib/tree/v1.4.0/instrumentation#instrumentation-packages) and install the necessary packages to ensure that the trace information is attached correctly. -```bash +```shell go get -u go.opentelemetry.io/otel go get -u github.com/hyperdxio/otel-config-go go get -u github.com/hyperdxio/opentelemetry-go @@ -40,7 +40,7 @@ go get -u github.com/hyperdxio/opentelemetry-logs-go For this example, we will be using `net/http/otelhttp`. -```sh +```shell go get -u go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp ``` @@ -149,7 +149,7 @@ func ExampleHandler(w http.ResponseWriter, r *http.Request) { For this example, we will be using `gin-gonic/gin`. -```sh +```shell go get -u go.opentelemetry.io/contrib/instrumentation/github.com/gin-gonic/gin/otelgin ``` @@ -234,7 +234,7 @@ func main() { Afterwards you'll need to configure the following environment variables in your shell to ship telemetry to ClickStack: -```sh +```shell export OTEL_EXPORTER_OTLP_ENDPOINT=https://localhost:4318 \ OTEL_EXPORTER_OTLP_PROTOCOL=http/protobuf \ OTEL_SERVICE_NAME='' \ diff --git a/docs/use-cases/observability/clickstack/ingesting-data/sdks/index.md b/docs/use-cases/observability/clickstack/ingesting-data/sdks/index.md index bbaf8fbe9ef..ced169c0190 100644 --- a/docs/use-cases/observability/clickstack/ingesting-data/sdks/index.md +++ b/docs/use-cases/observability/clickstack/ingesting-data/sdks/index.md @@ -12,13 +12,13 @@ Language SDKs are responsible for collecting telemetry from within your applicat In browser-based environments, SDKs may also be responsible for collecting **session data**, including UI events, clicks, and navigation thus enabling replays of user sessions. -## How It Works {#how-it-works} +## How it works {#how-it-works} -1. Your application uses a a ClickStack SDK (e.g., Node.js, Python, Go). These SDKs are based on the OpenTelemetry SDKs with additional features and usability enhancements. +1. Your application uses a ClickStack SDK (e.g., Node.js, Python, Go). These SDKs are based on the OpenTelemetry SDKs with additional features and usability enhancements. 2. The SDK collects and exports traces and logs via OTLP (HTTP or gRPC). 3. The OpenTelemetry collector receives the telemetry and writes it to ClickHouse via the configured exporters. -## Supported Languages {#supported-languages} +## Supported languages {#supported-languages} :::note OpenTelemetry compatibility While ClickStack offers its own language SDKs with enhanced telemetry and features, users can also use their existing OpenTelemetry SDKs seamlessly. @@ -40,11 +40,11 @@ While ClickStack offers its own language SDKs with enhanced telemetry and featur | React Native | React Native mobile applications | [Documentation](/use-cases/observability/clickstack/sdks/react-native) | | Ruby | Ruby on Rails applications and web services | [Documentation](/use-cases/observability/clickstack/sdks/ruby-on-rails) | -## Securing with API Key {#securing-api-key} +## Securing with API key {#securing-api-key} In order to send data to ClickStack via the OTel collector, SDKs will need to specify an ingestion API key. This can either be set using an `init` function in the SDK or an `OTEL_EXPORTER_OTLP_HEADERS` environment variable: -```bash +```shell OTEL_EXPORTER_OTLP_HEADERS='authorization=' ``` @@ -52,11 +52,11 @@ This API key is generated by the HyperDX application, and is available via the a For most [language SDKs](/use-cases/observability/clickstack/sdks) and telemetry libraries that support OpenTelemetry, users can simply set `OTEL_EXPORTER_OTLP_ENDPOINT` environment variable in your application or specify it during initialization of the SDK: -```bash +```shell export OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4318 ``` -## Kubernetes Integration {#kubernetes-integration} +## Kubernetes integration {#kubernetes-integration} All SDKs support automatic correlation with Kubernetes metadata (pod name, namespace, etc.) when running in a Kubernetes environment. This allows you to: diff --git a/docs/use-cases/observability/clickstack/ingesting-data/sdks/java.md b/docs/use-cases/observability/clickstack/ingesting-data/sdks/java.md index 8b9908a257b..50e77dbf7ef 100644 --- a/docs/use-cases/observability/clickstack/ingesting-data/sdks/java.md +++ b/docs/use-cases/observability/clickstack/ingesting-data/sdks/java.md @@ -36,7 +36,7 @@ and place the JAR in your preferred directory. The JAR file contains the agent and instrumentation libraries. You can also use the following command to download the agent: -```bash +```shell curl -L -O https://github.com/open-telemetry/opentelemetry-java-instrumentation/releases/latest/download/opentelemetry-javaagent.jar ``` @@ -44,7 +44,7 @@ curl -L -O https://github.com/open-telemetry/opentelemetry-java-instrumentation/ Afterwards you'll need to configure the following environment variables in your shell to ship telemetry to ClickStack: -```bash +```shell export JAVA_TOOL_OPTIONS="-javaagent:PATH/TO/opentelemetry-javaagent.jar" \ OTEL_EXPORTER_OTLP_ENDPOINT=https://localhost:4318 \ OTEL_EXPORTER_OTLP_HEADERS='authorization=' \ @@ -59,7 +59,7 @@ The `OTEL_EXPORTER_OTLP_HEADERS` environment variable contains the API Key avail ### Run the application with OpenTelemetry Java agent {#run-the-application-with-otel-java-agent} -```sh +```shell java -jar target/ ```
diff --git a/docs/use-cases/observability/clickstack/ingesting-data/sdks/nestjs.md b/docs/use-cases/observability/clickstack/ingesting-data/sdks/nestjs.md index 336f762267f..42964cde1dc 100644 --- a/docs/use-cases/observability/clickstack/ingesting-data/sdks/nestjs.md +++ b/docs/use-cases/observability/clickstack/ingesting-data/sdks/nestjs.md @@ -30,7 +30,7 @@ integration to your application as well._ Import `HyperDXNestLoggerModule` into the root `AppModule` and use the `forRoot()` method to configure it. -```js +```javascript import { Module } from '@nestjs/common'; import { HyperDXNestLoggerModule } from '@hyperdx/node-logger'; @@ -49,7 +49,7 @@ export class AppModule {} Afterward, the winston instance will be available to inject across the entire project using the `HDX_LOGGER_MODULE_PROVIDER` injection token: -```js +```javascript import { Controller, Inject } from '@nestjs/common'; import { HyperDXNestLoggerModule, HyperDXNestLogger } from '@hyperdx/node-logger'; @@ -84,7 +84,7 @@ into the Logger class, forwarding all calls to it: Create the logger in the `main.ts` file -```js +```javascript import { HyperDXNestLoggerModule } from '@hyperdx/node-logger'; async function bootstrap() { @@ -102,7 +102,7 @@ bootstrap(); Change your main module to provide the Logger service: -```js +```javascript import { Logger, Module } from '@nestjs/common'; @Module({ @@ -113,7 +113,7 @@ export class AppModule {} Then inject the logger simply by type hinting it with the Logger from `@nestjs/common`: -```js +```javascript import { Controller, Logger } from '@nestjs/common'; @Controller('cats') diff --git a/docs/use-cases/observability/clickstack/ingesting-data/sdks/nextjs.md b/docs/use-cases/observability/clickstack/ingesting-data/sdks/nextjs.md index 391550b1bce..38e83a6ec8d 100644 --- a/docs/use-cases/observability/clickstack/ingesting-data/sdks/nextjs.md +++ b/docs/use-cases/observability/clickstack/ingesting-data/sdks/nextjs.md @@ -31,7 +31,7 @@ To get started, you'll need to enable the Next.js instrumentation hook by settin **Example:** -```js +```javascript const nextConfig = { experimental: { instrumentationHook: true, @@ -57,14 +57,14 @@ module.exports = nextConfig; -```bash +```shell npm install @hyperdx/node-opentelemetry ``` -```bash +```shell yarn add @hyperdx/node-opentelemetry ``` @@ -75,7 +75,7 @@ yarn add @hyperdx/node-opentelemetry Create a file called `instrumentation.ts` (or `.js`) in your Next.js project root with the following contents: -```js +```javascript export async function register() { if (process.env.NEXT_RUNTIME === 'nodejs') { const { init } = await import('@hyperdx/node-opentelemetry'); diff --git a/docs/use-cases/observability/clickstack/ingesting-data/sdks/nodejs.md b/docs/use-cases/observability/clickstack/ingesting-data/sdks/nodejs.md index 4a34944035c..ad8c5612dd9 100644 --- a/docs/use-cases/observability/clickstack/ingesting-data/sdks/nodejs.md +++ b/docs/use-cases/observability/clickstack/ingesting-data/sdks/nodejs.md @@ -30,14 +30,14 @@ Use the following command to install the [ClickStack OpenTelemetry package](http -```bash +```shell npm install @hyperdx/node-opentelemetry ``` -```bash +```shell yarn add @hyperdx/node-opentelemetry ``` @@ -46,12 +46,12 @@ yarn add @hyperdx/node-opentelemetry ### Initializing the SDK {#initializin-the-sdk} -To initialize the SDK, you'll need to call the `init` function at the top of the entry point of your application. +To initialize the SDK, you'll need to call the `init` function at the top of the entry point of your application. -```js +```javascript const HyperDX = require('@hyperdx/node-opentelemetry'); HyperDX.init({ @@ -63,7 +63,7 @@ HyperDX.init({ -```js +```javascript import * as HyperDX from '@hyperdx/node-opentelemetry'; HyperDX.init({ @@ -151,7 +151,7 @@ To enable this, you'll need to add the following code to the end of your applica -```js +```javascript const HyperDX = require('@hyperdx/node-opentelemetry'); HyperDX.init({ apiKey: 'YOUR_INGESTION_API_KEY', @@ -171,7 +171,7 @@ app.listen(3000); -```js +```javascript const Koa = require("koa"); const Router = require("@koa/router"); const HyperDX = require('@hyperdx/node-opentelemetry'); @@ -193,7 +193,7 @@ app.listen(3030); -```js +```javascript const HyperDX = require('@hyperdx/node-opentelemetry'); function myErrorHandler(error, req, res, next) { @@ -211,7 +211,7 @@ function myErrorHandler(error, req, res, next) { If you're having trouble with the SDK, you can enable verbose logging by setting the `OTEL_LOG_LEVEL` environment variable to `debug`. -```sh +```shell export OTEL_LOG_LEVEL=debug ``` @@ -242,7 +242,7 @@ tag and propagate identifiers yourself. with the corresponding values, but can be omitted. Any other additional values can be specified and used to search for events. -```ts +```typescript import * as HyperDX from '@hyperdx/node-opentelemetry'; app.use((req, res, next) => { @@ -262,7 +262,7 @@ Make sure to enable beta mode by setting `HDX_NODE_BETA_MODE` environment variable to 1 or by passing `betaMode: true` to the `init` function to enable trace attributes. -```sh +```shell export HDX_NODE_BETA_MODE=1 ``` @@ -316,14 +316,14 @@ Node.js `--require` flag. The CLI installation exposes a wider range of auto-ins -```bash +```shell HYPERDX_API_KEY='' OTEL_SERVICE_NAME='' npx opentelemetry-instrument index.js ``` -```bash +```shell HYPERDX_API_KEY='' OTEL_SERVICE_NAME='' ts-node -r '@hyperdx/node-opentelemetry/build/src/tracing' index.js ``` @@ -331,7 +331,7 @@ HYPERDX_API_KEY='' OTEL_SERVICE_NAME='' ts-no -```js +```javascript // Import this at the very top of the first file loaded in your application // You'll still specify your API key via the `HYPERDX_API_KEY` environment variable import { initSDK } from '@hyperdx/node-opentelemetry'; @@ -352,7 +352,7 @@ _The `OTEL_SERVICE_NAME` environment variable is used to identify your service i To enable uncaught exception capturing, you'll need to set the `HDX_NODE_EXPERIMENTAL_EXCEPTION_CAPTURE` environment variable to 1. -```sh +```shell HDX_NODE_EXPERIMENTAL_EXCEPTION_CAPTURE=1 ``` diff --git a/docs/use-cases/observability/clickstack/ingesting-data/sdks/python.md b/docs/use-cases/observability/clickstack/ingesting-data/sdks/python.md index ccfc61f70fa..2d64351bf47 100644 --- a/docs/use-cases/observability/clickstack/ingesting-data/sdks/python.md +++ b/docs/use-cases/observability/clickstack/ingesting-data/sdks/python.md @@ -27,14 +27,14 @@ This guide integrates: Use the following command to install the [ClickStack OpenTelemetry package](https://pypi.org/project/hyperdx-opentelemetry/). -```bash +```shell pip install hyperdx-opentelemetry ``` Install the OpenTelemetry automatic instrumentation libraries for the packages used by your Python application. We recommend that you use the `opentelemetry-bootstrap` tool that comes with the OpenTelemetry Python SDK to scan your application packages and generate the list of available libraries. -```bash +```shell opentelemetry-bootstrap -a install ``` @@ -42,7 +42,7 @@ opentelemetry-bootstrap -a install Afterwards you'll need to configure the following environment variables in your shell to ship telemetry to ClickStack: -```bash +```shell export HYPERDX_API_KEY='' \ OTEL_SERVICE_NAME='' \ OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4318 @@ -54,7 +54,7 @@ _The `OTEL_SERVICE_NAME` environment variable is used to identify your service i Now you can run the application with the OpenTelemetry Python agent (`opentelemetry-instrument`). -```bash +```shell opentelemetry-instrument python app.py ``` @@ -105,7 +105,7 @@ By enabling network capture features, developers gain the capability to debug HTTP request headers and body payloads effectively. This can be accomplished simply by setting `HYPERDX_ENABLE_ADVANCED_NETWORK_CAPTURE` flag to 1. -```bash +```shell export HYPERDX_ENABLE_ADVANCED_NETWORK_CAPTURE=1 ``` @@ -133,7 +133,7 @@ When debug mode is activated, all telemetries will be printed to the console, allowing you to verify if your application is properly instrumented with the expected data. -```bash +```shell export DEBUG=true ``` diff --git a/docs/use-cases/observability/clickstack/ingesting-data/sdks/react-native.md b/docs/use-cases/observability/clickstack/ingesting-data/sdks/react-native.md index 265ac6fffd9..da822eedbb2 100644 --- a/docs/use-cases/observability/clickstack/ingesting-data/sdks/react-native.md +++ b/docs/use-cases/observability/clickstack/ingesting-data/sdks/react-native.md @@ -21,7 +21,7 @@ This Guide Integrates: Use the following command to install the [ClickStack React Native package](https://www.npmjs.com/package/@hyperdx/otel-react-native). -```bash +```shell npm install @hyperdx/otel-react-native ``` @@ -29,7 +29,7 @@ npm install @hyperdx/otel-react-native Initialize the library as early in your app lifecycle as possible: -```js +```javascript import { HyperDXRum } from '@hyperdx/otel-react-native'; HyperDXRum.init({ @@ -50,7 +50,7 @@ with the user information. corresponding values, but can be omitted. Any other additional values can be specified and used to search for events. -```js +```javascript HyperDXRum.setGlobalAttributes({ userId: user.id, userEmail: user.email, @@ -66,7 +66,7 @@ To instrument applications running on React Native versions lower than 0.68, edit your `metro.config.js` file to force metro to use browser specific packages. For example: -```js +```javascript const defaultResolver = require('metro-resolver'); module.exports = { @@ -112,7 +112,7 @@ module.exports = { The following example shows how to instrument navigation: -```js +```javascript import { startNavigationTracking } from '@hyperdx/otel-react-native'; export default function App() { diff --git a/docs/use-cases/observability/clickstack/ingesting-data/sdks/ruby.md b/docs/use-cases/observability/clickstack/ingesting-data/sdks/ruby.md index 8eaf9df0b14..da880c9dd2b 100644 --- a/docs/use-cases/observability/clickstack/ingesting-data/sdks/ruby.md +++ b/docs/use-cases/observability/clickstack/ingesting-data/sdks/ruby.md @@ -28,7 +28,7 @@ _To send logs to ClickStack, please send logs via the [OpenTelemetry collector]( Use the following command to install the OpenTelemetry package. -```bash +```shell bundle add opentelemetry-sdk opentelemetry-instrumentation-all opentelemetry-exporter-otlp ``` @@ -77,7 +77,7 @@ end Afterwards you'll need to configure the following environment variables in your shell to ship telemetry to ClickStack: -```bash +```shell export OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4318 \ OTEL_EXPORTER_OTLP_PROTOCOL=http/protobuf \ OTEL_SERVICE_NAME='' \ diff --git a/docs/use-cases/observability/clickstack/migration/elastic/concepts.md b/docs/use-cases/observability/clickstack/migration/elastic/concepts.md new file mode 100644 index 00000000000..4f97360fb90 --- /dev/null +++ b/docs/use-cases/observability/clickstack/migration/elastic/concepts.md @@ -0,0 +1,252 @@ +--- +slug: /use-cases/observability/clickstack/migration/elastic/concepts +title: 'Equivalent concepts in ClickStack and Elastic' +pagination_prev: null +pagination_next: null +sidebar_label: 'Equivalent concepts' +sidebar_position: 1 +description: 'Equivalent concepts - ClickStack and Elastic' +show_related_blogs: true +keywords: ['Elasticsearch'] +--- + +import Image from '@theme/IdealImage'; +import elasticsearch from '@site/static/images/use-cases/observability/elasticsearch.png'; +import clickhouse from '@site/static/images/use-cases/observability/clickhouse.png'; +import clickhouse_execution from '@site/static/images/use-cases/observability/clickhouse-execution.png'; +import elasticsearch_execution from '@site/static/images/use-cases/observability/elasticsearch-execution.png'; +import elasticsearch_transforms from '@site/static/images/use-cases/observability/es-transforms.png'; +import clickhouse_mvs from '@site/static/images/use-cases/observability/ch-mvs.png'; + +## Elastic Stack vs ClickStack {#elastic-vs-clickstack} + +Both Elastic Stack and ClickStack cover the core roles of an observability platform, but they approach these roles with different design philosophies. These roles include: + +- **UI and Alerting**: tools for querying data, building dashboards, and managing alerts. +- **Storage and Query Engine**: the backend systems responsible for storing observability data and serving analytical queries. +- **Data Collection and ETL**: agents and pipelines that gather telemetry data and process it before ingestion. + +The table below outlines how each stack maps its components to these roles: + +| **Role** | **Elastic Stack** | **ClickStack** | **Comments** | +|--------------------------|--------------------------------------------------|--------------------------------------------------|--------------| +| **UI & Alerting** | **Kibana** — dashboards, search, and alerts | **HyperDX** — real-time UI, search, and alerts | Both serve as the primary interface for users, including visualizations and alert management. HyperDX is purpose-built for observability and tightly coupled to OpenTelemetry semantics. | +| **Storage & Query Engine** | **Elasticsearch** — JSON document store with inverted index | **ClickHouse** — column-oriented database with vectorized engine | Elasticsearch uses an inverted index optimized for search; ClickHouse uses columnar storage and SQL for high-speed analytics over structured and semi-structured data. | +| **Data Collection** | **Elastic Agent**, **Beats** (e.g. Filebeat, Metricbeat) | **OpenTelemetry Collector** (edge + gateway) | Elastic supports custom shippers and a unified agent managed by Fleet. ClickStack relies on OpenTelemetry, allowing vendor-neutral data collection and processing. | +| **Instrumentation SDKs** | **Elastic APM agents** (proprietary) | **OpenTelemetry SDKs** (distributed by ClickStack) | Elastic SDKs are tied to the Elastic stack. ClickStack builds on OpenTelemetry SDKs for logs, metrics, and traces in major languages. | +| **ETL / Data Processing** | **Logstash**, ingest pipelines | **OpenTelemetry Collector** + ClickHouse materialized views | Elastic uses ingest pipelines and Logstash for transformation. ClickStack shifts compute to insert time via materialized views and OTel collector processors, which transform data efficiently and incrementally. | +| **Architecture Philosophy** | Vertically integrated, proprietary agents and formats | Open standard–based, loosely coupled components | Elastic builds a tightly integrated ecosystem. ClickStack emphasizes modularity and standards (OpenTelemetry, SQL, object storage) for flexibility and cost-efficiency. | + +ClickStack emphasizes open standards and interoperability, being fully OpenTelemetry-native from collection to UI. In contrast, Elastic provides a tightly coupled but more vertically integrated ecosystem with proprietary agents and formats. + +Given that **Elasticsearch** and **ClickHouse** are the core engines responsible for data storage, processing, and querying in their respective stacks, understanding how they differ is essential. These systems underpin the performance, scalability, and flexibility of the entire observability architecture. The following section explores the key differences between Elasticsearch and ClickHouse - including how they model data, handle ingestion, execute queries, and manage storage. + +## Elasticsearch vs ClickHouse {#elasticsearch-vs-clickhouse} + +ClickHouse and Elasticsearch organize and query data using different underlying models, but many core concepts serve similar purposes. This section outlines key equivalences for users familiar with Elastic, mapping them to their ClickHouse counterparts. While the terminology differs, most observability workflows can be reproduced - often more efficiently - in ClickStack. + +### Core structural concepts {#core-structural-concepts} + +| **Elasticsearch** | **ClickHouse / SQL** | **Description** | +|-------------------|----------------------|------------------| +| **Field** | **Column** | The basic unit of data, holding one or more values of a specific type. Elasticsearch fields can store primitives as well as arrays and objects. Fields can have only one type. ClickHouse also supports arrays and objects (`Tuples`, `Maps`, `Nested`), as well as dynamic types like [`Variant`](/sql-reference/data-types/variant) and [`Dynamic`](/sql-reference/data-types/dynamic) which allow a column to have multiple types. | +| **Document** | **Row** | A collection of fields (columns). Elasticsearch documents are more flexible by default, with new fields added dynamically based on the data (type is inferred from ). ClickHouse rows are schema-bound by default, with users needing to insert all columns for a row or subset. The [`JSON`](/integrations/data-formats/json/overview) type in ClickHouse supports equivalent semi-structured dynamic column creation based on the inserted data. | +| **Index** | **Table** | The unit of query execution and storage. In both systems, queries run against indices or tables, which store rows/documents. | +| *Implicit* | Schema (SQL) | SQL schemas group tables into namespaces, often used for access control. Elasticsearch and ClickHouse don't have schemas, but both support row-and table-level security via roles and RBAC. | +| **Cluster** | **Cluster / Database** | Elasticsearch clusters are runtime instances that manage one or more indices. In ClickHouse, databases organize tables within a logical namespace, providing the same logical grouping as a cluster in Elasticsearch. A ClickHouse cluster is a distributed set of nodes, similar to Elasticsearch, but is decoupled and independent of the data itself. | + +### Data modeling and flexibility {#data-modeling-and-flexibility} + +Elasticsearch is known for its schema flexibility through [dynamic mappings](https://www.elastic.co/docs/manage-data/data-store/mapping/dynamic-mapping). Fields are created as documents are ingested, and types are inferred automatically - unless a schema is specified. ClickHouse is stricter by default — tables are defined with explicit schemas — but offers flexibility through [`Dynamic`](/sql-reference/data-types/dynamic), [`Variant`](/sql-reference/data-types/variant), and [`JSON`](/integrations/data-formats/json/overview) types. These enable ingestion of semi-structured data, with dynamic column creation and type inference similar to Elasticsearch. Similarly, the [`Map`](/sql-reference/data-types/map) type allows arbitrary key-value pairs to be stored - although a single type is enforced for both the key and value. + +ClickHouse's approach to type flexibility is more transparent and controlled. Unlike Elasticsearch, where type conflicts can cause ingestion errors, ClickHouse allows mixed-type data in [`Variant`](/sql-reference/data-types/variant) columns and supports schema evolution through the use of the [`JSON`](/integrations/data-formats/json/overview) type. + +If not using [`JSON`](/integrations/data-formats/json/overview), the schema is statically-defined. If values are not provided for a row, they will either be defined as [`Nullable`](/sql-reference/data-types/nullable) (not used in ClickStack) or revert to the default value for the type e.g. empty value for `String`. + +### Ingestion and transformation {#ingestion-and-transformation} + +Elasticsearch uses ingest pipelines with processors (e.g., `enrich`, `rename`, `grok`) to transform documents before indexing. In ClickHouse, similar functionality is achieved using [**incremental materialized views**](/materialized-view/incremental-materialized-view), which can [filter, transform](/materialized-view/incremental-materialized-view#filtering-and-transformation), or [enrich](/materialized-view/incremental-materialized-view#lookup-table) incoming data and insert results into target tables. You can also insert data to a `Null` table engine if you only need the output of the materialized view to be stored. This means that only the results of any materialized views are preserved, but the original data is discarded - thus saving storage space. + +For enrichment, Elasticsearch supports dedicated [enrich processors](https://www.elastic.co/docs/reference/enrich-processor/enrich-processor) to add context to documents. In ClickHouse, [**dictionaries**](/dictionary) can be used at both [query time](/dictionary#query-time-enrichment) and [ingest time](/dictionary#index-time-enrichment) to enrich rows - for example, to [map IPs to locations](/use-cases/observability/schema-design#using-ip-dictionaries) or apply [user agent lookups](/use-cases/observability/schema-design#using-regex-dictionaries-user-agent-parsing) on insert. + + +### Query languages {#query-languages} + +Elasticsearch supports a [number of query languages](https://www.elastic.co/docs/explore-analyze/query-filter/languages) including [DSL](https://www.elastic.co/docs/explore-analyze/query-filter/languages/querydsl), [ES|QL](https://www.elastic.co/docs/explore-analyze/query-filter/languages/esql), [EQL](https://www.elastic.co/docs/explore-analyze/query-filter/languages/eql) and [KQL](https://www.elastic.co/docs/explore-analyze/query-filter/languages/kql) (Lucene style) queries, but has limited support for joins — only **left outer joins** are available via [`ES|QL`](https://www.elastic.co/guide/en/elasticsearch/reference/8.x/esql-commands.html#esql-lookup-join). ClickHouse supports **full SQL syntax**, including [all join types](/sql-reference/statements/select/join#supported-types-of-join), [window functions](/sql-reference/window-functions), subqueries (and correlated), and CTEs. This is a major advantage for users needing to correlate between observability signals and business or infrastructure data. + +In ClickStack, [HyperDX provides a Lucene-compatible search interface](/use-cases/observability/clickstack/search) for ease of transition, alongside full SQL support via the ClickHouse backend. This syntax is comparable to the [Elastic query string](https://www.elastic.co/docs/reference/query-languages/query-dsl/query-dsl-query-string-query#query-string-syntax) syntax. For an exact comparison of this syntax, see ["Searching in ClickStack and Elastic"](/use-cases/observability/clickstack/migration/elastic/search). + +### File formats and interfaces {#file-formats-and-interfaces} + +Elasticsearch supports JSON (and [limited CSV](https://www.elastic.co/docs/reference/enrich-processor/csv-processor)) ingestion. ClickHouse supports over **70 file formats**, including Parquet, Protobuf, Arrow, CSV, and others — for both ingestion and export. This makes it easier to integrate with external pipelines and tools. + +Both systems offer a REST API, but ClickHouse also provides a **native protocol** for low-latency, high-throughput interaction. The native interface supports query progress, compression, and streaming more efficiently than HTTP, and is the default for most production ingestion. + +### Indexing and storage {#indexing-and-storage} + +Elasticsearch + +The concept of sharding is fundamental to Elasticsearch's scalability model. Each ① [**index**](https://www.elastic.co/blog/what-is-an-elasticsearch-index) is broken into **shards**, each of which is a physical Lucene index stored as segments on disk. A shard can have one or more physical copies called replica shards for resilience. For scalability, shards and replicas can be distributed over several nodes. A single shard ② consists of one or more immutable segments. A segment is the basic indexing structure of Lucene, the Java library providing the indexing and search features on which Elasticsearch is based. + +:::note Insert processing in Elasticsearch +Ⓐ Newly inserted documents Ⓑ first go into an in-memory indexing buffer that is flushed by default once per second. A routing formula is used to determine the target shard for flushed documents, and a new segment is written for the shard on disk. To improve query efficiency and enable the physical deletion of deleted or updated documents, segments are continuously merged in the background into larger segments until they reach a max size of 5 GB. It is, however, possible to force a merge into larger segments. +::: + +Elasticsearch recommends sizing shards to around [50 GB or 200 million documents](https://www.elastic.co/docs/deploy-manage/production-guidance/optimize-performance/size-shards) due to [JVM heap and metadata overhead](https://www.elastic.co/docs/deploy-manage/production-guidance/optimize-performance/size-shards#each-shard-has-overhead). There's also a hard limit of [2 billion documents per shard](https://www.elastic.co/docs/deploy-manage/production-guidance/optimize-performance/size-shards#troubleshooting-max-docs-limit). Elasticsearch parallelizes queries across shards, but each shard is processed using a **single thread**, making over-sharding both costly and counterproductive. This inherently tightly couples sharding to scaling, with more shards (and nodes) required to scale performance. + +Elasticsearch indexes all fields into [**inverted indices**](https://www.elastic.co/docs/manage-data/data-store/index-basics) for fast search, optionally using [**doc values**](https://www.elastic.co/docs/reference/elasticsearch/mapping-reference/doc-values) for aggregations, sorting and scripted field access. Numeric and geo fields use [Block K-D trees](https://users.cs.duke.edu/~pankaj/publications/papers/bkd-sstd.pdf) for searches on geospatial data and numeric and date ranges. + +Importantly, Elasticsearch stores the full original document in [`_source`](https://www.elastic.co/docs/reference/elasticsearch/mapping-reference/mapping-source-field) (compressed with `LZ4`, `Deflate` or `ZSTD`), while ClickHouse does not store a separate document representation. Data is reconstructed from columns at query time, saving storage space. This same capability is possible for Elasticsearch using [Synthetic `_source`](https://www.elastic.co/docs/reference/elasticsearch/mapping-reference/mapping-source-field#synthetic-source), with some [restrictions](https://www.elastic.co/docs/reference/elasticsearch/mapping-reference/mapping-source-field#synthetic-source-restrictions). Disabling of `_source` also has [implications](https://www.elastic.co/docs/reference/elasticsearch/mapping-reference/mapping-source-field#include-exclude) which don't apply to ClickHouse. + +In Elasticsearch, [index mappings](https://www.elastic.co/guide/en/elasticsearch/reference/current/mapping.html) (equivalent to table schemas in ClickHouse) control the type of fields and the data structures used for this persistence and querying. + +ClickHouse, by contrast, is **column-oriented** — every column is stored independently but always sorted by the table's primary/ordering key. This ordering enables [sparse primary indexes](/primary-indexes), which allow ClickHouse to skip over data during query execution efficiently. When queries filter by primary key fields, ClickHouse reads only the relevant parts of each column, significantly reducing disk I/O and improving performance — even without a full index on every column. + +ClickHouse + +ClickHouse also supports [**skip indexes**](/optimize/skipping-indexes), which accelerate filtering by precomputing index data for selected columns. These must be explicitly defined but can significantly improve performance. Additionally, ClickHouse lets users specify [compression codecs](/use-cases/observability/schema-design#using-codecs) and compression algorithms per column — something Elasticsearch does not support (its [compression](https://www.elastic.co/docs/reference/elasticsearch/index-settings/index-modules) only applies to `_source` JSON storage). + +ClickHouse also supports sharding, but its model is designed to favor **vertical scaling**. A single shard can store **trillions of rows** and continues to perform efficiently as long as memory, CPU, and disk permit. Unlike Elasticsearch, there is **no hard row limit** per shard. Shards in ClickHouse are logical — effectively individual tables — and do not require partitioning unless the dataset exceeds the capacity of a single node. This typically occurs due to disk size constraints, with sharding ① introduced only when horizontal scale-out is necessary - reducing complexity and overhead. In this case, similar to Elasticsearch, a shard will hold a subset of the data. The data within a single shard is organized as a collection of ② immutable data parts containing ③ several data structures. + +Processing within a ClickHouse shard is **fully parallelized**, and users are encouraged to scale vertically to avoid the network costs associated with moving data across nodes. + +:::note Insert processing in ClickHouse +Inserts in ClickHouse are **synchronous by default** — the write is acknowledged only after commit — but can be configured for **asynchronous inserts** to match Elastic-like buffering and batching. If [asynchronous data inserts](https://clickhouse.com/blog/asynchronous-data-inserts-in-clickhouse) are used, Ⓐ newly inserted rows first go into an Ⓑ in-memory insert buffer that is flushed by default once every 200 milliseconds. If multiple shards are used, a [distributed table](/engines/table-engines/special/distributed) is used for routing newly inserted rows to their target shard. A new part is written for the shard on disk. +::: + +### Distribution and replication {#distribution-and-replication} + +While both Elasticsearch and ClickHouse use clusters, shards, and replicas to ensure scalability and fault tolerance, their models differ significantly in implementation and performance characteristics. + +Elasticsearch uses a **primary-secondary** model for replication. When data is written to a primary shard, it is synchronously copied to one or more replicas. These replicas are themselves full shards distributed across nodes to ensure redundancy. Elasticsearch acknowledges writes only after all required replicas confirm the operation — a model that provides near **sequential consistency**, although **dirty reads** from replicas are possible before full sync. A **master node** coordinates the cluster, managing shard allocation, health, and leader election. + +Conversely, ClickHouse employs **eventual consistency** by default, coordinated by **Keeper** - a lightweight alternative to ZooKeeper. Writes can be sent to any replica directly or via a [**distributed table**](/engines/table-engines/special/distributed), which automatically selects a replica. Replication is asynchronous - changes are propagated to other replicas after the write is acknowledged. For stricter guarantees, ClickHouse [supports **sequential consistency**](/migrations/postgresql/appendix#sequential-consistency), where writes are acknowledged only after being committed across replicas, though this mode is rarely used due to its performance impact. Distributed tables unify access across multiple shards, forwarding `SELECT` queries to all shards and merging the results. For `INSERT` operations, they balance the load by evenly routing data across shards. ClickHouse's replication is highly flexible: any replica (a copy of a shard) can accept writes, and all changes are asynchronously synchronized to others. This architecture allows uninterrupted query serving during failures or maintenance, with resynchronization handled automatically - eliminating the need for primary-secondary enforcement at the data layer. + +:::note ClickHouse Cloud +In **ClickHouse Cloud**, the architecture introduces a shared-nothing compute model where a single **shard is backed by object storage**. This replaces traditional replica-based high availability, allowing the shard to be **read and written by multiple nodes simultaneously**. The separation of storage and compute enables elastic scaling without explicit replica management. +::: + +In summary: + +- **Elastic**: Shards are physical Lucene structures tied to JVM memory. Over-sharding introduces performance penalties. Replication is synchronous and coordinated by a master node. +- **ClickHouse**: Shards are logical and vertically scalable, with highly efficient local execution. Replication is asynchronous (but can be sequential), and coordination is lightweight. + +Ultimately, ClickHouse favors simplicity and performance at scale by minimizing the need for shard tuning while still offering strong consistency guarantees when needed. + +### Deduplication and routing {#deduplication-and-routing} + +Elasticsearch de-duplicates documents based on their `_id`, routing them to shards accordingly. ClickHouse does not store a default row identifier but supports **insert-time deduplication**, allowing users to retry failed inserts safely. For more control, `ReplacingMergeTree` and other table engines enable deduplication by specific columns. + +Index routing in Elasticsearch ensures specific documents are always routed to specific shards. In ClickHouse, users can define **shard keys** or use `Distributed` tables to achieve similar data locality. + +### Aggregations and execution model {#aggregations-execution-model} + +While both systems support the aggregation of data, ClickHouse offers significantly [more functions](/sql-reference/aggregate-functions/reference), including statistical, approximate, and specialized analytical functions. + +In observability use cases, one of the most common applications for aggregations is to count how often specific log messages or events occur (and alert in case the frequency is unusual). + +The equivalent to a ClickHouse `SELECT count(*) FROM ... GROUP BY ...` SQL query in Elasticsearch is the [terms aggregation](https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-bucket-terms-aggregation.html), which is an Elasticsearch [bucket aggregation](https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-bucket.html). + +ClickHouse's `GROUP BY` with a `count(*)` and Elasticsearch's terms aggregation are generally equivalent in terms of functionality, but they differ widely in their implementation, performance, and result quality. + +This aggregation in Elasticsearch [estimates results in "top-N" queries](https://www.elastic.co/docs/reference/aggregations/search-aggregations-bucket-terms-aggregation#terms-agg-doc-count-error) (e.g., top 10 hosts by count), when the queried data spans multiple shards. This estimation improves speed but can compromise accuracy. Users can reduce this error by [inspecting `doc_count_error_upper_bound`](https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-bucket-terms-aggregation.html#terms-agg-doc-count-error) and increasing the `shard_size` parameter — at the cost of increased memory usage and slower query performance. + +Elasticsearch also requires a [`size` setting](https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-bucket-terms-aggregation.html#search-aggregations-bucket-terms-aggregation-size) for all bucketed aggregations — there's no way to return all unique groups without explicitly setting a limit. High-cardinality aggregations risk hitting [`max_buckets` limits](https://www.elastic.co/guide/en/elasticsearch/reference/current/search-settings.html#search-settings-max-buckets) or require paginating with a [composite aggregation](https://www.elastic.co/docs/reference/aggregations/bucket/composite-aggregation), which is often complex and inefficient. + +ClickHouse, by contrast, performs exact aggregations out of the box. Functions like `count(*)` return accurate results without needing configuration tweaks, making query behavior simpler and more predictable. + +ClickHouse imposes no size limits. You can perform unbounded group-by queries across large datasets. If memory thresholds are exceeded, ClickHouse [can spill to disk](https://clickhouse.com/docs/en/sql-reference/statements/select/group-by#group-by-in-external-memory). Aggregations that group by a prefix of the primary key are especially efficient, often running with minimal memory consumption. + +#### Execution model {#execution-model} + +The above differences can be attributed to the execution models of Elasticsearch and ClickHouse, which take fundamentally different approaches to query execution and parallelism. + +ClickHouse was designed to maximize efficiency on modern hardware. By default, ClickHouse runs a SQL query with N concurrent execution lanes on a machine with N CPU cores: + +ClickHouse execution + +On a single node, execution lanes split data into independent ranges allowing concurrent processing across CPU threads. This includes filtering, aggregation, and sorting. The local results from each lane are eventually merged, and a limit operator is applied, in case the query features a limit clause. + +Query execution is further parallelized by: +1. **SIMD vectorization**: operations on columnar data use [CPU SIMD instructions](https://en.wikipedia.org/wiki/Single_instruction,_multiple_data) (e.g., [AVX512](https://en.wikipedia.org/wiki/AVX-512)), allowing batch processing of values. +2. **Cluster-level parallelism**: in distributed setups, each node performs query processing locally. [Partial aggregation states](https://clickhouse.com/blog/aggregate-functions-combinators-in-clickhouse-for-arrays-maps-and-states#working-with-aggregation-states) are streamed to the initiating node and merged. If the query's `GROUP BY` keys align with the [sharding keys](/architecture/horizontal-scaling#shard), merging can be [minimized or avoided entirely](/operations/settings/settings#distributed_group_by_no_merge). +
+This model enables efficient scaling across cores and nodes, making ClickHouse well-suited for large-scale analytics. The use of *partial aggregation states* allows intermediate results from different threads and nodes to be merged without loss of accuracy. + +Elasticsearch, by contrast, assigns one thread per shard for most aggregations, regardless of how many CPU cores are available. These threads return shard-local top-N results, which are merged at the coordinating node. This approach can underutilize system resources and introduce potential inaccuracies in global aggregations, particularly when frequent terms are distributed across multiple shards. Accuracy can be improved by increasing the `shard_size` parameter, but this comes at the cost of higher memory usage and query latency. + +Elasticsearch execution + +In summary, ClickHouse executes aggregations and queries with finer-grained parallelism and greater control over hardware resources, while Elasticsearch relies on shard-based execution with more rigid constraints. + +For further details on the mechanics of aggregations in the respective technologies, we recommend the blog post ["ClickHouse vs. Elasticsearch: The Mechanics of Count Aggregations"](https://clickhouse.com/blog/clickhouse_vs_elasticsearch_mechanics_of_count_aggregations#elasticsearch). + +### Data management {#data-management} + +Elasticsearch and ClickHouse take fundamentally different approaches to managing time-series observability data — particularly around data retention, rollover, and tiered storage. + +#### Index lifecycle management vs native TTL {#lifecycle-vs-ttl} + +In Elasticsearch, long-term data management is handled through **Index Lifecycle Management (ILM)** and **Data Streams**. These features allow users to define policies that govern when indices are rolled over (e.g. after reaching a certain size or age), when older indices are moved to lower-cost storage (e.g. warm or cold tiers), and when they are ultimately deleted. This is necessary because Elasticsearch does **not support re-sharding**, and shards cannot grow indefinitely without performance degradation. To manage shard sizes and support efficient deletion, new indices must be created periodically and old ones removed — effectively rotating data at the index level. + +ClickHouse takes a different approach. Data is typically stored in a **single table** and managed using **TTL (time-to-live) expressions** at the column or partition level. Data can be **partitioned by date**, allowing efficient deletion without the need to create new tables or perform index rollovers. As data ages and meets the TTL condition, ClickHouse will automatically remove it — no additional infrastructure is required to manage rotation. + +#### Storage tiers and hot-warm architectures {#storage-tiers} + +Elasticsearch supports **hot-warm-cold-frozen** storage architectures, where data is moved between storage tiers with different performance characteristics. This is typically configured through ILM and tied to node roles in the cluster. + +ClickHouse supports **tiered storage** through native table engines like `MergeTree`, which can automatically move older data between different **volumes** (e.g., SSD to HDD to object storage) based on custom rules. This can mimic Elastic's hot-warm-cold approach — but without the complexity of managing multiple node roles or clusters. + +:::note ClickHouse Cloud +In **ClickHouse Cloud**, this becomes even more seamless: all data is stored on **object storage (e.g. S3)**, and compute is decoupled. Data can remain in object storage until queried, at which point it is fetched and cached locally (or in a distributed cache) — offering the same cost profile as Elastic's frozen tier, with better performance characteristics. This approach means no data needs to be moved between storage tiers, making hot-warm architectures redundant. +::: + +### Rollups vs incremental aggregates {#rollups-vs-incremental-aggregates} + +In Elasticsearch, **rollups** or **aggregates** are achieved using a mechanism called [**transforms**](https://www.elastic.co/guide/en/elasticsearch/reference/current/transforms.html). These are used to summarize time-series data at fixed intervals (e.g., hourly or daily) using a **sliding window** model. These are configured as recurring background jobs that aggregate data from one index and write the results to a separate **rollup index**. This helps reduce the cost of long-range queries by avoiding repeated scans of high-cardinality raw data. + +The following diagram sketches abstractly how transforms work (note that we use the blue color for all documents belonging to the same bucket for which we want to pre-calculate aggregate values): + +Elasticsearch transforms + +Continuous transforms use transform [checkpoints](https://www.elastic.co/guide/en/elasticsearch/reference/current/transform-checkpoints.html) based on a configurable check interval time (transform [frequency](https://www.elastic.co/guide/en/elasticsearch/reference/current/put-transform.html) with a default value of 1 minute). In the diagram above, we assume ① a new checkpoint is created after the check interval time has elapsed. Now Elasticsearch checks for changes in the transforms' source index and detects three new `blue` documents (11, 12, and 13) that exist since the previous checkpoint. Therefore the source index is filtered for all existing `blue` documents, and, with a [composite aggregation](https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-bucket-composite-aggregation.html) (to utilize result [pagination](https://www.elastic.co/guide/en/elasticsearch/reference/current/paginate-search-results.html)), the aggregate values are recalculated (and the destination index is updated with a document replacing the document containing the previous aggregation values). Similarly, at ② and ③, new checkpoints are processed by checking for changes and recalculating the aggregate values from all existing documents belonging to the same 'blue' bucket. + +ClickHouse takes a fundamentally different approach. Rather than re-aggregating data periodically, ClickHouse supports **incremental materialized views**, which transform and aggregate data **at insert time**. When new data is written to a source table, a materialized view executes a pre-defined SQL aggregation query on only the new **inserted blocks**, and writes the aggregated results to a target table. + +This model is made possible by ClickHouse's support for [**partial aggregate states**](https://clickhouse.com/docs/en/sql-reference/data-types/aggregatefunction) — intermediate representations of aggregation functions that can be stored and later merged. This allows users to maintain partially aggregated results that are fast to query and cheap to update. Since the aggregation happens as data arrives, there's no need to run expensive recurring jobs or re-summarize older data. + +We sketch the mechanics of incremental materialized views abstractly (note that we use the blue color for all rows belonging to the same group for which we want to pre-calculate aggregate values): + +ClickHouse Materialized Views + +In the diagram above, the materialized view's source table already contains a data part storing some `blue` rows (1 to 10) belonging to the same group. For this group, there also already exists a data part in the view's target table storing a [partial aggregation state](https://www.youtube.com/watch?v=QDAJTKZT8y4) for the `blue` group. When ① ② ③ inserts into the source table with new rows take place, a corresponding source table data part is created for each insert, and, in parallel, (just) for each block of newly inserted rows, a partial aggregation state is calculated and inserted in the form of a data part into the materialized view's target table. ④ During background part merges, the partial aggregation states are merged, resulting in incremental data aggregation. + +Note that all [aggregate functions](https://clickhouse.com/docs/en/sql-reference/aggregate-functions/reference) (over 90 of them), including their combinations with aggregate function [combinators](https://www.youtube.com/watch?v=7ApwD0cfAFI), support [partial aggregation states](https://clickhouse.com/docs/en/sql-reference/data-types/aggregatefunction). + +For a more concrete example of Elasticsearch vs ClickHouse for incremental aggregates, see this [example](https://github.com/ClickHouse/examples/tree/main/blog-examples/clickhouse-vs-elasticsearch/continuous-data-transformation#continuous-data-transformation-example). + +The advantages of ClickHouse's approach include: + +- **Always-up-to-date aggregates**: materialized views are always in sync with the source table. +- **No background jobs**: aggregations are pushed to insert time rather than query time. +- **Better real-time performance**: ideal for observability workloads and real-time analytics where fresh aggregates are required instantly. +- **Composable**: materialized views can be layered or joined with other views and tables for more complex query acceleration strategies. +- **Different TTLs**: different TTL settings can be applied to the source table and target table of the materialized view. + +This model is particularly powerful for observability use cases where users need to compute metrics such as per-minute error rates, latencies, or top-N breakdowns without scanning billions of raw records per query. + +### Lakehouse support {#lakehouse-support} + +ClickHouse and Elasticsearch take fundamentally different approaches to lakehouse integration. ClickHouse is a fully-fledged query execution engine capable of executing queries over lakehouse formats such as [Iceberg](/sql-reference/table-functions/iceberg) and [Delta Lake](/sql-reference/table-functions/deltalake), as well as integrating with data lake catalogs such as [AWS Glue](/use-cases/data-lake/glue-catalog) and [Unity catalog](/use-cases/data-lake/unity-catalog). These formats rely on efficient querying of [Parquet](/interfaces/formats/Parquet) files, which ClickHouse fully supports. ClickHouse can read both Iceberg and Delta Lake tables directly, enabling seamless integration with modern data lake architectures. + +In contrast, Elasticsearch is tightly coupled to its internal data format and Lucene-based storage engine. It cannot directly query lakehouse formats or Parquet files, limiting its ability to participate in modern data lake architectures. Elasticsearch requires data to be transformed and loaded into its proprietary format before it can be queried. + +ClickHouse's lakehouse capabilities extend beyond just reading data: + +- **Data catalog integration**: ClickHouse supports integration with data catalogs like [AWS Glue](/use-cases/data-lake/glue-catalog), enabling automatic discovery and access to tables in object storage. +- **Object storage support**: native support for querying data residing in [S3](/engines/table-engines/integrations/s3), [GCS](/sql-reference/table-functions/gcs), and [Azure Blob Storage](/engines/table-engines/integrations/azureBlobStorage) without requiring data movement. +- **Query federation**: the ability to correlate data across multiple sources, including lakehouse tables, traditional databases, and ClickHouse tables using [external dictionaries](/dictionary) and [table functions](/sql-reference/table-functions). +- **Incremental loading**: support for continuous loading from lakehouse tables into local [MergeTree](/engines/table-engines/mergetree-family/mergetree) tables, using features like [S3Queue](/engines/table-engines/integrations/s3queue) and [ClickPipes](/integrations/clickpipes). +- **Performance optimization**: distributed query execution over lakehouse data using [cluster functions](/sql-reference/table-functions/cluster) for improved performance. + +These capabilities make ClickHouse a natural fit for organizations adopting lakehouse architectures, allowing them to leverage both the flexibility of data lakes and the performance of a columnar database. diff --git a/docs/use-cases/observability/clickstack/migration/elastic/index.md b/docs/use-cases/observability/clickstack/migration/elastic/index.md new file mode 100644 index 00000000000..91b64208a29 --- /dev/null +++ b/docs/use-cases/observability/clickstack/migration/elastic/index.md @@ -0,0 +1,22 @@ +--- +slug: /use-cases/observability/clickstack/migration/elastic +title: 'Migrating to ClickStack from Elastic' +pagination_prev: null +pagination_next: null +description: 'Landing page migrating to the ClickHouse Observability Stack from Elastic' +show_related_blogs: true +keywords: ['Elasticsearch'] +--- + +This guide provides a comprehensive approach to migrating from Elastic Stack to ClickStack. We focus on a parallel operation strategy that minimizes risk while leveraging ClickHouse's strengths in observability workloads. + +| Section | Description | +|---------|-------------| +| [Introduction](/use-cases/observability/clickstack/migration/elastic/intro) | Overview of the migration process and key considerations | +| [Concepts](/use-cases/observability/clickstack/migration/elastic/concepts) | Understanding equivalent concepts between Elastic and ClickStack | +| [Types](/use-cases/observability/clickstack/migration/elastic/types) | Mapping Elasticsearch types to ClickHouse equivalents | +| [Search](/use-cases/observability/clickstack/migration/elastic/search) | Comparing search capabilities and query syntax | +| [Migrating Data](/use-cases/observability/clickstack/migration/elastic/migrating-data) | Strategies for data migration and parallel operation | +| [Migrating Agents](/use-cases/observability/clickstack/migration/elastic/migrating-agents) | Transitioning from Elastic agents to OpenTelemetry | +| [Migrating SDKs](/use-cases/observability/clickstack/migration/elastic/migrating-sdks) | Replacing Elastic APM agents with OpenTelemetry SDKs | + diff --git a/docs/use-cases/observability/clickstack/migration/elastic/intro.md b/docs/use-cases/observability/clickstack/migration/elastic/intro.md new file mode 100644 index 00000000000..66bed39785f --- /dev/null +++ b/docs/use-cases/observability/clickstack/migration/elastic/intro.md @@ -0,0 +1,31 @@ +--- +slug: /use-cases/observability/clickstack/migration/elastic/intro +title: 'Migrating to ClickStack from Elastic' +pagination_prev: null +pagination_next: null +sidebar_label: 'Overview' +sidebar_position: 0 +description: 'Overview for migrating to the ClickHouse Observability Stack from Elastic' +show_related_blogs: true +keywords: ['Elasticsearch'] +--- + +## Migrating to ClickStack from Elastic {#migrating-to-clickstack-from-elastic} + +This guide is intended for users migrating from the Elastic Stack — specifically those using Kibana to monitor logs, traces, and metrics collected via Elastic Agent and stored in Elasticsearch. It outlines equivalent concepts and data types in ClickStack, explains how to translate Kibana Lucene-based queries to HyperDX's syntax, and provides guidance on migrating both data and agents for a smooth transition. + +Before beginning a migration, it's important to understand the tradeoffs between ClickStack and the Elastic Stack. + +You should consider moving to ClickStack if: + +- You are ingesting large volumes of observability data and find Elastic cost-prohibitive due to inefficient compression and poor resource utilization. ClickStack can reduce storage and compute costs significantly — offering at least 10x compression on raw data. +- You experience poor search performance at scale or face ingestion bottlenecks. +- You want to correlate observability signals with business data using SQL, unifying observability and analytics workflows. +- You are committed to OpenTelemetry and want to avoid vendor lock-in. +- You want to take advantage of the separation of storage and compute in ClickHouse Cloud, enabling virtually unlimited scale — paying only for ingestion compute and object storage during idle periods. + +However, ClickStack may not be suitable if: + +- You use observability data primarily for security use cases and need a SIEM-focused product. +- Universal profiling is a critical part of your workflow. +- You require a business intelligence (BI) dashboarding platform. ClickStack intentionally has opinionated visual workflows for SREs and developers and is not designed as a Business Intelligence (BI) tool. For equivalent capabilities,m we recommend using [Grafana with the ClickHouse plugin](/integrations/grafana) or [Superset](/integrations/superset). diff --git a/docs/use-cases/observability/clickstack/migration/elastic/migrating-agents.md b/docs/use-cases/observability/clickstack/migration/elastic/migrating-agents.md new file mode 100644 index 00000000000..2afbc1b6d56 --- /dev/null +++ b/docs/use-cases/observability/clickstack/migration/elastic/migrating-agents.md @@ -0,0 +1,412 @@ +--- +slug: /use-cases/observability/clickstack/migration/elastic/migrating-agents +title: 'Migrating agents from Elastic' +pagination_prev: null +pagination_next: null +sidebar_label: 'Migrating agents' +sidebar_position: 5 +description: 'Migrating agents from Elastic' +show_related_blogs: true +keywords: ['ClickStack'] +--- + +import Image from '@theme/IdealImage'; +import ingestion_key from '@site/static/images/use-cases/observability/ingestion-keys.png'; +import add_logstash_output from '@site/static/images/use-cases/observability/add-logstash-output.png'; +import agent_output_settings from '@site/static/images/use-cases/observability/agent-output-settings.png'; +import migrating_agents from '@site/static/images/use-cases/observability/clickstack-migrating-agents.png'; + +## Migrating agents from Elastic {#migrating-agents-from-elastic} + +The Elastic Stack provides a number of Observability data collection agents. Specifically: + +- The [Beats family](https://www.elastic.co/beats) - such as [Filebeat](https://www.elastic.co/beats/filebeat), [Metricbeat](https://www.elastic.co/beats/metricbeat), and [Packetbeat](https://www.elastic.co/beats/packetbeat) - all based on the `libbeat` library. These Beats support [sending data to Elasticsearch, Kafka, Redis, or Logstash](https://www.elastic.co/docs/reference/beats/filebeat/configuring-output) over the Lumberjack protocol. +- The [`Elastic Agent`](https://www.elastic.co/elastic-agent) provides a unified agent capable of collecting logs, metrics, and traces. This agent can be centrally managed via the [Elastic Fleet Server](https://www.elastic.co/docs/reference/fleet/manage-elastic-agents-in-fleet) and supports output to Elasticsearch, Logstash, Kafka, or Redis. +- Elastic also provides a distribution of the [OpenTelemetry Collector - EDOT](https://www.elastic.co/docs/reference/opentelemetry). While it currently cannot be orchestrated by the Fleet Server, it offers a more flexible and open path for users migrating to ClickStack. + +The best migration path depends on the agent(s) currently in use. In the sections that follow, we document migration options for each major agent type. Our goal is to minimize friction and, where possible, allow users to continue using their existing agents during the transition. + +## Preferred migration path {#prefered-migration-path} + +Where possible we recommend migrating to the [OpenTelemetry (OTel) Collector](https://opentelemetry.io/docs/collector/) for all log, metric, and trace collection, deploying the collector at the [edge in an agent role](/use-cases/observability/clickstack/ingesting-data/otel-collector#collector-roles). This represents the most efficient means of sending data and avoids architectural complexity and data transformation. + +:::note Why OpenTelemetry Collector? +The OpenTelemetry Collector provides a sustainable and vendor-neutral solution for observability data ingestion. We recognize that some organizations operate fleets of thousands—or even tens of thousands—of Elastic agents. For these users, maintaining compatibility with existing agent infrastructure may be critical. This documentation is designed to support this, while also helping teams gradually transition to OpenTelemetry-based collection. +::: + +## ClickHouse OpenTelemetry endpoint {#clickhouse-otel-endpoint} + +All data is ingested into ClickStack via an **OpenTelemetry (OTel) collector** instance, which acts as the primary entry point for logs, metrics, traces, and session data. We recommend using the official [ClickStack distribution](/use-cases/observability/clickstack/ingesting-data/opentelemetry#installing-otel-collector) of the collector for this instance, if not [already bundled in your ClickStack deployment model](/use-cases/observability/clickstack/deployment). + +Users send data to this collector from [language SDKs](/use-cases/observability/clickstack/sdks) or through data collection agents collecting infrastructure metrics and logs (such OTel collectors in an [agent](/use-cases/observability/clickstack/ingesting-data/otel-collector#collector-roles) role or other technologies e.g. [Fluentd](https://www.fluentd.org/) or [Vector](https://vector.dev/)). + +**We assume this collector is available for all agent migration steps**. + +## Migrating from beats {#migrating-to-beats} + +Users with extensive Beat deployments may wish to retain these when migrating to ClickStack. + +**Currently this option has only been tested with Filebeat, and is therefore appropriate for Logs only.** + +Beats agents use the [Elastic Common Schema (ECS)](https://www.elastic.co/docs/reference/ecs), which is currently [in the process of being merged into the OpenTelemetry](https://github.com/open-telemetry/opentelemetry-specification/blob/main/oteps/0199-support-elastic-common-schema-in-opentelemetry.md) specification used by ClickStack. However, these [schemas still differ significantly](https://www.elastic.co/docs/reference/ecs/ecs-otel-alignment-overview), and users are currently responsible for transforming ECS-formatted events into OpenTelemetry format before ingestion into ClickStack. + +We recommend performing this transformation using [Vector](https://vector.dev), a lightweight and high-performance observability data pipeline that supports a powerful transformation language called Vector Remap Language (VRL). + +If your Filebeat agents are configured to send data to Kafka - a supported output by Beats - Vector can consume those events from Kafka, apply schema transformations using VRL, and then forward them via OTLP to the OpenTelemetry Collector distributed with ClickStack. + +Alternatively, Vector also supports receiving events over the Lumberjack protocol used by Logstash. This enables Beats agents to send data directly to Vector, where the same transformation process can be applied before forwarding to the ClickStack OpenTelemetry Collector via OTLP. + +We illustrate both of these architectures below. + +Migrating agents + +In the following example, we provide the initial steps to configure Vector to receive log events from Filebeat via the Lumberjack protocol. We provide VRL for mapping the inbound ECS events to OTel specification, before sending these to the ClickStack OpenTelemetry collector via OTLP. Users consuming events from Kafka can replace the Vector Logstash source with the [Kafka source](https://vector.dev/docs/reference/configuration/sources/kafka/) - all other steps remain the same. + + + +### Install vector {#install-vector} + +Install Vector using the [official installation guide](https://vector.dev/docs/setup/installation/). + +This can be installed on the same instance as your Elastic Stack OTel collector. + +Users can follow best practices with regards to architecture and security when [moving Vector to production](https://vector.dev/docs/setup/going-to-prod/). + +### Configure vector {#configure-vector} + +Vector should be configured to receive events over the Lumberjack protocol, imitating a Logstash instance. This can be achieved by configuring a [`logstash` source](https://vector.dev/docs/reference/configuration/sources/logstash/) for Vector: + +```yaml +sources: + beats: + type: logstash + address: 0.0.0.0:5044 + tls: + enabled: false # Set to true if you're using TLS + # The files below are generated from the steps at https://www.elastic.co/docs/reference/fleet/secure-logstash-connections#generate-logstash-certs + # crt_file: logstash.crt + # key_file: logstash.key + # ca_file: ca.crt + # verify_certificate: true +``` + +:::note TLS configuration +If Mutual TLS is required, generate certificates and keys using the Elastic guide ["Configure SSL/TLS for the Logstash output"](https://www.elastic.co/docs/reference/fleet/secure-logstash-connections#use-ls-output). These can then be specified in the configuration as shown above. +::: + + +Events will be received in ECS format. These can be converted to the OpenTelemetry schema using a Vector Remap Language (VRL) transformer. Configuration of this transformer is simple - with the script file held in a separate file: + +```yaml +transforms: + remap_filebeat: + inputs: ["beats"] + type: "remap" + file: 'beat_to_otel.vrl' +``` + +Note it receives events from the above `beats` source. Our remap script is shown below. This script has been tested with log events only but can form the basis for other formats. + +
+VRL - ECS to OTel + +```javascript +# Define keys to ignore at root level +ignored_keys = ["@metadata"] + +# Define resource key prefixes +resource_keys = ["host", "cloud", "agent", "service"] + +# Create separate objects for resource and log record fields +resource_obj = {} +log_record_obj = {} + +# Copy all non-ignored root keys to appropriate objects +root_keys = keys(.) +for_each(root_keys) -> |_index, key| { + if !includes(ignored_keys, key) { + val, err = get(., [key]) + if err == null { + # Check if this is a resource field + is_resource = false + if includes(resource_keys, key) { + is_resource = true + } + + # Add to appropriate object + if is_resource { + resource_obj = set(resource_obj, [key], val) ?? resource_obj + } else { + log_record_obj = set(log_record_obj, [key], val) ?? log_record_obj + } + } + } +} + +# Flatten both objects separately +flattened_resources = flatten(resource_obj, separator: ".") +flattened_logs = flatten(log_record_obj, separator: ".") + +# Process resource attributes +resource_attributes = [] +resource_keys_list = keys(flattened_resources) +for_each(resource_keys_list) -> |_index, field_key| { + field_value, err = get(flattened_resources, [field_key]) + if err == null && field_value != null { + attribute, err = { + "key": field_key, + "value": { + "stringValue": to_string(field_value) + } + } + if (err == null) { + resource_attributes = push(resource_attributes, attribute) + } + } +} + +# Process log record attributes +log_attributes = [] +log_keys_list = keys(flattened_logs) +for_each(log_keys_list) -> |_index, field_key| { + field_value, err = get(flattened_logs, [field_key]) + if err == null && field_value != null { + attribute, err = { + "key": field_key, + "value": { + "stringValue": to_string(field_value) + } + } + if (err == null) { + log_attributes = push(log_attributes, attribute) + } + } +} + +# Get timestamp for timeUnixNano (convert to nanoseconds) +timestamp_nano = if exists(.@timestamp) { + to_unix_timestamp!(parse_timestamp!(.@timestamp, format: "%Y-%m-%dT%H:%M:%S%.3fZ"), unit: "nanoseconds") +} else { + to_unix_timestamp(now(), unit: "nanoseconds") +} + +# Get message/body field +body_value = if exists(.message) { + to_string!(.message) +} else if exists(.body) { + to_string!(.body) +} else { + "" +} + +# Create the OpenTelemetry structure +. = { + "resourceLogs": [ + { + "resource": { + "attributes": resource_attributes + }, + "scopeLogs": [ + { + "scope": {}, + "logRecords": [ + { + "timeUnixNano": to_string(timestamp_nano), + "severityNumber": 9, + "severityText": "info", + "body": { + "stringValue": body_value + }, + "attributes": log_attributes + } + ] + } + ] + } + ] +} +``` + +
+ +Finally, transformed events can be sent to ClickStack via OpenTelemetry collector over OTLP. This requires the configuration of a OTLP sink in Vector, which takes events from the `remap_filebeat` transform as input: + +```yaml +sinks: + otlp: + type: opentelemetry + inputs: [remap_filebeat] # receives events from a remap transform - see below + protocol: + type: http # Use "grpc" for port 4317 + uri: http://localhost:4318/v1/logs # logs endpoint for the OTel collector + method: post + encoding: + codec: json + framing: + method: newline_delimited + headers: + content-type: application/json + authorization: ${YOUR_INGESTION_API_KEY} +``` + +The `YOUR_INGESTION_API_KEY` here is produced by ClickStack. You can find the key in the HyperDX app under `Team Settings → API Keys`. + +Ingestion keys + +Our final complete configuration is shown below: + +```yaml +sources: + beats: + type: logstash + address: 0.0.0.0:5044 + tls: + enabled: false # Set to true if you're using TLS + #crt_file: /data/elasticsearch-9.0.1/logstash/logstash.crt + #key_file: /data/elasticsearch-9.0.1/logstash/logstash.key + #ca_file: /data/elasticsearch-9.0.1/ca/ca.crt + #verify_certificate: true + + +transforms: + remap_filebeat: + inputs: ["beats"] + type: "remap" + file: 'beat_to_otel.vrl' + +sinks: + otlp: + type: opentelemetry + inputs: [remap_filebeat] + protocol: + type: http # Use "grpc" for port 4317 + uri: http://localhost:4318/v1/logs + method: post + encoding: + codec: json + framing: + method: newline_delimited + headers: + content-type: application/json +``` + +### Configure Filebeat {#configure-filebeat} + +Existing Filebeat installations simply need to be modified to send their events to Vector. This requires the configuration of a Logstash output - again, TLS can be optionally configured: + +```yaml +# ------------------------------ Logstash Output ------------------------------- +output.logstash: + # The Logstash hosts + hosts: ["localhost:5044"] + + # Optional SSL. By default is off. + # List of root certificates for HTTPS server verifications + #ssl.certificate_authorities: ["/etc/pki/root/ca.pem"] + + # Certificate for SSL client authentication + #ssl.certificate: "/etc/pki/client/cert.pem" + + # Client Certificate Key + #ssl.key: "/etc/pki/client/cert.key" +``` + +
+ + +## Migrating from Elastic Agent {#migrating-from-elastic-agent} + +The Elastic Agent consolidates the different Elastic Beats into a single package. This agent integrates with [Elastic Fleet](https://www.elastic.co/docs/reference/fleet/fleet-server), allowing it to be centrally orchestrated and configured. + +Users with Elastic Agents deployed have several migration paths: + +- Configure the agent to send to a Vector endpoint over the Lumberjack protocol. **This has currently been tested for users collecting log data with the Elastic Agent only.** This can be centrally configured via the Fleet UI in Kibana. +- [Run the agent as Elastic OpenTelemetry Collector (EDOT)](https://www.elastic.co/docs/reference/fleet/otel-agent). The Elastic Agent includes an embedded EDOT Collector that allows you to instrument your applications and infrastructure once and send data to multiple vendors and backends. In this configuration, users can simply configure the EDOT collector to forward events to the ClickStack OTel collector over OTLP. **This approach supports all event types.** + +We demonstrate both of these options below. + +### Sending data via Vector {#sending-data-via-vector} + + + +#### Install and configure Vector {#install-configure-vector} + +Install and configure Vector using the [same steps](#install-vector) as those documented for migrating from Filebeat. + +#### Configure Elastic Agent {#configure-elastic-agent} + +Elastic Agent needs to be configured to send data via the Logstash protocol Lumberjack. This is a [supported deployment pattern](https://www.elastic.co/docs/manage-data/ingest/ingest-reference-architectures/ls-networkbridge) and can either be configured centrally or [via the agent configuration file `elastic-agent.yaml`](https://www.elastic.co/docs/reference/fleet/logstash-output) if deploying without Fleet. + +Central configuration through Kibana can be achieved by adding [an Output to Fleet](https://www.elastic.co/docs/reference/fleet/fleet-settings#output-settings). + +Add Logstash output + +This output can then be used in an [agent policy](https://www.elastic.co/docs/reference/fleet/agent-policy). This will automatically mean any agents using the policy will send their data to Vector. + +Agent settings + +Since this requires secure communication over TLS to be configured, we recommend the guide ["Configure SSL/TLS for the Logstash output"](https://www.elastic.co/docs/reference/fleet/secure-logstash-connections#use-ls-output), which can be followed with the user assuming their Vector instance assumes the role of Logstash. + +Note that this requires users to configure the Logstash source in Vector to also mutual TLS. Use the keys and certificates [generated in the guide](https://www.elastic.co/docs/reference/fleet/secure-logstash-connections#generate-logstash-certs) to configure the input appropriately. + +```yaml +sources: + beats: + type: logstash + address: 0.0.0.0:5044 + tls: + enabled: true # Set to true if you're using TLS. + # The files below are generated from the steps at https://www.elastic.co/docs/reference/fleet/secure-logstash-connections#generate-logstash-certs + crt_file: logstash.crt + key_file: logstash.key + ca_file: ca.crt + verify_certificate: true +``` + + + +### Run Elastic Agent as OpenTelemetry collector {#run-agent-as-otel} + +The Elastic Agent includes an embedded EDOT Collector that allows you to instrument your applications and infrastructure once and send data to multiple vendors and backends. + +:::note Agent integrations and orchestration +Users running the EDOT collector distributed with Elastic Agent will not be able to exploit the [existing integrations offered by the agent](https://www.elastic.co/docs/reference/fleet/manage-integrations). Additionally, the collector cannot be centrally managed by Fleet - forcing the user to run the [agent in standalone mode](https://www.elastic.co/docs/reference/fleet/configure-standalone-elastic-agents), managing configuration themselves. +::: + +To run the Elastic Agent with the EDOT collector, see the [official Elastic guide](https://www.elastic.co/docs/reference/fleet/otel-agent-transform). Rather than configuring the Elastic endpoint, as indicated in the guide, remove existing `exporters` and configure the OTLP output - sending data to the ClickStack OpenTelemetry collector. For example, the configuration for the exporters becomes: + + +```yaml +exporters: + # Exporter to send logs and metrics to Elasticsearch Managed OTLP Input + otlp: + endpoint: localhost:4317 + headers: + authorization: ${YOUR_INGESTION_API_KEY} + tls: + insecure: true +``` + +The `YOUR_INGESTION_API_KEY` here is produced by ClickStack. You can find the key in the HyperDX app under `Team Settings → API Keys`. + +Ingestion keys + +If Vector has been configured to use mutual TLS, with the certificate and keys generated using the steps from the guide ["Configure SSL/TLS for the Logstash output"](https://www.elastic.co/docs/reference/fleet/secure-logstash-connections#use-ls-output), the `otlp` exporter will need to be configured accordingly e.g. + +```yaml +exporters: + # Exporter to send logs and metrics to Elasticsearch Managed OTLP Input + otlp: + endpoint: localhost:4317 + headers: + authorization: ${YOUR_INGESTION_API_KEY} + tls: + insecure: false + ca_file: /path/to/ca.crt + cert_file: /path/to/client.crt + key_file: /path/to/client.key +``` + +## Migrating from the Elastic OpenTelemetry collector {#migrating-from-elastic-otel-collector} + +Users already running the [Elastic OpenTelemetry Collector (EDOT)](https://www.elastic.co/docs/reference/opentelemetry) can simply reconfigure their agents to send to ClickStack OpenTelemetry collector via OTLP. The steps involved are identical to those outlined above for running the [Elastic Agent as an OpenTelemetry collector](#run-agent-as-otel). This approach can be used for all data types. diff --git a/docs/use-cases/observability/clickstack/migration/elastic/migrating-data.md b/docs/use-cases/observability/clickstack/migration/elastic/migrating-data.md new file mode 100644 index 00000000000..213b3a37f54 --- /dev/null +++ b/docs/use-cases/observability/clickstack/migration/elastic/migrating-data.md @@ -0,0 +1,652 @@ +--- +slug: /use-cases/observability/clickstack/migration/elastic/migrating-data +title: 'Migrating data to ClickStack from Elastic' +pagination_prev: null +pagination_next: null +sidebar_label: 'Migrating data' +sidebar_position: 4 +description: 'Migrating data to ClickHouse Observability Stack from Elastic' +show_related_blogs: true +keywords: ['ClickStack'] +--- + +## Parallel operation strategy {#parallel-operation-strategy} + +When migrating from Elastic to ClickStack for observability use cases, we recommend a **parallel operation** approach rather than attempting to migrate historical data. This strategy offers several advantages: + +1. **Minimal risk**: by running both systems concurrently, you maintain access to existing data and dashboards while validating ClickStack and familiarizing your users with the new system. +2. **Natural data expiration**: most observability data has a limited retention period (typically 30 days or less), allowing for a natural transition as data expires from Elastic. +3. **Simplified migration**: no need for complex data transfer tools or processes to move historical data between systems. +
+:::note Migrating data +We demonstrate an approach for migrating essential data from Elasticsearch to ClickHouse in the section ["Migrating data"](#migrating-data). This should not be used for larger datasets as it is rarely performant - limited by the ability for Elasticsearch to export efficiently, with only JSON format supported. +::: + +### Implementation steps {#implementation-steps} + +1. **Configure Dual Ingestion** +
+Set up your data collection pipeline to send data to both Elastic and ClickStack simultaneously. + +How this is achieved depends on your current agents for collection - see ["Migrating Agents"](/use-cases/observability/clickstack/migration/elastic/migrating-agents). + +2. **Adjust Retention Periods** +
+Configure Elastic's TTL settings to match your desired retention period. Set up the ClickStack [TTL](/use-cases/observability/clickstack/production#configure-ttl) to maintain data for the same duration. + +3. **Validate and Compare**: +
+- Run queries against both systems to ensure data consistency +- Compare query performance and results +- Migrate dashboards and alerts to ClickStack. This is currently a manual process. +- Verify that all critical dashboards and alerts work as expected in ClickStack + +4. **Gradual Transition**: +
+- As data naturally expires from Elastic, users will increasingly rely on ClickStack +- Once confidence in ClickStack is established, you can begin redirecting queries and dashboards + +### Long-term retention {#long-term-retention} + +For organizations requiring longer retention periods: + +- Continue running both systems in parallel until all data has expired from Elastic +- ClickStack [tiered storage](/engines/table-engines/mergetree-family/mergetree#table_engine-mergetree-multiple-volumes) capabilities can help manage long-term data efficiently. +- Consider using [materialized views](/materialized-view/incremental-materialized-view) to maintain aggregated or filtered historical data while allowing raw data to expire. + +### Migration timeline {#migration-timeline} + +The migration timeline will depend on your data retention requirements: + +- **30-day retention**: Migration can be completed within a month. +- **Longer retention**: Continue parallel operation until data expires from Elastic. +- **Historical data**: If absolutely necessary, consider using [Migrating data](#migrating-data) to import specific historical data. + +## Migrating settings {#migration-settings} + +When migrating from Elastic to ClickStack, your indexing and storage settings will need to be adapted to fit ClickHouse's architecture. While Elasticsearch relies on horizontal scaling and sharding for performance and fault tolerance and thus has multiple shards by default, ClickHouse is optimized for vertical scaling and typically performs best with fewer shards. + +### Recommended settings {#recommended-settings} + +We recommend starting with a **single shard** and scaling vertically. This configuration is suitable for most observability workloads and simplifies both management and query performance tuning. + +- **[ClickHouse Cloud](https://clickhouse.com/cloud)**: Uses a single-shard, multi-replica architecture by default. Storage and compute scale independently, making it ideal for observability use cases with unpredictable ingest patterns and read-heavy workloads. +- **ClickHouse OSS**: In self-managed deployments, we recommend: + - Starting with a single shard + - Scaling vertically with additional CPU and RAM + - Using [tiered storage](/observability/managing-data#storage-tiers) to extend local disk with S3-compatible object storage + - Using [`ReplicatedMergeTree`](/engines/table-engines/mergetree-family/replication) if high availability is required + - For fault tolerance, [1 replica of your shard](/engines/table-engines/mergetree-family/replication) is typically sufficient in Observability workloads. + +### When to shard {#when-to-shard} + +Sharding may be necessary if: + +- Your ingest rate exceeds the capacity of a single node (typically >500K rows/sec) +- You need tenant isolation or regional data separation +- Your total dataset is too large for a single server, even with object storage + +If you do need to shard, refer to [Horizontal scaling](/architecture/horizontal-scaling) for guidance on shard keys and distributed table setup. + +### Retention and TTL {#retention-and-ttl} + +ClickHouse uses [TTL clauses](/use-cases/observability/clickstack/production#configure-ttl) on MergeTree tables to manage data expiration. TTL policies can: + +- Automatically delete expired data +- Move older data to cold object storage +- Retain only recent, frequently queried logs on fast disk + +We recommend aligning your ClickHouse TTL configuration with your existing Elastic retention policies to maintain a consistent data lifecycle during the migration. For examples, see [ClickStack production TTL setup](/use-cases/observability/clickstack/production#configure-ttl). + +## Migrating data {#migrating-data} + +While we recommend parallel operation for most observability data, there are specific cases where direct data migration from Elasticsearch to ClickHouse may be necessary: + +- Small lookup tables used for data enrichment (e.g., user mappings, service catalogs) +- Business data stored in Elasticsearch that needs to be correlated with observability data, with ClickHouse's SQL capabilities and Business Intelligence integrations making it easier to maintain and query the data compared to Elasticsearch's more limited query options. +- Configuration data that needs to be preserved across the migration + +This approach is only viable for datasets under 10 million rows, as Elasticsearch's export capabilities are limited to JSON over HTTP and don't scale well for larger datasets. + +The following steps allow the migration of a single Elasticsearch index from ClickHouse. + + + +### Migrate schema {#migrate-scheme} + +Create a table in ClickHouse for the index being migrated from Elasticsearch. Users can map [Elasticsearch types to their ClickHouse](/use-cases/observability/clickstack/migration/elastic/types) equivalent. Alternatively, users can simply rely on the JSON data type in ClickHouse, which will dynamically create columns of the appropriate type as data is inserted. + +Consider the following Elasticsearch mapping for an index containing `syslog` data: + +
+Elasticsearch mapping + +```javascripton +GET .ds-logs-system.syslog-default-2025.06.03-000001/_mapping +{ + ".ds-logs-system.syslog-default-2025.06.03-000001": { + "mappings": { + "_meta": { + "managed_by": "fleet", + "managed": true, + "package": { + "name": "system" + } + }, + "_data_stream_timestamp": { + "enabled": true + }, + "dynamic_templates": [], + "date_detection": false, + "properties": { + "@timestamp": { + "type": "date", + "ignore_malformed": false + }, + "agent": { + "properties": { + "ephemeral_id": { + "type": "keyword", + "ignore_above": 1024 + }, + "id": { + "type": "keyword", + "ignore_above": 1024 + }, + "name": { + "type": "keyword", + "fields": { + "text": { + "type": "match_only_text" + } + } + }, + "type": { + "type": "keyword", + "ignore_above": 1024 + }, + "version": { + "type": "keyword", + "ignore_above": 1024 + } + } + }, + "cloud": { + "properties": { + "account": { + "properties": { + "id": { + "type": "keyword", + "ignore_above": 1024 + } + } + }, + "availability_zone": { + "type": "keyword", + "ignore_above": 1024 + }, + "image": { + "properties": { + "id": { + "type": "keyword", + "ignore_above": 1024 + } + } + }, + "instance": { + "properties": { + "id": { + "type": "keyword", + "ignore_above": 1024 + } + } + }, + "machine": { + "properties": { + "type": { + "type": "keyword", + "ignore_above": 1024 + } + } + }, + "provider": { + "type": "keyword", + "ignore_above": 1024 + }, + "region": { + "type": "keyword", + "ignore_above": 1024 + }, + "service": { + "properties": { + "name": { + "type": "keyword", + "fields": { + "text": { + "type": "match_only_text" + } + } + } + } + } + } + }, + "data_stream": { + "properties": { + "dataset": { + "type": "constant_keyword", + "value": "system.syslog" + }, + "namespace": { + "type": "constant_keyword", + "value": "default" + }, + "type": { + "type": "constant_keyword", + "value": "logs" + } + } + }, + "ecs": { + "properties": { + "version": { + "type": "keyword", + "ignore_above": 1024 + } + } + }, + "elastic_agent": { + "properties": { + "id": { + "type": "keyword", + "ignore_above": 1024 + }, + "snapshot": { + "type": "boolean" + }, + "version": { + "type": "keyword", + "ignore_above": 1024 + } + } + }, + "event": { + "properties": { + "agent_id_status": { + "type": "keyword", + "ignore_above": 1024 + }, + "dataset": { + "type": "constant_keyword", + "value": "system.syslog" + }, + "ingested": { + "type": "date", + "format": "strict_date_time_no_millis||strict_date_optional_time||epoch_millis", + "ignore_malformed": false + }, + "module": { + "type": "constant_keyword", + "value": "system" + }, + "timezone": { + "type": "keyword", + "ignore_above": 1024 + } + } + }, + "host": { + "properties": { + "architecture": { + "type": "keyword", + "ignore_above": 1024 + }, + "containerized": { + "type": "boolean" + }, + "hostname": { + "type": "keyword", + "ignore_above": 1024 + }, + "id": { + "type": "keyword", + "ignore_above": 1024 + }, + "ip": { + "type": "ip" + }, + "mac": { + "type": "keyword", + "ignore_above": 1024 + }, + "name": { + "type": "keyword", + "ignore_above": 1024 + }, + "os": { + "properties": { + "build": { + "type": "keyword", + "ignore_above": 1024 + }, + "codename": { + "type": "keyword", + "ignore_above": 1024 + }, + "family": { + "type": "keyword", + "ignore_above": 1024 + }, + "kernel": { + "type": "keyword", + "ignore_above": 1024 + }, + "name": { + "type": "keyword", + "fields": { + "text": { + "type": "match_only_text" + } + } + }, + "platform": { + "type": "keyword", + "ignore_above": 1024 + }, + "type": { + "type": "keyword", + "ignore_above": 1024 + }, + "version": { + "type": "keyword", + "ignore_above": 1024 + } + } + } + } + }, + "input": { + "properties": { + "type": { + "type": "keyword", + "ignore_above": 1024 + } + } + }, + "log": { + "properties": { + "file": { + "properties": { + "path": { + "type": "keyword", + "fields": { + "text": { + "type": "match_only_text" + } + } + } + } + }, + "offset": { + "type": "long" + } + } + }, + "message": { + "type": "match_only_text" + }, + "process": { + "properties": { + "name": { + "type": "keyword", + "fields": { + "text": { + "type": "match_only_text" + } + } + }, + "pid": { + "type": "long" + } + } + }, + "system": { + "properties": { + "syslog": { + "type": "object" + } + } + } + } + } + } +} +``` +
+ + +The equivalent ClickHouse table schema: + +
+ClickHouse schema + +```sql +SET enable_json_type = 1; + +CREATE TABLE logs_system_syslog +( + `@timestamp` DateTime, + `agent` Tuple( + ephemeral_id String, + id String, + name String, + type String, + version String), + `cloud` Tuple( + account Tuple( + id String), + availability_zone String, + image Tuple( + id String), + instance Tuple( + id String), + machine Tuple( + type String), + provider String, + region String, + service Tuple( + name String)), + `data_stream` Tuple( + dataset String, + namespace String, + type String), + `ecs` Tuple( + version String), + `elastic_agent` Tuple( + id String, + snapshot UInt8, + version String), + `event` Tuple( + agent_id_status String, + dataset String, + ingested DateTime, + module String, + timezone String), + `host` Tuple( + architecture String, + containerized UInt8, + hostname String, + id String, + ip Array(Variant(IPv4, IPv6)), + mac Array(String), + name String, + os Tuple( + build String, + codename String, + family String, + kernel String, + name String, + platform String, + type String, + version String)), + `input` Tuple( + type String), + `log` Tuple( + file Tuple( + path String), + offset Int64), + `message` String, + `process` Tuple( + name String, + pid Int64), + `system` Tuple( + syslog JSON) +) +ENGINE = MergeTree +ORDER BY (`host.name`, `@timestamp`) +``` + +
+ +Note that: + + - Tuples are used to represent nested structures instead of dot notation + - Used appropriate ClickHouse types based on the mapping: + - `keyword` → `String` + - `date` → `DateTime` + - `boolean` → `UInt8` + - `long` → `Int64` + - `ip` → `Array(Variant(IPv4, IPv6))`. We use a [`Variant(IPv4, IPv6)`](/sql-reference/data-types/variant) here as the field contains a mixture of [`IPv4`](/sql-reference/data-types/ipv4) and [`IPv6`](/sql-reference/data-types/ipv6). + - `object` → `JSON` for the syslog object whose structure is unpredictable. + - Columns `host.ip` and `host.mac` are explicit `Array` type, unlike in Elasticsearch where all types are arrays. + - An `ORDER BY` clause is added using timestamp and hostname for efficient time-based queries + - `MergeTree`, which is optimal for log data, is used as the engine type + +**This approach of statically defining the schema and using the JSON type selectively where required [is recommended](/integrations/data-formats/json/schema#handling-semi-structured-dynamic-structures).** + +This strict schema has a number of benefits: + +- **Data validation** – enforcing a strict schema avoids the risk of column explosion, outside of specific structures. +- **Avoids risk of column explosion**: although the JSON type scales to potentially thousands of columns, where subcolumns are stored as dedicated columns, this can lead to a column file explosion where an excessive number of column files are created that impacts performance. To mitigate this, the underlying [Dynamic type](/sql-reference/data-types/dynamic) used by JSON offers a [`max_dynamic_paths`](/sql-reference/data-types/newjson#reading-json-paths-as-sub-columns) parameter, which limits the number of unique paths stored as separate column files. Once the threshold is reached, additional paths are stored in a shared column file using a compact encoded format, maintaining performance and storage efficiency while supporting flexible data ingestion. Accessing this shared column file is, however, not as performant. Note, however, that the JSON column can be used with [type hints](/integrations/data-formats/json/schema#using-type-hints-and-skipping-paths). "Hinted" columns will deliver the same performance as dedicated columns. +- **Simpler introspection of paths and types**: although the JSON type supports [introspection functions](/sql-reference/data-types/newjson#introspection-functions) to determine the types and paths that have been inferred, static structures can be simpler to explore e.g. with `DESCRIBE`. +
+Alternatively, users can simply create a table with one `JSON` column. + +```sql +SET enable_json_type = 1; + +CREATE TABLE syslog_json +( + `json` JSON(`host.name` String, `@timestamp` DateTime) +) +ENGINE = MergeTree +ORDER BY (`json.host.name`, `json.@timestamp`) +``` + +:::note +We provide a type hint for the `host.name` and `timestamp` columns in the JSON definition as we use it in the ordering/primary key. This helps ClickHouse know this column won't be null and ensures it knows which sub-columns to use (there may be multiple for each type, so this is ambiguous otherwise). +::: + +This latter approach, while simpler, is best for prototyping and data engineering tasks. For production, use `JSON` only for dynamic sub structures where necessary. + +For more details on using the JSON type in schemas, and how to efficiently apply it, we recommend the guide ["Designing your schema"](/integrations/data-formats/json/schema). + +### Install `elasticdump` {#install-elasticdump} + +We recommend [`elasticdump`](https://github.com/elasticsearch-dump/elasticsearch-dump) for exporting data from Elasticsearch. This tool requires `node` and should be installed on a machine with network proximity to both Elasticsearch and ClickHouse. We recommend a dedicated server with at least 4 cores and 16GB of RAM for most exports. + +```shell +npm install elasticdump -g +``` + +`elasticdump` offers several advantages for data migration: + +- It interacts directly with the Elasticsearch REST API, ensuring proper data export. +- Maintains data consistency during the export process using the Point-in-Time (PIT) API - this creates a consistent snapshot of the data at a specific moment. +- Exports data directly to JSON format, which can be streamed to the ClickHouse client for insertion. + +Where possible, we recommend running both ClickHouse, Elasticsearch, and `elastic dump` in the same availability zone or data center to minimize network egress and maximize throughput. + +### Install ClickHouse client {#install-clickhouse-client} + +Ensure ClickHouse is [installed on the server](/install) on which `elasticdump` is located. **Do not start a ClickHouse server** - these steps only require the client. + +### Stream data {#stream-data} + +To stream data between Elasticsearch and ClickHouse, use the `elasticdump` command - piping the output directly to the ClickHouse client. The following inserts the data into our well structured table `logs_system_syslog`. + +```shell +# export url and credentials +export ELASTICSEARCH_INDEX=.ds-logs-system.syslog-default-2025.06.03-000001 +export ELASTICSEARCH_URL= +export ELASTICDUMP_INPUT_USERNAME= +export ELASTICDUMP_INPUT_PASSWORD= +export CLICKHOUSE_HOST= +export CLICKHOUSE_PASSWORD= +export CLICKHOUSE_USER=default + +# command to run - modify as required +elasticdump --input=${ELASTICSEARCH_URL} --type=data --input-index ${ELASTICSEARCH_INDEX} --output=$ --sourceOnly --searchAfter --pit=true | +clickhouse-client --host ${CLICKHOUSE_HOST} --secure --password ${CLICKHOUSE_PASSWORD} --user ${CLICKHOUSE_USER} --max_insert_block_size=1000 \ +--min_insert_block_size_bytes=0 --min_insert_block_size_rows=1000 --query="INSERT INTO test.logs_system_syslog FORMAT JSONEachRow" +``` + +Note the use of the following flags for `elasticdump`: + +- `type=data` - limits the response to only the document content in Elasticsearch. +- `input-index` - our Elasticsearch input index. +- `output=$` - redirects all results to stdout. +- `sourceOnly` flag ensuring we omit metadata fields in our response. +- `searchAfter` flag to use the [`searchAfter` API](https://www.elastic.co/docs/reference/elasticsearch/rest-apis/paginate-search-results#search-after) for efficient pagination of results. +- `pit=true` to ensure consistent results between queries using the [point in time API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-open-point-in-time). +
+Our ClickHouse client parameters here (aside from credentials): + +- `max_insert_block_size=1000` - ClickHouse client will send data once this number of rows is reached. Increasing improves throughput at the expense of time to formulate a block - thus increasing time till data appears in ClickHouse. +- `min_insert_block_size_bytes=0` - Turns off server block squashing by bytes. +- `min_insert_block_size_rows=1000` - Squashes blocks from clients on the server side. In this case, we set to `max_insert_block_size` so rows appear immediately. Increase to improve throughput. +- `query="INSERT INTO logs_system_syslog FORMAT JSONAsRow"` - Inserting the data as [JSONEachRow format](/integrations/data-formats/json/other-formats). This is appropriate if sending to a well-defined schema such as `logs_system_syslog.` +
+**Users can expect throughput in order of thousands of rows per second.** + +:::note Inserting into single JSON row +If inserting into a single JSON column (see the `syslog_json` schema above), the same insert command can be used. However, users must specify `JSONAsObject` as the format instead of `JSONEachRow` e.g. + +```shell +elasticdump --input=${ELASTICSEARCH_URL} --type=data --input-index ${ELASTICSEARCH_INDEX} --output=$ --sourceOnly --searchAfter --pit=true | +clickhouse-client --host ${CLICKHOUSE_HOST} --secure --password ${CLICKHOUSE_PASSWORD} --user ${CLICKHOUSE_USER} --max_insert_block_size=1000 \ +--min_insert_block_size_bytes=0 --min_insert_block_size_rows=1000 --query="INSERT INTO test.logs_system_syslog FORMAT JSONAsObject" +``` + +See ["Reading JSON as an object"](/integrations/data-formats/json/other-formats#reading-json-as-an-object) for further details. +::: + +### Transform data (optional) {#transform-data} + +The above commands assume a 1:1 mapping of Elasticsearch fields to ClickHouse columns. Users often need to filter and transform Elasticsearch data before insertion into ClickHouse. + +This can be achieved using the [`input`](/sql-reference/table-functions/input) table function, which allows us to execute any `SELECT` query on the stdout. + +Suppose we wish to only store the `timestamp` and `hostname` fields from our earlier data. The ClickHouse schema: + +```sql +CREATE TABLE logs_system_syslog_v2 +( + `timestamp` DateTime, + `hostname` String +) +ENGINE = MergeTree +ORDER BY (hostname, timestamp) +``` + +To insert from `elasticdump` into this table, we can simply use the `input` table function - using the JSON type to dynamically detect and select the required columns. Note this `SELECT` query could easily contain a filter. + +```shell +elasticdump --input=${ELASTICSEARCH_URL} --type=data --input-index ${ELASTICSEARCH_INDEX} --output=$ --sourceOnly --searchAfter --pit=true | +clickhouse-client --host ${CLICKHOUSE_HOST} --secure --password ${CLICKHOUSE_PASSWORD} --user ${CLICKHOUSE_USER} --max_insert_block_size=1000 \ +--min_insert_block_size_bytes=0 --min_insert_block_size_rows=1000 --query="INSERT INTO test.logs_system_syslog_v2 SELECT json.\`@timestamp\` as timestamp, json.host.hostname as hostname FROM input('json JSON') FORMAT JSONAsObject" +``` + +Note the need to escape the `@timestamp` field name and use the `JSONAsObject` input format. + +
diff --git a/docs/use-cases/observability/clickstack/migration/elastic/migrating-sdks.md b/docs/use-cases/observability/clickstack/migration/elastic/migrating-sdks.md new file mode 100644 index 00000000000..9b3e8d5c166 --- /dev/null +++ b/docs/use-cases/observability/clickstack/migration/elastic/migrating-sdks.md @@ -0,0 +1,52 @@ +--- +slug: /use-cases/observability/clickstack/migration/elastic/migrating-sdks +title: 'Migrating SDKs from Elastic' +pagination_prev: null +pagination_next: null +sidebar_label: 'Migrating SDKs' +sidebar_position: 6 +description: 'Migrating SDKs from Elastic' +show_related_blogs: true +keywords: ['ClickStack'] +--- + +import Image from '@theme/IdealImage'; +import ingestion_key from '@site/static/images/use-cases/observability/ingestion-keys.png'; + +The Elastic Stack provides two types of language SDKs for instrumenting applications: + +1. **[Elastic Official APM agents](https://www.elastic.co/docs/reference/apm-agents/)** – These are built specifically for use with the Elastic Stack. There is currently no direct migration path for these SDKs. Applications using them will need to be re-instrumented using the corresponding [ClickStack SDKs](/use-cases/observability/clickstack/sdks). + +2. **[Elastic Distributions of OpenTelemetry (EDOT SDKs)](https://www.elastic.co/docs/reference/opentelemetry/edot-sdks/)** – These are Elastic's distributions of the standard OpenTelemetry SDKs, available for .NET, Java, Node.js, PHP, and Python. If your application is already using an EDOT SDK, you do not need to re-instrument your code. Instead, you can simply reconfigure the SDK to export telemetry data to the OTLP Collector included in ClickStack. See ["Migrating EDOT SDKs"](#migrating-edot-sdks) for further details. + +:::note Use ClickStack SDKs where possible +While standard OpenTelemetry SDKs are supported, we strongly recommend using the [**ClickStack-distributed SDKs**](/use-cases/observability/clickstack/sdks) for each language. These distributions include additional instrumentation, enhanced defaults, and custom extensions designed to work seamlessly with the ClickStack pipeline and HyperDX UI. By using the ClickStack SDKs, you can unlock advanced features such as exception stack traces that are not available with vanilla OpenTelemetry or EDOT SDKs. +::: + +## Migrating EDOT SDKs {#migrating-edot-sdks} + +Similar to the ClickStack OpenTelemetry-based SDKs, the Elastic Distributions of the OpenTelemetry SDKs (EDOT SDKs) are customized versions of the official OpenTelemetry SDKs. For example, the [EDOT Python SDK](https://www.elastic.co/docs/reference/opentelemetry/edot-sdks/python/) is a vendor-customized distribution of the [OpenTelemetry Python SDK](https://opentelemetry.io/docs/languages/python/) designed to work seamlessly with Elastic Observability. + +Because these SDKs are based on standard OpenTelemetry libraries, migration to ClickStack is straightforward - no re-instrumentation is required. You only need to adjust the configuration to direct telemetry data to the ClickStack OpenTelemetry Collector. + +Configuration follows the standard OpenTelemetry mechanisms. For Python, this is typically done via environment variables, as described in the [OpenTelemetry Zero-Code Instrumentation docs](https://opentelemetry.io/docs/zero-code/python/configuration/). + +A typical EDOT SDK configuration might look like this: + +```shell +export OTEL_RESOURCE_ATTRIBUTES=service.name= +export OTEL_EXPORTER_OTLP_ENDPOINT=https://my-deployment.ingest.us-west1.gcp.cloud.es.io +export OTEL_EXPORTER_OTLP_HEADERS="Authorization=ApiKey P....l" +``` + +To migrate to ClickStack, update the endpoint to point to the local OTLP Collector and change the authorization header: + +```shell +export OTEL_RESOURCE_ATTRIBUTES=service.name= +export OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4318 +export OTEL_EXPORTER_OTLP_HEADERS="authorization=" +``` + +Your ingestion API key is generated by the HyperDX application and can be found under Team Settings → API Keys. + +Ingestion keys diff --git a/docs/use-cases/observability/clickstack/migration/elastic/search.md b/docs/use-cases/observability/clickstack/migration/elastic/search.md new file mode 100644 index 00000000000..9a399f629c0 --- /dev/null +++ b/docs/use-cases/observability/clickstack/migration/elastic/search.md @@ -0,0 +1,72 @@ +--- +slug: /use-cases/observability/clickstack/migration/elastic/search +title: 'Searching in ClickStack and Elastic' +pagination_prev: null +pagination_next: null +sidebar_label: 'Search' +sidebar_position: 3 +description: 'Searching in ClickStack and Elastic' +--- + +import Image from '@theme/IdealImage'; +import hyperdx_search from '@site/static/images/use-cases/observability/hyperdx-search.png'; +import hyperdx_sql from '@site/static/images/use-cases/observability/hyperdx-sql.png'; + + +## Search in ClickStack and Elastic {#search-in-clickstack-and-elastic} + +ClickHouse is a SQL-native engine, designed from the ground up for high-performance analytical workloads. In contrast, Elasticsearch provides a SQL-like interface, transpiling SQL into the underlying Elasticsearch query DSL — meaning it is not a first-class citizen, and [feature parity](https://www.elastic.co/docs/explore-analyze/query-filter/languages/sql-limitations) is limited. + +ClickHouse not only supports full SQL but extends it with a range of observability-focused functions, such as [`argMax`](/sql-reference/aggregate-functions/reference/argmax), [`histogram`](/sql-reference/aggregate-functions/parametric-functions#histogram), and [`quantileTiming`](/sql-reference/aggregate-functions/reference/quantiletiming), that simplify querying structured logs, metrics, and traces. + +For simple log and trace exploration, HyperDX provides a [Lucene-style syntax](/use-cases/observability/clickstack/search) for intuitive, text-based filtering for field-value queries, ranges, wildcards, and more. This is comparable to the [Lucene syntax](https://www.elastic.co/docs/reference/query-languages/query-dsl/query-dsl-query-string-query#query-string-syntax) in Elasticsearch and elements of the [Kibana Query Language](https://www.elastic.co/docs/reference/query-languages/kql). + +Search + +HyperDX's search interface supports this familiar syntax but translates it behind the scenes into efficient SQL `WHERE` clauses, making the experience familiar for Kibana users while still allowing users to leverage the power of SQL when needed. This allows users to exploit the full range of [string search functions](/sql-reference/functions/string-search-functions), [similarity functions](/sql-reference/functions/string-functions#stringjaccardindex) and [date time functions](/sql-reference/functions/date-time-functions) in ClickHouse. + +SQL + +Below, we compare the Lucene query languages of ClickStack and Elasticsearch. + +## ClickStack search syntax vs Elasticsearch query string {#hyperdx-vs-elasticsearch-query-string} + +Both HyperDX and Elasticsearch provide flexible query languages to enable intuitive log and trace filtering. While Elasticsearch's query string is tightly integrated with its DSL and indexing engine, HyperDX supports a Lucene-inspired syntax that translates to ClickHouse SQL under the hood. The table below outlines how common search patterns behave across both systems, highlighting similarities in syntax and differences in backend execution. + +| **Feature** | **HyperDX Syntax** | **Elasticsearch Syntax** | **Comments** | +|-------------------------|----------------------------------------|----------------------------------------|--------------| +| Free text search | `error` | `error` | Matches across all indexed fields; in ClickStack this is rewritten to a multi-field SQL `ILIKE`. | +| Field match | `level:error` | `level:error` | Identical syntax. HyperDX matches exact field values in ClickHouse. | +| Phrase search | `"disk full"` | `"disk full"` | Quoted text matches an exact sequence; ClickHouse uses string equality or `ILIKE`. | +| Field phrase match | `message:"disk full"` | `message:"disk full"` | Translates to SQL `ILIKE` or exact match. | +| OR conditions | `error OR warning` | `error OR warning` | Logical OR of terms; both systems support this natively. | +| AND conditions | `error AND db` | `error AND db` | Both translate to intersection; no difference in user syntax. | +| Negation | `NOT error` or `-error` | `NOT error` or `-error` | Supported identically; HyperDX converts to SQL `NOT ILIKE`. | +| Grouping | `(error OR fail) AND db` | `(error OR fail) AND db` | Standard Boolean grouping in both. | +| Wildcards | `error*` or `*fail*` | `error*`, `*fail*` | HyperDX supports leading/trailing wildcards; ES disables leading wildcards by default for perf. Wildcards within terms are not supported, e.g., `f*ail.` Wildcards must be applied with a field match.| +| Ranges (numeric/date) | `duration:[100 TO 200]` | `duration:[100 TO 200]` | HyperDX uses SQL `BETWEEN`; Elasticsearch expands to range queries. Unbounded `*` in ranges are not supported e.g. `duration:[100 TO *]`. If needed use `Unbounded ranges` below.| +| Unbounded ranges (numeric/date) | `duration:>10` or `duration:>=10` | `duration:>10` or `duration:>=10` | HyperDX uses standard SQL operators| +| Inclusive/exclusive | `duration:{100 TO 200}` (exclusive) | Same | Curly brackets denote exclusive bounds. `*` in ranges are not supported. e.g. `duration:[100 TO *]`| +| Exists check | N/A | `_exists_:user` or `field:*` | `_exists_` is not supported. Use `LogAttributes.log.file.path: *` for `Map` columns e.g. `LogAttributes`. For root columns, these have to exist and will have a default value if not included in the event. To search for default values or missing columns use the same syntax as Elasticsearch ` ServiceName:*` or `ServiceName != ''`. | +| Regex | `match` function | `name:/joh?n(ath[oa]n)/` | Not currently supported in Lucene syntax. Users can use SQL and the [`match`](/sql-reference/functions/string-search-functions#match) function or other [string search functions](/sql-reference/functions/string-search-functions).| +| Fuzzy match | `editDistance('quikc', field) = 1` | `quikc~` | Not currently supported in Lucene syntax. Distance functions can be used in SQL e.g. `editDistance('rror', SeverityText) = 1` or [other similarity functions](/sql-reference/functions/string-functions#jarosimilarity). | +| Proximity search | Not supported | `"fox quick"~5` | Not currently supported in Lucene syntax. | +| Boosting | `quick^2 fox` | `quick^2 fox` | Not supported in HyperDX at present. | +| Field wildcard | `service.*:error` | `service.*:error` | Not supported in HyperDX at present. | +| Escaped special chars | Escape reserved characters with `\` | Same | Escaping required for reserved symbols. | + + +## Exists/missing differences {#empty-value-differences} + +Unlike Elasticsearch, where a field can be entirely omitted from an event and therefore truly "not exist," ClickHouse requires all columns in a table schema to exist. If a field is not provided in an insert event: + +- For [`Nullable`](/sql-reference/data-types/nullable) fields, it will be set to `NULL`. +- For non-nullable fields (the default), it will be populated with a default value (often an empty string, 0, or equivalent). + +In ClickStack, we use the latter as [`Nullable`](/sql-reference/data-types/nullable) is [not recommended](/optimize/avoid-nullable-columns). + +This behavior means that checking whether a field "exists”" in the Elasticsearch sense is not directly supported. + +Instead, users can use `field:*` or `field != ''` to check for the presence of a non-empty value. It is thus not possible to distinguish between truly missing and explicitly empty fields. + +In practice, this difference rarely causes issues for observability use cases, but it's important to keep in mind when translating queries between systems. diff --git a/docs/use-cases/observability/clickstack/migration/elastic/types.md b/docs/use-cases/observability/clickstack/migration/elastic/types.md new file mode 100644 index 00000000000..43fd32f145d --- /dev/null +++ b/docs/use-cases/observability/clickstack/migration/elastic/types.md @@ -0,0 +1,73 @@ +--- +slug: /use-cases/observability/clickstack/migration/elastic/types +title: 'Mapping types' +pagination_prev: null +pagination_next: null +sidebar_label: 'Types' +sidebar_position: 2 +description: 'Mapping types in ClickHouse and Elasticsearch' +show_related_blogs: true +keywords: ['JSON', 'Codecs'] +--- + +Elasticsearch and ClickHouse support a wide variety of data types, but their underlying storage and query models are fundamentally different. This section maps commonly used Elasticsearch field types to their ClickHouse equivalents, where available, and provides context to help guide migrations. Where no equivalent exists, alternatives or notes are provided in the comments. + + +| **Elasticsearch Type** | **ClickHouse Equivalent** | **Comments** | +|-------------------------------|------------------------------|--------------| +| `boolean` | [`UInt8`](/sql-reference/data-types/int-uint) or [`Bool`](/sql-reference/data-types/boolean) | ClickHouse supports `Boolean` as an alias for `UInt8` in newer versions. | +| `keyword` | [`String`](/sql-reference/data-types/string) | Used for exact-match filtering, grouping, and sorting. | +| `text` | [`String`](/sql-reference/data-types/string) | Full-text search is limited in ClickHouse; tokenization requires custom logic using functions such as `tokens` combined with array functions. | +| `long` | [`Int64`](/sql-reference/data-types/int-uint) | 64-bit signed integer. | +| `integer` | [`Int32`](/sql-reference/data-types/int-uint) | 32-bit signed integer. | +| `short` | [`Int16`](/sql-reference/data-types/int-uint) | 16-bit signed integer. | +| `byte` | [`Int8`](/sql-reference/data-types/int-uint) | 8-bit signed integer. | +| `unsigned_long` | [`UInt64`](/sql-reference/data-types/int-uint) | Unsigned 64-bit integer. | +| `double` | [`Float64`](/sql-reference/data-types/float) | 64-bit floating-point. | +| `float` | [`Float32`](/sql-reference/data-types/float) | 32-bit floating-point. | +| `half_float` | [`Float32`](/sql-reference/data-types/float) or [`BFloat16`](/sql-reference/data-types/float) | Closest equivalent. ClickHouse does not have a 16-bit float. ClickHouse has a `BFloat16`- this is different from Half-float IEE-754: half-float offers higher precision with a smaller range, while bfloat16 sacrifices precision for a wider range, making it better suited for machine learning workloads. | +| `scaled_float` | [`Decimal(x, y)`](/sql-reference/data-types/decimal) | Store fixed-point numeric values. | +| `date` | [`DateTime`](/sql-reference/data-types/datetime) | Equivalent date types with second precision. | +| `date_nanos` | [`DateTime64`](/sql-reference/data-types/datetime64) | ClickHouse supports nanosecond precision with `DateTime64(9)`. | +| `binary` | [`String`](/sql-reference/data-types/string), [`FixedString(N)`](/sql-reference/data-types/fixedstring) | Needs base64 decoding for binary fields. | +| `ip` | [`IPv4`](/sql-reference/data-types/ipv4), [`IPv6`](/sql-reference/data-types/ipv6) | Native `IPv4` and `IPv6` types available. | +| `object` | [`Nested`](/sql-reference/data-types/nested-data-structures/nested), [`Map`](/sql-reference/data-types/map), [`Tuple`](/sql-reference/data-types/tuple), [`JSON`](/sql-reference/data-types/newjson) | ClickHouse can model JSON-like objects using [`Nested`](/sql-reference/data-types/nested-data-structures/nested) or [`JSON`](/sql-reference/data-types/newjson). | +| `flattened` | [`String`](/sql-reference/data-types/string) | The flattened type in Elasticsearch stores entire JSON objects as single fields, enabling flexible, schemaless access to nested keys without full mapping. In ClickHouse, similar functionality can be achieved using the String type, but requires processing to be done in materialized views. | +| `nested` | [`Nested`](/sql-reference/data-types/nested-data-structures/nested) | ClickHouse `Nested` columns provide similar semantics for grouped sub fields assuming users use `flatten_nested=0`. | +| `join` | NA | No direct concept of parent-child relationships. Not required in ClickHouse as joins across tables are supported. | +| `alias` | [`Alias`](/sql-reference/statements/create/table#alias) column modifier | Aliases [are supported](/sql-reference/statements/create/table#alias) through a field modifier. Functions can be applied to these alias e.g. `size String ALIAS formatReadableSize(size_bytes)`| +| `range` types (`*_range`) | [`Tuple(start, end)`](/sql-reference/data-types/tuple) or [`Array(T)`](/sql-reference/data-types/array) | ClickHouse has no native range type, but numerical and date ranges can be represented using [`Tuple(start, end)`](/sql-reference/data-types/tuple) or [`Array`](/sql-reference/data-types/array) structures. For IP ranges (`ip_range`), store CIDR values as `String` and evaluate with functions like `isIPAddressInRange()`. Alternatively, consider `ip_trie` based lookup dictionaries for efficient filtering. | +| `aggregate_metric_double` | [`AggregateFunction(...)`](/sql-reference/data-types/aggregatefunction) and [`SimpleAggregateFunction(...)`](/sql-reference/data-types/simpleaggregatefunction) | Use aggregate function states and materialized views to model pre-aggregated metrics. All aggregation functions support aggregate states.| +| `histogram` | [`Tuple(Array(Float64), Array(UInt64))`](/sql-reference/data-types/tuple) | Manually represent buckets and counts using arrays or custom schemas. | +| `annotated-text` | [`String`](/sql-reference/data-types/string) | No built-in support for entity-aware search or annotations. | +| `completion`, `search_as_you_type` | NA | No native autocomplete or suggester engine. Can be reproduced with `String` and [search functions](/sql-reference/functions/string-search-functions). | +| `semantic_text` | NA | No native semantic search - generate embeddings and use vector search. | +| `token_count` | [`Int32`](/sql-reference/data-types/int-uint) | Use during ingestion to compute token count manually e.g. `length(tokens())` function e.g. with a Materialized column | +| `dense_vector` | [`Array(Float32)`](/sql-reference/data-types/array) | Use arrays for embedding storage | +| `sparse_vector` | [`Map(UInt32, Float32)`](/sql-reference/data-types/map) | Simulate sparse vectors with maps. No native sparse vector support. | +| `rank_feature` / `rank_features` | [`Float32`](/sql-reference/data-types/float), [`Array(Float32)`](/sql-reference/data-types/array) | No native query-time boosting, but can be modeled manually in scoring logic. | +| `geo_point` | [`Tuple(Float64, Float64)`](/sql-reference/data-types/tuple) or [`Point`](/sql-reference/data-types/geo#point) | Use tuple of (latitude, longitude). [`Point`](/sql-reference/data-types/geo#point) is available as a ClickHouse type. | +| `geo_shape`, `shape` | [`Ring`](/sql-reference/data-types/geo#ring), [`LineString`](/sql-reference/data-types/geo#linestring), [`MultiLineString`](/sql-reference/data-types/geo#multilinestring), [`Polygon`](/sql-reference/data-types/geo#polygon), [`MultiPolygon`](/sql-reference/data-types/geo#multipolygon) | Native support for geo shapes and spatial indexing. | +| `percolator` | NA | No concept of indexing queries. Use standard SQL + Incremental Materialized Views instead. | +| `version` | [`String`](/sql-reference/data-types/string) | ClickHouse does not have a native version type. Store versions as strings and use custom UDFs functions to perform semantic comparisons if needed. Consider normalizing to numeric formats if range queries are required. | + +### Notes {#notes} + +- **Arrays**: In Elasticsearch, all fields support arrays natively. In ClickHouse, arrays must be explicitly defined (e.g., `Array(String)`), with the advantage specific positions can be accessed and queried e.g. `an_array[1]`. +- **Multi-fields**: Elasticsearch allows indexing the [same field multiple ways](https://www.elastic.co/docs/reference/elasticsearch/mapping-reference/multi-fields#_multi_fields_with_multiple_analyzers) (e.g., both `text` and `keyword`). In ClickHouse, this pattern must be modeled using separate columns or views. +- **Map and JSON Types** - In ClickHouse, the [`Map`](/sql-reference/data-types/map) type is commonly used to model dynamic key-value structures such as `resourceAttributes` and `logAttributes`. This type enables flexible schema-less ingestion by allowing arbitrary keys to be added at runtime — similar in spirit to JSON objects in Elasticsearch. However, there are important limitations to consider: + + - **Uniform value types**: ClickHouse [`Map`](/sql-reference/data-types/map) columns must have a consistent value type (e.g., `Map(String, String)`). Mixed-type values are not supported without coercion. + - **Performance cost**: accessing any key in a [`Map`](/sql-reference/data-types/map) requires loading the entire map into memory, which can be suboptimal for performance. + - **No subcolumns**: unlike JSON, keys in a [`Map`](/sql-reference/data-types/map) are not represented as true subcolumns, which limits ClickHouse’s ability to index, compress, and query efficiently. + + Because of these limitations, ClickStack is migrating away from [`Map`](/sql-reference/data-types/map) in favor of ClickHouse's enhanced [`JSON`](/sql-reference/data-types/newjson) type. The [`JSON`](/sql-reference/data-types/newjson) type addresses many of the shortcomings of `Map`: + + - **True columnar storage**: each JSON path is stored as a subcolumn, allowing efficient compression, filtering, and vectorized query execution. + - **Mixed-type support**: different data types (e.g., integers, strings, arrays) can coexist under the same path without coercion or type unification. + - **File system scalability**: internal limits on dynamic keys (`max_dynamic_paths`) and types (`max_dynamic_types`) prevent an explosion of column files on disk, even with high cardinality key sets. + - **Dense storage**: nulls and missing values are stored sparsely to avoid unnecessary overhead. + + The [`JSON`](/sql-reference/data-types/newjson) type is especially well-suited for observability workloads, offering the flexibility of schemaless ingestion with the performance and scalability of native ClickHouse types — making it an ideal replacement for [`Map`](/sql-reference/data-types/map) in dynamic attribute fields. + + For further details on the JSON type we recommend the [JSON guide](https://clickhouse.com/docs/integrations/data-formats/json/overview) and ["How we built a new powerful JSON data type for ClickHouse"](https://clickhouse.com/blog/a-new-powerful-json-data-type-for-clickhouse). diff --git a/docs/use-cases/observability/clickstack/migration/index.md b/docs/use-cases/observability/clickstack/migration/index.md new file mode 100644 index 00000000000..eea17c4238b --- /dev/null +++ b/docs/use-cases/observability/clickstack/migration/index.md @@ -0,0 +1,14 @@ +--- +slug: /use-cases/observability/clickstack/migration +title: 'Migrating to ClickStack from other Observability solutions' +pagination_prev: null +pagination_next: null +sidebar_label: 'Migration guides' +description: 'Migrating to ClickStack from other Observability solutions' +--- + +This section provides comprehensive guides for migrating from various observability solutions to ClickStack. Each guide includes detailed instructions for transitioning your data, agents, and workflows while maintaining operational continuity. + +| Technology | Description | +|------------|-------------| +| [Elastic Stack](/use-cases/observability/clickstack/migration/elastic) | Complete guide for migrating from Elastic Stack to ClickStack, covering data migration, agent transition, and search capabilities | diff --git a/docs/use-cases/observability/clickstack/overview.md b/docs/use-cases/observability/clickstack/overview.md index 4cc36bb5d2d..504f525e78b 100644 --- a/docs/use-cases/observability/clickstack/overview.md +++ b/docs/use-cases/observability/clickstack/overview.md @@ -26,7 +26,7 @@ The stack includes several key features designed for debugging and root cause an - Correlate/search logs, metrics, session replays, and traces all in one place - Schema agnostic, works on top of your existing ClickHouse schema - Blazing-fast searches & visualizations optimized for ClickHouse -- Intuitive full-text search and property search syntax (ex. `level:err`), SQL optional! +- Intuitive full-text search and property search syntax (ex. `level:err`), SQL optional. - Analyze trends in anomalies with event deltas - Set up alerts in just a few clicks - Dashboard high cardinality events without a complex query language @@ -95,3 +95,5 @@ ClickStack consists of three core components: In addition to these three components, ClickStack uses a **MongoDB instance** to store application state such as dashboards, user accounts, and configuration settings. A full architectural diagram and deployment details can be found in the [Architecture section](/use-cases/observability/clickstack/architecture). + +For users interesting in deploying ClickStack to production, we recommend reading the ["Production"](/use-cases/observability/clickstack/production) guide. diff --git a/docs/use-cases/observability/clickstack/production.md b/docs/use-cases/observability/clickstack/production.md index d81abc8991e..0b45abd130e 100644 --- a/docs/use-cases/observability/clickstack/production.md +++ b/docs/use-cases/observability/clickstack/production.md @@ -15,7 +15,7 @@ import hyperdx_login from '@site/static/images/use-cases/observability/hyperdx-l When deploying ClickStack in production, there are several additional considerations to ensure security, stability, and correct configuration. -## Network and Port Security {#network-security} +## Network and port security {#network-security} By default, Docker Compose exposes ports on the host, making them accessible from outside the container - even if tools like `ufw` (Uncomplicated Firewall) are enabled. This behavior is due to the Docker networking stack, which can bypass host-level firewall rules unless explicitly configured. @@ -35,7 +35,7 @@ ports: Refer to the [Docker networking documentation](https://docs.docker.com/network/) for details on isolating containers and hardening access. -## Session Secret Configuration {#session-secret} +## Session secret configuration {#session-secret} In production, you must set a strong, random value for the `EXPRESS_SESSION_SECRET` environment variable to protect session data and prevent tampering. @@ -69,7 +69,7 @@ Here's how to add it to your `docker-compose.yml` file for the app service: You can generate a strong secret using openssl: -```bash +```shell openssl rand -hex 32 ``` @@ -87,7 +87,7 @@ Additionally, we recommend enabling TLS for OTLP endpoints and creating a [dedic For production deployments, we recommend using [ClickHouse Cloud](https://clickhouse.com/cloud), which applies industry-standard [security practices](/cloud/security) by default - including [enhanced encryption](/cloud/security/cmek), [authentication and connectivity](/cloud/security/connectivity), and [managed access controls](/cloud/security/cloud-access-management). See ["ClickHouse Cloud"](#clickhouse-cloud-production) for a step-by-step guide of using ClickHouse Cloud with best practices. -### User Permissions {#user-permissions} +### User permissions {#user-permissions} #### HyperDX user {#hyperdx-user} @@ -123,7 +123,11 @@ ClickHouse OSS provides robust security features out of the box. However, these See also [external authenticators](/operations/external-authenticators) and [query complexity settings](/operations/settings/query-complexity) for managing users and ensuring query/resource limits. -## MongoDB Guidelines {#mongodb-guidelines} +### Configure Time To Live (TTL) {#configure-ttl} + +Ensure the [Time To Live (TTL)](/use-cases/observability/clickstack/ttl) has been [appropriately configured](/use-cases/observability/clickstack/ttl#modifying-ttl) for your ClickStack deployment. This controls how long data is retained for - the default of 3 days often needs to be modified. + +## MongoDB guidelines {#mongodb-guidelines} Follow the official [MongoDB security checklist](https://www.mongodb.com/docs/manual/administration/security-checklist/). diff --git a/docs/use-cases/observability/clickstack/search.md b/docs/use-cases/observability/clickstack/search.md index 8d896367312..4099ac8abd3 100644 --- a/docs/use-cases/observability/clickstack/search.md +++ b/docs/use-cases/observability/clickstack/search.md @@ -15,7 +15,7 @@ ClickStack allows you to do a full-text search on your events (logs and traces). This same search syntax is used for filtering events with Dashboards and Charts as well. -## Natural Language Search Syntax {#natural-language-syntax} +## Natural language search syntax {#natural-language-syntax} - Searches are not case sensitive - Searches match by whole word by default (ex. `Error` will match `Error here` @@ -31,7 +31,7 @@ as well. Search -### Column/Property Search {#column-search} +### Column/property search {#column-search} - You can search columns and JSON/map properties by using `column:value` (ex. `level:Error`, `service:app`) @@ -40,7 +40,7 @@ as well. - You can search for the existence of a property by using `property:*` (ex. `duration:*`) -## Time Input {#time-input} +## Time input {#time-input} - Time input accepts natural language inputs (ex. `1 hour ago`, `yesterday`, `last week`) @@ -50,13 +50,13 @@ as well. easy debugging of time queries. - You can highlight a histogram bar to zoom into a specific time range as well. -## SQL Search Syntax {#sql-syntax} +## SQL search syntax {#sql-syntax} You can optionally toggle search inputs to be in SQL mode. This will accept any valid SQL WHERE clause for searching. This is useful for complex queries that cannot be expressed in Lucene syntax. -## SELECT Statement {#select-statement} +## Select statement {#select-statement} To specify the columns to display in the search results, you can use the `SELECT` input. This is a SQL SELECT expression for the columns to select in the search page. diff --git a/docs/use-cases/observability/clickstack/ttl.md b/docs/use-cases/observability/clickstack/ttl.md new file mode 100644 index 00000000000..a11ba3eef2e --- /dev/null +++ b/docs/use-cases/observability/clickstack/ttl.md @@ -0,0 +1,114 @@ +--- +slug: /use-cases/observability/clickstack/ttl +title: 'Managing TTL' +sidebar_label: 'Managing TTL' +pagination_prev: null +pagination_next: null +description: 'Managing TTL with ClickStack' +--- + +import observability_14 from '@site/static/images/use-cases/observability/observability-14.png'; +import Image from '@theme/IdealImage'; + +## TTL in ClickStack {#ttl-clickstack} + +Time-to-Live (TTL) is a crucial feature in ClickStack for efficient data retention and management, especially given vast amounts of data are continuously generated. TTL allows for automatic expiration and deletion of older data, ensuring that the storage is optimally used and performance is maintained without manual intervention. This capability is essential for keeping the database lean, reducing storage costs, and ensuring that queries remain fast and efficient by focusing on the most relevant and recent data. Moreover, it helps in compliance with data retention policies by systematically managing data life cycles, thus enhancing the overall sustainability and scalability of the observability solution. + +**By default, ClickStack retains data for 3 days. To modify this, see ["Modifying TTL"](#modifying-ttl).** + +TTL is controlled at a table level in ClickHouse. For example, the schema for logs is shown below: + +```sql +CREATE TABLE default.otel_logs +( + `Timestamp` DateTime64(9) CODEC(Delta(8), ZSTD(1)), + `TimestampTime` DateTime DEFAULT toDateTime(Timestamp), + `TraceId` String CODEC(ZSTD(1)), + `SpanId` String CODEC(ZSTD(1)), + `TraceFlags` UInt8, + `SeverityText` LowCardinality(String) CODEC(ZSTD(1)), + `SeverityNumber` UInt8, + `ServiceName` LowCardinality(String) CODEC(ZSTD(1)), + `Body` String CODEC(ZSTD(1)), + `ResourceSchemaUrl` LowCardinality(String) CODEC(ZSTD(1)), + `ResourceAttributes` Map(LowCardinality(String), String) CODEC(ZSTD(1)), + `ScopeSchemaUrl` LowCardinality(String) CODEC(ZSTD(1)), + `ScopeName` String CODEC(ZSTD(1)), + `ScopeVersion` LowCardinality(String) CODEC(ZSTD(1)), + `ScopeAttributes` Map(LowCardinality(String), String) CODEC(ZSTD(1)), + `LogAttributes` Map(LowCardinality(String), String) CODEC(ZSTD(1)), + INDEX idx_trace_id TraceId TYPE bloom_filter(0.001) GRANULARITY 1, + INDEX idx_res_attr_key mapKeys(ResourceAttributes) TYPE bloom_filter(0.01) GRANULARITY 1, + INDEX idx_res_attr_value mapValues(ResourceAttributes) TYPE bloom_filter(0.01) GRANULARITY 1, + INDEX idx_scope_attr_key mapKeys(ScopeAttributes) TYPE bloom_filter(0.01) GRANULARITY 1, + INDEX idx_scope_attr_value mapValues(ScopeAttributes) TYPE bloom_filter(0.01) GRANULARITY 1, + INDEX idx_log_attr_key mapKeys(LogAttributes) TYPE bloom_filter(0.01) GRANULARITY 1, + INDEX idx_log_attr_value mapValues(LogAttributes) TYPE bloom_filter(0.01) GRANULARITY 1, + INDEX idx_body Body TYPE tokenbf_v1(32768, 3, 0) GRANULARITY 8 +) +ENGINE = MergeTree +PARTITION BY toDate(TimestampTime) +PRIMARY KEY (ServiceName, TimestampTime) +ORDER BY (ServiceName, TimestampTime, Timestamp) +TTL TimestampTime + toIntervalDay(3) +SETTINGS index_granularity = 8192, ttl_only_drop_parts = 1 +``` + +Partitioning in ClickHouse allows data to be logically separated on disk according to a column or SQL expression. By separating data logically, each partition can be operated on independently e.g. deleted when it expires according to a TTL policy. + +As shown in the above example, partitioning is specified on a table when it is initially defined via the `PARTITION BY` clause. This clause can contain an SQL expression on any column/s, the results of which will define which partition a row is sent to. This causes data to be logically associated (via a common folder name prefix) with each partition on the disk, which can then be queried in isolation. For the example above, the default `otel_logs` schema partitions by day using the expression `toDate(Timestamp).` As rows are inserted into ClickHouse, this expression will be evaluated against each row and routed to the resulting partition if it exists (if the row is the first for a day, the partition will be created). For further details on partitioning and its other applications, see ["Table Partitions"](/partitions). + +Partitions + +The table schema also includes a `TTL TimestampTime + toIntervalDay(3)` and setting `ttl_only_drop_parts = 1`. The former clause ensures data will be dropped once it is older than 3 days. The setting `ttl_only_drop_parts = 1` enforces only expiring data parts where all of the data has expired (vs. attempting to partially delete rows). With partitioning ensuring data from separate days is never "merged," data can thus be efficiently dropped. + +:::important `ttl_only_drop_parts` +We recommend always using the setting [`ttl_only_drop_parts=1`](/operations/settings/merge-tree-settings#ttl_only_drop_parts). When this setting is enabled, ClickHouse drops a whole part when all rows in it are expired. Dropping whole parts instead of partial cleaning TTL-d rows (achieved through resource-intensive mutations when `ttl_only_drop_parts=0`) allows having shorter `merge_with_ttl_timeout` times and lower impact on system performance. If data is partitioned by the same unit at which you perform TTL expiration e.g. day, parts will naturally only contain data from the defined interval. This will ensure `ttl_only_drop_parts=1` can be efficiently applied. +::: + +By default, data with an expired TTL is removed when ClickHouse [merges data parts](/engines/table-engines/mergetree-family/mergetree#mergetree-data-storage). When ClickHouse detects that data is expired, it performs an off-schedule merge. + +:::note TTL schedule +TTLs are not applied immediately but rather on a schedule, as noted above. The MergeTree table setting `merge_with_ttl_timeout` sets the minimum delay in seconds before repeating a merge with delete TTL. The default value is 14400 seconds (4 hours). But that is just the minimum delay; it can take longer until a TTL merge is triggered. If the value is too low, it will perform many off-schedule merges that may consume a lot of resources. A TTL expiration can be forced using the command `ALTER TABLE my_table MATERIALIZE TTL`. +::: + +## Modifying TTL {#modifying-ttl} + +To modify TTL users can either: + +1. **Modify the table schemas (recommended)**. This requires connecting to the ClickHouse instance e.g. using the [clickhouse-client](/interfaces/cli) or [Cloud SQL Console](/cloud/get-started/sql-console). For example, we can modify the TTL for the `otel_logs` table using the following DDL: + +```sql +ALTER TABLE default.otel_logs +MODIFY TTL TimestampTime + toIntervalDay(7); +``` + +2. **Modify the OTel collector**. The ClickStack OpenTelemetry collector creates tables in ClickHouse if they do not exist. This is achieved via the ClickHouse exporter, which itself exposes a `ttl` parameter used for controlling the default TTL expression e.g. + +```yaml +exporters: + clickhouse: + endpoint: tcp://localhost:9000?dial_timeout=10s&compress=lz4&async_insert=1 + ttl: 72h +``` + +### Column level TTL {#column-level-ttl} + +The above examples expire data at a table level. Users can also expire data at a column level. As data ages, this can be used to drop columns whose value in investigations does not justify their resource overhead to retain. For example, we recommend retaining the `Body` column in case new dynamic metadata is added that has not been extracted at insert time, e.g., a new Kubernetes label. After a period e.g. 1 month, it might be obvious that this additional metadata is not useful - thus limiting the value in retaining the `Body` column. + +Below, we show how the `Body` column can be dropped after 30 days. + +```sql +CREATE TABLE otel_logs_v2 +( + `Body` String TTL Timestamp + INTERVAL 30 DAY, + `Timestamp` DateTime, + ... +) +ENGINE = MergeTree +ORDER BY (ServiceName, Timestamp) +``` + +:::note +Specifying a column level TTL requires users to specify their own schema. This cannot be specified in the OTel collector. +::: diff --git a/docs/whats-new/changelog/index.md b/docs/whats-new/changelog/index.md index 964f5055f0c..e221f116fba 100644 --- a/docs/whats-new/changelog/index.md +++ b/docs/whats-new/changelog/index.md @@ -367,7 +367,7 @@ title: '2025 Changelog' * Don't fail silently if a user executing `SYSTEM DROP REPLICA` doesn't have enough permissions. [#75377](https://github.com/ClickHouse/ClickHouse/pull/75377) ([Bharat Nallan](https://github.com/bharatnc)). * Add a ProfileEvent about the number of times any of the system logs have failed to flush. [#75466](https://github.com/ClickHouse/ClickHouse/pull/75466) ([Alexey Milovidov](https://github.com/alexey-milovidov)). * Add a check and extra logging for decrypting and decompressing. [#75471](https://github.com/ClickHouse/ClickHouse/pull/75471) ([Vitaly Baranov](https://github.com/vitlibar)). -* Added support for the micro sign (U+00B5) in the `parseTimeDelta` function. Now both the micro sign (U+00B5) and the Greek letter mu (U+03BC) are recognized as valid representations for microseconds, aligning ClickHouse's behavior with Go’s implementation ([see time.go](https://github.com/golang/go/blob/ad7b46ee4ac1cee5095d64b01e8cf7fcda8bee5e/src/time/time.go#L983C19-L983C20) and [time/format.go](https://github.com/golang/go/blob/ad7b46ee4ac1cee5095d64b01e8cf7fcda8bee5e/src/time/format.go#L1608-L1609)). [#75472](https://github.com/ClickHouse/ClickHouse/pull/75472) ([Vitaly Orlov](https://github.com/orloffv)). +* Added support for the micro sign (U+00B5) in the `parseTimeDelta` function. Now both the micro sign (U+00B5) and the Greek letter mu (U+03BC) are recognized as valid representations for microseconds, aligning ClickHouse's behavior with Go's implementation ([see time.go](https://github.com/golang/go/blob/ad7b46ee4ac1cee5095d64b01e8cf7fcda8bee5e/src/time/time.go#L983C19-L983C20) and [time/format.go](https://github.com/golang/go/blob/ad7b46ee4ac1cee5095d64b01e8cf7fcda8bee5e/src/time/format.go#L1608-L1609)). [#75472](https://github.com/ClickHouse/ClickHouse/pull/75472) ([Vitaly Orlov](https://github.com/orloffv)). * Replace server setting (`send_settings_to_client`) with client setting (`apply_settings_from_server`) that controls whether client-side code (e.g., parsing INSERT data and formatting query output) should use settings from server's `users.xml` and user profile. Otherwise, only settings from the client command line, session, and query are used. Note that this only applies to native client (not e.g. HTTP), and doesn't apply to most of query processing (which happens on the server). [#75478](https://github.com/ClickHouse/ClickHouse/pull/75478) ([Michael Kolupaev](https://github.com/al13n321)). * Better error messages for syntax errors. Previously, if the query was too large, and the token whose length exceeds the limit is a very large string literal, the message about the reason was lost in the middle of two examples of this very long token. Fix the issue when a query with UTF-8 was cut incorrectly in the error message. Fix excessive quoting of query fragments. This closes [#75473](https://github.com/ClickHouse/ClickHouse/issues/75473). [#75561](https://github.com/ClickHouse/ClickHouse/pull/75561) ([Alexey Milovidov](https://github.com/alexey-milovidov)). * Add profile events in storage `S3(Azure)Queue`. [#75618](https://github.com/ClickHouse/ClickHouse/pull/75618) ([Kseniia Sumarokova](https://github.com/kssenii)). diff --git a/scripts/aspell-dict-file.txt b/scripts/aspell-dict-file.txt index a735c6e2d70..1488034414f 100644 --- a/scripts/aspell-dict-file.txt +++ b/scripts/aspell-dict-file.txt @@ -1054,4 +1054,6 @@ allowlist --docs/integrations/data-ingestion/azure-data-factory/overview.md-- microsoft --docs/integrations/data-ingestion/azure-data-factory/index.md-- -microsoft \ No newline at end of file +microsoft +--docs/use-cases/observability/clickstack/migration/elastic/migrating-data.md-- +clickstack diff --git a/scripts/aspell-ignore/en/aspell-dict.txt b/scripts/aspell-ignore/en/aspell-dict.txt index 26c5af4a949..a925653caaa 100644 --- a/scripts/aspell-ignore/en/aspell-dict.txt +++ b/scripts/aspell-ignore/en/aspell-dict.txt @@ -319,6 +319,7 @@ Draxlr Dresseler Durre ECMA +EDOT EMQX ETag EachRow @@ -354,6 +355,7 @@ FOSDEM FQDN Failover FarmHash +Filebeat FileCluster FileLog FilesystemCacheBytes @@ -711,6 +713,7 @@ MessageBird MessagePack Metabase Metastore +Metricbeat MetroHash MiB Milli @@ -880,6 +883,7 @@ PROCESSLIST PROXYv PRQL PSUN +Packetbeat PagerDuty ParallelFormattingOutputFormatThreads ParallelFormattingOutputFormatThreadsActive @@ -1058,6 +1062,7 @@ SCIM SDKs SELECTs SERIALIZABLE +SIEM SIGTERM SIMD SLAs @@ -2450,6 +2455,7 @@ mlockall mmap mmapped modularization +modularity moduli moduloOrZero mongoc @@ -2692,6 +2698,7 @@ prebuilt precompiled precompute precomputed +precomputing preconfigured preemptable preferServerCiphers @@ -2902,6 +2909,8 @@ roadmap rocksdb rollout rollup +rollups +Rollups roundAge roundBankers roundDown @@ -3300,6 +3309,7 @@ transactionally translateUTF translocality transpilation +transpiling trie trimBoth trimLeft @@ -3346,6 +3356,7 @@ unbin uncomment undelete undeleting +underutilize undrop undropping unencoded @@ -3426,6 +3437,7 @@ varpopstable varsamp varsampstable vectorized +vectorization vectorscan vendoring verificationDepth @@ -3543,3 +3555,4 @@ TimescaleDB columnstore TiDB resync +resynchronization diff --git a/scripts/vale/check-prose.sh b/scripts/vale/check-prose.sh index 68c1c49cfa1..b11ddcdfef1 100755 --- a/scripts/vale/check-prose.sh +++ b/scripts/vale/check-prose.sh @@ -64,7 +64,7 @@ if $USE_CHANGED_FILES; then MERGE_BASE=$(git merge-base $BASE_BRANCH $CURRENT_BRANCH) # Get changed files between merge-base and current branch - CHANGED_FILES=$(git diff --name-only $MERGE_BASE $CURRENT_BRANCH | grep -E '^docs/.*\.(md|mdx)$' | tr '\n' ' ') + CHANGED_FILES=$(git diff --name-only --diff-filter=d $MERGE_BASE $CURRENT_BRANCH | grep -E '^docs/.*\.(md|mdx)$' | tr '\n' ' ') # Also check for uncommitted changes UNCOMMITTED_FILES=$(git diff --name-only HEAD | grep -E '^docs/.*\.(md|mdx)$' | tr '\n' ' ') diff --git a/sidebars.js b/sidebars.js index 1e8cd37aad9..25d1cf0c2e7 100644 --- a/sidebars.js +++ b/sidebars.js @@ -1622,9 +1622,32 @@ const sidebars = { ] }, "use-cases/observability/clickstack/config", + "use-cases/observability/clickstack/ttl", "use-cases/observability/clickstack/search", "use-cases/observability/clickstack/alerts", "use-cases/observability/clickstack/production", + { + type: "category", + label: "Migration guides", + link: { type: "doc", id: "use-cases/observability/clickstack/migration/index" }, + collapsed: true, + collapsible: true, + items: [ + { + type: "category", + label: "Migrating from Elastic", + link: { type: "doc", id: "use-cases/observability/clickstack/migration/elastic/index" }, + collapsed: true, + collapsible: true, + items: [ + { + type: "autogenerated", + dirName: "use-cases/observability/clickstack/migration/elastic", + } + ] + } + ] + } ] }, ], diff --git a/static/images/use-cases/observability/add-logstash-output.png b/static/images/use-cases/observability/add-logstash-output.png new file mode 100644 index 00000000000..39eaf6e533b Binary files /dev/null and b/static/images/use-cases/observability/add-logstash-output.png differ diff --git a/static/images/use-cases/observability/agent-output-settings.png b/static/images/use-cases/observability/agent-output-settings.png new file mode 100644 index 00000000000..e9b23471067 Binary files /dev/null and b/static/images/use-cases/observability/agent-output-settings.png differ diff --git a/static/images/use-cases/observability/ch-mvs.png b/static/images/use-cases/observability/ch-mvs.png new file mode 100644 index 00000000000..3d00066968b Binary files /dev/null and b/static/images/use-cases/observability/ch-mvs.png differ diff --git a/static/images/use-cases/observability/clickhouse-execution.png b/static/images/use-cases/observability/clickhouse-execution.png new file mode 100644 index 00000000000..137488fc020 Binary files /dev/null and b/static/images/use-cases/observability/clickhouse-execution.png differ diff --git a/static/images/use-cases/observability/clickhouse.png b/static/images/use-cases/observability/clickhouse.png new file mode 100644 index 00000000000..65fab7af3e0 Binary files /dev/null and b/static/images/use-cases/observability/clickhouse.png differ diff --git a/static/images/use-cases/observability/clickstack-migrating-agents.png b/static/images/use-cases/observability/clickstack-migrating-agents.png new file mode 100644 index 00000000000..92bc11ffe73 Binary files /dev/null and b/static/images/use-cases/observability/clickstack-migrating-agents.png differ diff --git a/static/images/use-cases/observability/elasticsearch-execution.png b/static/images/use-cases/observability/elasticsearch-execution.png new file mode 100644 index 00000000000..70df9d92883 Binary files /dev/null and b/static/images/use-cases/observability/elasticsearch-execution.png differ diff --git a/static/images/use-cases/observability/elasticsearch.png b/static/images/use-cases/observability/elasticsearch.png new file mode 100644 index 00000000000..4f98fc96e7b Binary files /dev/null and b/static/images/use-cases/observability/elasticsearch.png differ diff --git a/static/images/use-cases/observability/es-transforms.png b/static/images/use-cases/observability/es-transforms.png new file mode 100644 index 00000000000..c0998bb0252 Binary files /dev/null and b/static/images/use-cases/observability/es-transforms.png differ diff --git a/static/images/use-cases/observability/hyperdx-search.png b/static/images/use-cases/observability/hyperdx-search.png new file mode 100644 index 00000000000..84570f49d8f Binary files /dev/null and b/static/images/use-cases/observability/hyperdx-search.png differ diff --git a/static/images/use-cases/observability/hyperdx-sql.png b/static/images/use-cases/observability/hyperdx-sql.png new file mode 100644 index 00000000000..2726c66e78c Binary files /dev/null and b/static/images/use-cases/observability/hyperdx-sql.png differ diff --git a/styles/ClickHouse/Headings.yml b/styles/ClickHouse/Headings.yml index bb84fce9cc4..9bee1cb4225 100644 --- a/styles/ClickHouse/Headings.yml +++ b/styles/ClickHouse/Headings.yml @@ -49,3 +49,20 @@ exceptions: - AWS - Frequently Asked Questions - PostgreSQL + - TTL + - ClickStack + - OpenTelemetry + - Filebeat + - Elastic Agent + - HyperDX + - Helm + - EDOT + - SDK + - Docker + - Time To Live + - Docker Compose + - Kafka + - Google Cloud Run + - NPM + - OTel + - SQL \ No newline at end of file