Skip to content

Commit 5cbdae2

Browse files
authored
add explicit headers
1 parent 8a910bb commit 5cbdae2

File tree

1 file changed

+13
-13
lines changed

1 file changed

+13
-13
lines changed

docs/use-cases/observability/clickstack/migration/elastic/migrating-data.md

Lines changed: 13 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11

2-
## Parallel operation strategy
2+
## Parallel operation strategy {#parallel-operation-strategy}
33

44
When migrating from Elastic to ClickStack for observability use cases, we recommend a **parallel operation** approach rather than attempting to migrate historical data. This strategy offers several advantages:
55

@@ -11,7 +11,7 @@ When migrating from Elastic to ClickStack for observability use cases, we recomm
1111
We demonstrate an approach for migrating essential data from Elasticsearch to ClickHouse in the section ["Migrating data"](#migrating-data). This should not be used for larger datasets as it is rarely performant - limited by the ability for Elasticsearch to export efficiently, with only JSON format supported.
1212
:::
1313

14-
### Implementation steps
14+
### Implementation steps {#implementation-steps}
1515

1616
1. **Configure Dual Ingestion**
1717
<br/>
@@ -35,27 +35,27 @@ Configure Elastic's TTL settings to match your desired retention period. Set up
3535
- As data naturally expires from Elastic, users will increasingly rely on ClickStack
3636
- Once confidence in ClickStack is established, you can begin redirecting queries and dashboards
3737

38-
### Long-term retention
38+
### Long-term retention {#long-term-retention}
3939

4040
For organizations requiring longer retention periods:
4141

4242
- Continue running both systems in parallel until all data has expired from Elastic
4343
- ClickStack [tiered storage](/engines/table-engines/mergetree-family/mergetree#table_engine-mergetree-multiple-volumes) capabilities can help manage long-term data efficiently.
4444
- Consider using [materialized views](/materialized-view/incremental-materialized-view) to maintain aggregated or filtered historical data while allowing raw data to expire.
4545

46-
### Migration timeline
46+
### Migration timeline {#migration-timeline}
4747

4848
The migration timeline will depend on your data retention requirements:
4949

5050
- **30-day retention**: Migration can be completed within a month.
5151
- **Longer retention**: Continue parallel operation until data expires from Elastic.
5252
- **Historical data**: If absolutely necessary, consider using [Migrating data](#migrating-data) to import specific historical data.
5353

54-
## Migrating settings
54+
## Migrating settings {#migration-settings}
5555

5656
When migrating from Elastic to ClickStack, your indexing and storage settings will need to be adapted to fit ClickHouse's architecture. While Elasticsearch relies on horizontal scaling and sharding for performance and fault tolerance and thus has multiple shards by default, ClickHouse is optimized for vertical scaling and typically performs best with fewer shards.
5757

58-
### Recommended settings
58+
### Recommended settings {#recommended-settings}
5959

6060
We recommend starting with a **single shard** and scaling vertically. This configuration is suitable for most observability workloads and simplifies both management and query performance tuning.
6161

@@ -67,7 +67,7 @@ We recommend starting with a **single shard** and scaling vertically. This confi
6767
- Using [`ReplicatedMergeTree`](/engines/table-engines/mergetree-family/replication) if high availability is required
6868
- For fault tolerance, [1 replica of your shard](/engines/table-engines/mergetree-family/replication) is typically sufficient in Observability workloads.
6969

70-
### When to shard
70+
### When to shard {#when-to-shard}
7171

7272
Sharding may be necessary if:
7373

@@ -77,7 +77,7 @@ Sharding may be necessary if:
7777

7878
If you do need to shard, refer to [Horizontal scaling](/architecture/horizontal-scaling) for guidance on shard keys and distributed table setup.
7979

80-
### Retention and TTL
80+
### Retention and TTL {#retention-and-ttl}
8181

8282
ClickHouse uses [TTL clauses](/use-cases/observability/clickstack/production#configure-ttl) on MergeTree tables to manage data expiration. TTL policies can:
8383

@@ -87,7 +87,7 @@ ClickHouse uses [TTL clauses](/use-cases/observability/clickstack/production#con
8787

8888
We recommend aligning your ClickHouse TTL configuration with your existing Elastic retention policies to maintain a consistent data lifecycle during the migration. For examples, see [ClickStack production TTL setup](/use-cases/observability/clickstack/production#configure-ttl).
8989

90-
## Migrating data
90+
## Migrating data {#migrating-data}
9191

9292
While we recommend parallel operation for most observability data, there are specific cases where direct data migration from Elasticsearch to ClickHouse may be necessary:
9393

@@ -101,7 +101,7 @@ The following steps allow the migration of a single Elasticsearch index from Cli
101101

102102
<VerticalStepper headerLevel="h3">
103103

104-
### Migrate schema
104+
### Migrate schema {#migrate-scheme}
105105

106106
Create a table in ClickHouse for the index being migrated from Elasticsearch. Users can map [Elasticsearch types to their ClickHouse](/use-cases/observability/clickstack/migration/elastic/types) equivalent. Alternatively, users can simply rely on the JSON data type in ClickHouse, which will dynamically create columns of the appropriate type as data is inserted.
107107

@@ -556,11 +556,11 @@ npm install elasticdump -g
556556

557557
Where possible, we recommend running both ClickHouse, Elasticsearch, and `elastic dump` in the same availability zone or data center to minimize network egress and maximize throughput.
558558

559-
### Install ClickHouse client
559+
### Install ClickHouse client {#install-clickhouse-client}
560560

561561
Ensure ClickHouse is [installed on the server](/install) on which `elasticdump` is located. **Do not start a ClickHouse server** - these steps only require the client.
562562

563-
### Stream data
563+
### Stream data {#stream-data}
564564

565565
To stream data between Elasticsearch and ClickHouse, use the `elasticdump` command - piping the output directly to the ClickHouse client. The following inserts the data into our well structured table `logs_system_syslog`.
566566

@@ -610,7 +610,7 @@ clickhouse-client --host ${CLICKHOUSE_HOST} --secure --password ${CLICKHOUSE_PAS
610610
See ["Reading JSON as an object"](/integrations/data-formats/json/other-formats#reading-json-as-an-object) for further details.
611611
:::
612612

613-
### Transform data (optional)
613+
### Transform data (optional) {#transform-data}
614614

615615
The above commands assume a 1:1 mapping of Elasticsearch fields to ClickHouse columns. Users often need to filter and transform Elasticsearch data before insertion into ClickHouse.
616616

0 commit comments

Comments
 (0)