Skip to content

Saint-Hadi/cdc-mysql-clickhouse

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 

Repository files navigation

cdc-mysql-clickhouse

In this setup, we build a real-time data pipeline using Docker Compose to orchestrate services like MariaDB (our source database), Redpanda (a Kafka-compatible event streaming platform), Debezium (for capturing database changes), and ClickHouse (as the analytics engine). After launching services with docker-compose up -d, we configure a Debezium connector via a curl command to monitor changes in the ourdb.ourtable_message table and stream them to a Redpanda topic in Avro format. On the ClickHouse side, we define a Kafka engine table to consume messages from this topic and create a materialized view to transform and store the data in a MergeTree table. Redpanda manages topic streams, and with the rpk CLI, we can list topics and scale them-e.g., using rpk topic add-partitions to increase parallelism for high-throughput processing. This pipeline enables low-latency, scalable CDC-based analytics.

Read the full document on https://medium.com/@Saint-Hadi/cdc-mysql-to-clickhouse-with-debezium-and-avro-connector-a-real-time-replication-75ea20554967