Some simple explorations, providing easy to understand reusable examples to get to grips when starting with Kafka.
Examples include;
- A getting started stack to have a simple example running locally
- A TFL stream of live tube data from their public API
- Consumer gives some example consumers, using confluent_kafka library, or Quix's library
docker compose up -d
Will start the local Kafka cluster, a data generator, Conduktor UI for interacting with Kafka on localhost:8080.
Data generator, live stream of data from the next tube approaching the Stockwell underground station from the Transport for London API.
This can be used as a Kafka producer for the local stack when an internet connection is available.
Steps involved in it's creation included:
- Navigating the TFL documentation site to find what endpoints held the data I wanted, how frequently it's updated and how to auth to the API
- Query the TFL API to grab the station codes depending which station I wanted to watch
- Query another part of the API to get the tube data, in this case incoming Victoria line tubes to Stockwell station
- Improve the reliability by building in back-offs and what to do with empty responses
- Capture and transform the API response into a usable JSON message
- Produce the JSON message into Kafka
Screenshot of my app polling the TFL API.
Screenshot from Conduktor's UI of the message in Kafka.
This leverages an adjusted Conduktor Docker compose stack for local development, containing a Kafka cluster and tooling.
To run this yourself:
- Install
requirements.txt
- Get an API key from TFL by signing-up for an account on their website and generating one
- Create a file
.env
withintfl
directory, setting your API key asTFL_API_KEY
- Run
tfl/main.py
- View results using Conduktor if you run the Docker compose, or run the consumer script in
consumer/
The other examples don't include schemas to keep the boilerplate more minimal. Read through this folder to see examples using local schemas, remote schemas on the Confluent schema registry, for both Avro and JSON schema types.
Aiven, Quix and Confluent content for producer & consumer examples to model my own around.