A Flink-based streaming application for processing and transforming Cyberpartner state updates.
I hated touching this project. Just look at the dockerfile and ask me why.
This project uses Apache Flink to process real-time data streams from Kafka, transform Cyberpartner state data, and store results in Redis. The application is containerized with Docker and deployable to Kubernetes.
- pyenv-win - Python version management
- Scoop - Package manager for Windows
- pipx - Install Python applications
# Install Python 3.11
pyenv install 3.11
pyenv local 3.11
# Install Poetry
pipx install poetry
poetry --version
# Set up Poetry environment
poetry env info
poetry env use $(pyenv which python3)
# Install dependencies
poetry lock
poetry install --no-interaction --no-ansi --no-root
# Run locally with Doppler for secrets management
doppler run -- poetry run python src\run.py
# Format code
poetry run isort src
poetry run black src
... you need docker to survive this one
# From your project root
docker build -t flink-mqtt:latest .
docker tag flink-mqtt:latest your-registry/flink-mqtt:latest
docker push your-registry/flink-mqtt:latest
# spin up local
docker-compose up -d
# verification
docker run --rm flink-mqtt:latest python --version
docker run --rm flink-mqtt:latest bash -c "ls -la /opt/flink/opt/flink-python*"
docker run --rm flink-mqtt:latest bash -c "ls -la /opt/flink/lib/flink-connector-kafka*"
# start a flink job
docker-compose exec -it app bash
/opt/flink/bin/flink run -py /opt/app/src/count_message_len.py --jobmanager jobmanager:8081
# perform startup steps as usual
docker-compose up -d
# start the cp flink job
docker-compose exec -it app bash
./deploy_jobs.sh
# start the consumer
docker-compose exec -it python-kafka bash
> python consumer.py
# start the producer
docker-compose exec -it python-kafka bash
> python producer.py --mode new --count 5
> python producer.py --mode update --badge-id badge-d73bc7f5 --count 100 --interval .1
> python producer.py --mode random --count 20 --interval 1
# check redis
docker exec -it redis redis-cli
KEYS *
GET <badge_id>
# Build and tag with version
docker build -t ckc-flink:latest .
docker tag ckc-flink:latest 059039070213.dkr.ecr.us-east-1.amazonaws.com/ckc-flink:latest
docker push 059039070213.dkr.ecr.us-east-1.amazonaws.com/ckc-flink:latest
kubectl apply -f manifests/flink.yaml
This project uses CircleCI for continuous integration and deployment:
- Automatically builds Docker images on code changes
- Updates Kubernetes manifests with new image tags
- Deploys to AWS ECR
src/
- Python source code for Flink jobscyberpartner_transform_state.py
- Main Cyberpartner state transformation logiccount_message_len.py
- Simple example Flink job
manifests/
- Kubernetes deployment manifestskafka_scripts/
- Test scripts for producing/consuming Kafka messagesdeploy_jobs.sh
- Script to deploy Flink jobs
- Add Taskfile for common operations
- Improve local Docker commands