Spark Cluster with Docker π³ This repository contains a containerized setup for deploying an Apache Spark cluster using Docker and Docker Compose. It simplifies the process of running a distributed Spark environment for local development, testing, or deployment in production.
π Features Pre-configured Spark Master and Worker Nodes Horizontal Scaling: Easily scale worker nodes with Docker Compose. Custom Spark Configurations: Update Spark settings via configuration files. Multi-host Deployment: Suitable for deployment on Hetzner servers or similar platforms. Monitoring Tools: Access Spark Web UI to monitor tasks and jobs. π Quick Start Guide Prerequisites Ensure you have the following installed:
Docker Docker Compose Steps to Get Started Clone the repository:
bash Copy code docker-compose build Start the Spark Cluster:
bash Copy code docker-compose up -d Access the Spark Master Web UI:
arduino Copy code http://localhost:8080 Scale the worker nodes as needed:
bash Copy code docker-compose up --scale spark-worker=3 π οΈ Configuration Modify Spark configurations in the config/ folder. Update docker-compose.yml to adjust the number of workers or ports. π Repository Structure plaintext Copy code spark-cluster-docker/ βββ config/ β βββ spark/ # Spark configuration files βββ workspace/ # Workspace for Spark jobs and scripts βββ docker-compose.yml # Docker Compose configuration file βββ Dockerfile.spark-cluster # Dockerfile for Spark cluster setup βββ entrypoint.sh # Entrypoint script for containers βββ README.md # Project documentation π Access Points Service URL Default Port Spark Master UI http://localhost 8080 Spark Worker UI http://localhost 8081+ π¦ Scaling the Cluster To scale worker nodes, use Docker Compose's --scale option:
bash Copy code docker-compose up --scale spark-worker=<number_of_workers> β¨ Customization Custom Docker Images: Edit the Dockerfile.spark-cluster to include custom libraries or dependencies. Custom Configurations: Update Spark-specific configuration files in the config/spark/ directory. π€ Contributing Contributions are welcome! Please follow these steps:
Fork the repository. Create a feature branch: git checkout -b feature-branch-name. Commit changes: git commit -m "Add new feature". Push to the branch: git push origin feature-branch-name. Create a Pull Request. π License This project is licensed under the MIT License. See the LICENSE file for details.
π§ Support For any questions, feel free to reach out to me via email or create an issue in this repository.
This README.md is ready to use. Replace placeholders (like email or license details) with your specifics, and you're all set!