Skip to content

This repository provides a fully customizable setup for deploying an Apache Spark cluster using Docker and Docker Compose. It's designed to simplify local development, testing, and deployment of distributed applications powered by Spark.

Notifications You must be signed in to change notification settings

piyush-047/spark-cluster-docker-

Repository files navigation

Spark Cluster with Docker 🐳 This repository contains a containerized setup for deploying an Apache Spark cluster using Docker and Docker Compose. It simplifies the process of running a distributed Spark environment for local development, testing, or deployment in production.

πŸ“š Features Pre-configured Spark Master and Worker Nodes Horizontal Scaling: Easily scale worker nodes with Docker Compose. Custom Spark Configurations: Update Spark settings via configuration files. Multi-host Deployment: Suitable for deployment on Hetzner servers or similar platforms. Monitoring Tools: Access Spark Web UI to monitor tasks and jobs. πŸš€ Quick Start Guide Prerequisites Ensure you have the following installed:

Docker Docker Compose Steps to Get Started Clone the repository:

bash Copy code docker-compose build Start the Spark Cluster:

bash Copy code docker-compose up -d Access the Spark Master Web UI:

arduino Copy code http://localhost:8080 Scale the worker nodes as needed:

bash Copy code docker-compose up --scale spark-worker=3 πŸ› οΈ Configuration Modify Spark configurations in the config/ folder. Update docker-compose.yml to adjust the number of workers or ports. πŸ“‚ Repository Structure plaintext Copy code spark-cluster-docker/ β”œβ”€β”€ config/ β”‚ β”œβ”€β”€ spark/ # Spark configuration files β”œβ”€β”€ workspace/ # Workspace for Spark jobs and scripts β”œβ”€β”€ docker-compose.yml # Docker Compose configuration file β”œβ”€β”€ Dockerfile.spark-cluster # Dockerfile for Spark cluster setup β”œβ”€β”€ entrypoint.sh # Entrypoint script for containers └── README.md # Project documentation 🌐 Access Points Service URL Default Port Spark Master UI http://localhost 8080 Spark Worker UI http://localhost 8081+ πŸ“¦ Scaling the Cluster To scale worker nodes, use Docker Compose's --scale option:

bash Copy code docker-compose up --scale spark-worker=<number_of_workers> ✨ Customization Custom Docker Images: Edit the Dockerfile.spark-cluster to include custom libraries or dependencies. Custom Configurations: Update Spark-specific configuration files in the config/spark/ directory. 🀝 Contributing Contributions are welcome! Please follow these steps:

Fork the repository. Create a feature branch: git checkout -b feature-branch-name. Commit changes: git commit -m "Add new feature". Push to the branch: git push origin feature-branch-name. Create a Pull Request. πŸ“„ License This project is licensed under the MIT License. See the LICENSE file for details.

πŸ“§ Support For any questions, feel free to reach out to me via email or create an issue in this repository.

This README.md is ready to use. Replace placeholders (like email or license details) with your specifics, and you're all set!

About

This repository provides a fully customizable setup for deploying an Apache Spark cluster using Docker and Docker Compose. It's designed to simplify local development, testing, and deployment of distributed applications powered by Spark.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages