An AWS Glue Docker Container with PySpark for Local Development and Testing.
Start by building the Docker Image:
docker build -t awsglue .
This may take a while when the GLue Libraries are being set up. Next, start the container:
docker run --name awsglue -v ~/.aws:/home/app/.aws/credentials:ro -t -d awsglue
You will be in the aws-glue-libs
working directory where Glue is available to you.
Run your scripts using:
spark-submit main.py
ONce complete, remember to remove the contaier:
docker stop awsglue
docker rm awsglue