This project is designed to fetch articles from the Guardian API and send them to an AWS SQS queue. The project includes a Lambda function that handles the fetching and publishing of articles, as well as a set of Terraform scripts to set up the necessary AWS infrastructure.
- Fetch articles from the Guardian API
- Publish articles to an AWS SQS queue
- Terraform scripts to set up AWS infrastructure
- Unit tests for the Lambda function and message broker
- Rate limiting to comply with API usage policies
- Python 3.8 or higher
- AWS account with necessary permissions (IAM, SQS, Lambda, and CloudWatch)
- AWS CLI configured with your credentials
- Register on the Guardian website to obtain an API key and API URL: https://open-platform.theguardian.com/access/ --> Click "Register Developer Key"
-
Open a terminal and run the following command to clone the repository
git clone https://github.com/elly-uk/streaming-data-project cd streaming-data-project
-
Set up Terraform Variables
- This project requires a
terraform.tfvars
file to configure sensitive variables such as AWS credentials. - Create a
terraform.tfvars
file in theterraform
directory with the following structure:guardian_api_key = "your-api-key" guardian_api_url = "https://content.guardianapis.com/search"
- Important: Alternatively, you can skip this step. The system will automatically handle these settings in Step 5.
- Important: Do not include this file in version control for security reasons.
- This project requires a
-
Set up the virtual environment and install requirements:
make requirements
-
Run the application:
make run
-
Deploy the application:
make deploy
-
Test the Lambda function on AWS:
- Go to AWS > Lambda > Functions >
guardian_api
- Click "Test" and paste the following event:
{ "search_term": "machine learning", "date_from": "2023-01-01" }
or
{ "search_term": "machine learning", "date_from": "" }
- Click the "Test" button
- Go to AWS > SQS > Queues >
guardian_content
> "Send and receive messages" - Click "Poll for messages"
- Go to AWS > Lambda > Functions >
Relevant Link: View Tutorial Video --> https://drive.google.com/file/d/1ZZgfpFZj_Q4KOUuj1UQUeb-wx3FJwijj/view?usp=sharing
Run test at any point after setting up the environment (after " make requirements ")
- To run tests:
make test
- To run tests with coverage:
make coverage
- To check code style (PEP-8 compliance):
make lint
- To run security checks:
make security
- To destroy the application on AWS:
make destroy
- To clean up the environment and remove generated files:
make clean
- To display available commands:
make help