Analytics Hub

Analytics Hub is an intelligent data analytics platform that automatically preprocesses datasets and identifies the best machine learning model for making predictions. Built with modern web technologies and deployed on AWS infrastructure, it provides a seamless experience for data scientists and analysts.

Features

Automated Data Preprocessing: Leverages EvalML's powerful preprocessing capabilities to clean and prepare your data
Intelligent Model Selection: Automatically evaluates multiple machine learning models and recommends the best performer
Interactive Web Interface: Modern, responsive frontend built with Next.js for intuitive data exploration
Scalable Cloud Infrastructure: Deployed on AWS EC2 with Docker containerization for reliable performance
Notebook Integration: Uses Papermill for parameterized notebook execution and reporting

Technology Stack

Frontend

Next.js: React-based framework for server-side rendering and optimal performance
JavaScript: Modern ES6+ for dynamic user interactions

Backend & ML

Python: Core machine learning and data processing logic
EvalML: AutoML library for automated model selection and evaluation
Papermill: Notebook parameterization and execution engine

Infrastructure

AWS EC2: Cloud computing platform for scalable deployment
Docker: Containerization for consistent environments across development and production

Getting Started

Prerequisites

Node.js (v16 or higher)
Python 3.8+
Docker
AWS CLI (for deployment)

Local Development

Clone the repository

git clone <repository-url>
cd analytics-hub

Install frontend dependencies
```
npm install
```

Set up Python environment

python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate
pip install -r requirements.txt

Start the development servers

# Frontend (Next.js)
npm run dev

# Backend (Python API)
python app.py

Access the application
- Frontend: http://localhost:3000
- API: http://localhost:8000

Docker Deployment

Build the Docker image
```
docker build -t analytics-hub .
```

Run the container

docker run -p 3000:3000 -p 8000:8000 analytics-hub

AWS EC2 Deployment

Launch EC2 instance
- Use Amazon Linux 2 or Ubuntu AMI
- Configure security groups for ports 3000, 8000, and 22

Deploy using Docker

# SSH into your EC2 instance
ssh -i your-key.pem ec2-user@your-instance-ip

# Install Docker
sudo yum update -y
sudo yum install docker -y
sudo service docker start

# Pull and run your containerized application
docker pull your-registry/analytics-hub
docker run -d -p 3000:3000 -p 8000:8000 analytics-hub

Usage

Upload Dataset: Use the web interface to upload your CSV or structured data file
Data Preprocessing: The system automatically preprocesses your data using EvalML's built-in capabilities
Model Training: Multiple ML models are trained and evaluated automatically
Results: View model performance metrics and select the best model for your use case
Predictions: Make predictions on new data using the selected model

Configuration

Environment Variables

Create a .env.local file in the root directory:

NEXT_PUBLIC_API_URL=http://localhost:8000
AWS_REGION=us-east-1
DOCKER_REGISTRY=your-registry-url

EvalML Configuration

The EvalML pipeline can be customized in config/evalml_config.py:

EVALML_CONFIG = {
    "problem_type": "auto",
    "max_iterations": 10,
    "patience": 5,
    "tolerance": 0.01
}

Contributing

Fork the repository
Create a feature branch (git checkout -b feature/new-feature)
Commit your changes (git commit -am 'Add new feature')
Push to the branch (git push origin feature/new-feature)
Create a Pull Request

Performance Optimization

Caching: Redis integration for model caching
Load Balancing: AWS Application Load Balancer support
Auto Scaling: EC2 Auto Scaling Group configuration
Database: PostgreSQL for persistent storage

License

This project is licensed under the MIT License - see the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
backend		backend
public		public
src/app		src/app
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
eslint.config.mjs		eslint.config.mjs
jsconfig.json		jsconfig.json
next.config.mjs		next.config.mjs
package-lock.json		package-lock.json
package.json		package.json
postcss.config.mjs		postcss.config.mjs

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Analytics Hub

Features

Technology Stack

Frontend

Backend & ML

Infrastructure

Getting Started

Prerequisites

Local Development

Docker Deployment

AWS EC2 Deployment

Usage

Configuration

Environment Variables

EvalML Configuration

Contributing

Performance Optimization

License

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

License

anwitac246/data-analytics-automation

Folders and files

Latest commit

History

Repository files navigation

Analytics Hub

Features

Technology Stack

Frontend

Backend & ML

Infrastructure

Getting Started

Prerequisites

Local Development

Docker Deployment

AWS EC2 Deployment

Usage

Configuration

Environment Variables

EvalML Configuration

Contributing

Performance Optimization

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages