Food101 End-to-End Image Classifier (PyTorch + SageMaker + Gradio)

A reproducible, end-to-end PyTorch pipeline for Food101 image classification. Supports local development, SageMaker training, flexible dataset prep, Weights & Biases, and is ready for deployment through Gradio. A hosted demo is available on Hugging Face Spaces: Food101 End-to-End Classifier.

Features

EfficientNet-B2 transfer learning (feature extraction + fine-tuning)
Simple, reproducible training with local or SageMaker workflows
Dataset preparation: full or small sample subsets
Logging and experiment tracking (Weights & Biases)
Model checkpointing and flexible configuration
Ready for deployment (Gradio web app)
Gradient clipping, OneCycle LR policy, torch.compile, and mixed precision training (with autocast and GradScaler) for improved stability and GPU memory efficiency
Tests and developer tools such as ruff and black

Pre-trained Weights

Skip training and download the latest checkpoint from the GitHub Releases page.

File	Architecture	Test Acc.	Dataset	Typical Acc.†
model.pth	EfficientNet-B2 (IMAGENET1K_V1)	80.0 %	Food-101	78 – 82 %†

† Published EfficientNet-B2 runs on Food-101 usually score 78–82 % top-1.

Training hyper-parameters

seed = 42 • batch size = 128 • img size = 224
phase 1 epochs = 8 • phase 2 epochs = 10
lr-head = 4e-3 • lr-backbone = 5e-4 • patience = 3
workers = 2

Usage: move the file to output/model.pth (or update the path) before running Gradio inference.

Project Structure

├── assets/                    # Images for README
├── config/
│   └── prod.yaml              # Config for SageMaker
├── data/                      # Datasets (generated by scripts)
│   └── full/                  # Entire dataset
│   └── sample/                # Sample dataset for local testing
├── notebooks/
│   └── 00_explore.ipynb       # EDA, prototyping
├── scripts/
│   ├── download_full.py       # Download full Food101 as ImageFolder
│   ├── download_sample.py     # Create small per-class sample for local testing
│   └── remote_train.py        # Launch SageMaker training job
├── src/
│   └── train.py               # Main training script
├── tests/                     # Tests
├── app.py                     # Gradio app interface
├── class_names.txt            # Food101 class names for Gradio
├── .env.example               # Example for API keys/secrets
├── requirements.txt           # Pip dependencies
├── requirements-dev.txt       # Pip dependencies for developer tools
├── .pre-commit-config.yaml    # Pre-commit settings for ruff and black
├── README.md
├── LICENSE
└── .gitignore

Quick Start

1. Clone & Install

git clone https://github.com/codinglabsong/food101-end2end-classifier-sagemaker-gradio.git
cd food101-classifier
pip install -r requirements.txt

For CUDA users, see PyTorch's install guide

2. Prepare the Dataset

Quick sample Food101 dataset for local development:

python scripts/download_sample.py --out data/sample --train-per-class 20 --test-per-class 4

Full Food101 dataset (may take time/disk space):
```
python scripts/download_full.py --out data/full
```

3. Setting Environment Variables

Edit .env using .env.example as a guide for AWS and wandb keys.

4. Train Locally

python src/train.py \
  --batch-size 32 \
  --num-epochs-phase1 3 \
  --num-epochs-phase2 2 \
  --lr-head 1e-3 \
  --lr-backbone 1e-4 \
  --patience 3 \
  --num-workers 4 \
  --img-size 224

Use --help for all options.

Results

My model (EfficientNet-B2) achieved:

Metric	Value
Training accuracy	69.0%
Validation accuracy	75.0%
Test accuracy	80.0%

The charts above show accuracy and loss over 18 epochs (8 epochs for the head, 10 for fine‑tuning). Solid lines represent training metrics, while dotted lines indicate validation metrics.

SageMaker Training

Upload tarred datasets to S3:

tar czf food101-train.tar.gz -C data/full/train .
tar czf food101-test.tar.gz -C data/full/test .
aws s3 cp food101-train.tar.gz s3://<your-bucket>/full/
aws s3 cp food101-test.tar.gz s3://<your-bucket>/full/

Launch SageMaker training:
```
python scripts/remote_train.py
```

Running the Gradio Inference App

This project includes an interactive Gradio app for making predictions with the trained model.

Obtain the Trained Model:

Make sure you have the trained model file (model.pth).
If you trained the model yourself, it should be saved automatically to the output/ directory.
If you received a pre-trained model, download it and place it in the output/ directory at the project root.

Run the App Locally:
```
python app.py
```

The app will start locally and print a link (e.g., http://127.0.0.1:7860) to access the web UI in your browser.

Deploying on Hugging Face Spaces

Create a new Gradio Space on Hugging Face.
Upload the following files from this repo:
- gradio_app.py
- requirements.txt
- class_names.txt
- config/prod.yaml
- output/model.pth
- (optional) an examples/ folder with sample images for the Gradio UI
Commit and push to the Space. Hugging Face will build and launch the app.
View the hosted demo: Food101 End-to-End Classifier

Preprocessing Consistency & Image Size Limit

Important:
The preprocessing pipeline (image resizing, cropping, normalization) must be identical between training and inference (including Gradio app or deployment).

All transforms should use parameters from config/prod.yaml (or your config file).

The value of img_size used for training and inference must always be ≤ 256, since images are first resized so their short edge is 256 before center cropping.

Do not set img_size greater than 256. This would result in errors or ineffective cropping during inference.

Best practice:
Update only your config file (not hardcoded values) when changing image size or normalization, and always reload configs in both training and inference code.

Requirements

See requirements.txt
Python >= 3.9
PyTorch >= 2.6

Development

Install the developer tools and set up pre-commit hooks:

pip install -r requirements-dev.txt
pre-commit install

Run formatting, linting, and tests with:

pre-commit run --all-files
pytest

Contributing

Open to issues and pull requests!

References

License

This project is licensed under the MIT License.

Tips:

.env.example helps keep secrets out of git.
.gitignore: Don't track datasets, outputs, wandb, or .env.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Food101 End-to-End Image Classifier (PyTorch + SageMaker + Gradio)

Features

Pre-trained Weights

Project Structure

Quick Start

1. Clone & Install

2. Prepare the Dataset

3. Setting Environment Variables

4. Train Locally

Results

SageMaker Training

Running the Gradio Inference App

Deploying on Hugging Face Spaces

Preprocessing Consistency & Image Size Limit

Requirements

Development

Contributing

References

License

Tips:

About

Uh oh!

Releases 1

Packages

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 42 Commits
assets		assets
config		config
notebooks		notebooks
scripts		scripts
src		src
tests		tests
.env.example		.env.example
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
ETHICS.md		ETHICS.md
LICENSE		LICENSE
MODEL_CARD.md		MODEL_CARD.md
README.md		README.md
app.py		app.py
class_names.txt		class_names.txt
requirements-dev.txt		requirements-dev.txt
requirements.txt		requirements.txt

License

codinglabsong/food101-classifier-sagemaker-gradio-end2end

Folders and files

Latest commit

History

Repository files navigation

Food101 End-to-End Image Classifier (PyTorch + SageMaker + Gradio)

Features

Pre-trained Weights

Project Structure

Quick Start

1. Clone & Install

2. Prepare the Dataset

3. Setting Environment Variables

4. Train Locally

Results

SageMaker Training

Running the Gradio Inference App

Deploying on Hugging Face Spaces

Preprocessing Consistency & Image Size Limit

Requirements

Development

Contributing

References

License

Tips:

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Languages

Packages