This project demonstrates a Computer Vision model that detects hand gestures: Rock, Paper, and Scissors using image classification. The model is trained using the official TensorFlow Rock-Paper-Scissors Dataset.
The dataset is provided by TensorFlow:
- Training Set: 840 images per class (rock, paper, scissors)
- Testing Set: 124 images per class
!mkdir ./tmp
!wget --no-check-certificate \
https://storage.googleapis.com/download.tensorflow.org/data/rps.zip \
-O ./tmp/rps.zip
!wget --no-check-certificate \
https://storage.googleapis.com/download.tensorflow.org/data/rps-test-set.zip \
-O ./tmp/rps-test-set.zip
import zipfile
def extract_file(src, dest):
with zipfile.ZipFile(src, 'r') as zip:
print(f'Extracting all the files from {src}...')
zip.extractall(dest)
print('Done!')
extract_file('./tmp/rps.zip', './data')
extract_file('./tmp/rps-test-set.zip', './data')
import os
def get_image_counts(parent_folder, dataset_name):
print(f"total {dataset_name} rock images: {len(os.listdir(os.path.join(parent_folder, 'rock')))}")
print(f"total {dataset_name} paper images: {len(os.listdir(os.path.join(parent_folder, 'paper')))}")
print(f"total {dataset_name} scissors images: {len(os.listdir(os.path.join(parent_folder, 'scissors')))}")
get_image_counts('./data/rps', 'training')
get_image_counts('./data/rps-test-set', 'testing')
Gesture | Training Images | Testing Images |
---|---|---|
Rock | 840 | 124 |
Paper | 840 | 124 |
Scissors | 840 | 124 |
-
Implemented using TensorFlow & Keras
-
Convolutional Neural Network (CNN)
-
Categorical classification (3 classes)
-
Trained using image data from the dataset
-
Designed to predict whether a hand gesture is:
- Rock 🪨
- Paper 📄
- Scissors ✂️
from PIL import Image
import matplotlib.pyplot as plt
import numpy as np
for label in ['rock', 'paper', 'scissors']:
im_folder = f'./data/rps/{label}'
for im_name in os.listdir(im_folder)[:2]:
img = Image.open(os.path.join(im_folder, im_name)).convert('RGB')
plt.imshow(np.asarray(img))
plt.title(f'Label: {label}')
plt.axis('off')
plt.show()
model = keras.models.Sequential([
keras.layers.Conv2D(32, (3,3), activation='relu', input_shape=(150, 150, 3)),
keras.layers.MaxPooling2D(2,2),
keras.layers.Conv2D(64, (3,3), activation='relu'),
keras.layers.MaxPooling2D(2,2),
keras.layers.Conv2D(128, (3,3), activation='relu'),
keras.layers.MaxPooling2D(2,2),
keras.layers.Flatten(),
keras.layers.Dense(512, activation='relu'),
keras.layers.Dense(3, activation='softmax')
])
model.compile(loss='categorical_crossentropy',
optimizer='adam',
metrics=['accuracy'])
history = model.fit(train_generator,
epochs=10,
validation_data=validation_generator)
plt.plot(history.history['accuracy'], label='accuracy')
plt.plot(history.history['val_accuracy'], label='val_accuracy')
plt.xlabel('Epoch')
plt.ylabel('Accuracy')
plt.legend()
plt.show()
Use OpenCV for real-time hand gesture prediction via webcam:
import cv2
cap = cv2.VideoCapture(0)
while True:
ret, frame = cap.read()
if not ret:
break
# Process frame here
cv2.imshow('Rock-Paper-Scissors Detection', frame)
if cv2.waitKey(1) & 0xFF == ord('q'):
break
cap.release()
cv2.destroyAllWindows()
- Download and extract Rock-Paper-Scissors dataset
- Display sample training images
- Count number of images per class
- Build and train a TensorFlow model
- Evaluate performance
- (Optional) Integrate with webcam
- Open the notebook in Google Colab or your Jupyter environment.
- Run all code cells step-by-step.
- Extend with webcam support or mobile deployment.
This was built as part of a Computer Vision assignment focused on gesture recognition using CNNs.