This project automates interactions with the Zomato food delivery app on Android devices. It uses optical character recognition (OCR) and vision language models to identify UI elements, allowing for automated testing and workflow execution without modifying the app's source code.
- Automatically opens Zomato app
- Navigates to restaurant listings
- Selects food items from a restaurant menu
- Adds items to cart
- Checks for available coupons
- Applies suitable coupons when available
- Python 3.7+
- Android device or emulator with Zomato app installed
- Android Debug Bridge (ADB) set up
- Tesseract OCR installed on the system
- Clone the repository
git clone https://github.com/yourusername/zomato-automation.git
cd zomato-automation
- Install dependencies
pip3 install -r requirements.txt
- Install Tesseract OCR
- On macOS:
brew install tesseract
- On Ubuntu:
sudo apt install tesseract-ocr
- On Windows: Download and install from Tesseract GitHub
- Connect your Android device via USB and enable USB debugging
Run the main script to start the automation:
python3 main.py
Edit config.py to adjust:
- App package names
- Wait times
- OCR parameters
- Vision model settings
The automation uses a combination of:
- ADB to control the Android device
- OCR to identify text on the screen
- Vision models to interpret complex UI elements
- Sequence matching to locate text when perfect matches aren't possible
MIT