ZomatoApp Test Automation with Vision Language Model

Overview

This project automates interactions with the Zomato food delivery app on Android devices. It uses optical character recognition (OCR) and vision language models to identify UI elements, allowing for automated testing and workflow execution without modifying the app's source code.

Features

Automatically opens Zomato app
Navigates to restaurant listings
Selects food items from a restaurant menu
Adds items to cart
Checks for available coupons
Applies suitable coupons when available

Requirements

Python 3.7+
Android device or emulator with Zomato app installed
Android Debug Bridge (ADB) set up
Tesseract OCR installed on the system

Installation

Clone the repository

git clone https://github.com/yourusername/zomato-automation.git
cd zomato-automation

Install dependencies

pip3 install -r requirements.txt

Install Tesseract OCR

On macOS: brew install tesseract
On Ubuntu: sudo apt install tesseract-ocr
On Windows: Download and install from Tesseract GitHub

Connect your Android device via USB and enable USB debugging

Usage

Run the main script to start the automation:

python3 main.py

Configuration

Edit config.py to adjust:

App package names
Wait times
OCR parameters
Vision model settings

How It Works

The automation uses a combination of:

ADB to control the Android device
OCR to identify text on the screen
Vision models to interpret complex UI elements
Sequence matching to locate text when perfect matches aren't possible

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
.gitignore		.gitignore
README.md		README.md
adb_controller.py		adb_controller.py
config.py		config.py
main.py		main.py
model.py		model.py
requirements.txt		requirements.txt
steps.py		steps.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

ZomatoApp Test Automation with Vision Language Model

Overview

Features

Requirements

Installation

Usage

Configuration

How It Works

License

About

Uh oh!

Releases

Packages

Uh oh!

Languages

tirthajyoti-ghosh/vlm-test-automation

Folders and files

Latest commit

History

Repository files navigation

ZomatoApp Test Automation with Vision Language Model

Overview

Features

Requirements

Installation

Usage

Configuration

How It Works

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages