Skip to content

Commit 63812d6

Browse files
authored
Merge pull request #775 from Rahuls66/zomato
Zomato Restaurant Scraper
2 parents 973396e + 164bea6 commit 63812d6

File tree

4 files changed

+77
-0
lines changed

4 files changed

+77
-0
lines changed
Loading
Lines changed: 31 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,31 @@
1+
# Zomato Dine-in Resaturant Scraper
2+
3+
This python script scrapes the Name, Cuisine, Area, and Rate for Two details of Dine-in Reaturants from Zomato.
4+
This script cotnains a user defined function which scrapes the restaurant details from certain city and return a DataFrame of the fetched details.
5+
6+
## Setup instructions
7+
8+
Steps to run the script:
9+
1. Clone this repository
10+
2. Download the [Chrome Webdriver](https://chromedriver.chromium.org/downloads) for your current Google Chrome version. Save the downloaded file to the cloned `zomato_dinein_restaurant_scraper` folder.
11+
3. Install the required dependencies by running the command `pip install -r requirements.txt`
12+
4. Run the `zomato_scraper.py` file.
13+
14+
## Addtional Information of script
15+
16+
In the Python script, we have scraped the restaurants for `Indore`. One can scrape resaturants of other citties by changing the `url` variable.
17+
For example, if you want to scrape the Resaturants for `Mumbai`, change the url to `https://www.zomato.com/mumbai/dine-out`.
18+
With time, number resaturants may increase, thus try increasing the number of iterations in the `for loop` if you think all the resaturants are not fetched.
19+
20+
21+
## Output
22+
23+
![Sample](https://user-images.githubusercontent.com/43356237/137814799-c9180b73-0163-4f93-a230-b7fdb0a2b00a.png)
24+
25+
## Author(s)
26+
27+
Rahul Shah
28+
29+
## Disclaimers, if any
30+
31+
Author shall not be responsible for any malpractice done because of this script.
Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
pandas==1.3.2
2+
selenium==3.141.0
3+
beautifulsoup4==4.10.0
Lines changed: 43 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,43 @@
1+
# -- IMPORTING LIBRARIES --
2+
import pandas as pd
3+
from selenium import webdriver
4+
from bs4 import BeautifulSoup
5+
import time
6+
7+
8+
# -- STARTING CHROME WITH WEBDRIVER --
9+
browser = webdriver.Chrome()
10+
11+
12+
# -- OPENING URL IN BROWSER --
13+
url = 'https://www.zomato.com/indore/dine-out'
14+
browser.get(url)
15+
16+
17+
# -- ITERATING THROUGH THE PAGE TO GET ALL THE RESTAURANTS --
18+
for i in range(0, 25):
19+
browser.execute_script("window.scrollTo(0, document.body.scrollHeight*0.81);")
20+
time.sleep(5)
21+
browser.execute_script("window.scrollTo(0, document.body.scrollHeight*0.86);")
22+
time.sleep(1)
23+
24+
# -- EXTRACTING PAGE SOURCE --
25+
html = browser.page_source
26+
27+
# -- CREATING BeautifulSoup OBJECT --
28+
soup = BeautifulSoup(html, 'html.parser')
29+
30+
31+
# -- DEFINING FUNCTION FOR EXTRACTING RESATURANT DETAILS --
32+
def zomato(soup):
33+
name = [i.text.strip() for i in soup.find_all('h4', class_='sc-1hp8d8a-0 sc-dpiBDp iFpvOr')]
34+
cuisine = [i.text.strip() for i in soup.find_all('p', class_='sc-1hez2tp-0 sc-hENMEE ffqcCI')]
35+
area = [i.text.strip() for i in soup.find_all('p', class_='sc-1hez2tp-0 sc-dCaJBF jughZz')]
36+
rate = [i.text.strip() for i in soup.find_all('p', class_='sc-1hez2tp-0 sc-hENMEE crfqyB')]
37+
return pd.DataFrame({'Name': name, 'Cuisine': cuisine, 'Area': area, 'Rate for Two': rate})
38+
39+
40+
# -- DISPLAYING AND EXPORTING RESULTS --
41+
df = zomato(soup)
42+
print(df.head())
43+
df.to_csv('Zomato Restaurants.csv')

0 commit comments

Comments
 (0)