Real Estate Web Scraper

This Python script scrapes real estate listings from a specified website, extracting key information about properties such as house type, address, listing date, price, and description.

Features

Scrapes real estate listings from a given URL
Extracts detailed information for each property listing
Handles different date formats (including "Today" and "Yesterday")
Normalizes Unicode characters in addresses
Compiles the scraped data into a pandas DataFrame

Prerequisites

Before running this script, make sure you have the following libraries installed:

requests
beautifulsoup4
pandas
lxml

You can install these dependencies using pip:

pip install requests beautifulsoup4 pandas lxml

Usage

Import the necessary modules and the scrape_real_estate function into your Python script.
Call the function with the required parameters:

today = "2024-07-16"  # Replace with the current date
yesterday = "2024-07-15"  # Replace with yesterday's date
url = "https://example-real-estate-website.com/listings"  # Replace with the target website URL

scraped_data = scrape_real_estate(today, yesterday, url)

The function returns a dictionary containing lists of scraped data. You can then convert this to a pandas DataFrame:

df = pd.DataFrame()
for item in scraped_data:
    df = pd.concat([df, pd.DataFrame(item)], axis=0)

Function Details

The scrape_real_estate function does the following:

Sends a GET request to the specified URL with custom headers to mimic a browser request.
Parses the HTML content using BeautifulSoup.
Finds all property listings on the page.
For each listing, extracts:

House type
Address
Date listed
Price
Description

Handles special cases like recently added listings (Today/Yesterday).
Normalizes Unicode characters in addresses.

Note

This script is designed for educational purposes and should be used responsibly. Always respect the website's terms of service and robots.txt file. Consider implementing rate limiting and error handling for production use.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
README.md		README.md
realestate_scrapping_revised.ipynb		realestate_scrapping_revised.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Real Estate Web Scraper

Features

Prerequisites

Usage

Function Details

Note

About

Uh oh!

Releases

Packages

Languages

cybernatics-AI/real-estate-data-scrapper

Folders and files

Latest commit

History

Repository files navigation

Real Estate Web Scraper

Features

Prerequisites

Usage

Function Details

Note

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages