Skip to content

Sethumathavan2001/Amazon-Best-Sellers

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

5 Commits
Β 
Β 
Β 
Β 

Repository files navigation

πŸ›οΈ Amazon Best Sellers Scraper (India)

This Python script scrapes product data from various Amazon India Best Sellers categories like Grocery, Electronics, Beauty, Health & Personal Care, and Baby Products. The script fetches product details such as rank, name, rating, price, and more, and outputs the data as a CSV file with the current date in the filename.


πŸ“Œ Features

βœ… Scrapes from multiple Best Seller categories
βœ… Navigates nested subcategories recursively
βœ… Collects:

  • Product Rank
  • Product ID
  • Product Name
  • Rating
  • Number of People Rated
  • Price
  • Product Image URL
  • Full Category Hierarchy

βœ… Exports to CSV (dated)


πŸ“ Output Example

A CSV file named like:

With columns:

Category Sub_Category_1 Rank Name Rating People Prize Image_Link

🧠 How It Works

  1. Initial URLs: It begins with hardcoded Best Seller URLs for 5 categories from Amazon India.
  2. HTML Parsing: Uses BeautifulSoup to parse HTML.
  3. AJAX Handling: Uses a combination of GET and POST to extract dynamic content using ACP path logic.
  4. Category Tree: Recursively walks through category trees using role="group" and role="treeitem".
  5. Data Cleaning: Handles edge cases, fills missing ranks, removes duplicates.
  6. Output: Final dataset is concatenated and exported as a dated CSV.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages