🌐 WebClonePro — Dynamic Webpage Scraper & Recreator

🚀 Overview

WebClonePro is an advanced tool that allows users to input any public website URL and generate a clone-like HTML preview using headless browser scraping.

This project showcases dynamic content extraction using Playwright, fast API development with FastAPI, and a sleek user interface powered by Next.js & TypeScript.

🎥 Demo Video

📺 Watch the walkthrough here:

🔗 Click the image or [watch on YouTube]((https://youtu.be/jfMwgjjgFoE)

🛠️ Features

🔗 Clone any public website URL
🎭 Powerful dynamic scraping using Playwright
⚡ Fast & asynchronous backend with FastAPI
📄 Metadata, styles, and HTML extraction
🖥️ Frontend preview of cloned content
🧠 Intelligent error handling and feedback
📦 Optional support for downloadable static sites (future)
⚙️ Planned caching and performance boosts

📂 Project Structure

WebClonePro/
├── backend/               # FastAPI backend for scraping logic
│   ├── hello.py           # Main API app
│   ├── requirements.txt   # Python dependencies
│   └── ...                # Utilities, models
│
├── frontend/              # Next.js frontend for UI
│   ├── pages/             # Main input and result pages
│   ├── components/        # Reusable React components
│   └── ...                # Static assets, config
│
├── README.md              # Project documentation
└── LICENSE                # MIT License

🖼️ Screenshots 📸 Real views of WebClonePro in action:

URL Input Page	Preview of Cloned Site

Metadata Extracted	Navigation Sample
------------------------------------------------------	------------------------------------------------------

Example Cloned	Responsive View
------------------------------------------------------	------------------------------------------------------

📋 Getting Started

🔧 Prerequisites
🐍 Python 3.9+
🔧 Node.js 16+
🎭 Install Playwright Browsers:

python -m playwright install

📥 Installation & Setup

1️⃣ Clone Repository

git clone https://github.com/Shristirajpoot/WebClonePro.git
cd WebClonePro

2️⃣ Backend Setup

cd backend
python -m venv venv

# Activate (Windows)
.\venv\Scripts\activate

# Activate (macOS/Linux)
source venv/bin/activate

pip install --upgrade pip
pip install -r requirements.txt
python -m playwright install

uvicorn hello:app --reload

Server will run at http://localhost:8000

3️⃣ Frontend Setup

cd ../frontend
npm install
npm run dev

Frontend available at http://localhost:3000

🔍 Usage

Open your browser and go to http://localhost:3000
Enter any public website URL
Submit the form and wait for the scraping
View the cloned preview and metadata

🚧 Challenges & Solutions

⚙️ Managed Playwright async subprocesses on Windows
🕸️ Solved dynamic loading via network idle wait strategy
🛡️ Implemented robust error handling and fallback messages
🚀 Used async def for high-performance scraping

🧭 Future Enhancements

📄 Clone multi-page sites with link traversal
📦 Enable download of static HTML/CSS packages
🎨 Add history and clone logs on UI
⚡ Caching for repeated URLs

📞 Contact

Shristi Rajpoot

🔗LinkedIn: www.linkedin.com/in/shristi-rajpoot-36774b281
📧 Email: shristirajpoot369@gmail.com
🔗 GitHub: @Shristirajpoot

📄License

MIT License. See LICENSE for details..

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
backend		backend
frontend		frontend
README.md		README.md
Screenshot 2025-06-07 013644.png		Screenshot 2025-06-07 013644.png
Screenshot 2025-06-07 013711.png		Screenshot 2025-06-07 013711.png
Screenshot 2025-06-07 013739.png		Screenshot 2025-06-07 013739.png
Screenshot 2025-06-07 013807.png		Screenshot 2025-06-07 013807.png
Screenshot 2025-06-07 013826.png		Screenshot 2025-06-07 013826.png
Screenshot 2025-06-07 013912.png		Screenshot 2025-06-07 013912.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

🌐 WebClonePro — Dynamic Webpage Scraper & Recreator

🚀 Overview

🎥 Demo Video

🛠️ Features

📂 Project Structure

📋 Getting Started

📥 Installation & Setup

🔍 Usage

🚧 Challenges & Solutions

🧭 Future Enhancements

📞 Contact

Shristi Rajpoot

📄License

🌟 If you found this useful, please ⭐ star the repo and share it!

About

Uh oh!

Releases

Packages

Languages

Shristirajpoot/WebClonePro

Folders and files

Latest commit

History

Repository files navigation

🌐 WebClonePro — Dynamic Webpage Scraper & Recreator

🚀 Overview

🎥 Demo Video

🛠️ Features

📂 Project Structure

📋 Getting Started

📥 Installation & Setup

🔍 Usage

🚧 Challenges & Solutions

🧭 Future Enhancements

📞 Contact

Shristi Rajpoot

📄License

🌟 If you found this useful, please ⭐ star the repo and share it!

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages