A web scraping application built using Java, Spring Boot, and the JSoup library. This project extracts and processes data from websites, making it available in a structured format for analysis or other applications.
- Efficient Scraping: Extract text, images, and links from web pages.
- Dynamic Content Handling: Parse HTML and manage dynamic or complex structures.
- Customizable: Easily modify scraping logic to suit specific use cases.
- Spring Boot Integration: Leverages Spring Boot for scalability, easy configuration, and RESTful API exposure.
- Data Export: Outputs data in formats such as JSON or CSV.
To run this project, ensure the following prerequisites are met:
- Java: Version 17 or higher.
- Maven: To manage dependencies and build the project.
- Spring Boot: Integrated with the application.
- JSoup Library: For parsing and scraping web pages.
1) Clone the repository:
git clone https://github.com/Shivarora22/WebScapper
2) Build the project using Maven:
mvn clean install
3) Run the application:
mvn spring-boot:run