CrawlX - Web Crawler & Search Engine

CrawlX is a simple web crawler and search engine that retrieves web page content based on user queries. It follows a Breadth-First Search (BFS) approach to crawl Wikipedia pages and other websites. Additionally, it supports web search and image search functionalities.

Features

Web Crawling: Extracts content and metadata from Wikipedia and other websites.
Search Engine: Users can search for keywords, and the crawler fetches relevant results.
BFS-Based Crawling: Follows links iteratively but may encounter timeouts due to depth limitations.
Image Search: Fetches images related to search queries.
Backlink Analysis: Retrieves backlinks using Google search API.

Live Demo

Visit CrawlX

How It Works

The user enters a search keyword.
The crawler starts from Wikipedia and follows links using a BFS approach.
The results are stored in a database and displayed to the user.
Users can also perform an image search related to the keyword.

Technologies Used

PHP - Backend logic and data processing
MySQL - Database to store crawled data
cURL & DOMDocument - Fetching and parsing HTML content
JavaScript & jQuery - Frontend interactivity
Bootstrap - Responsive UI design

Installation

Clone the repository:

git clone https://github.com/yourusername/crawlx.git

Set up a local or remote server with PHP and MySQL.
Import the database schema from database.sql.
Update database connection details in config.php.
Run the project on your server.

Contributing

We welcome contributions! If you’d like to improve CrawlX, feel free to:

Report issues
Submit pull requests
Enhance crawling efficiency and search results
Improve UI/UX

Fork the project and start contributing!

License

This project is open-source and available under the MIT License.

🌟 Star the repository if you find it useful!

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
README.md		README.md
class.php		class.php
dbcon.php		dbcon.php
image_search.php		image_search.php
index.php		index.php
search.sql		search.sql
web_search.php		web_search.php

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

CrawlX - Web Crawler & Search Engine

Features

Live Demo

How It Works

Technologies Used

Installation

Contributing

License

About

Uh oh!

Releases

Packages

Languages

anurag001/Search-Engine

Folders and files

Latest commit

History

Repository files navigation

CrawlX - Web Crawler & Search Engine

Features

Live Demo

How It Works

Technologies Used

Installation

Contributing

License

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages