Skip to content

automates the search for company websites starting from a CSV file. It reads each entry, uses Selenium with ChromeDriver to search the web, and intelligently matches the correct site. Ideal for lead gen, sourcing, and business research.

Notifications You must be signed in to change notification settings

CODEX-cpp/Scraping-Web

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

3 Commits
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

This is a smart Python-based tool that automates the process of finding company websites and extracting key business information โ€” starting from a simple CSV file. ๐Ÿ” How it works

Just feed the program a CSV file containing company names or product categories. It will:

Parse the file row-by-row to read company names.

Launch a browser session using Selenium and ChromeDriver.

Automatically perform web searches for each company.

Match and identify the most relevant official website using a custom-built filtering and matching system.

(Optional) Extract useful data from the found websites.

๐Ÿค– Key Features

Fully automated search and data gathering

Intelligent matching algorithm to reduce false positives

Easy CSV input/output

Modular and customizable architecture

๐Ÿš€ Ideal for:

Market research

Lead generation

Business intelligence

Sourcing and supplier discovery

The input CSV must follow a strict format, with ; as the delimiter. Each line represents one company and should look like:

76;SEDE;SOCIETA' AGRICOLA AMC S.R.L.;VIA GIUSEPPE GARIBALDI 12; ;33070;CANEVA - PN; ;

With the following header:

N:;SEDE;Nome legale;Via;CAP;Comune;Frazione;;Sito web

If you're using an official registry file (e.g. legally purchased from a chamber of commerce), just specify the correct CSV file using the optional input parameter โ€” the tool will handle the rest.

About

automates the search for company websites starting from a CSV file. It reads each entry, uses Selenium with ChromeDriver to search the web, and intelligently matches the correct site. Ideal for lead gen, sourcing, and business research.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages