This tool helps extract specific information about companies or other entities from the web using AI. It combines web search capabilities with AI processing to gather and analyze information about specified entities.
- Multiple data source support (CSV, Google Sheets)
- Configurable search queries
- AI-powered information extraction
- Batch processing capability
- Export results to CSV or Google Sheets
- Clone the repository
- Install dependencies:
pip install -r requirements.txt
- Create a
.env
file with your API keys:SERPAPI_KEY=your_serp_api_key GROQ_API_KEY=your_groq_api_key
- For Google Sheets integration:
- Place your
credentials.json
file in the project root - Run the application to authenticate and generate
token.json
- Place your
Run the application:
streamlit run src/app.py
Follow the UI prompts to:
- Select your data source
- Configure your queries
- Start the extraction process
- Download or export results
project_root/
├── requirements.txt # Project dependencies
├── README.md # Project documentation
├── src/
├── app.py # Main Streamlit application
├── config.py # Configuration and settings
└── ... # Other modules