An AI-powered tool to find and validate LinkedIn profiles based on a persona with hybrid scoring approach.
- Generate a persona from social media profiles (GitHub, Twitter)
- Enrich persona data with People Data Labs API for comprehensive professional details
- Further enhance persona with AI using Google's Gemini model
- Generate optimized LinkedIn search queries based on persona details
- Find and score potential LinkedIn profile matches
- Validate matches with image similarity using CLIP
- Complete profile scoring with weighted matching (name, semantic, location, etc.)
- Streamlit UI for easy interaction
- Persona Creation: Start with basic information about the person you're looking for (name, social media profiles, professional details)
- Profile Enrichment: Scrape social profiles to gather more details about the person
- Professional Data Enrichment: Use People Data Labs API to fill in professional details, skills, and background
- AI Enhancement: Use Gemini AI to infer additional professional details, skills, and other attributes
- Search Query Generation: Create optimized search queries for LinkedIn
- LinkedIn Search: Find potential matching profiles using SerpAPI
- Profile Scoring: Score each candidate with a hybrid approach:
- Name similarity using fuzzy matching
- Semantic similarity of professional descriptions using BERT
- Industry and location matching
- Social profile validation
- Image similarity using CLIP (optional)
- Final Ranking: Combine all scores for a final confidence score
- Python 3.8+
- Required API keys:
- SERPAPI_API_KEY: For LinkedIn searches
- PEOPLE_API_KEY: For professional data enrichment (People Data Labs)
- GEMINI_API_KEY: For AI enrichment (Google Gemini)
- SCRAPINGDOG_API_KEY: For LinkedIn profile image extraction (optional)
- TWITTER_BEARER_TOKEN: For Twitter profile scraping (optional)
- Clone the repository
git clone https://github.com/yourusername/linkedin-profile-finder.git
cd linkedin-profile-finder
- Install dependencies
pip install -r requirements.txt
- Create a
.env
file with your API keys
SERPAPI_API_KEY=your_serpapi_api_key_here
PEOPLE_API_KEY=your_people_data_labs_api_key_here
GEMINI_API_KEY=your_gemini_api_key_here
SCRAPINGDOG_API_KEY=your_scrapingdog_api_key_here
TWITTER_BEARER_TOKEN=your_twitter_bearer_token_here
Launch the Streamlit UI:
streamlit run app.py
Or run the command-line demo:
python main.py
The profile scoring system uses a hybrid approach that considers:
Score Type | Weight | Description |
---|---|---|
Name Score | 35% | Fuzzy matching of names, accounting for variations |
Semantic Score | 25% | BERT-based similarity of professional introductions |
Industry Score | 10% | Matching of industry and professional domain |
Location Score | 15% | Geographic proximity and timezone alignment |
Social Score | 10% | Validation through social media profiles |
Image Score | 5% | Visual similarity of profile photos (using CLIP) |
- Recruiting: Find potential candidates matching a specific profile
- Business Development: Locate decision-makers at target companies
- Research: Find professionals in specific domains
- Networking: Locate colleagues or contacts with limited information
main.py
: Entry-point for command-line usageapp.py
: Streamlit web appcore/profile_scraper.py
: Social media profile scrapingapi/people_api.py
: Professional data enrichment with People Data Labsapi/gemini_api.py
: AI enrichment with Geminicore/image_similarity.py
: Handles lightweight perceptual hash-based image comparisoncore/name_utils.py
: Name parsing and expansioncore/search.py
: LinkedIn search query generationcore/profile_scoring.py
: Candidate matching and scoring functions
This project is licensed under the MIT License - see the LICENSE file for details.
This tool is for educational and research purposes only. Always respect LinkedIn's terms of service and privacy policies when using this tool. The creators are not responsible for any misuse of this tool.