AI Agentic-RAG Job Matching System

An AI-powered job matching system that combines Retrieval Augmented Generation (RAG) with an agentic workflow to match candidate resumes with suitable job positions. The system uses vector embeddings to find relevant job matches and provides detailed explanations of why each position is a good fit.

📋 Features

Smart Skill Extraction: Automatically identifies key skills from candidate resumes
Hybrid Vector Search: Combines dense and sparse vector search for more accurate job matching
Match Explanation: Provides detailed reasoning for why specific jobs match a candidate's profile
User-Friendly Interface: Simple web UI for uploading resumes and viewing results
PDF and Text Support: Process resumes in various formats
Persistence: Save matching results for future reference

🏗️ System Architecture

The system is built on a multi-agent architecture using the CrewAI framework. Think of it as a team of specialized experts working together:

┌─────────────────┐     ┌─────────────────┐     ┌─────────────────┐
│                 │     │                 │     │                 │
│  Skill Extractor├────►│  Job Searcher   ├────►│ Match Explainer │
│     Agent       │     │     Agent       │     │     Agent       │
│                 │     │                 │     │                 │
└─────────────────┘     └─────────────────┘     └─────────────────┘
        │                       │                       │
        ▼                       ▼                       ▼
┌─────────────────┐     ┌─────────────────┐     ┌─────────────────┐
│                 │     │                 │     │                 │
│  Extract skills │     │ Search Milvus   │     │ Explain matches │
│    from resume  │     │ vector database │     │ with reasoning  │
│                 │     │                 │     │                 │
└─────────────────┘     └─────────────────┘     └─────────────────┘
                              │
                              ▼
                        ┌─────────────────┐
                        │                 │
                        │  Milvus Vector  │
                        │    Database     │
                        │                 │
                        └─────────────────┘

How It Works

Resume Processing:
- The system extracts text from a resume (PDF or text)
- A specialized LLM agent identifies 5-10 key skills
Vector Search:
- Skills are used to perform a hybrid search in Milvus
- Combines dense semantic embeddings (E5-large) with sparse BM25 matching
- Retrieves the most relevant job positions based on similarity
Match Explanation:
- Another LLM agent analyzes the matches and resume
- Generates human-readable explanations of why each job is a good fit
- Highlights specific skills and experiences that match job requirements

This architecture follows a domain-driven design approach where each agent has a clearly defined role and responsibility, communicating through well-defined interfaces.

🚀 Installation

Prerequisites

Python 3.11 or higher
Docker (for running Milvus)
OpenAI API key

Setup

Clone the repository:

git clone https://github.com/yourusername/agentic-rag.git
cd agentic-rag

Create a virtual environment:

python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

Install dependencies:
```
pip install -r requirements.txt
```

Start Milvus with Docker:

docker run -d --name milvus-standalone -p 19530:19530 -p 9091:9091 milvusdb/milvus:v2.3.3 standalone

Configure environment variables: Create a .env file in the project root with:

OPENAI_API_KEY=your_openai_api_key
MILVUS_COLLECTION=job_positions
TOP_K=3
LOG_MISC=DEBUG

Initialize the job database:
```
python job_positions_ingestion.py
```

📊 Usage

Running the Web Interface

Start the Gradio web application:

python gradio_app.py

Then open your browser to http://localhost:7860 (or the URL shown in the terminal).

Using the Web Interface

Upload a resume: Either paste the text or upload a PDF/text file
Start the matching process: Click "Find Matching Jobs"
View results: The system will display matching jobs with explanations
Save results: Click "Save Results to Markdown" to persist the results

Command-Line Usage

For batch processing or testing, you can use the command-line interface:

python main.py sample_resumes/Alex_Chen_Resume.md

📁 Project Structure

agentic-rag/
├── media/                    # Screenshot and demo files
├── results/                  # Output files from job matching
├── sample_resumes/           # Example resumes for testing
├── .gitignore                # Git ignore file
├── gradio_app.py             # Web interface using Gradio
├── job_positions_ingestion.py # Script to load job data into Milvus
├── job_positions.json        # Sample job position data
├── Logger.py                 # Logging utility
├── main.py                   # Core job matching logic with CrewAI
├── MilvusClient.py           # Interface to Milvus vector database
├── sandbox.ipynb             # Jupyter notebook for experimentation
└── Singleton.py              # Singleton pattern implementation

🔧 Technologies Used

CrewAI: Framework for creating and orchestrating AI agents
Milvus: Vector database for efficient similarity search
Gradio: Web interface for user interaction
OpenAI API: Powers the language understanding capabilities
PyMuPDF: PDF processing library
E5-large: Text embedding model for dense vector search

💡 How to Extend the System

Adding New Job Positions

Add your job positions to job_positions.json following the existing format:

{
  "job_id": "YOUR-ID-001",
  "title": "Your Job Title",
  "company_name": "Your Company",
  "location": "Location (Remote/Hybrid/On-site)",
  "job_type": "Full-time/Part-time",
  "salary_range": "$XX,XXX - $YY,YYY",
  "required_skills": "Comma-separated list of skills",
  "text": "Detailed job description..."
}

Run the ingestion script to update the database:
```
python job_positions_ingestion.py
```

Customizing the Matching Logic

The main agent workflow is defined in main.py. You can modify:

The number of results returned by changing TOP_K in the .env file
The weighting between dense and sparse search by updating the weights in the MilvusJobSearchTool
The agent prompts and task descriptions to emphasize different aspects of matching

Working with Custom Vector Embeddings

By default, the system uses the E5-large embeddings model. To use a different model:

Update the embeddings initialization in MilvusClient.py
Adjust the vector dimensions in the schema creation if needed

🔮 Future Improvements

Add user authentication for saving personalized results
Implement feedback mechanisms to improve matching quality
Support for job posting scraping from popular job sites
Advanced filtering options (location, salary range, etc.)
Candidate skills gap analysis with suggestions for improvement

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

🧩 Conceptual Foundations

To understand this system, consider these key analogies:

Vector Embeddings: Think of skills as points in a high-dimensional space. Similar skills cluster together, while different skills are far apart. This allows us to find job positions "nearby" in this conceptual space.
Hybrid Search: Like using both a map (sparse/keyword matching) and a compass (dense/semantic direction) to navigate. The combination gives more accurate results than either method alone.
Agent Workflow: Similar to an assembly line in a factory, where each specialized worker performs their specific task and passes the result to the next worker.

By combining these powerful concepts, we create a system that understands both the explicit and implicit connections between candidate skills and job requirements.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

AI Agentic-RAG Job Matching System

📋 Features

🏗️ System Architecture

How It Works

🚀 Installation

Prerequisites

Setup

📊 Usage

Running the Web Interface

Using the Web Interface

Command-Line Usage

📁 Project Structure

🔧 Technologies Used

💡 How to Extend the System

Adding New Job Positions

Customizing the Matching Logic

Working with Custom Vector Embeddings

🔮 Future Improvements

📄 License

🧩 Conceptual Foundations

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
media		media
results		results
sample_resumes		sample_resumes
.gitignore		.gitignore
Logger.py		Logger.py
MilvusClient.py		MilvusClient.py
README.md		README.md
Singleton.py		Singleton.py
env.example		env.example
gradio_app.py		gradio_app.py
job_positions.json		job_positions.json
job_positions_ingestion.py		job_positions_ingestion.py
main.py		main.py
requirements.txt		requirements.txt
sandbox.ipynb		sandbox.ipynb

r0llingclouds/agentic-rag

Folders and files

Latest commit

History

Repository files navigation

AI Agentic-RAG Job Matching System

📋 Features

🏗️ System Architecture

How It Works

🚀 Installation

Prerequisites

Setup

📊 Usage

Running the Web Interface

Using the Web Interface

Command-Line Usage

📁 Project Structure

🔧 Technologies Used

💡 How to Extend the System

Adding New Job Positions

Customizing the Matching Logic

Working with Custom Vector Embeddings

🔮 Future Improvements

📄 License

🧩 Conceptual Foundations

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages