This project demonstrates a simple Retrieval-Augmented Generation (RAG) pipeline using LangChain, FAISS, and Groq LLM. It matches a candidate's resume with a job description to answer questions such as suitability for a role.
- Loads and processes both a resume (PDF) and a job description (TXT)
- Splits documents into manageable chunks for efficient retrieval
- Embeds documents using HuggingFace sentence transformers
- Stores embeddings in a FAISS vector database
- Uses Groq's Llama3-70B model for question answering over the retrieved context
.
├── app.py
├── .env
├── requirements.in
├── docs/
│ └── franco_resume_en.pdf
├── JD/
│ └── 01_falabella_product_owner.txt
└── README.md
uv pip install -r requirements.in
Create a .env
file with your API keys (e.g., for Groq).
python app.py
- The script loads both the resume and job description.
- Documents are split into overlapping chunks.
- Chunks are embedded and indexed in FAISS.
- A question is asked (e.g., "¿puede Franco Cedillo postular al puesto de Product Owner en Falabella?").
- The system retrieves relevant chunks and uses the LLM to generate an answer.
- Replace
docs/franco_resume_en.pdf
with your own resume. - Replace
JD/01_falabella_product_owner.txt
with any job description. - Modify the query in
app.py
to ask different questions.
See requirements.in for the full list.
MIT License
Inspired by modern RAG pipelines for HR and recruitment use cases.