Automated AI Powered Resume Ranking & Analysis Tool
Built with the tools and technologies:
- Overview
- Features
- Project Structure
- Getting Started
- Project Roadmap
- Contributing
- License
- Acknowledgments
ResumeAnalyser is an automated resume ranking and analysis application designed for recruiters and hiring managers. This tool accepts a ZIP file containing multiple PDF resumes and an optional job description text file. It then processes the resumes using advanced NLP techniques and scoring algorithms to rank candidates based on:
- Technical & Managerial Skills: Evaluated using years of experience, education level, and keyword-based skills extraction.
- Resume Quality: Assessed via spell-check ratios, section identification, and overall brevity.
- Job Matching: Uses TF-IDF similarity measures to compare resumes against the provided job description.
The final output is a ranked list of resumes with scores and downloadable results for further review.
- Interactive UI with Streamlit: Upload resumes and job descriptions, view processing status, and download results.
- PDF Text Extraction: Converts resumes in PDF format into text using libraries like PyPDF2 and pymupdf.
- NLP-Powered Analysis:
- Preprocessing: Tokenization and lemmatization using spaCy.
- Skill Extraction: Identifies both general and technical skills.
- Experience & Education: Extracts years of experience and detects education level.
- Resume Quality Metrics: Spell-check, section identification, and brevity evaluation.
- Scoring and Ranking: Combines technical, managerial, and overall quality scores with job match scores (via TF-IDF similarity) to generate final rankings.
- Downloadable Results:
- CSV file containing all calculated scores.
- Top 3 ranked resumes available as downloadable PDFs with an embedded viewer.
- Default Job Skills Management: Automatically loads or creates a default job skills file (
job_skills.json
) for reference during analysis.
Below is a sample layout of the project's folder structure:
└── ResumeAnalyser/
├── README.md
├── model.py
├── app.py
├── output
| └── .gitkeep
├── .streamlit
| └── config.toml
└── requirements.txt
- requirements.txt: Lists all project dependencies.
- model.py: Main logic for resume processing, scoring, and ranking.
- app.py: Streamlit application for interactive resume analysis.
- .streamlit: Configuration for streamlit environment.
- output: Placeholder for output stream
- Programming Language: Python 3.7 or higher
- Package Manager: Pip
- Required Libraries: Streamlit, pandas, scikit-learn, spaCy, pymupdf, PyPDF2, language_tool_python, textblob, and others listed in requirements.txt.
-
Clone the Repository:
git clone https://github.com/yourusername/ResumeAnalyser.git
-
Navigate to the Project Directory:
cd ResumeAnalyser
-
Install Dependencies:
pip install -r requirements.txt python -m spacy download en_core_web_sm
To launch the Resume Ranking Application, use the following command:
streamlit run app.py
This will start a local Streamlit server where you can:
- Upload a ZIP file containing PDF resumes.
- Optionally upload a job description (TXT file).
- View the analysis results, download the final ranked CSV, and retrieve the top ranked resume PDFs.
- Task 1: Implement resume extraction and text conversion from PDF files.
- Task 2: Enhance scoring algorithms with additional metrics and dynamic weighting.
- Task 3: Integrate more robust job description parsing and candidate matching features.
- Task 4: Improve UI/UX and add more visualization options in the Streamlit app.
- 💬 Join the Discussions: Share insights, provide feedback, or ask questions.
- 🐛 Report Issues: Submit bugs or request features.
- 💡 Submit Pull Requests: Fork the repository, create a feature branch, and submit a PR.
Contributing Guidelines
- Fork the Repository: Fork the project to your account.
- Clone Locally: Clone your forked repository.
git clone https://github.com/yourusername/ResumeAnalyser.git - Create a New Branch:
git checkout -b new-feature-x - Make Your Changes: Develop and test your changes locally.
- Commit Your Changes:
git commit -m "Implemented feature x." - Push to Your Fork:
git push origin new-feature-x - Submit a Pull Request: Create a PR against the original repository with a clear description of your changes.
- Review: Once reviewed and approved, your changes will be merged.
- Improve the regex and pattern matching for extraction
- Enhance the NLP pipeline
- Host on AWS/Google Cloud
- Suggest improvements to resumes
- Generate cover letters
- Incorporate weightage to personal recommendations on the basis of content of recommendation letters
- Identifying appropriate parameters for resume ranking and assigning meaningful weightage to different sections
Conducted thorough research on industry standards and consulted domain experts to determine relevant parameters such as skills, experience, and education
- Integrating the frontend, particularly handling the import of archive files
Utilized Python's zipfile module to extract and parse multiple resumes from archive files (ZIP format) and ensured seamless data transfer between the frontend and backend
- Handling inconsistencies in resume structures and maintaining uniform data processing
Implemented a robust text processing pipeline that included tokenization, lemmatization, and stopword removal to clean and standardize extracted text. Utilized libraries like nltk and spaCy to ensure uniform processing across different resume formats.
This project is protected under the MIT License. For more details, please refer to the LICENSE site.
-
Developers:
- Samudraneel Sarkar
LinkedIn | GitHub | 📧 samudraneel05@gmail.com - Guransh Goyal
LinkedIn | GitHub | 📧 guransh31goyal@gmail.com
- Samudraneel Sarkar
-
Inspiration & Contributions: Thanks to the open-source community for providing robust libraries (such as spaCy, scikit-learn, and Streamlit) that made this project possible.
-
Other Resources: Special thanks to language processing libraries like language_tool_python and TextBlob which help enhance the resume quality evaluation.
© 2025 P-125, Batch of 2027. All rights reserved.