The aim of the project is to extract and cluster information from a free-format resume which later will be used to match similar Cvs.
To see the dataset: Kaggle
- Repository :
CV-data-extraction
- Duration :
10 days
- Deadline :
10.09.2021
- Information to extract:
-
Personal information: name,surname,email address, postal address, phone number
-
Education: year, institution name, study name
-
Previous Job Experience: year, title, organization name
-
Skills
-
- Python3
- SpaCy NLP processing library
- Flair NLP library
- Apache Tika - Text extractor
- Pandas - Dataframe, CSV reader
git clone https://github.com/jejobueno/CV-data-extraction
cd CV-data-extraction
pip install -r requirements.txt
streamlit run app.py