Welcome to the ATS (Applicant Tracking System) NER competition! Participants will form teams of two and follow the four phases below.
-
Form Teams: Two participants per team.
-
Dataset: Download the resume dataset from Kaggle: https://www.kaggle.com/datasets/gauravduttakiit/resume-dataset
-
Doccano Project:
- Create a new project in Doccano with Sequence Labeling.
- Import the provided
labels.json
(8 entity tags) via Labels → Import.
-
Objective: Split the dataset, label as many resumes as possible, and export your annotations.
-
Steps:
-
In your Doccano project, upload half of the dataset for your team.
-
Annotate the resumes using these entity tags:
[ {"text":"PERSON_NAME"}, {"text":"EMAIL"}, {"text":"PHONE"}, {"text":"DEGREE"}, {"text":"UNIVERSITY"}, {"text":"JOB_TITLE"}, {"text":"COMPANY"}, {"text":"SKILL"} ]
-
Export your annotated data in JSONL format when time is up.
-
⏱ Time limit: 0.5 hour
-
Objective: Fine-tune a spaCy NER model on your labeled data and publish it.
-
Steps:
- Open Google Colab and install spaCy:
!pip install spacy
- Convert your JSONL annotations into spaCy training format.
- Train a basic NER pipeline on your exported data.
- Save the trained model as a package and upload it to the Hugging Face Hub under your team name.
- Open Google Colab and install spaCy:
⏱ Time limit: 30 minutes
-
Objective: Build a simple web UI for your NER service, containerize it, and deploy.
-
Steps:
- Create a minimal web interface wit Streamlit that accepts resume text and highlights named entities.
- Write a
Dockerfile
to containerize your app and your spaCy model. - Deploy your Docker container to Hamravesh.
⏱ Time limit: 45 minutes
-
Objective: Showcase your application and give feedback to peers.
-
Steps:
- Post a brief demo of your deployed application (screenshots or link) in the competition forum.
- Comment on at least two other teams’ posts with constructive feedback.
⏱ Time limit: 15 minutes
Good luck to all teams—may the best ATS NER demo win!