Skill-Extraction Refactor (ESCO + KSA) #102
phanindra-max
announced in
Announcements
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Release v0.2.2 · Skill-Extraction Refactor (“ESCO + KSA”)
⚡PR⚡#101
✨ Highlights
Taxonomy-aware skill extraction
• Integrates a FAISS index of the ESCO skills taxonomy.
•
Skill_Extractor.get_top_esco_skills()
now returns{Skill, index, score}
enabling deterministicSkill Tag
values (ESCO.<index>
).KSA enrichment with vLLM
• New helper
get_ksa_details()
generates Knowledge Required and Task Abilities lists for each skill.• Automatically invoked when a GPU/vLLM backend is available.
Unified output schema
The extractor returns a tidy DataFrame with seven columns:
Research ID, Description, Raw Skill, Knowledge Required, Task Abilities, Skill Tag, Correlation Coefficient
.🔧 Detailed Changes
get_top_esco_skills()
enhanced to include ESCO index and similarity score.get_ksa_details()
plus supporting imports.self.index
is always defined.•
build_faiss_index_esco()
/load_faiss_index_esco()
now instance methods storing the index underlaiser/input
.• New taxonomy-first pipeline inserted at the top of
extractor()
; legacy alignment kept for fallback.align_skills()
andalign_KSAs()
will be dropped in v0.3 once consumers migrate to the new output format.🚧 Known Issues / Roadmap
get_ksa_details()
needs additional resilience checks.import json
lines remain inllm_methods.py
.⬆️ Upgrade Notes
No changes to input parameters are required, but downstream code should read the new seven-column schema.
Next up
0.2 → 0.3
This discussion was created from the release Skill-Extraction Refactor (ESCO + KSA).
Beta Was this translation helpful? Give feedback.
All reactions