Skip to content

Commit ce79baf

Browse files
committed
Merge branch 'main' into 98-implement-a-script-for-llm-fine-tuning
2 parents 8d8ebb4 + 040e381 commit ce79baf

32 files changed

+1427
-2
lines changed
263 KB
Loading
Lines changed: 70 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,70 @@
1+
Title URL Assignees Status Labels
2+
Run Stackoverflow extraction to get more data https://github.com/amosproj/amos2024ss08-cloud-native-llm/issues/78 YashodharPansuriya Feature Archive Actual SP 01, SP 01, User Story
3+
Refactor the ETL and Data Transformation Scripts https://github.com/amosproj/amos2024ss08-cloud-native-llm/issues/94 dnsch, YashodharPansuriya Feature Archive Actual SP 05, SP 03, User Story
4+
Run Q&A Generator with the Latest Collected Data https://github.com/amosproj/amos2024ss08-cloud-native-llm/issues/93 christianwielenberg Feature Archive Actual SP 02, SP 03, User Story
5+
Prepare a small test set for evaluation https://github.com/amosproj/amos2024ss08-cloud-native-llm/issues/82 anosh-ar, christianwielenberg, dnsch Feature Archive Actual SP 02, SP 02, User Story
6+
Collect previous research work in our Wiki https://github.com/amosproj/amos2024ss08-cloud-native-llm/issues/95 YashodharPansuriya Feature Archive Actual SP 02, SP 03, User Story
7+
Study and Research on LLM Hyper-parameters https://github.com/amosproj/amos2024ss08-cloud-native-llm/issues/92 anosh-ar, julioc-p Feature Archive Actual SP 05, SP 05, User Story
8+
Select Suitable Benchmark Metrics https://github.com/amosproj/amos2024ss08-cloud-native-llm/issues/80 christianwielenberg, dnsch Feature Archive Actual SP 05, SP 05, User Story
9+
Extract And Store Text Data From CNCF Project Webpages https://github.com/amosproj/amos2024ss08-cloud-native-llm/issues/56 anosh-ar Feature Archive Actual SP 08, SP 08, User Story
10+
Train LLM on FAUs HPC https://github.com/amosproj/amos2024ss08-cloud-native-llm/issues/64 christianwielenberg, dnsch Feature Archive Actual SP 03, SP 03, User Story
11+
Deploy the trained model to Huggingface https://github.com/amosproj/amos2024ss08-cloud-native-llm/issues/66 julioc-p Feature Archive Actual SP 05, SP 05, User Story
12+
Utilise HPC for Data Generation Scripts https://github.com/amosproj/amos2024ss08-cloud-native-llm/issues/71 christianwielenberg Feature Archive Actual SP 05, SP 05, User Story
13+
Preprocessing of the data of the Q&A dataset https://github.com/amosproj/amos2024ss08-cloud-native-llm/issues/61 anosh-ar Feature Archive Actual SP 05, SP 05, User Story
14+
Implementation of the Retraining to get a CNCF LLM https://github.com/amosproj/amos2024ss08-cloud-native-llm/issues/63 christianwielenberg, dnsch Feature Archive Actual SP 05, SP 05, User Story
15+
Create build/deploy documentation https://github.com/amosproj/amos2024ss08-cloud-native-llm/issues/52 YashodharPansuriya Feature Archive Actual SP 02, amos-homework, SP 03, User Story
16+
Create design documentation https://github.com/amosproj/amos2024ss08-cloud-native-llm/issues/51 julioc-p Feature Archive Actual SP 03, amos-homework, SP 03, User Story
17+
Create User documentation https://github.com/amosproj/amos2024ss08-cloud-native-llm/issues/50 julioc-p Feature Archive Actual SP 02, amos-homework, SP 03, User Story
18+
Learn How to Utilise the HPC https://github.com/amosproj/amos2024ss08-cloud-native-llm/issues/32 anosh-ar, christianwielenberg, dnsch, julioc-p, YashodharPansuriya Feature Archive Actual SP 03, SP 05, User Story
19+
Research and Select a Suitable NER Tool or Library https://github.com/amosproj/amos2024ss08-cloud-native-llm/issues/44 dnsch Feature Archive Actual SP 03, SP 05, User Story
20+
Include Stackoverflow Q&As related to CNCF into our Q&A dataset https://github.com/amosproj/amos2024ss08-cloud-native-llm/issues/34 anosh-ar, YashodharPansuriya Feature Archive Actual SP 05, SP 08, User Story
21+
Improve the Performance of Data Unifying Script https://github.com/amosproj/amos2024ss08-cloud-native-llm/issues/35 christianwielenberg Feature Archive Actual SP 03, SP 03, User Story
22+
Make whole project able to be build with a build tool https://github.com/amosproj/amos2024ss08-cloud-native-llm/issues/49 julioc-p Feature Archive Actual SP 08, amos-homework, SP 08, User Story
23+
Improve the Q&A Pairs Transformation Pipeline https://github.com/amosproj/amos2024ss08-cloud-native-llm/issues/42 christianwielenberg, dnsch Feature Archive Actual SP 05, SP 05, User Story
24+
Implement an Initial Generator for enriching Q&A Pairs https://github.com/amosproj/amos2024ss08-cloud-native-llm/issues/31 anosh-ar, dnsch Feature Archive Actual SP 05, SP 05, User Story
25+
Exclude Non-English Content from Raw Dataset https://github.com/amosproj/amos2024ss08-cloud-native-llm/issues/38 anosh-ar Feature Archive Actual SP 02, SP 02, User Story
26+
Implement an Initial Answer Matching Function https://github.com/amosproj/amos2024ss08-cloud-native-llm/issues/30 christianwielenberg, julioc-p Feature Archive Actual SP 02, SP 05, User Story
27+
Implement an Initial Question Extraction Function https://github.com/amosproj/amos2024ss08-cloud-native-llm/issues/29 christianwielenberg, julioc-p Feature Archive Actual SP 05, SP 05, User Story
28+
Extract the dataset from Hugging Face https://github.com/amosproj/amos2024ss08-cloud-native-llm/issues/16 YashodharPansuriya Feature Archive Actual SP 03, SP 03, User Story
29+
Store Q&As in Hugging Face https://github.com/amosproj/amos2024ss08-cloud-native-llm/issues/33 YashodharPansuriya Feature Archive Actual SP 02, SP 05, User Story
30+
Identify Webpages for the Kubernetes LLM dataset https://github.com/amosproj/amos2024ss08-cloud-native-llm/issues/6 julioc-p Feature Archive Actual SP 03, SP 05, User Story
31+
Research Q&A Pairs Pipeline https://github.com/amosproj/amos2024ss08-cloud-native-llm/issues/27 dnsch Feature Archive Actual SP 05, SP 03, User Story
32+
Enhance Software Architecture Draft - Detail Machine Learning Pipeline https://github.com/amosproj/amos2024ss08-cloud-native-llm/issues/17 dnsch Feature Archive Actual SP 05, SP 05, User Story
33+
Create a Report on LLM Research Findings https://github.com/amosproj/amos2024ss08-cloud-native-llm/issues/21 christianwielenberg Feature Archive Actual SP 03, SP 03, User Story
34+
Extract Content from CNCF sources https://github.com/amosproj/amos2024ss08-cloud-native-llm/issues/7 anosh-ar Feature Archive Actual SP 05, SP 01, User Story
35+
Conduct Largue Langage Base Model Selection Process https://github.com/amosproj/amos2024ss08-cloud-native-llm/issues/19 anosh-ar Feature Archive Actual SP 05, SP 05, User Story
36+
Research Open-Source LLMs https://github.com/amosproj/amos2024ss08-cloud-native-llm/issues/18 julioc-p Feature Archive Actual SP 03, SP 03, User Story
37+
Set Up Team Logo and request Team T-shirt https://github.com/amosproj/amos2024ss08-cloud-native-llm/issues/4 dnsch Feature Archive amos-homework
38+
Identify Repositories for the Kubernetes LLM dataset https://github.com/amosproj/amos2024ss08-cloud-native-llm/issues/11 julioc-p Feature Archive Actual SP 05, SP 05, User Story
39+
CI/CD pipeline for project https://github.com/amosproj/amos2024ss08-cloud-native-llm/issues/23 julioc-p Feature Archive Actual SP 03, SP 03
40+
Create Software architecture draft https://github.com/amosproj/amos2024ss08-cloud-native-llm/issues/12 dnsch Feature Archive Actual SP 05, amos-homework, SP 05, User Story
41+
Transform extracted data into a unified Format https://github.com/amosproj/amos2024ss08-cloud-native-llm/issues/8 YashodharPansuriya Feature Archive Actual SP 05, SP 05, User Story
42+
Load data to Hugging Face https://github.com/amosproj/amos2024ss08-cloud-native-llm/issues/9 christianwielenberg Feature Archive Actual SP 02, SP 02, User Story
43+
Monitoring of the ETL process by Logging https://github.com/amosproj/amos2024ss08-cloud-native-llm/issues/10 julioc-p Feature Archive Actual SP 02, SP 03, User Story
44+
Create a Chatbot on HuggingFace https://github.com/amosproj/amos2024ss08-cloud-native-llm/issues/79 julioc-p Feature Archive Actual SP 03, SP 05, User Story
45+
Implement a Script for LLM fine-tuning https://github.com/amosproj/amos2024ss08-cloud-native-llm/issues/98 julioc-p Awaiting review SP 05, User Story
46+
Utilise Hetzner for LLM Fine-tuning https://github.com/amosproj/amos2024ss08-cloud-native-llm/issues/97 anosh-ar Awaiting review SP 05, User Story
47+
Combine latest changes of the Q&A dataset and Stackoverflow dataset to the Merged Q&A dataset https://github.com/amosproj/amos2024ss08-cloud-native-llm/issues/101 anosh-ar, YashodharPansuriya Awaiting review SP 03, User Story
48+
Implement Quantitative Evaluation Script https://github.com/amosproj/amos2024ss08-cloud-native-llm/issues/81 christianwielenberg, dnsch Awaiting review SP 05, User Story
49+
Fine-tune LLM on Hetzner https://github.com/amosproj/amos2024ss08-cloud-native-llm/issues/99 YashodharPansuriya In progress SP 05, User Story
50+
Document our Datasets https://github.com/amosproj/amos2024ss08-cloud-native-llm/issues/102 YashodharPansuriya In progress SP 03, User Story
51+
Preliminary investigation of deployment with localai https://github.com/amosproj/amos2024ss08-cloud-native-llm/issues/103 christianwielenberg, dnsch In progress SP 03, User Story
52+
Deploy the Fine-tuned Model https://github.com/amosproj/amos2024ss08-cloud-native-llm/issues/108 Product Backlog User Story
53+
Benchmark the trained model against the base models and a competitor https://github.com/amosproj/amos2024ss08-cloud-native-llm/issues/91 Product Backlog User Story
54+
Implement the Chat-bot User Interface https://github.com/amosproj/amos2024ss08-cloud-native-llm/issues/110 Product Backlog User Story
55+
Create Demo video https://github.com/amosproj/amos2024ss08-cloud-native-llm/issues/107 Product Backlog amos-homework
56+
Create the Demo-day Slide https://github.com/amosproj/amos2024ss08-cloud-native-llm/issues/109 Product Backlog amos-homework
57+
[Epic] Named Entity Recognition (NER) on Training Data Product Backlog
58+
Annotate a Dataset for NER Trainning https://github.com/amosproj/amos2024ss08-cloud-native-llm/issues/67 Product Backlog User Story
59+
Develop and Train the NER Model https://github.com/amosproj/amos2024ss08-cloud-native-llm/issues/62 Product Backlog User Story
60+
Integrate NER Module with the Q&A Transformation Pipeline https://github.com/amosproj/amos2024ss08-cloud-native-llm/issues/46 Product Backlog User Story
61+
[Epic] Fine-Tune our LLM on Hetzner Environment Product Backlog
62+
[Epic] Implement Data Collection Automation Pipeline Feature Archive
63+
[Epic] Research and Select an Open LLM for CNCF-related knowledge Feature Archive
64+
[Epic] Develop Q&A Pairs Transforming Pipeline Feature Archive
65+
[Epic] Implement the Quantitative Evaluation Feature Archive
66+
[Epic] Tidy up our datasets Feature Archive
67+
[Epic] Implement Named Entity Recognition (NER) on training data Feature Archive
68+
Define CNCF-related Entities and their Relationships https://github.com/amosproj/amos2024ss08-cloud-native-llm/issues/45 Feature Archive User Story
69+
Contact Kubernetes to verify the training data https://github.com/amosproj/amos2024ss08-cloud-native-llm/issues/43 Product Backlog User Story
70+
-- FOR LATER IN THE FUTURE -- Product Backlog
320 KB
Loading
Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
Title URL Assignees Status Priority Estimate Size Iteration
2+
Use only open source libraries. Done
3+
Choose release manager Done
4+
Show the result not the code during sprint review https://github.com/amosproj/amos2024ss08-cloud-native-llm/issues/86 Done
5+
Use Planning Poker Tool https://github.com/amosproj/amos2024ss08-cloud-native-llm/issues/85 Done
6+
Check out if we get access to IP hardware https://github.com/amosproj/amos2024ss08-cloud-native-llm/issues/104 Todo
7+
Check local ai https://github.com/amosproj/amos2024ss08-cloud-native-llm/issues/111 Todo
8+
Find good way to deploy our model https://github.com/amosproj/amos2024ss08-cloud-native-llm/issues/112 Todo
360 KB
Binary file not shown.
571 KB
Binary file not shown.
4.48 MB
Binary file not shown.
813 KB
Loading

0 commit comments

Comments
 (0)