Goal: In this project, we aim to improve LLM-based code completion for functional programming languages, in specific: Haskell.
Models: We consider two pre-trained decoder only models: UniXcoder and CodeGPT.
Data: github-code-haskell-function (HuggingFace)
Benchmarking: Customized version of HumanEval for Haskell, see /humaneval-hs
.
Clone the repository:
git clone https://github.com/<repository author/repository name>.git
Install the dependencies:
-
Either using
pip
and the providedrequirements.txt
files within the repository:pip install -r requirements.txt
-
Or using Poetry:
curl -sSL https://install.python-poetry.org | python3 - poetry env use python3.8 poetry install
Note: for details, please refer to
pyproject.toml
. In case of missing dependencies, usepoetry add <name>
.
Any code surrounding the considered models (i.e. UniXcoder and CodeGPT) can be found in /models
. Within this folder, there are subfolders for:
/finetuning
: contains the code and scripts to finetune both the models (seeblue-finetune.sh
for our final finetuning script)/inference
: contains the code to run inference on both models/evaluation
: contains the code for the evaluation of both models Furthermore, this folder containscreate_model_inputs.py
for the splitting of the data for the models into train and test sets.
The HumanEval code can be found in /humaneval-hs
. Each problem has been annotated manually with splits for the manual evaluation. The annotation of the results can be found in /models/evaluation/annotated
.
In case you want to do manual evaluation on your own inference results, you can generate an Excel for annotation using excelify.py
in /models/evaluation
and plot the results of your annotations using plotify.py
in the same folder. In case you are repeating the experiment and are curious to overlapping results, you can use overlapify.py
in /overlap-check
where both old and new files ought to be placed in the /old
and /new
folders respectively, with the same file names.
Each file has been documented with comments to explain the code and if applicable, the arguments that can be passed to the script.
For the main results, please refer to our submitted paper. Additional details can be found in /appendix/appendix.pdf
. Note: GitHub limits the visibility of all PDF pages for large documents, click the ** More Pages ** button at the bottom of the scroll view repeatedly until the end is reached, i.e. page 23.