LLM Project

This project is based on the paper LongRoPE (Ding et al. 2024). The goal of this project is to explore a method to consistenly increase the context length of an LLM. The main idea start from an empirical fact: if we increase the context length of an LLM, the performances tends to deteriorate. Various methods has been explored to help the model to break through the limits of context size.

This project is in the context of the cours LLM @ MVA ( M. Fijalkow & M. Lelarge)

Repo

The repo has the following architecture:

Directory structure:
└── paulcauch-llm-project-mva-2025/
    ├── README.md
    ├── setup.py
    ├── data/
    │   ├── __init__.py
    │   └── input.txt
    ├── ntbk/
    │   ├── __init__.py
    │   └── simple_use.ipynb
    ├── report/
    │   └── __init__.py
    ├── src/
    │   ├── __init__.py
    │   ├── dataset.py
    │   ├── extension.py
    │   ├── finetune.py
    │   ├── longrope.py
    │   ├── longrope_utils.py
    │   ├── rope.py
    │   ├── search.py
    │   ├── utils_data.py
    │   └── utils_general.py
    └── test/
        ├── __init__.py
        └── test.ipynb

src contains all the function used to perform LongRoPE
report contains the slides of the presentation
ntbk contains an example of the run of the code for LongRoPE, and some visual representations
data contains a simple dataset, it's actually just a poem (the same one used by Karpathy in his videos)

Please note that the github was mainly based on the two following repo: official LongRoPE non-official LongRoPE. The first one is developped to handle big llm (mistral, llama etc...) while the second one is adapted to a much smaller size but contains several mistakes in the code (some of them are just details, purely dev details).

Since the computations are really heavy (For the search function, we take about 1 hour for 1 iteration with non trivial parameters with a GPU), the code here is to give a clear insight on how LongRoPE works, how can it be implemented, but we can not reach results since they are experimentals and were showed on large LLM with (really) large datasets and by definition of the problem, we need to perform it on huge context lengths to have results

Get Start

To run the code, please first run

pip install -e .

Contributors

Paul CAUCHETEUX
Sacha HAKIM
Tom WELCH
Quentin MOAYEDPOUR

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

LLM Project

Repo

Get Start

Contributors

About

Uh oh!

Releases

Packages

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
data		data
ntbk		ntbk
report		report
src		src
test		test
README.md		README.md
setup.py		setup.py

Paulcauch/LLM-Project-MVA-2025

Folders and files

Latest commit

History

Repository files navigation

LLM Project

Repo

Get Start

Contributors

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages