Knarf04

Haochen Shen Knarf04

Popular repositories Loading

Mamba-Megatron-DeepSpeed Mamba-Megatron-DeepSpeed Public

Forked from deepspeedai/Megatron-DeepSpeed

Ongoing research training transformer language models at scale, including: BERT & GPT-2

Python
LongRoPE LongRoPE Public

Forked from microsoft/LongRoPE

LongRoPE is a novel method that can extends the context window of pre-trained LLMs to an impressive 2048k tokens.

Python
transformers transformers Public

Forked from huggingface/transformers

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.

Python
long_context_eval long_context_eval Public

Evaluation scripts for long context tasks.

Python 1
RULER RULER Public

Forked from NVIDIA/RULER

This repo contains the source code for RULER: What’s the Real Context Size of Your Long-Context Language Models?

Python
vllm vllm Public

Forked from vllm-project/vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

Python