Popular repositories Loading
-
Mamba-Megatron-DeepSpeed
Mamba-Megatron-DeepSpeed PublicForked from deepspeedai/Megatron-DeepSpeed
Ongoing research training transformer language models at scale, including: BERT & GPT-2
Python
-
LongRoPE
LongRoPE PublicForked from microsoft/LongRoPE
LongRoPE is a novel method that can extends the context window of pre-trained LLMs to an impressive 2048k tokens.
Python
-
transformers
transformers PublicForked from huggingface/transformers
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
Python
-
-
RULER
RULER PublicForked from NVIDIA/RULER
This repo contains the source code for RULER: What’s the Real Context Size of Your Long-Context Language Models?
Python
-
vllm
vllm PublicForked from vllm-project/vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
Python
If the problem persists, check the GitHub status page or contact support.