Deep learning materials

This is a curated repository of materials for deep learning.

Build a Large Language Model (From Scratch)
GPT in 60 lines of NumPy
Implementing Mixtral with Neural Circuit Diagrams
Hello Evo: From DNA generation to protein folding
University of Amsterdam Deep Learning Tutorials web Github
Zero to LitGPT: Getting Started with Pretraining, Finetuning, and Using LLMs
Annotated Deep Learinng Research Papers Implementations web Github
llamafile is the new best way to run a LLM on your own computer
Bash One-Liners for LLMs
A primer on algorithmic differentiation
Creating a Transformer From Scratch - Part One: The Attention Mechanism
Creating a Transformer From Scratch - Part Two: The Rest of the Transformer
The Annotated Diffusion Model
Defusing Diffusion Models
The Illustrated Stable Diffusion
build nanoGPT by Andrey Karpathy Github Youtube
Let's build GPT from scratch by Andrey Karpathy Youtube
AI by hand Substack Twitter
The Matrix Calculus You Need For Deep Learning
CNN from Scratch with pure Mathematical Intuition
Machine Learning from Scratch

Curated lists

Papers

Role play with large language models (2023) Nature.com arXiv
Attention is All You Need (2017) arXiv v7 (2023)
OLMo: Accelerating the Science of Language Models (2024) arXiv
Scaling Rectified Flow Transformers for High-Resolution Image Synthesis (2024) blog paper
Datasets for Large Language Models: A Comprehensive Survey (2024) arXiv
Large Language Models(LLMs) on Tabular Data (2024) arXiv
Chain-of-Thought Reasoning Without Prompting (2024) arXiv
Holistic Evaluation of Language Models (2022) arXiv
Understanding LLMs: A Comprehensive Overview from Training to Inference (2024) arXiv
Large Language Models: A Survey (2024) arXiv
Grandmaster-Level Chess Without Search (2024) arXiv
GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection (2024) arXiv Github
MacGyver: Are Large Language Models Creative Problem Solvers? (2024) arXiv Github
Revisiting Unreasonable Effectiveness of Data in Deep Learning Era (2017) paper
Understanding Deep Learning Requires Rethinking Generalization (2017) arXiv
Understanding Deep Learning (Still) Requires Rethinking Generalization (2021) paper
Grokking: Generalization Beyond Overfitting on Small Algorithmic Datasets (2022) arXiv
Deep Double Descent: Where Bigger Models and More Data Hurt (2019) arXiv
Unifying Grokking and Double Descent (2023) arXiv
Textbooks Are All You Need (2023) arXiv
LoRa: Low-Rank Adaptation of Large Language Models (2021) arXiv
Fundamental Components of Deep Learning: A category-theoretic approach (2023) arXiv
Compression Represents Intelligence Linearly arXiv HuggingFace
Rules of Machine Learning - Best Practices for ML Engineering web
General Intelligence Requires Rethinking Exploration (2024) arXiv
Testing theory of mind in large language models and humans (2024) paper
Diffusion Models Are Real-Time Game Engines web
Compact Language Models via Pruning and Knowledge Distillation paper

Articles

Books

Reports

Models

Moondream - tiny vision language model web Github
BioMistral - pretrained LLM models for biomedical domain arXiv HuggingFace
StructLM - Generalist Model for Structured Knowledge Grounding web arXiv Github HuggingFace
Gemma - Open Models Based on Gemini Research and Technology HuggingFace paper
Large World Model web Github HuggingFace arXiv
ChemLLM: A Chemical Large Language Model arXiv HuggingFace
BLOOM: A 176B-Parameter Open-Access Multilingual Language Model arXiv HuggingFace web
Llama 3 web Github

Benchmarks

Nicolas Carlini's benchmark of 100 tests for LLM's

Datasets

Dolma: an Open Corpus of Three Trillion Tokens for Language Model Pretraining Research arXiv HuggingFace
Internet Archive Public Domain English Books HuggingFace
Multimodal ArXiv: A Dataset for Improving Scientific Comprehension of Large Vision-Language Models web arXiv
Cosmopedia - dataset of synthetic textbooks, blogposts, stories, posts and WikiHow articles web HuggingFace
FineWeb - 15T tokens of deduplicated and English texts from CommonCrawl HuggingFace

Tools

Ollama - run Llama 2, Mistral and Gemma locally web Github
llama.cpp - standalone LLM inference in C/C++ for Llama models Github
gemma.cpp - standalone LLM inference in c/C++ for Gemma models Github
DSPy - framework for programming foundational models arXiV Github
LangChain - framework for buidling LLM applications web Github
trl - Transformer Reinforcement Learning Github
Transformers - State-of-the-art Machine Learning for PyTorch, TensorFlow, and JAX Github HuggingFace
Transformers.js - State-of-the-art Machine Learning for the web HuggingFace
Tokenizer Playground
Neural Networks Playground
MelloTTS - multi-lingual text-to-speech library HuggingFace demo Github
Finetune Mistral, Gemma, Llama 2-5x faster with 70% less memory! Github
llm.c - LLM (GPT-2) training in simple, pure C/CUDA Github
LLM transparency tool Github
CoreNet - Apple's library for deep learning Github
Cohere toolkit - quickly deploy RAG applications Github
wllama - WebAssembly bindings for llama.cpp Github
elia - LLM in terminal Github
MiniTorch Github

APIs

Initiatives

Occiglot - research collective for open-source European LLMs web HuggingFace
Open Sora Github

Lectures

A little guide to building Large Language Models in 2024 Youtube slides
Intro to LLMs by Andrey Karpathy Youtube slides
The spelled-out intro to neural networks and backpropagation: building micrograd by Andrey Karpathy Youtube
Attention in transformers, visually explained (Chapter 6, Deep Learning) Youtube
Intuition Behind Self-Attention Mechanism in Transformer Networks Youtube

Implementations

Pytorch implementations of various papers github

Name		Name	Last commit message	Last commit date
Latest commit History 33 Commits
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Deep learning materials

Table of Contents

Courses

Tutorials

Curated lists

Papers

Articles

Books

Reports

Models

Benchmarks

Datasets

Tools

APIs

Initiatives

Lectures

Implementations

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

mlkr-rbi/deep-learning-materials

Folders and files

Latest commit

History

Repository files navigation

Deep learning materials

Table of Contents

Courses

Tutorials

Curated lists

Papers

Articles

Books

Reports

Models

Benchmarks

Datasets

Tools

APIs

Initiatives

Lectures

Implementations

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Packages