Skip to content

mlkr-rbi/deep-learning-materials

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

33 Commits
 
 

Repository files navigation

Deep learning materials

This is a curated repository of materials for deep learning.

Table of Contents

Courses

Tutorials

Curated lists

Papers

  • Role play with large language models (2023) Nature.com arXiv
  • Attention is All You Need (2017) arXiv v7 (2023)
  • OLMo: Accelerating the Science of Language Models (2024) arXiv
  • Scaling Rectified Flow Transformers for High-Resolution Image Synthesis (2024) blog paper
  • Datasets for Large Language Models: A Comprehensive Survey (2024) arXiv
  • Large Language Models(LLMs) on Tabular Data (2024) arXiv
  • Chain-of-Thought Reasoning Without Prompting (2024) arXiv
  • Holistic Evaluation of Language Models (2022) arXiv
  • Understanding LLMs: A Comprehensive Overview from Training to Inference (2024) arXiv
  • Large Language Models: A Survey (2024) arXiv
  • Grandmaster-Level Chess Without Search (2024) arXiv
  • GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection (2024) arXiv Github
  • MacGyver: Are Large Language Models Creative Problem Solvers? (2024) arXiv Github
  • Revisiting Unreasonable Effectiveness of Data in Deep Learning Era (2017) paper
  • Understanding Deep Learning Requires Rethinking Generalization (2017) arXiv
  • Understanding Deep Learning (Still) Requires Rethinking Generalization (2021) paper
  • Grokking: Generalization Beyond Overfitting on Small Algorithmic Datasets (2022) arXiv
  • Deep Double Descent: Where Bigger Models and More Data Hurt (2019) arXiv
  • Unifying Grokking and Double Descent (2023) arXiv
  • Textbooks Are All You Need (2023) arXiv
  • LoRa: Low-Rank Adaptation of Large Language Models (2021) arXiv
  • Fundamental Components of Deep Learning: A category-theoretic approach (2023) arXiv
  • Compression Represents Intelligence Linearly arXiv HuggingFace
  • Rules of Machine Learning - Best Practices for ML Engineering web
  • General Intelligence Requires Rethinking Exploration (2024) arXiv
  • Testing theory of mind in large language models and humans (2024) paper
  • Diffusion Models Are Real-Time Game Engines web
  • Compact Language Models via Pruning and Knowledge Distillation paper

Articles

Books

Reports

Models

Benchmarks

Datasets

  • Dolma: an Open Corpus of Three Trillion Tokens for Language Model Pretraining Research arXiv HuggingFace
  • Internet Archive Public Domain English Books HuggingFace
  • Multimodal ArXiv: A Dataset for Improving Scientific Comprehension of Large Vision-Language Models web arXiv
  • Cosmopedia - dataset of synthetic textbooks, blogposts, stories, posts and WikiHow articles web HuggingFace
  • FineWeb - 15T tokens of deduplicated and English texts from CommonCrawl HuggingFace

Tools

  • Ollama - run Llama 2, Mistral and Gemma locally web Github
  • llama.cpp - standalone LLM inference in C/C++ for Llama models Github
  • gemma.cpp - standalone LLM inference in c/C++ for Gemma models Github
  • DSPy - framework for programming foundational models arXiV Github
  • LangChain - framework for buidling LLM applications web Github
  • trl - Transformer Reinforcement Learning Github
  • Transformers - State-of-the-art Machine Learning for PyTorch, TensorFlow, and JAX Github HuggingFace
  • Transformers.js - State-of-the-art Machine Learning for the web HuggingFace
  • Tokenizer Playground
  • Neural Networks Playground
  • MelloTTS - multi-lingual text-to-speech library HuggingFace demo Github
  • Finetune Mistral, Gemma, Llama 2-5x faster with 70% less memory! Github
  • llm.c - LLM (GPT-2) training in simple, pure C/CUDA Github
  • LLM transparency tool Github
  • CoreNet - Apple's library for deep learning Github
  • Cohere toolkit - quickly deploy RAG applications Github
  • wllama - WebAssembly bindings for llama.cpp Github
  • elia - LLM in terminal Github
  • MiniTorch Github

APIs

Initiatives

Lectures

  • A little guide to building Large Language Models in 2024 Youtube slides
  • Intro to LLMs by Andrey Karpathy Youtube slides
  • The spelled-out intro to neural networks and backpropagation: building micrograd by Andrey Karpathy Youtube
  • Attention in transformers, visually explained (Chapter 6, Deep Learning) Youtube
  • Intuition Behind Self-Attention Mechanism in Transformer Networks Youtube

Implementations

  • Pytorch implementations of various papers github

About

A repository of materials for deep learning.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •