Skip to content

ifLabX/Awesome-LLM-Learning

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

7 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Awesome LLM Resources - A Curated Learning Path & Selection of Free Resources for Large Language Models

Awesome

A curated learning roadmap and list of selected free resources for developers, researchers, and enthusiasts of Large Language Models (LLMs). This list aims to provide a clear, structured guide to help you systematically master the core concepts, cutting-edge technologies, and practical tools of LLMs, starting from scratch.


Table of Contents


🧠 Core Concepts

Before diving into code, it's crucial to understand the core ideas that drive LLMs. These are the cornerstones of the field.

Must-Read Papers

In-depth Guides & Blogs

πŸŽ“ Structured Courses

Build a solid knowledge base with systematic, university-level courses.

  • Stanford CS224n: NLP with Deep Learning - Stanford University's flagship NLP course. It comprehensively covers everything from traditional NLP to the latest Transformers and LLMs.
  • LLMs from Scratch by Sebastian Raschka - A course/book on building a large language model from scratch. It offers in-depth theoretical explanations and detailed code implementations, making it an excellent resource for understanding the underlying principles.
  • LLM Course by Maxime Labonne - A GitHub-based, hands-on LLM course that provides a clear roadmap and numerous Colab notebooks.

🏒 Corporate Learning Hubs

Get authoritative learning resources and best practices directly from the companies building these technologies.

DeepLearning.AI Short Courses

A collection of free, concise courses from Andrew Ng's team and industry experts (OpenAI, LangChain, etc.), focused on quickly mastering a specific skill.

Microsoft AI & LLM Resources

Microsoft provides a wealth of structured courses, official documentation, and practical code samples through its Learn platform and GitHub organizations.

  • Official Learning Paths & Documentation:

    • Microsoft Learn AI Hub - The central hub for all of Microsoft's AI learning resources, including documentation, tutorials, and certification paths for Azure AI and OpenAI services.
    • Azure AI Fundamentals - The official learning path for the AI-900 certification, covering core AI and machine learning concepts on Azure.
  • GitHub Repositories & Courses:

    • generative-ai-for-beginners - A comprehensive, 18-lesson course on Generative AI, created by Microsoft developers.
    • azure-search-openai-demo - A premier reference implementation for the RAG pattern using Azure AI Search and Azure OpenAI.
    • Semantic Kernel - Microsoft's open-source LLM orchestration SDK, an alternative to LangChain, for building agents and planners.
    • Microsoft/AI - A central index repository that categorizes and links to a large number of Microsoft's open-source AI samples and best practices.

Other Corporate Hubs

  • Google - Generative AI Learning Path - Official Google Cloud learning path. It offers a series of courses on Generative AI fundamentals, large language models, and the Google Cloud AI platform.
  • AWS - Generative AI Learning Plan for Developers - AWS developer learning plan. It includes 10 courses from beginner to advanced, designed to help developers learn to build and deploy generative AI applications on AWS.
  • OpenAI Cookbook - Official OpenAI practice guide. It provides numerous runnable code examples that demonstrate best practices for completing common tasks with the OpenAI API.
  • Anthropic Cookbook - Official hands-on guide. Contains extensive code examples, best practices, and tutorials for building with Claude and Anthropic APIs.
  • Meta Llama Cookbook - Official Meta Llama hands-on guide. This repo contains various code examples for inference, fine-tuning, and building RAG applications with Llama models.

πŸ› οΈ Emerging AI Tools & Platforms

These companies and tools are defining the development paradigm for modern AI applications. Understanding them is key to building advanced solutions.

Foundation Model Providers

  • Anthropic (Claude) - Official Docs. The authoritative starting point for learning the Claude model family, its API, safety features, and prompt engineering. Their GitHub Cookbook provides extensive hands-on code.
  • Mistral AI - Official Docs. Known for its high-performance open-source and commercial models, excelling in efficiency and performance. Their GitHub contains model implementations and usage examples.
  • Cohere - Official Docs. An AI platform focused on enterprise applications. Their Cohere University and GitHub offer rich tutorials, especially for RAG and semantic search.

Application Development Frameworks

  • LangChain - Official Docs. The most popular framework for LLM application development, providing modular components and extensive integrations. Its GitHub includes many cookbooks and templates.
  • LlamaIndex - Official Docs. A data framework focused on RAG, offering powerful capabilities for data ingestion, indexing, and querying. Its GitHub contains a wealth of examples.

Critical Infrastructure: Vector Databases

  • Pinecone - Official Docs. A leading managed vector database that provides core support for large-scale, low-latency semantic search and RAG applications. Their GitHub provides clients and examples.
  • Weaviate - Official Docs. A powerful open-source vector database that supports hybrid search and has an active community. Its GitHub provides the core code and clients.

Production & Evaluation Tools

  • Weights & Biases - W&B Prompts Docs. A tool for tracking, visualizing, and evaluating LLM applications (especially Prompt Chains), which is a key part of moving from development to production.

πŸ’» General Tools & Frameworks

  • Hugging Face Transformers - The absolute core. It is a complete ecosystem providing a massive number of pre-trained models, datasets, and tools.
  • PyTorch / TensorFlow - Mainstream deep learning frameworks, fundamental for understanding model fine-tuning and underlying research.

πŸŽ₯ Video Tutorials & Lectures

Visual learning materials can significantly accelerate understanding.

Coding & Implementation

Concepts & Theory

Conferences & Lectures

πŸ“– Open Source Models & Datasets

The open-source community is the core driving force behind the democratization of LLM technology.

Models

  • Hugging Face Open LLM Leaderboard - A dynamically updated leaderboard for open-source LLMs. The best starting point for finding and evaluating the most powerful open models available.
  • Meta Llama 3 - The flagship open-source model family from Meta, a cornerstone of the current open-source ecosystem.
  • Mixtral of Experts - Developed by Mistral AI, this model uses an innovative "Mixture of Experts" (MoE) architecture, representing a major step in model efficiency.

Datasets

  • The Pile - A large-scale, diverse, open-source text dataset that has been used to train many powerful open LLMs.
  • RedPajama-Data-v2 - An open-source reproduction of the training data used for the Llama family of models.
  • Awesome Datasets for LLM - A curated list of high-quality datasets specifically for Instruction Tuning and Preference Alignment.

πŸš€ Keep Learning

The field of large models is evolving daily. It's vital to maintain a habit of continuous learning.

  • Twitter / X: Follow top researchers in the field, such as Yann LeCun, Andrej Karpathy, Jim Fan, and Lilian Weng.
  • Papers with Code: https://paperswithcode.com/ - Track the latest papers and their open-source implementations.
  • AI Newsletters: Subscribe to newsletters like The Batch (from DeepLearning.AI) and Import AI to get industry updates and curated paper selections.

If you find any valuable resources or wish to contribute, please feel free to submit a Pull Request!

About

the Awesome list of free learning resources for Large Language Models (LLMs)

Resources

License

Code of conduct

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published