Skip to content

Master the essential steps of pretraining large language models (LLMs). Learn to create high-quality datasets, configure model architectures, execute training runs, and assess model performance for efficient and effective LLM pretraining.

Notifications You must be signed in to change notification settings

ksm26/Pretraining-LLMs

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Welcome to the "Pretraining LLMs" course! 🧑‍🏫 The course dives into the essential steps of pretraining large language models (LLMs).

📘 Course Summary

In this course, you’ll explore pretraining, the foundational step in training LLMs, which involves teaching an LLM to predict the next token using vast text datasets.

🧠 You'll learn the essential steps to pretrain an LLM, understand the associated costs, and discover cost-effective methods by leveraging smaller, existing open-source models.

Detailed Learning Outcomes:

  1. 🧠 Pretraining Basics: Understand the scenarios where pretraining is the optimal choice for model performance. Compare text generation across different versions of the same model to grasp the performance differences between base, fine-tuned, and specialized pre-trained models.
  2. 🗃️ Creating High-Quality Datasets: Learn how to create and clean a high-quality training dataset using web text and existing datasets, and how to package this data for use with the Hugging Face library.
  3. 🔧 Model Configuration: Explore ways to configure and initialize a model for training, including modifying Meta’s Llama models and initializing weights either randomly or from other models.
  4. 🚀 Executing Training Runs: Learn how to configure and execute a training run to train your own model effectively.
  5. 📊 Performance Assessment: Assess your trained model’s performance and explore common evaluation strategies for LLMs, including benchmark tasks used to compare different models’ performance.

🔑 Key Points

  • 🧩 Pretraining Process: Gain in-depth knowledge of the steps to pretrain an LLM, from data preparation to model configuration and performance assessment.
  • 🏗️ Model Architecture Configuration: Explore various options for configuring your model’s architecture, including modifying Meta’s Llama models and innovative pretraining techniques like Depth Upscaling, which can reduce training costs by up to 70%.
  • 🛠️ Practical Implementation: Learn how to pretrain a model from scratch and continue the pretraining process on your own data using existing pre-trained models.

👩‍🏫 About the Instructors

  • 👨‍🏫 Sung Kim: CEO of Upstage, bringing extensive expertise in LLM pretraining and optimization.
  • 👩‍🔬 Lucy Park: Chief Scientific Officer of Upstage, with a deep background in scientific research and LLM development.

🔗 To enroll in the course or for further information, visit 📚 deeplearning.ai.

About

Master the essential steps of pretraining large language models (LLMs). Learn to create high-quality datasets, configure model architectures, execute training runs, and assess model performance for efficient and effective LLM pretraining.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published