Large Language Models (LLMs)

(under construction)

What is a Large Language Model (LMM)?
What is the building block of an LLM?
LLMs, self-attention mechanism: The self-attention mechanism is the core concept of transformer-based LLMs. Here, we review the formulae of this mechanism and implement a self-attention from scratch in Python.
LLMs, the softmax in self-attention: We remind the softmax function ,which is widely used in neural networks, deep learning, and machine learning. The function softmax is implemented in Python with an example.
LLMs: Layer normalization: Layer normalization is a critical component of Transformers and LLMs, ensuring stable and efficient training by normalizing activations across the feature dimension. It is particularly well-suited for sequence-based tasks and deep architectures. Here, we implement the layer normalization with Numpy. Moreover, we give the code of PyTorch for the layer normalization so that you can compare the results.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
LICENSE		LICENSE
LLM-layer normalization.ipynb		LLM-layer normalization.ipynb
LLM-self-attention mechanism.ipynb		LLM-self-attention mechanism.ipynb
LLM-softmax.ipynb		LLM-softmax.ipynb
README.md		README.md
what is an LLM.ipynb		what is an LLM.ipynb
what is the building block of an LLM.ipynb		what is the building block of an LLM.ipynb

Provide feedback