-
Visualizing Attention: See what an LLM sees.
Learn how attention mechanisms work in transformers by visualizing what LLMs see when processing text. Discover how attention connects semantically related tokens (like Paris → French), understand the Query-Key-Value framework, and explore how different attention heads specialize in syntax, semantics, and coreference.
-
Supervised Finetuning in LLM training workflow
Learn how supervised fine-tuning (SFT) fits into the LLM training pipeline. This post explains the three-step process (pretraining → SFT → alignment), demonstrates SFT implementation with a practical example, and shows how fine-tuning transforms a base model into a task-specific assistant.
-
From Words to Meaning: Implementing Word2Vec from Scratch
Word embeddings are one of the most transformative developments in Natural Language Processing (NLP). They solve a fundamental problem: how can we rep...
-
Primer on Large Language Model (LLM) Inference Optimizations: 3. Model Architecture Optimizations
Exploring model architecture optimizations for Large Language Model (LLM) inference, focusing on Group Query Attention (GQA) and Mixture of Experts (MoE) techniques.
-
Scaling Laws in Large Language Models
Scaling laws in AI offer a quantitative framework for understanding the relationship between model size, data, and compute resources. Learn about the Chinchilla scaling law, power laws, and the future of large models.
-
Primer on Large Language Model (LLM) Inference Optimizations: 2. Introduction to Artificial Intelligence (AI) Accelerators
Exploration of AI accelerators and their impact on deploying Large Language Models (LLMs) at scale.
-
Primer on Large Language Model (LLM) Inference Optimizations: 1. Background and Problem Formulation
Overview of Large Language Model (LLM) inference, its importance, challenges, and key problem formulations.