Deeply Learning
Notes on Deep Learning Systems and AI Research
-
Scaling Laws in Large Language Models
Scaling laws in AI offer a quantitative framework for understanding the relationship between model size, data, and compute resources. Learn about the Chinchilla scaling law, power laws, and the future of large models.
10 min read · November 07, 2024
2024 · LLM Scaling Laws Emergent Capabilities Transformers Natural Language Processing NLP Large Language Models LLMs Transformers · Large Language Model Emegent Capabilities Natural Language Processing
-
Primer on Large Language Model (LLM) Inference Optimizations: 2. Introduction to Artificial Intelligence (AI) Accelerators
Exploration of AI accelerators and their impact on deploying Large Language Models (LLMs) at scale.
12 min read · November 06, 2024
2024 · LLM Inference Optimization Transformer Attention Mechanism Multi-Head Attention AI Accelerators GPUs TPUs FPGAs ASICs Parallel Processing Data Parallelism Model Parallelism Task Parallelism Co-Processing Mode Intelligent Processing Units Reconfigurable Dataflow Units Neural Processing Units Large Language Models LLMs Transformers Natural Language Processing NLP · Large Language Model Inference Optimization AI Accelerators Natural Language Processing
-
Primer on Large Language Model (LLM) Inference Optimizations: 1. Background and Problem Formulation
Overview of Large Language Model (LLM) inference, its importance, challenges, and key problem formulations.
15 min read · October 31, 2024
2024 · LLM Inference Optimization Transformer Attention Mechanism Multi-Head Attention K-V Caching Memory Calculation Optimization Metrics Optimization Techniques Natural Language Processing NLP Large Language Models LLMs Transformers · Large Language Model Inference Optimization Natural Language Processing