The world of AI moves fast, and nowhere faster than in the world of language models. New terms, frameworks, and buzzwords appear almost daily, making it easy to feel lost in translation. That’s why we’ve created the LLM Terms Glossary, a clear, comprehensive guide to 50 essential concepts every modern developer, product manager, and AI enthusiast should know. From tokens and embeddings to transformers and context windows, this glossary turns complex ideas into simple, actionable knowledge. Whether you’re building with ChatGPT, Gemini, or Claude, this LLM Terms Glossary will help you stay confident, current, and ahead of the curve.
Page Contents
A–Z Glossary of LLM Terms You Should Know
A
1. API (Application Programming Interface)
A way to interact with LLMs programmatically, for example, sending text to OpenAI’s GPT API to get a response.
2. Alignment
Ensuring that an AI model’s behaviour matches human intentions and ethical values.
3. Attention Mechanism
A key part of transformer models that allows the AI to “focus” on relevant words or tokens while processing text.
4. Auto-Regressive Model
A model that generates one token at a time, predicting the next based on previous ones (like GPT).
B
5. Bias
Systematic skew in model outputs due to imbalanced training data or societal factors.
6. Base Model
An LLM before fine-tuning, trained on large-scale general data but not optimised for specific tasks.
7. Benchmarking
Evaluating model performance using standard datasets or metrics (e.g., MMLU, HELM).
8. Bloom Filter
A memory-efficient data structure sometimes used in AI retrieval systems to test membership of an element in a set.
C
9. Chain of Thought (CoT)
A prompting technique that encourages models to explain their reasoning step-by-step before answering.
10. Context Window
The maximum number of tokens an LLM can consider at once, e.g., GPT-4 has 128k tokens.
11. Completion
The text generated by an LLM in response to a prompt.
12. Compression Tokenisation
The process of breaking down text into smaller units (tokens) for model processing.
13. Contrastive Learning
A training method that teaches models by comparing similar and dissimilar examples (used in CLIP and embedding models).
D
14. Dataset Curation
The process of collecting and cleaning training data for LLMs.
15. Decoder
The part of a transformer that generates text outputs from learned representations.
16. Diffusion Models
A class of generative models (mainly for images, like Stable Diffusion) that gradually “denoise” random data into coherent output.
17. Domain Adaptation
Fine-tuning a base model for a specific industry or dataset, e.g., a medical LLM.
E
18. Embeddings
Numerical representations of text that capture meaning, enabling semantic search and RAG.
19. Encoder
The component that transforms input data into vector representations for further processing.
20. Ethics in AI
Guidelines ensuring responsible development, transparency, and fairness in LLMs.
F
21. Fine-Tuning
Training a pre-trained model on a smaller, specialised dataset to improve task-specific performance.
22. Foundation Model
A large-scale, pre-trained model that serves as the base for downstream tasks (like GPT, LLaMA, Claude).
23. Few-Shot Learning
When a model learns from only a few examples provided in the prompt.
G
24. Generative AI
AI capable of creating new content, text, images, music, or code rather than just analysing data.
25. Gradient Descent
A core optimisation technique for minimising error during model training.
26. Guardrails
Safety systems that restrict or monitor what an LLM can output.
H
27. Hallucination
When an AI model confidently produces incorrect or fabricated information.
28. Hyperparameters
Settings that control model training, like learning rate, layers, or batch size.
I
29. Inference
The process of generating outputs from a trained model based on input prompts.
30. Instruction Tuning
Training an LLM to better follow human instructions rather than just predict text patterns.
31. In-Context Learning
The model’s ability to learn patterns directly from examples within a single prompt session.
J–L
32. JSON Mode
A structured output format where LLMs produce responses in machine-readable JSON, essential for API integrations.
33. Knowledge Cutoff
The latest point in time when the LLM’s training data was updated (e.g., GPT-4’s cutoff: Oct 2023).
34. Latent Space
A compressed, mathematical space where concepts are represented by vectors.
35. LoRA (Low-Rank Adaptation)
A lightweight fine-tuning technique allowing customisation without retraining the full model.
M
36. Model Weights
Parameters that define what the model has “learned.”
37. Multimodal Model
A model that processes text, images, and audio together for richer understanding (e.g., GPT-4V, Gemini).
38. Mistral AI
A 2024-era open-weight LLM family optimised for efficiency and performance.
N–P
39. Natural Language Processing (NLP)
A field of AI focused on enabling machines to understand and generate human language.
40. Parameters
The adjustable values that determine the behaviour of an LLM (GPT-4 reportedly has over a trillion).
41. Prompt Engineering
The art of crafting inputs that guide LLMs toward better, more accurate outputs.
42. Pre-Training
Initial model training on large, diverse datasets before fine-tuning.
Q–R
43. Quantisation
Reducing model size and computational needs by compressing numerical precision, useful for edge devices.
44. RAG (Retrieval-Augmented Generation)
Combines retrieval systems with LLMs to ground answers in factual data sources.
45. Reinforcement Learning from Human Feedback (RLHF)
Training method where humans rate outputs, helping align models with human preferences.
S
46. Semantic Search
Finding information based on meaning, not keywords, powered by embeddings.
47. Synthetic Data
Artificially generated data is used to augment or replace real-world data for training.
48. System Prompt
The hidden instruction that defines an AI’s base behaviour (e.g., “You are a helpful assistant.”)
T–Z
49. Temperature
A parameter controlling randomness in model output, lower values make answers more deterministic.
50. Token
A basic unit of text used by LLMs (can be a word, subword, or character).
Bonus: Emerging Terms to Watch in 2026
- Contextual Memory Networks – enable long-term memory beyond context windows.
- Mixture of Experts (MoE) – models that use specialised subnetworks for different tasks.
- Prompt Chaining – linking multiple prompts for complex workflows.
- Retrieval Agents – AI systems capable of querying external databases autonomously.
Why This Glossary Matters
Understanding LLM terminology is no longer optional. It’s essential for:
- Building AI products confidently
- Collaborating with cross-functional teams
- Communicating AI insights to non-engineers
- Keeping up with an evolving ecosystem
If your business or product involves AI-driven features, learning these concepts will help you stay ahead of the curve.

Parvesh Sandila is a passionate web and Mobile app developer from Jalandhar, Punjab, who has over six years of experience. Holding a Master’s degree in Computer Applications (2017), he has also mentored over 100 students in coding. In 2019, Parvesh founded Owlbuddy.com, a platform that provides free, high-quality programming tutorials in languages like Java, Python, Kotlin, PHP, and Android. His mission is to make tech education accessible to all aspiring developers.