ai glossary · 32 terms

Speak the language.

From "attention" to "zero-shot" — clear, plain-English definitions for every term you'll meet across modern AI.

Agent An AI system that takes actions in the world — calling tools, browsing, executing code — rather than just emitting text.
Alignment The umbrella term for ensuring an AI's goals and behavior match what humans actually want.
Attention The mechanism inside transformers that lets each token weigh how much every other token matters.
Backpropagation The algorithm that trains neural networks by walking errors backward through the layers.
Batch size The number of training examples processed before the model's weights are updated once.
Benchmark A standardized test (MMLU, SWE-bench, GSM8K) used to compare models on a fixed task.
Chain-of-Thought Prompting a model to think step-by-step in writing before producing a final answer.
Context window The maximum number of tokens a model can consider at once. Bigger windows enable longer documents.
Diffusion A generative approach that learns to reverse noise — the foundation of most modern image models.
Embedding A dense vector representation of text, images, or other data — the lingua franca of semantic search and RAG.
Eval Short for evaluation — a test, often automated, that measures whether a model's output meets a quality bar.
Fine-tuning Further training a pretrained model on a specific dataset to specialize its behavior.
Hallucination When a model generates confident output that isn't grounded in any real source.
Inference Running a trained model to get predictions or outputs — what you pay for at runtime.
KV cache The stored key/value tensors that let transformers skip recomputing attention for tokens already processed.
LoRA Low-Rank Adaptation — a parameter-efficient way to fine-tune large models by training small additive matrices.
MoE Mixture of Experts — an architecture where each token routes through only a few specialized sub-networks.
Multimodal A model that can natively process more than one input type — text, images, audio, video.
Parameter A learned weight inside a neural network. Modern frontier models have hundreds of billions to trillions.
Pretraining The expensive first phase where a model learns from a vast corpus before any task-specific training.
Prompt injection An attack where untrusted input convinces the model to ignore its original instructions.
Quantization Compressing a model's weights to lower precision (8-bit, 4-bit) to fit smaller hardware.
RAG Retrieval-Augmented Generation — fetching relevant documents and adding them to the prompt before generation.
Reasoning Multi-step problem-solving inside the model, usually via chain-of-thought or test-time compute.
RLHF Reinforcement Learning from Human Feedback — using human preferences to shape model behavior.
Sampling How the model picks the next token. Temperature, top-k, and top-p change the trade-off between diversity and predictability.
System prompt The instruction that sets a model's persona, rules, and tools — invisible to the end user but always present.
Temperature A sampling knob: 0 is deterministic, higher means more random and creative.
Token The atomic unit a model reads and writes — roughly a syllable or a short word.
Tool use A model's ability to call external functions (search, code execution, APIs) during a conversation.
Transformer The neural-network architecture, introduced in 2017, that underlies essentially every modern LLM.
Zero-shot Asking a model to do a task it was never explicitly trained on, with no examples in the prompt.
No matches.