ai glossary · 32 terms
Speak the language.
From "attention" to "zero-shot" — clear, plain-English definitions for every term you'll meet across modern AI.
- Agent An AI system that takes actions in the world — calling tools, browsing, executing code — rather than just emitting text.
- Alignment The umbrella term for ensuring an AI's goals and behavior match what humans actually want.
- Attention The mechanism inside transformers that lets each token weigh how much every other token matters.
- Backpropagation The algorithm that trains neural networks by walking errors backward through the layers.
- Batch size The number of training examples processed before the model's weights are updated once.
- Benchmark A standardized test (MMLU, SWE-bench, GSM8K) used to compare models on a fixed task.
- Chain-of-Thought Prompting a model to think step-by-step in writing before producing a final answer.
- Context window The maximum number of tokens a model can consider at once. Bigger windows enable longer documents.
- Diffusion A generative approach that learns to reverse noise — the foundation of most modern image models.
- Embedding A dense vector representation of text, images, or other data — the lingua franca of semantic search and RAG.
- Eval Short for evaluation — a test, often automated, that measures whether a model's output meets a quality bar.
- Fine-tuning Further training a pretrained model on a specific dataset to specialize its behavior.
- Hallucination When a model generates confident output that isn't grounded in any real source.
- Inference Running a trained model to get predictions or outputs — what you pay for at runtime.
- KV cache The stored key/value tensors that let transformers skip recomputing attention for tokens already processed.
- LoRA Low-Rank Adaptation — a parameter-efficient way to fine-tune large models by training small additive matrices.
- MoE Mixture of Experts — an architecture where each token routes through only a few specialized sub-networks.
- Multimodal A model that can natively process more than one input type — text, images, audio, video.
- Parameter A learned weight inside a neural network. Modern frontier models have hundreds of billions to trillions.
- Pretraining The expensive first phase where a model learns from a vast corpus before any task-specific training.
- Prompt injection An attack where untrusted input convinces the model to ignore its original instructions.
- Quantization Compressing a model's weights to lower precision (8-bit, 4-bit) to fit smaller hardware.
- RAG Retrieval-Augmented Generation — fetching relevant documents and adding them to the prompt before generation.
- Reasoning Multi-step problem-solving inside the model, usually via chain-of-thought or test-time compute.
- RLHF Reinforcement Learning from Human Feedback — using human preferences to shape model behavior.
- Sampling How the model picks the next token. Temperature, top-k, and top-p change the trade-off between diversity and predictability.
- System prompt The instruction that sets a model's persona, rules, and tools — invisible to the end user but always present.
- Temperature A sampling knob: 0 is deterministic, higher means more random and creative.
- Token The atomic unit a model reads and writes — roughly a syllable or a short word.
- Tool use A model's ability to call external functions (search, code execution, APIs) during a conversation.
- Transformer The neural-network architecture, introduced in 2017, that underlies essentially every modern LLM.
- Zero-shot Asking a model to do a task it was never explicitly trained on, with no examples in the prompt.
- No matches.