AI Glossary

70+ AI terms explained in plain language. From foundational concepts like LLMs and transformers to emerging topics like vibe coding and context engineering.

Models & Architecture (16)Techniques (15)Training & Data (12)Applications (9)Infrastructure (10)Safety & Ethics (8)

A

Agentic AI

Techniques

AI systems that can autonomously plan, reason, and take actions to accomplish goals with minimal human intervention. Agentic AI often combines LLMs with tool use, memory, and decision-making capabilities.

AI Agent

Applications

A software system powered by AI that can perceive its environment, make decisions, and take actions to achieve specific objectives. AI agents can use tools, browse the web, write code, and interact with external services.

AI Alignment

Safety & Ethics

The research field focused on ensuring AI systems behave in ways that are consistent with human values and intentions. Alignment aims to make AI helpful, harmless, and honest.

AI Safety

Safety & Ethics

The interdisciplinary field dedicated to ensuring AI systems do not cause unintended harm. AI safety encompasses technical research, policy, and practices to mitigate risks from both current and future AI systems.

API

Infrastructure

Application Programming Interface. A set of protocols and tools that allows software applications to communicate with each other. AI APIs let developers send prompts and receive model responses programmatically.

Attention Mechanism

Models & Architecture

A neural network technique that allows models to focus on the most relevant parts of the input when producing output. Attention is the core building block of the Transformer architecture and enables models to capture long-range dependencies in text.

B

Batch Size

Training & Data

The number of training examples processed together in one forward and backward pass during model training. Larger batch sizes can speed up training but require more memory, while smaller batch sizes may improve generalization.

Benchmark

Training & Data

A standardized test or dataset used to evaluate and compare the performance of AI models on specific tasks. Benchmarks provide objective metrics that help researchers and users understand model capabilities.

Bias

Safety & Ethics

Systematic errors or unfair tendencies in AI outputs that reflect prejudices present in training data or model design. Bias can lead to discriminatory outcomes across dimensions such as race, gender, and socioeconomic status.

C

Chain-of-Thought

Techniques

A prompting technique that encourages AI models to break down complex problems into intermediate reasoning steps before arriving at a final answer. This approach significantly improves performance on math, logic, and multi-step tasks.

Chatbot

Applications

A software application that simulates human conversation through text or voice interactions. Modern AI chatbots are typically powered by large language models and can handle a wide range of questions and tasks.

Related:LLM System Prompt AI Agent

Code Generation

Applications

The use of AI models to automatically write, complete, or translate programming code based on natural language descriptions or partial code. Modern LLMs can generate code in dozens of programming languages.

Related:Copilot Vibe Coding HumanEval

Constitutional AI

Models & Architecture

A training approach developed by Anthropic where AI models are guided by a set of principles (a constitution) to self-critique and revise their outputs. This method reduces harmful responses without requiring extensive human feedback on every example.

Content Filtering

Safety & Ethics

Automated systems that screen AI inputs and outputs to block harmful, inappropriate, or policy-violating content. Content filters are a key safety layer applied on top of model behavior.

Related:Guardrails AI Safety Jailbreak

Context Engineering

Techniques

The practice of carefully designing and structuring the information provided to an AI model within its context window to maximize output quality. Context engineering goes beyond prompt engineering to include managing system prompts, retrieved documents, and conversation history.

Context Window

Models & Architecture

The maximum amount of text (measured in tokens) that an AI model can process in a single interaction, including both the input prompt and the generated output. Larger context windows allow models to handle longer documents and conversations.

Copilot

Applications

An AI assistant designed to work alongside humans, augmenting their capabilities rather than replacing them. Copilots are common in coding, writing, and productivity tools, offering suggestions while keeping the human in control.

D

Data Augmentation

Training & Data

Techniques for artificially expanding training datasets by creating modified versions of existing data. In NLP, this can include paraphrasing, back-translation, or synonym replacement to improve model robustness.

Distillation

Models & Architecture

A technique where a smaller student model is trained to replicate the behavior of a larger teacher model. Distillation produces compact models that retain much of the larger model's capability while being faster and cheaper to run.

E

Edge AI

Infrastructure

Running AI models directly on local devices such as phones, laptops, or embedded systems rather than in the cloud. Edge AI reduces latency, improves privacy, and enables offline operation.

Related:Inference Quantization Latency

Embedding

Models & Architecture

A numerical representation of text, images, or other data as a vector of numbers that captures semantic meaning. Similar concepts have embeddings that are close together in vector space, enabling search, clustering, and recommendation systems.

Related:Token RAG Transformer

F

Few-shot Learning

Techniques

A prompting technique where a small number of example input-output pairs are included in the prompt to guide the model's behavior. Few-shot learning helps models understand the desired format and style without any fine-tuning.

Fine-tuning

Models & Architecture

The process of further training a pre-trained model on a specific dataset to specialize it for a particular task or domain. Fine-tuning adjusts the model's weights to improve performance on targeted use cases while leveraging knowledge from pre-training.

Foundation Model

Models & Architecture

A large AI model trained on broad data at scale that can be adapted to a wide range of downstream tasks. Foundation models like GPT, Claude, and Gemini serve as the base for chatbots, coding assistants, and many other applications.

Function Calling

Techniques

A capability that allows AI models to generate structured requests to invoke external functions or APIs based on user input. Function calling enables models to perform actions like searching databases, making calculations, or interacting with services.

G

GPT

Models & Architecture

Generative Pre-trained Transformer. A family of large language models developed by OpenAI that generate text by predicting the next token. GPT models, including GPT-4 and GPT-4o, are among the most widely used AI models.

GPU

Infrastructure

Graphics Processing Unit. A specialized processor originally designed for rendering graphics that is now essential for training and running AI models. GPUs excel at the parallel matrix computations that neural networks require.

Related:TPU Inference Model Serving

Guardrails

Safety & Ethics

Safety mechanisms and constraints built into AI systems to prevent harmful, biased, or undesirable outputs. Guardrails can include input validation, output filtering, topic restrictions, and behavioral guidelines.

H

Hallucination

Safety & Ethics

When an AI model generates information that sounds plausible but is factually incorrect or entirely fabricated. Hallucinations are a major challenge for LLMs and can include invented citations, false statistics, or fictional events.

HumanEval

Training & Data

A benchmark developed by OpenAI for evaluating AI models on code generation tasks. It consists of programming problems that test a model's ability to write correct Python functions from docstrings.

I

Inference

Infrastructure

The process of using a trained AI model to generate predictions or outputs from new inputs. Inference is what happens when you send a prompt to an AI model and receive a response.

J

Jailbreak

Safety & Ethics

An adversarial technique that attempts to bypass an AI model's safety guidelines and restrictions to produce prohibited or harmful content. Jailbreaks exploit vulnerabilities in model training or prompt processing.

L

Latency

Infrastructure

The time delay between sending a request to an AI model and receiving the first token of a response. Lower latency is critical for real-time applications like chatbots and interactive coding assistants.

Learning Rate

Training & Data

A hyperparameter that controls how much a model's weights are adjusted during each step of training. A learning rate that is too high can cause unstable training, while one that is too low results in slow convergence.

LLM

Models & Architecture

Large Language Model. A type of AI model trained on vast amounts of text data that can understand and generate human language. LLMs power chatbots, coding assistants, and many other AI applications.

M

MMLU

Training & Data

Massive Multitask Language Understanding. A widely used benchmark that tests AI models across 57 academic subjects including math, history, law, and science. MMLU scores are a common way to compare model knowledge and reasoning.

Model Serving

Infrastructure

The infrastructure and processes involved in deploying a trained AI model so it can accept requests and return predictions in production. Model serving includes load balancing, scaling, and optimizing for latency and cost.

Mixture of Experts

Models & Architecture

An architecture where a model contains multiple specialized sub-networks (experts) and a gating mechanism that routes each input to only a subset of experts. MoE allows models to have more total parameters while keeping inference costs manageable.

Related:Transformer Parameter LLM

Multimodal

Models & Architecture

AI models or systems that can process and generate multiple types of data, such as text, images, audio, and video. Multimodal models like GPT-4o and Gemini can understand images and produce text descriptions, or generate images from text.

N

Named Entity Recognition

Applications

An NLP task that identifies and classifies named entities in text into predefined categories such as person names, organizations, locations, and dates. NER is widely used in information extraction and document processing.

O

Open-weight Model

Models & Architecture

An AI model whose trained weights are publicly released, allowing anyone to download, use, and modify it. Open-weight models like Llama and Mistral enable local deployment and customization but may not include training data or code.

Overfitting

Training & Data

When a model learns the training data too well, including its noise and quirks, and fails to generalize to new unseen data. Overfitting results in high training accuracy but poor real-world performance.

P

Parameter

Models & Architecture

A learnable value within a neural network that is adjusted during training to improve model performance. The number of parameters in a model, often measured in billions, is a rough indicator of its capacity and complexity.

Post-training

Training & Data

Additional training steps applied after pre-training to refine model behavior, including instruction tuning, RLHF, and safety training. Post-training transforms a raw language model into a helpful, safe assistant.

Pre-training

Training & Data

The initial phase of training an AI model on a large, diverse dataset to learn general language patterns and knowledge. Pre-training typically involves predicting the next token across billions of text documents.

Prompt Chaining

Techniques

A technique where the output of one AI prompt is used as input for another, creating a sequence of steps to accomplish complex tasks. Prompt chaining breaks down difficult problems into manageable sub-tasks.

Prompt Engineering

Techniques

The practice of crafting effective instructions and inputs to get the best possible outputs from AI models. Good prompt engineering involves clear instructions, relevant context, examples, and structured formatting.

Q

Quantization

Infrastructure

A technique that reduces the precision of a model's numerical weights, for example from 32-bit to 8-bit or 4-bit numbers, to decrease memory usage and speed up inference. Quantization enables large models to run on consumer hardware with minimal quality loss.

R

RAG

Techniques

Retrieval-Augmented Generation. A technique that enhances AI model responses by first retrieving relevant information from external knowledge sources, then including that information in the prompt. RAG reduces hallucinations and keeps responses grounded in up-to-date facts.

Rate Limit

Infrastructure

A restriction on the number of API requests a user or application can make within a given time period. Rate limits protect AI services from overload and ensure fair usage across customers.

Red Teaming

Safety & Ethics

The practice of systematically testing AI systems by attempting to find vulnerabilities, biases, and failure modes through adversarial prompting. Red teaming helps identify safety issues before models are deployed to the public.

RLHF

Models & Architecture

Reinforcement Learning from Human Feedback. A training technique where human evaluators rank model outputs to create a reward signal that guides the model toward more helpful and harmless behavior. RLHF is a key step in making LLMs safe and useful.

S

Sentiment Analysis

Applications

The use of AI to identify and classify the emotional tone or opinion expressed in text, such as positive, negative, or neutral. Sentiment analysis is widely used in social media monitoring, customer feedback, and market research.

Structured Output

Techniques

AI model responses that follow a specific data format such as JSON, XML, or a defined schema rather than free-form text. Structured output makes it easier to integrate AI responses into software applications programmatically.

Summarization

Applications

The task of condensing longer text into a shorter version while preserving the key information and meaning. AI-powered summarization can handle documents, articles, meeting transcripts, and conversations.

SWE-bench

Training & Data

A benchmark that evaluates AI models on real-world software engineering tasks drawn from GitHub issues and pull requests. SWE-bench measures a model's ability to understand codebases and produce working fixes for actual bugs.

Synthetic Data

Training & Data

Artificially generated data created by AI models or algorithms rather than collected from real-world sources. Synthetic data is increasingly used to train and fine-tune AI models when real data is scarce, expensive, or privacy-sensitive.

System Prompt

Techniques

A special instruction provided to an AI model that sets its behavior, persona, and constraints for an entire conversation. System prompts are typically hidden from the end user and define how the model should respond.

T

Temperature

Techniques

A parameter that controls the randomness of an AI model's output. Lower temperatures produce more focused and deterministic responses, while higher temperatures increase creativity and variation but may reduce accuracy.

Text-to-Image

Applications

AI models that generate images from natural language text descriptions. Popular text-to-image models include DALL-E, Midjourney, and Stable Diffusion, which can create realistic photos, illustrations, and artwork from prompts.

Text-to-Video

Applications

AI models that generate video content from natural language descriptions. Text-to-video represents a frontier in generative AI, with models like Sora producing increasingly realistic video clips from text prompts.

Throughput

Infrastructure

The number of requests or tokens an AI system can process per unit of time. High throughput is essential for serving many users simultaneously and is a key metric when evaluating AI infrastructure.

Token

Models & Architecture

The basic unit of text that AI models process, typically representing a word, subword, or character. A single word may be split into multiple tokens, and most models process about 3-4 tokens per English word.

Tokenization

Training & Data

The process of breaking text into tokens that an AI model can process. Different models use different tokenization schemes, which affect how efficiently they handle various languages and special characters.

Related:Token Pre-training Embedding

Tool Use

Techniques

The ability of AI models to interact with external tools and services such as web browsers, code interpreters, calculators, and APIs to accomplish tasks. Tool use is a key capability that enables agentic AI behavior.

Top-p / Top-k Sampling

Techniques

Decoding strategies that control which tokens the model considers when generating text. Top-p (nucleus sampling) selects from the smallest set of tokens whose cumulative probability exceeds p, while top-k limits selection to the k most likely tokens.

Related:Temperature Inference Token

TPU

Infrastructure

Tensor Processing Unit. A custom AI accelerator chip designed by Google specifically for machine learning workloads. TPUs are optimized for the matrix operations used in training and running neural networks.

Related:GPU Inference Model Serving

Transformer

Models & Architecture

A neural network architecture introduced in 2017 that uses self-attention mechanisms to process sequences of data in parallel. Transformers are the foundation of virtually all modern large language models including GPT, Claude, Gemini, and Llama.

V

Vibe Coding

Techniques

A style of programming where developers describe what they want in natural language and let AI generate the code, focusing on the creative direction rather than writing code manually. Vibe coding lowers the barrier to building software and is popular for prototyping.

Z

Zero-shot Learning

Techniques

A technique where an AI model performs a task without being given any examples in the prompt, relying entirely on its pre-trained knowledge. Zero-shot performance indicates how well a model generalizes to new tasks out of the box.

Want to go deeper?

Check out our in-depth guides and resources.

AI Learning Guides Prompt Library AI Courses