The industry throws around a lot of jargon. Here’s the vocabulary that actually matters — each term with its real definition and a plain-English translation, so you can talk models, agents, and governance without a computer-science degree.
A model trained on vast text to predict and generate language; the general-purpose engine behind most AI today.
In plain terms The text-prediction brain — it finishes your sentence, but at the scale of writing whole documents.
A base model further trained to follow instructions and hold a conversation, rather than just continue text.
In plain terms An assistant that does what you ask — versus raw autocomplete that just keeps typing.
The raw pretrained model before it's tuned to follow instructions or behave safely.
In plain terms The engine straight off the line, before it learned any manners.
A model tuned to “think” through intermediate steps before answering — better at math, code, and planning, but slower and pricier.
In plain terms The one that shows its working instead of blurting the first answer.
A compact model that's fast and cheap and can run on modest hardware or on-device.
In plain terms A pocket calculator to the LLM's mainframe — perfect for simple, high-volume jobs.
A model that handles more than text — images, PDFs, audio, sometimes video — and reasons across them.
In plain terms One that can see and hear, not just read.
A model that turns text (or images) into numerical vectors capturing meaning, so things can be compared and searched by similarity.
In plain terms It turns meaning into map coordinates, so “close” ideas sit near each other.
The billions of learned values that encode what a model “knows.” Downloadable weights are what make a model “open.”
In plain terms The millions of tiny knobs the training process dialed in.
How much text a model can consider at once — prompt plus conversation plus retrieved data.
In plain terms Its desk space, or short-term memory span: only so much fits at one time.
The unit of text a model reads and writes — roughly a word-piece. Usage and cost are measured in tokens.
In plain terms Syllable-sized chunks; also the meter your usage runs on.
A setting that controls randomness in a model's output — low is focused and repeatable, high is varied and creative.
In plain terms The creativity dial.
Extra training on your examples to specialize a model's behavior, tone, or format.
In plain terms Sending the model to a short course to learn your house style.
Compressing a model's weights to lower precision so it runs faster and on cheaper hardware, with a small quality trade-off.
In plain terms Shrinking the file so it fits on a smaller machine.
A model wrapped with tools and a loop so it can pursue a goal by taking actions — not just answer a question.
In plain terms An AI worker that pushes buttons and gets things done, rather than a chatbot that only talks.
An agent that holds a goal across many steps — hours or days — sequencing work and recovering from failures.
In plain terms An AI that works a whole project, not just one reply, and remembers where it was.
The repeating cycle an agent runs: plan, act, observe the result, reflect, and go again until the goal is met.
In plain terms Its work rhythm — try, check, adjust, repeat.
A defined capability you give a model — query a database, send an email, run a search — that it can choose to invoke.
In plain terms A button you let the AI press; “tool calling” is it deciding which to press.
A specialized agent that handles one slice of a larger job, often in parallel with others.
In plain terms A team member the lead delegates a task to.
The agent that breaks a job into steps, routes them to the right workers or models, and merges the results.
In plain terms The project manager directing the team.
A plan of tasks and their dependencies with no loops — used to fan work out in parallel and recombine it.
In plain terms A flowchart of who-does-what, with no going in circles.
The scaffolding around a model that gives it tools, memory, a loop, and an environment to act in. The model thinks; the harness lets it do.
In plain terms The cockpit the AI drives from — engine versus the whole vehicle around it.
Several agents working together, often a coordinator plus specialized workers.
In plain terms A team of AIs instead of a lone one.
Passing a task and its context from one agent (or a human) to another.
In plain terms A baton pass in a relay.
Describes AI that can act autonomously over multiple steps toward a goal, rather than responding once.
In plain terms Self-directed, not just reactive.
An agent pattern that interleaves reasoning and tool use — think, act, observe, think again.
In plain terms Think a bit, do a bit, repeat — instead of planning everything blind.
An orchestration pattern that plans all the steps up front, then executes them, to cut the number of model calls. Largely superseded in modern systems by DAG-based coordination.
In plain terms Write the whole plan first, then run it — versus deciding each step as you go.
An agent approach that drafts a full plan, then carries it out step by step.
In plain terms Make the to-do list, then work it top to bottom.
Splitting a job across parallel workers (fan-out) and merging their results (fan-in).
In plain terms Divide the work among the team, then collect and combine the pieces.
Saving an agent's progress so it can resume after a failure instead of starting over.
In plain terms Save points in a game — a crash doesn't cost you the whole run.
The input you give a model — your instructions, question, and any context.
In plain terms What you tell it to do.
The standing instructions that shape how a model behaves across a whole session.
In plain terms The job description it's working under.
Information an agent keeps across steps or sessions, beyond the current context window.
In plain terms What it remembers between conversations, not just within one.
Everything currently in front of the model — prompt, history, and retrieved data.
In plain terms What's on its desk right now.
Fetching relevant facts from your documents or data at query time and grounding the answer in them.
In plain terms An open-book exam — it looks things up instead of answering from memory.
A store of embeddings that lets you search by meaning rather than exact keywords.
In plain terms A search index organized by what things mean, not just the words.
Tying a model's answer to real, citable sources so it isn't invented.
In plain terms Cite your sources — show where the answer came from.
An isolated, controlled space where an agent runs and executes actions, so mistakes can't touch production systems.
In plain terms A walled playground — it can experiment without breaking anything real.
Constraints that keep a model's behavior within safe, intended bounds.
In plain terms Bumpers in the bowling lane.
A design where a person reviews or approves the AI's work at key points.
In plain terms A person signs off before it counts.
A checkpoint where a consequential action pauses for human sign-off before proceeding.
In plain terms A “wait for a yes” stop on the assembly line.
An attack where malicious instructions hidden in content trick an agent into doing something it shouldn't.
In plain terms A con slipped into a document the AI reads and naively obeys.
When a model states something false with full confidence because it filled a gap with plausible fiction.
In plain terms Making things up — fluently, which is what makes it dangerous.
Systematically testing model or agent output against known-good criteria.
In plain terms Graded tests for the AI, so you can prove it's right often enough.
Stripping personally identifiable information out of data before it reaches a model.
In plain terms Blacking out names and numbers before anything leaves the building.
The policies, access controls, and audit that make AI accountable and provable.
In plain terms The rules plus the paper trail.
A complete log of what an agent did, when, and on whose approval.
In plain terms The receipts.
Controls that detect and block sensitive data from leaving the organization.
In plain terms The guard that stops the donor list or a contract from being emailed or pasted out.
Crafting input that tricks a model into bypassing its own safety rules.
In plain terms Sweet-talking the AI into breaking the rules it was given.
The unauthorized transfer of data out of a system.
In plain terms Data being smuggled out the back door.
Corrupting training or source data so a model learns the wrong thing or behaves maliciously.
In plain terms Spiking the AI's diet so it picks up bad habits on purpose.
Compromising a trusted third-party model, library, or tool in order to reach you through it.
In plain terms Getting at you through a vendor you already trust.
Probing a model's outputs to steal its weights or reconstruct the data it was trained on.
In plain terms Reverse-engineering the AI to copy it or pull secrets back out.
A security model that verifies every request and trusts nothing by default — inside or outside the network.
In plain terms Check ID at every door, not just the front gate.
Granting the minimum access needed to do a job, and no more.
In plain terms Hand out only the keys someone actually needs.
Permissions assigned by role rather than per individual.
In plain terms What you can touch is set by your job, not by asking nicely.
Scrambling data in storage and as it moves so only authorized parties can read it.
In plain terms A lock on the filing cabinet and on the armored truck.
Systematically mapping how a system could be attacked, and the defense for each path.
In plain terms Thinking like the attacker before they show up.
Deliberately attacking your own system to find weaknesses before an adversary does.
In plain terms Hiring friendly burglars to test the locks.
Employees using unsanctioned AI tools outside of IT's oversight.
In plain terms Staff quietly pasting work into consumer chatbots.
The overarching practice of continuously assessing and improving the security posture of your AI systems and the data and tools they touch.
In plain terms An always-on health check for how secure your whole AI setup is.
Inbound protection for AI workloads — screening every incoming request for jailbreaks, prompt injection, and data-leakage attempts before it reaches the model.
In plain terms The bouncer at the door, checking each request for tricks before it gets in.
Outbound data protection — scanning what the AI produces, and what its tools return, for PII or sensitive-data exfiltration before it leaves.
In plain terms The guard at the exit, making sure nothing sensitive walks out in the AI's answer.
Governing what an AI agent is entitled to access — the permissions and policies (RBAC / ABAC) that bound its reach.
In plain terms The access-control office deciding which doors the AI is allowed to open.
Continuously checking cloud configurations and connected systems for misconfigurations and risk.
In plain terms A standing audit of your cloud setup for accidentally-open doors.
A control point between users and cloud services that enforces security policy on the traffic between them.
In plain terms A checkpoint between your people and the cloud apps they use.
The first certifiable AI management system standard — govern AI as an ongoing Plan-Do-Check-Act discipline.
In plain terms The certificate that says “we govern our AI responsibly, and can prove it.”
The US government's voluntary AI risk framework, organized as Govern, Map, Measure, Manage.
In plain terms A practical playbook for spotting and handling AI risk.
European regulation that classifies AI uses by risk level and restricts or bans the high-risk ones.
In plain terms The law that controls AI based on how risky its use is — with real penalties.
An independent audit report on how a service provider handles security and customer data.
In plain terms The trust report buyers ask for before they'll share data with you.
The international standard for an information security management system.
In plain terms The gold standard for “we keep data secure.”
The NIST Cybersecurity Framework: Identify, Protect, Detect, Respond, Recover.
In plain terms The standard checklist for defending an organization against cyber risk.
The security community's list of the top risks in LLM applications, with mitigations.
In plain terms The top-ten ways AI apps get hacked — and how to stop each.
A knowledge base of real-world adversary tactics and techniques used against AI systems.
In plain terms A field guide to how attackers actually go after AI.
A way to diagram software at four zoom levels: context, container, component, code.
In plain terms A method for drawing systems that everyone reads the same way.
Cloud providers' pillars for sound systems: security, reliability, performance, cost, and operations.
In plain terms A checklist for building cloud systems that hold up in production.
Web Content Accessibility Guidelines — the standard for accessible digital products.
In plain terms The rules that make sites usable for people with disabilities.
A scaled, human-centered design framework built on a continuous loop, Hills, Playbacks, and Sponsor Users.
In plain terms A structured way to keep designing around real users at scale.
A delivery method with fixed time, variable scope, and “bets” instead of an endless backlog.
In plain terms Budget the time, shape the work to fit it, and bet on shipping that.
A design-process model with four phases: discover, define, develop, deliver.
In plain terms The classic shape of a design project — open up, then narrow down, twice.
A governed entry point that sits between your data and the models — authenticating, redacting, routing, and logging every request.
In plain terms The guarded front door, or customs: nothing crosses unchecked.
An open standard for connecting models to tools and data sources in a consistent way.
In plain terms A universal adapter — USB-C for AI tools.
A way for agents to communicate and delegate to one another directly.
In plain terms AIs talking to AIs to get a job done together.
A defined way for two pieces of software to talk to each other.
In plain terms A waiter that takes your request to the kitchen and brings back the result.
The act of running a trained model to produce an output.
In plain terms The AI actually “thinking” to answer — and what you pay for per use.
The delay between a request and the response.
In plain terms How long you wait for the answer.
Purpose-built code that connects an agent to a specific system or workflow when a standard connector won't do.
In plain terms A made-to-measure adapter for a system that doesn't come off the shelf.
Repeatedly asking a system, on a set interval, whether anything new has happened.
In plain terms Refreshing your inbox every minute to see if mail arrived.
The reverse of polling — a system calls you the moment something happens, so you don't have to keep asking.
In plain terms Getting a text when your package ships, instead of checking the site all day.
Sending a response piece by piece as it's produced, rather than all at once at the end.
In plain terms Watching the answer type out live instead of staring at a spinner.
A cap on how many requests you can make in a given window of time.
In plain terms The “please slow down” a service enforces when you ask too fast.
Designing an action so that repeating it has no extra effect.
In plain terms Pressing the elevator button twice doesn't summon two elevators.
Processing a whole pile on a schedule versus handling each item the instant it arrives.
In plain terms Doing the laundry once a week versus washing each sock as it's worn.
A secret credential that authenticates your software to a service.
In plain terms The password your app logs in with — if it leaks, anyone can run up your bill or reach your data.
A specific address a service exposes for one function.
In plain terms The particular door you knock on for a particular thing.
A ready-made toolkit for building on top of a service.
In plain terms The parts kit, so you don't machine every screw yourself.
A line where tasks wait to be processed in order.
In plain terms The deli-counter ticket line.
Stored results kept so you don't recompute the same thing twice.
In plain terms Keeping a copy handy instead of redoing the work.
A chain of automated processing steps, each feeding the next.
In plain terms An assembly line for data or work.
Doing several things at the same time rather than one after another.
In plain terms Opening more checkout lanes so the line moves faster.
A written promise on availability and performance (SLA), and the share of time a service is actually running (uptime).
In plain terms The guarantee, in writing, that the lights stay on.
Crafting instructions, context, and examples so a model reliably produces what you need.
In plain terms Asking well — phrasing and framing the request so you get a good answer.
Giving the model a few worked examples in the prompt (few-shot) or none at all (zero-shot).
In plain terms Showing it samples versus just asking cold.
Prompting or training a model to reason step by step before answering.
In plain terms Telling it to show its working.
Forcing a model to return data in a strict, machine-readable format like JSON.
In plain terms Making it fill out a form exactly, instead of writing a paragraph.
Training a smaller, cheaper model to mimic a larger one's behavior.
In plain terms A star pupil learning to do the expert's job at a fraction of the cost.
Models, agents, memory, governance — we turn the vocabulary into working systems for organizations without an AI team.