Async Governance Glossary
The vocabulary of declared-state coordination for distributed engineering teams. These are not aspirational terms. They are operational ones. Each definition describes something a team either has or does not have.
Most teams running across time zones have the tools. What they lack is the language — a shared vocabulary for the structural layer that sits between communication and work. This glossary defines the terms that make that layer legible.
Async governance
The discipline of defining who knows what, who owns what, and what happened since the last person was online — without requiring anyone to ask, infer, or speculate.
Async governance is not async communication. Communication tools transfer information. Governance infrastructure transfers state and accountability. A team can have excellent async communication and still have no governance layer. The difference shows up at the handoff.
The term distinguishes between two things most organizations conflate: the ability to send messages without being in the same room, and the ability to maintain operational continuity without being in the same timezone. The first is async communication. The second is async governance.
Foundation concept. See also: async governance infrastructure.
Async governance infrastructure
The system-level implementation of async governance. The set of protocols, tools, and structures that allow a distributed team to function with continuity — without depending on any individual's availability.
Async governance infrastructure governs three things: what is declared when someone goes offline, who holds authority when the primary owner is unavailable, and how the next person picks up work without needing to ask the previous person a question.
It is not project management. Project management tracks artifacts. Governance infrastructure tracks human context and declared intent.
Async standup
A structured asynchronous check-in format in which team members post written updates on what they worked on, what they plan to work on, and what is blocking them — without a scheduled meeting.
Async standups replicate the format of a standup meeting without its only functional advantage: real-time clarification. They collect information but do not transfer state. An async standup tells you what someone did. A declared handoff tells you what you need to do next.
Often confused with: declared handoff. The distinction matters — see that entry.
Continuity layer
The structural mechanism that ensures work advances across timezone boundaries without loss of context, ownership clarity, or decision authority.
A continuity layer is not documentation. Documentation captures the past for future reference. A continuity layer propagates the present — the current state of active work — so that the next person online can act without waiting.
Teams without a continuity layer experience a predictable set of symptoms: blocked PRs waiting for a reviewer who is asleep, decisions delayed until the primary decision-maker comes online, and context that exists only in the memory of the engineer who just went offline.
Declared handoff
A structured transfer of working state from one engineer — or one shift — to the next, made explicit before the outgoing person goes offline.
A declared handoff contains: what shipped, what is in progress, what is blocked and why, who owns the next action, and when the originating engineer will return. It is not a summary of what happened. It is an instruction set for what happens next.
The word "declared" is load-bearing. Declarations are explicit, timestamped, and attributable. They do not require the recipient to infer, ask, or guess.
Contrast with: async standup, which collects but does not transfer state.
Declared state
The explicit, published record of an engineer's or team's current working context — available to anyone who needs it, without requiring the originator to be online.
Declared state includes: what the person is working on, what is blocked, what decisions have been made, and what the next person needs to know before touching the work. It is the difference between a team whose continuity depends on who is available and a team whose continuity is built into the system.
The opposite of declared state is ambient awareness — the informal, unstructured, often-invisible knowledge that exists only in the heads of people who happen to be online at the same time.
Foundation concept for: declared handoff, StandIn's wrap protocol.
Engineering wrap
A structured end-of-shift declaration made by an engineer before going offline. The wrap declares current state, outstanding blockers, decisions made during the shift, next actions and their owners, and the engineer's expected return.
The wrap is the atomic unit of async governance infrastructure. It is not a status update. A status update describes what happened. A wrap transfers responsibility for what happens next.
StandIn's core protocol.
Follow-the-sun development
A software development model in which engineering work is passed continuously across time zones — each shift picking up where the previous one left off — theoretically enabling 24-hour development cycles.
Follow-the-sun development fails in practice when teams have communication infrastructure but no governance infrastructure. Passing work across time zones requires more than a ticket in Jira and a message in Slack. It requires a declared transfer of state: current progress, open questions, blockers, and decision authority. Without that transfer, each shift loses 30 to 90 minutes reconstructing context before it can move.
Governance layer
The structural layer between communication and coordination in a distributed engineering organization. The governance layer defines what must be declared, when it must be declared, and who holds authority when the primary owner is unavailable.
Most distributed teams have a communication layer (Slack, email, Loom) and a project tracking layer (Jira, Linear, GitHub). Very few have a governance layer. The absence of the governance layer is the structural cause behind most distributed team coordination failures — missed handoffs, duplicated work, blocked decisions, and context that exists only in memory.
Distinct from: communication tools, project management tools, AI assistants.
Handoff context
The body of information that must be transferred from one engineer or shift to the next for work to continue without interruption. Handoff context includes: current task state, open decisions, blockers, relevant reasoning, and next actions.
Handoff context is not the same as documentation. Documentation captures what was done. Handoff context captures what needs to happen next and what the next person must know before touching the work. Most distributed team failures trace back to handoff context that was lost, assumed, or never declared.
Silence over speculation
The operating principle that when a declared state does not exist, the correct response is to acknowledge the absence of information — not to infer, synthesize, or guess.
Silence over speculation is the constraint that makes async governance infrastructure trustworthy. AI systems that synthesize answers from partial signals introduce accountability risk: a team might act on an inferred answer that turns out to be wrong. A governance system that refuses to guess — and explicitly says so — forces teams to build the habit of declaration.
The constraint that looks like a limitation is the source of the trust.
StandIn's core product philosophy.
State transfer
The act of passing the current working state of a piece of work from one person to another — explicitly and completely — so that the recipient can act without requiring the originator.
State transfer is the technical term for what a declared handoff achieves. It is distinct from information transfer, which is what async communication achieves. Information transfer tells you what happened. State transfer tells you what to do next, who owns it, and what authority you have to act.
Time-bounded representation
A temporary, explicitly declared form of delegated presence. A time-bounded representative holds decision-making authority for a defined scope during a defined period — typically while the primary owner is offline or in deep work.
Time-bounded representation differs from permanent delegation in that it has an explicit expiry. When the primary owner returns or the declared period ends, authority reverts automatically. It is not inferred from inactivity. It is not triggered by a calendar event. It is declared, scoped, and time-limited.
Representative
A Representative is the queryable version of a published wrap. When an engineer publishes their end-of-day handoff, that record becomes a Representative that teammates can ask questions to. The Representative answers from what was written, cites the source, and refuses when the answer isn't in the record.
Representatives are not bots. They don't infer, summarize Slack, or read between the lines. They don't read DMs, track activity, or analyze sentiment. Every answer traces back to a specific person, a specific wrap, and a specific timestamp. Every refusal means the record doesn't contain the answer.
Three types exist: Personal Representatives (one person's published work), Team Representatives (a team's combined wraps), and Project Representatives (an initiative spanning teams and timelines). All three follow the same rules: sourced answers or silence.
Core product concept. See also: Personal Representative, Team Representative, Project Representative.
Personal Representative
A Personal Representative is the queryable version of one person's published wrap. When Sarah publishes her end-of-day handoff, her Personal Representative goes live. Teammates can ask it questions about her work, and it answers from what she wrote.
The Personal Representative knows only what its author published. It cannot read their DMs, infer their mood, or guess at information they didn't include. When it answers, it cites the specific wrap and timestamp. When it refuses, it tells you the information wasn't declared.
Team Representative
A Team Representative combines published wraps from every member of a team into a single queryable surface. Instead of reading six individual wraps, you ask the Team Representative a question and get a sourced answer that spans the entire team's output.
Team Representatives turn status meetings into queries. 'What shipped overnight?' returns a sourced list from every engineer who published a wrap. 'Who owns the auth service?' returns a name and a timestamp. Every answer cites the specific engineer and wrap it came from.
Project Representative
A Project Representative spans teams and timelines for a specific initiative. It pulls context from every team contributing to a project and makes it queryable through a single surface.
Project Representatives answer questions like 'What's the status of the payments migration?' by drawing from wraps across three teams in two timezones. Every answer traces back to a specific engineer's wrap. When the answer isn't in any published record, the Representative refuses.
Async work
Async work is a mode of operating in which progress does not depend on people being online at the same time. Tasks move forward through written declarations, structured records, and explicit ownership rather than through real-time conversation. The unit of work is the artifact, not the meeting.
It is distinct from remote work. Remote work describes where people are. Async work describes how decisions and handoffs travel. A team can be fully remote and still operate synchronously — interrupting each other in Slack, waiting on calls, and blocking on people in other timezones.
Async work only functions when state is declared in advance. Without that declaration, async devolves into delay: each shift waits for the previous one to wake up and explain what happened.
DORA metrics
DORA metrics are four measurements of software delivery performance defined by the DevOps Research and Assessment group: deployment frequency, lead time for changes, change failure rate, and mean time to recovery. They originated as research findings and were later adopted as a benchmarking framework for engineering organizations.
DORA metrics measure throughput and stability. They do not measure coordination cost, handoff quality, or decision latency. A team can hit elite DORA targets while still losing two hours every morning reconstructing what happened overnight.
The limit of DORA is that it treats engineering as a delivery pipeline. It rewards teams that ship often and recover fast. It is silent on whether the team can answer a question without paging someone, or whether decisions stall when an owner is offline.
Engineering velocity
Engineering velocity is the rate at which a team converts decisions and code into shipped outcomes. It is the compound result of individual throughput, decision speed, review latency, and handoff quality. No single one of those produces velocity alone.
Velocity is often confused with output. Output measures activity — commits, tickets closed, lines written. Velocity measures the speed of the full cycle from intent to production. Two teams with identical output can have very different velocity if one waits 24 hours for every review.
In distributed teams, velocity is dominated by coordination cost. The team with declared state, decision authority maps, and structured handoffs ships faster than the team with brilliant individuals and broken handoffs.
Standup meeting
A standup is a short, recurring meeting — typically 15 minutes — in which each team member shares what they completed since the last meeting, what they intend to work on next, and what is blocking them. The format originated in agile software development and was designed for co-located teams.
The standup serves two functions in co-located work: surfacing blockers in real time and creating peer awareness of the team's shape. Both depend on people being in the same room at the same time.
In distributed teams, the standup often breaks down. It forces someone into an inconvenient timezone, transmits information that gets forgotten before it can be acted on, and replaces declared state with verbal summaries that do not survive the meeting.
Postmortem
A postmortem is a structured review conducted after an incident or completed project. It captures what happened, why it happened, what was done in response, and what the team intends to change. Well-run postmortems are blameless — focused on systems and processes rather than individuals.
A postmortem is distinct from a status update. Status updates describe ongoing work. Postmortems analyze closed events. The value of a postmortem comes from its commitment to record-level honesty about decisions and failures, including ones that look bad in hindsight.
Postmortems become useful institutional memory only when they are stored where future engineers can find them. A postmortem written in a Google Doc that no one ever opens again is not a postmortem — it is therapy.
Decision log
A decision log is a chronological record of decisions made by a team. Each entry captures the question that was decided, the options considered, the chosen path, the rationale, and the people accountable. The log is append-only — entries are not edited as circumstances change.
Decision logs are distinct from project plans and from postmortems. Plans describe intent. Postmortems analyze events. Decision logs capture the moment of choice and the reasoning at that moment, with no benefit of hindsight.
The value of a decision log compounds over time. Six months after a decision, the team can answer "why did we choose this?" without depending on anyone's memory. Two years later, the log explains the codebase to engineers who joined after the choice was made.
RFC (request for comments)
An RFC, or request for comments, is a written proposal circulated to a defined audience for structured feedback before a decision is made. The author lays out the problem, the proposed solution, alternatives considered, and tradeoffs. Reviewers comment in writing, and the author either revises or proceeds.
RFCs are distinct from design docs in intent: a design doc explains a chosen approach, while an RFC asks whether the approach should be chosen. They are distinct from decision logs in time: the RFC is the deliberation, the decision log is the outcome.
RFCs work especially well for distributed teams because the entire decision process happens in writing. There is no privileged information from a meeting that didn't get recorded.
Engineering management
Engineering management is the discipline of coordinating engineers, decisions, and outcomes inside a software organization. It covers hiring, performance, prioritization, cross-team alignment, and the operational health of the team. It is a leadership role, not an individual contributor role.
Engineering management is distinct from technical leadership. A tech lead owns architectural decisions and technical direction. An engineering manager owns people, priorities, and the operating system of the team. The two roles overlap but are not the same.
In distributed organizations, engineering management is increasingly about designing systems that do not depend on the manager being online. The manager who is the bottleneck for every decision is the manager whose team stalls every weekend.
Tech lead
A tech lead is an engineer responsible for the technical direction and architectural decisions of a team or project. They define how the team builds things, review critical design choices, and resolve technical disagreements. The role typically does not include direct management authority over the engineers.
Tech lead is distinct from engineering manager. A manager owns people, priorities, and performance. A tech lead owns the technical bar and the architectural shape of the work. Some organizations combine the roles (TLM); most keep them separate.
In distributed teams, the tech lead carries an additional burden: the technical context that lives in their head must be made explicit. Otherwise, decisions stall whenever they are offline.
Return to office (RTO)
Return to office, commonly abbreviated RTO, refers to a corporate policy that requires employees to work from the office on specified days per week. RTO policies expanded broadly after 2022 as many organizations attempted to reverse the fully remote arrangements adopted during 2020-2021.
RTO is often framed as a culture or productivity intervention, but in practice it is frequently a response to coordination failures: handoffs that did not survive remote work, decisions that stalled overnight, and onboarding that broke down without in-person presence.
RTO and async work are not the same conversation. RTO addresses where people work. Async work addresses how work travels. A team that solves the second often makes the first irrelevant.
Hybrid work
Hybrid work is an arrangement that combines in-office and remote workdays. The mix varies — some organizations require three days in office, others two, some flex by team. The defining feature is that on any given day, part of the team is co-located and part is remote.
Hybrid work is structurally harder than either fully remote or fully co-located arrangements. In a fully co-located team, everyone benefits from in-person coordination. In a fully remote team, everyone is forced to use async governance. In hybrid, the team alternates between two operating models with different rules, and information frequently leaks between them.
The most common failure mode of hybrid: decisions made in the office that never reach the remote half of the team.
Distributed team
A distributed team is a team whose members work from multiple locations and, typically, multiple timezones. The defining property is that no single location holds a majority of the team at any given time. Some distributed teams are fully remote; others have multiple offices.
Distributed team is sometimes used interchangeably with remote team, but the terms are not equivalent. A remote team describes work-from-anywhere arrangements. A distributed team specifically implies operational spread across boundaries — geographic, timezone, or organizational.
The challenge of a distributed team is not where people work. It is whether decisions, handoffs, and context can travel across those boundaries without loss.
Remote work
Remote work is an arrangement in which employees work from locations outside a central office — from home, from co-working spaces, or from any other location with connectivity. The term describes geography, not method.
Remote work is often conflated with async work, but they are different. Remote describes where the work happens. Async describes how decisions and handoffs travel. A team can be fully remote and still operate synchronously, with constant calls and real-time interruptions.
Most remote teams do not benefit fully from their remote arrangement until they also adopt async governance. Otherwise, they recreate the in-office coordination model over video, with all of its costs and none of its incidental benefits.
Async communication
Async communication is the practice of exchanging messages without requiring a real-time response. Email, Slack threads, recorded videos, and written documents are async communication. The recipient reads on their own schedule and replies when ready.
Async communication is necessary for distributed teams. It is not sufficient. Communication tools transfer information; they do not transfer state or accountability. A team can have excellent async communication and still have no governance layer.
The mistake most teams make is assuming that adopting async tools — Slack, Loom, Notion — produces async operation. The tools enable async communication. They do not automatically produce declared state, handoff protocols, or decision authority. Those require deliberate design.
Engineering operations
Engineering operations, sometimes shortened to EngOps, is the discipline of running the systems that allow engineers to do their work. It covers onboarding, tooling, internal processes, on-call rotations, planning rituals, and the operational infrastructure of the team.
EngOps is distinct from platform engineering, which focuses on internal developer platforms and shared infrastructure. EngOps is broader — it includes the human and process layer in addition to the technical one.
In distributed teams, engineering operations is increasingly about coordination infrastructure: handoff protocols, declared state, decision authority maps, and the governance layer that lets the team operate continuously.
Developer experience (DevEx)
Developer experience, often abbreviated DevEx or DX, is the quality of the day-to-day experience of building software inside a given organization. It covers tooling, friction, flow time, feedback loops, and the cognitive load of doing the job. Good DevEx removes friction so engineers can focus on the work itself.
DevEx is distinct from developer productivity. Productivity measures output. DevEx measures the conditions that make that output possible. Investing in DevEx is how organizations make productivity sustainable.
For distributed teams, DevEx includes coordination friction: how long it takes to get a review, how often work is blocked waiting for someone in another timezone, how much time is spent reconstructing context. Those are first-order DevEx concerns.
Platform engineering
Platform engineering is the practice of building internal developer platforms that abstract infrastructure and recurring concerns so that product engineering teams can ship faster. The platform team treats developers as customers of an internal product.
Platform engineering is distinct from DevOps and from SRE. DevOps is a cultural movement; SRE is a reliability discipline; platform engineering is a product approach to internal infrastructure. The deliverable is a platform with users, not just a pipeline.
Good platform engineering produces leverage. One small platform team can multiply the velocity of every product team in the organization. Bad platform engineering produces an internal product no one wants to use, and the product teams build around it.
Engineering productivity
Engineering productivity is the rate at which a team converts engineering effort into shipped value. It is the broadest performance question an engineering organization can ask, and it is widely misunderstood because it is widely conflated with activity.
Activity metrics — commits, tickets closed, PRs merged — measure motion, not productivity. A team can be highly active and unproductive: shipping work no one needs, fixing bugs that recur, or rebuilding things that already existed. Activity measures what people did; productivity measures whether it mattered.
True engineering productivity depends on coordination as much as on individual throughput. The team that ships less but ships the right thing is more productive than the team that ships more of the wrong thing.
Engineering coordination
Engineering coordination is the work of aligning engineers, decisions, dependencies, and outcomes across the boundaries of a team or organization. It includes handoffs between shifts, alignment across teams, and the propagation of decisions from where they are made to where they are needed.
Coordination is distinct from communication. Communication moves information. Coordination moves outcomes. Two teams can communicate constantly and still fail to coordinate — they exchange messages but do not align action.
In distributed teams, coordination cost is often the dominant cost. The team is paying for every cross-timezone handoff, every delayed decision, every duplicated effort caused by missing context.
Engineering handoff
An engineering handoff is the transfer of in-progress work from one engineer to another — across shifts, timezones, or team boundaries. The handoff is successful when the receiving engineer can continue the work without needing to ask the previous engineer questions.
Handoffs are the most common failure point in distributed engineering. The work that travels between people is often half-finished, mid-decision, and dependent on context that lives only in the previous engineer's head. Without an explicit transfer of state, that context is lost.
A well-run handoff is a small artifact, not a conversation. It captures current progress, open decisions, blockers, ownership, and next actions in a form the next engineer can read in two minutes and act on immediately.
Engineering onboarding
Engineering onboarding is the process of bringing a newly hired engineer from day one to productive contribution. It typically spans the first 30 to 90 days and covers environment setup, codebase familiarization, team context, and the unwritten norms of how the team operates.
Onboarding in distributed teams is harder than in co-located ones, for a specific reason: a new engineer cannot pick up tribal knowledge by sitting near the right people. Everything they need must be written down or made queryable.
Teams with strong governance infrastructure — decision logs, postmortems, declared state, RFCs — onboard new engineers significantly faster. The new engineer reads the record instead of interviewing every senior engineer in the org.
Engineering mentorship
Engineering mentorship is the deliberate transfer of judgment, context, and craft from a more experienced engineer to a less experienced one. It is distinct from management — the mentor does not typically have authority over the mentee's performance — and it is distinct from training, which transfers explicit knowledge rather than judgment.
Good mentorship operates on questions the mentee does not yet know to ask. The mentor surfaces the considerations that come from experience: which decisions matter, which tradeoffs are real, which patterns are traps. Documentation cannot replace this transfer, though documentation makes it more efficient.
In distributed teams, mentorship requires extra effort because the incidental learning of overhearing senior engineers think out loud is largely absent.
Engineering ladder
An engineering ladder is a structured progression of role levels within an engineering organization, with defined expectations at each step. Ladders typically describe technical scope, impact, leadership, and autonomy at each level — from junior engineer through staff, principal, and beyond.
Ladders serve two functions. They make promotion criteria explicit so engineers know what to grow toward, and they create comparable role definitions across teams so that "senior engineer" means the same thing in different parts of the organization.
Good ladders separate the management track from the individual contributor track. Both can grow indefinitely. Bad ladders force every senior engineer into management to keep advancing.
Engineering scaling
Engineering scaling is the process of growing an engineering organization without losing throughput per engineer. It covers the structural questions that arise as a team moves from five engineers to fifty to five hundred: team boundaries, communication patterns, platform investment, and the governance layer that holds it all together.
Scaling is not the same as hiring. A team can hire aggressively and still not scale — adding people often reduces per-engineer throughput because coordination cost grows faster than capacity. Scaling is what happens when an org grows and the per-engineer output holds or increases.
The hardest part of scaling is structural, not personal. The patterns that worked at ten engineers break at thirty. The patterns that worked at thirty break at a hundred. Each transition requires deliberate redesign of how decisions, handoffs, and ownership flow.
Cross-functional collaboration
Cross-functional collaboration is coordinated work across the functional boundaries of an organization — engineering, product, design, marketing, sales — toward a shared outcome. It is the operating model that most modern product organizations use to ship anything user-facing.
Cross-functional collaboration succeeds when each function brings its specific expertise without re-litigating decisions that belong to other functions. Engineering owns how. Product owns what. Design owns the experience. When those boundaries blur in real time, decisions slow and trust erodes.
In distributed organizations, cross-functional collaboration is harder because the boundary-blurring conversations that happen in offices — at lunch, in hallways — do not happen. They must be replaced with explicit structures: shared documents, written decisions, and declared ownership.
Sprint review
A sprint review is a meeting held at the end of a sprint in which the team demonstrates completed work to stakeholders and gathers feedback. The focus is the product — what was built, what was delivered, what was deferred — rather than the team's process.
Sprint review is distinct from sprint retrospective. The review looks outward at the product and stakeholders. The retrospective looks inward at how the team worked together. Conflating the two is one of the most common failures of poorly run agile practices.
In distributed teams, sprint reviews often work better when the demonstration is recorded and shared async, with a synchronous window reserved for live questions and discussion.
Sprint planning
Sprint planning is the meeting held at the start of a sprint in which the team commits to a body of work for the upcoming cycle. The meeting typically covers prioritization of the backlog, breakdown of work into tasks, estimation, and final commitment to a sprint goal.
Sprint planning differs from broader product planning in scope and time horizon. Product planning covers quarters and roadmaps. Sprint planning covers one to four weeks of execution-level commitments.
In distributed teams, sprint planning often benefits from async preparation. The team reviews the proposed backlog in writing before the meeting, surfaces questions in comments, and uses the synchronous time only for the final commitment.
Engineering culture
Engineering culture is the set of norms, values, and habits that define how an engineering team actually operates. It is observable in the small decisions: how code is reviewed, how disagreements are resolved, how mistakes are handled, how new engineers are treated.
Engineering culture is distinct from stated values. A team can have a values document that emphasizes psychological safety and a real culture that punishes mistakes. The document is aspirational; the culture is what people experience.
In distributed teams, culture is harder to transmit because so much of it is normally absorbed through proximity. Distributed teams that develop strong culture do it deliberately — through written norms, explicit examples, and the visible behavior of senior engineers in public channels.
AI agent
An AI agent is a software entity that takes actions toward a goal with some degree of autonomy. Unlike a passive AI assistant that responds to direct prompts, an agent maintains state across steps, plans sequences of actions, and operates without continuous human input.
The defining property of an agent is action, not generation. A chatbot generates text. An agent does things — calls APIs, edits files, makes commitments, sends messages. That difference creates a different governance question: not what did the AI say, but what is the AI allowed to do.
In organizational contexts, the central question for agents is authority. An agent that acts on behalf of a person or team must have explicit, scoped, and revocable authority — declared in advance, not inferred from convenience.
Agentic AI
Agentic AI describes AI systems that act with goal-directed autonomy across multiple steps — planning, taking action, observing results, and continuing without explicit prompting at each step. It is the class of system rather than any specific product.
The term is currently more marketing than reality. Many products described as agentic are agents in the narrow sense — they execute multi-step workflows — but operate within very constrained scopes. The hard problems of organizational agency, such as durable authority and accountability across action sequences, are largely unsolved.
For engineering organizations, the relevant question is not whether a system is "agentic" but whether the actions it takes are scoped, declared, and reviewable.
Autonomous AI
Autonomous AI describes AI systems that take actions without human review at each step. The level of autonomy is a spectrum — from systems that act independently within tightly scoped rules to systems that pursue open-ended goals with minimal supervision.
Autonomy is not the same as agency. An agent is a system that takes actions. An autonomous system is one that takes actions without immediate human approval. Most AI agents in production today are not fully autonomous — they include human-in-the-loop checkpoints at consequential moments.
The governance question for autonomous AI is the same question that applies to any delegate: what is the scope of authority, how long does it last, and who is accountable when it acts.
AI orchestration
AI orchestration is the coordination of multiple AI models, tools, and steps into a coherent workflow. It includes routing decisions between models, chaining outputs, calling external tools, managing state across steps, and handling failures.
Orchestration is distinct from the underlying models. The model is the engine; orchestration is the operating system that decides when to call it, what to feed it, and what to do with the result. Most production AI value comes from orchestration, not from raw model capability.
For engineering teams, AI orchestration is a software engineering discipline that happens to involve AI calls. It requires the same rigor as any production system: observability, error handling, version control, and clear ownership.
Human in the loop (HITL)
Human in the loop, often abbreviated HITL, is an AI design pattern that requires explicit human approval at consequential moments in an otherwise automated workflow. The system can act on its own for routine steps but pauses for human review when actions cross a threshold of impact or risk.
HITL is distinct from fully manual systems and from fully autonomous ones. The point of HITL is to capture the speed of automation for routine work while preserving human accountability for high-stakes decisions.
The hard part of HITL design is choosing where the human enters the loop. Too late and the human is just rubber-stamping. Too early and the automation provides no leverage. The right placement is at the moments where authority is actually being exercised.
AI accountability
AI accountability is the structural assignment of responsibility for AI actions to specific humans or roles inside an organization. It answers the question: when this AI system acts, who is accountable for the consequences?
Accountability is distinct from blame. The point is not to find someone to punish when AI fails. The point is to ensure that every AI action has a named human owner who is empowered to approve, audit, and revoke the system's authority. Without that named owner, AI mistakes become institutional voids.
Building AI accountability requires three things: declared scope of what the AI can do, named owners for each scope, and an audit trail that makes every action attributable. None of these is the AI's responsibility — they are the organization's.
AI governance
AI governance is the set of policies, controls, and structures that determine how AI systems are deployed, monitored, and held accountable inside an organization. It covers what AI is allowed to do, who can authorize new uses, how outcomes are audited, and how systems are retired or revoked.
AI governance is distinct from AI safety and from AI alignment. Safety asks whether AI systems can be made reliable in principle. Alignment asks whether their goals match human intent. Governance asks how an organization actually exercises authority over AI in its day-to-day operation.
Good AI governance is not a document. It is operational infrastructure — visible in the gates around AI deployment, the audit logs of AI action, and the named owners of every AI scope.
AI alignment
AI alignment is the research and engineering discipline focused on making AI systems pursue intended human goals reliably. It covers both the technical problem of specifying goals correctly and the engineering problem of building systems that follow those goals under pressure, drift, and distributional shift.
Alignment is distinct from AI safety in the narrow sense, though they overlap. Safety covers the broader question of whether AI systems are trustworthy. Alignment focuses specifically on the goal-pursuit question.
For most engineering organizations, alignment is not directly actionable — the work happens at the model layer, in labs. What organizations can do is build the governance layer around AI so that alignment failures are caught and contained, not amplified.
Retrieval augmented generation (RAG)
Retrieval augmented generation, abbreviated RAG, is an AI architecture pattern that retrieves relevant documents from a knowledge source and includes them in the model's context window before generating an answer. The model's response is then grounded in the retrieved material rather than only its training data.
RAG is distinct from fine-tuning. Fine-tuning bakes knowledge into the model's weights through additional training. RAG keeps the model unchanged and supplies fresh, organization-specific knowledge at query time. Each has tradeoffs; for most organizational knowledge use cases, RAG is the more practical pattern.
The quality of a RAG system depends almost entirely on the quality of the underlying retrieval. A model with bad context produces bad answers, regardless of the model's capability.
AI workflow
An AI workflow is a multi-step process that uses AI models for one or more of its steps. The workflow defines inputs, outputs, transitions between steps, and human checkpoints. It is the operational unit that turns a model call into a business process.
An AI workflow is distinct from an AI agent. A workflow is a structured pipeline with defined steps. An agent can dynamically plan its own steps. Most production AI deployments are workflows, not agents — explicit step-by-step pipelines are easier to test, monitor, and govern.
For engineering teams, AI workflows should be treated like any production system: version-controlled, observable, testable, and owned. The fact that one step calls a model does not change the engineering discipline required.
AI ops
AI ops is the operational discipline of running AI systems in production. It covers deployment, monitoring, evaluation, cost management, drift detection, model rotation, and the lifecycle management of prompts, models, and orchestration logic.
AI ops is to AI systems what DevOps is to software systems. The deliverable is reliability and observability — the team can answer what the system did, why it did it, and whether it is still performing within expected bounds.
The discipline is young. Tooling is fragmented. Most teams running AI in production are inventing significant parts of their own AI ops infrastructure.
AI infrastructure
AI infrastructure is the technical foundation that supports AI workloads inside an organization. It includes model serving infrastructure, vector stores, embedding pipelines, orchestration runtimes, evaluation tooling, and the observability layer required to operate any of it in production.
AI infrastructure is distinct from AI applications. The infrastructure is the substrate; the applications are what you build on top of it. Most organizations consume infrastructure from providers — OpenAI, Anthropic, cloud platforms — and build applications and orchestration on top.
The infrastructure choice has outsized downstream consequences. Vendor lock-in, cost trajectory, latency, and the ceiling on what the organization can build are all set at the infrastructure layer.
AI safety in business
AI safety in business is the practical discipline of deploying AI inside an organization without creating accountability gaps, security risks, regulatory exposure, or reputational damage. It is the operational counterpart to academic AI safety research.
Business AI safety is distinct from theoretical alignment work. Theoretical work asks whether superintelligent systems can be made safe in principle. Business AI safety asks whether the AI a company deploys this quarter does not hallucinate medical advice, expose customer data, or take actions the organization cannot defend.
The practical work is unglamorous: declared scope, audit trails, evaluation pipelines, human-in-the-loop checkpoints, and a culture that treats AI mistakes as system failures rather than individual errors.
Large language model (LLM)
A large language model, abbreviated LLM, is a neural network trained on massive amounts of text and code to generate and reason over natural language. Modern LLMs include the GPT series, Claude, Gemini, and others. They are the underlying technology behind most current generative AI products.
LLMs are distinct from earlier NLP systems in two ways: scale and generality. They are trained on far more data than prior systems, and they perform across a wide range of tasks without task-specific fine-tuning. The same model that drafts an email can write code, summarize a document, or answer factual questions.
For engineering teams, the LLM is a component, not a product. The product is the orchestration, retrieval, prompt design, evaluation, and human-in-the-loop layer built around the model.
Prompt engineering
Prompt engineering is the practice of designing the inputs given to a language model to produce reliable, useful outputs. It includes the wording of instructions, the structure of context, the inclusion of examples, and the format of the expected response.
Prompt engineering is sometimes dismissed as the practice of finding clever phrasings. In production systems, it is closer to API design: the prompt is the contract between the application and the model, and small changes have large downstream consequences. Reliable systems treat prompts as versioned, tested, and owned.
As models improve, individual prompt tricks become less important. What persists is the discipline of structuring inputs so that model behavior is predictable and improvements in the model translate into improvements in the system.
AI hallucination
An AI hallucination is a confident, fluent, plausible-sounding answer from an AI system that is factually wrong. The system does not signal uncertainty; the output looks indistinguishable from a correct answer.
Hallucination is a structural property of how language models work — they generate the most likely continuation of a prompt, not the truest one. Truth and likelihood are correlated but not identical, and the gap is where hallucinations live.
For organizational use of AI, hallucination is the central reliability problem. The fix is not at the model layer — it is in the surrounding architecture: grounding answers in retrieved sources, requiring citations, and accepting refusals when the system cannot answer reliably.
Decision making
Decision making is the process of choosing among options under uncertainty. In organizational contexts, it is rarely the act of a lone decider — it is the result of information flow, authority structure, deliberation, and the speed at which choices can be made and communicated.
Organizational decision making is distinct from individual judgment. An individual can have good judgment and still be unable to decide because the authority is not theirs. An organization can have good information and still be slow because the decision path is unclear.
Distributed teams face an additional decision-making constraint: synchronous decision-making does not scale across timezones. Either decisions move async, or they freeze whenever the decider is offline.
Decision documentation
Decision documentation is the written record of decisions made by a team, including the question, the options considered, the chosen path, the rationale, and the people accountable. It is the artifact that lets future engineers understand why the system looks the way it does.
Documentation is distinct from the decision itself. The decision happens once; the documentation is what survives. A team can make excellent decisions and have no documentation of them, and the value of those decisions decays the moment the people involved move on.
Good decision documentation is append-only and immutable. Edits to capture later context happen as new entries that reference the original, not as edits to the original record.
Decision velocity
Decision velocity is the rate at which an organization moves from open question to committed answer. It is the time between when a decision is needed and when the decision is actually made — distinct from the time it takes to implement the resulting work.
Decision velocity is distinct from engineering velocity. A team can execute quickly and still ship slowly because every decision routes through a 48-hour deliberation cycle. Engineering velocity is bounded by decision velocity.
In distributed teams, decision velocity is dominated by authority and asynchrony. The team with clear async decision paths makes ten decisions while the team with synchronous-only paths makes two.
Decision quality
Decision quality is the goodness of a decision evaluated by the process and information available at the time it was made, rather than by the outcome alone. A good decision made with the best available information can produce a bad outcome; a bad decision can occasionally produce a good outcome. Both should be evaluated by process, not luck.
Decision quality is distinct from decision velocity. The two often trade off — slowing down a decision can improve its quality, but only up to a point. Past that point, the additional deliberation produces no improvement and the cost of delay accumulates.
Measuring decision quality requires written records of the decision context, options, and reasoning at the time. Without those records, hindsight bias dominates the review.
Decision making framework
A decision making framework is a structured approach for making, documenting, and reviewing decisions consistently across an organization. Common frameworks include RACI, RAPID, DACI, and Amazon's one-way-door/two-way-door distinction. Each defines roles, sequence, and the form of the decision record.
Frameworks are not the decision itself. A framework provides scaffolding for the deliberation; the substance of the decision still depends on judgment, information, and authority. The framework's value is in making the process repeatable and the records consistent.
Distributed teams benefit more from explicit frameworks than co-located teams, because the implicit conventions that build up around a conference table do not build up around a Slack channel.
RACI
RACI is a decision-making and responsibility framework that assigns four roles to each task or decision: Responsible (does the work), Accountable (owns the outcome and approves), Consulted (provides input before the decision), and Informed (told after the decision). Each role can be a single person or a defined group.
RACI's strength is forcing the team to make implicit roles explicit. In its absence, accountability and responsibility blur — everyone assumes someone else is on it. The RACI matrix names them.
RACI's weakness is overhead. For small teams and routine decisions, a full RACI matrix is more bureaucracy than the decision warrants. The framework works best for cross-functional initiatives where roles legitimately need to be declared.
Consensus decision making
Consensus decision making is an approach in which a group seeks broad agreement among its participants before committing to a decision. It is distinct from majority voting and from authority-based decisions in that the goal is not a winning side but a position the whole group can support, even if not enthusiastically.
Consensus is not unanimity. Most consensus processes allow participants to "stand aside" — to disagree without blocking — so that a small minority does not have veto power. The threshold varies by group culture and decision stakes.
Consensus has clear benefits for buy-in and clear costs for speed. It works well for high-stakes decisions affecting a small group and works poorly for high-frequency decisions affecting a large one.
Asynchronous decision making
Asynchronous decision making is the practice of reaching decisions without requiring all participants to be online at the same time. The decision document or proposal circulates in writing, participants comment on their own schedule, and the decision is committed after a defined window or threshold.
Async decision making is distinct from delayed synchronous decision making. The latter is still a meeting — just one that took two weeks to schedule. The former replaces the meeting entirely with a written process.
For distributed teams, async decision making is not optional. Synchronous-only decisions across multiple timezones produce 24-48 hour delays per decision, which compounds across a week into multi-week effective lag.
Committee decision making
Committee decision making is an approach in which decisions of a given type are routed through a standing group rather than made by a single individual. Common examples include architecture review committees, hiring committees, and design review groups.
Committees increase representation and reduce single-point-of-failure risk in decision making. They also reduce decision velocity, sometimes dramatically. The right use of committees is for decisions whose impact justifies the slower process — typically high-stakes, long-horizon, or cross-cutting choices.
Misused, committees become accountability sinks: decisions that no one made because everyone made them, and that no one is responsible for if they fail.
Institutional knowledge
Institutional knowledge is the accumulated context, history, judgment, and lore that lives inside an organization. It includes why decisions were made, why some approaches were tried and abandoned, who knows what, and the unwritten conventions that govern how things actually work.
Institutional knowledge has two forms: explicit (written, queryable, durable) and tacit (held in people's heads). The explicit form survives turnover; the tacit form leaves with the person. Most organizations underinvest in converting the second form into the first.
In distributed teams, the question is not whether tacit knowledge exists but whether enough of it gets externalized that the team can survive a senior departure without losing major context.
Team coordination
Team coordination is the work of aligning the actions of team members so that the team produces coherent output. It covers handoffs, shared planning, decision propagation, and the operational mechanics of multiple people working on related things.
Coordination is distinct from communication. Communication moves information between team members. Coordination ensures their actions actually line up. Two team members can communicate constantly and remain uncoordinated.
For distributed teams, coordination is the dominant operational cost. The team that invests in coordination infrastructure outperforms the team that relies on goodwill.
Team alignment
Team alignment is the shared understanding within a team of priorities, goals, and the path the team intends to take toward them. Aligned teams agree on what they are doing and roughly why; misaligned teams may execute brilliantly on different things and not converge.
Alignment is distinct from agreement. A team can be aligned on the goal while disagreeing about tactics. Alignment that requires unanimous agreement is rarely achievable and not actually necessary — what matters is shared direction.
For distributed teams, alignment requires deliberate effort. The incidental conversations that produce alignment in co-located teams — the lunch table, the whiteboard session — happen rarely or not at all in distributed ones, and must be replaced with explicit alignment artifacts: written goals, shared plans, declared priorities.
Team continuity
Team continuity is the property that work and knowledge survive turnover, absence, shift changes, and the unavailability of any individual team member. A team with strong continuity does not collapse when its lead goes on vacation or when an engineer leaves.
Continuity is distinct from redundancy. Redundancy means multiple people can do the same thing. Continuity means the team's state and reasoning survives transitions — through declared records, documented decisions, and structured handoffs.
Most teams do not test their continuity until they need it, and by then the test is the failure. Strong continuity is built deliberately by externalizing knowledge before it is missed.
Knowledge transfer
Knowledge transfer is the deliberate movement of information, context, and judgment from one person or team to another. Typical occasions include onboarding, handoffs between teams, project transitions, and departures of key personnel.
Knowledge transfer is harder than it looks because most operational knowledge is tacit — held in the head of the person who has it, with no explicit form. A one-week handoff with the departing engineer often transfers a small fraction of what they actually know. The rest is lost.
Strong knowledge transfer practices externalize tacit knowledge continuously, not just at transition points. Decision logs, postmortems, and structured handoffs convert tacit knowledge into explicit form before it needs to be transferred.
Tribal knowledge
Tribal knowledge is the unwritten know-how that circulates among long-tenured team members — how things actually work, who to ask for what, which warnings in the codebase to ignore, which to take seriously. It is transmitted through proximity, mentorship, and informal conversation.
Tribal knowledge works at small scale. The whole team has been around long enough to share it. It fails at scale: new hires cannot absorb it fast enough, distributed members cannot pick it up by proximity, and a single departure can take significant operational knowledge with it.
The shift from tribal knowledge to institutional knowledge is one of the major transitions every growing team has to make. It is uncomfortable because it requires writing down what experienced engineers consider obvious.
Bus factor
Bus factor is the minimum number of team members whose sudden absence — hit by a bus, in the dark joke that gives the term its name — would seriously disrupt the team's ability to operate. A team with a bus factor of one has a single person whose disappearance would cripple the work.
Bus factor measures concentration of knowledge, ownership, and authority. It is distinct from team size. A team of ten people can have a bus factor of one if a single individual holds disproportionate context. A team of three can have a bus factor of three if knowledge is well distributed.
Raising bus factor is a structural exercise: pair on critical systems, write down decisions, distribute authority, and avoid hero-engineer dynamics that concentrate context in one place.
Shadow IT
Shadow IT is technology — software, services, or infrastructure — adopted and used by employees or teams without explicit sanction from the central IT or engineering organization. Common examples include unofficial SaaS subscriptions, personal cloud accounts used for work, or AI tools introduced informally.
Shadow IT is sometimes treated as a security failure to be eliminated. It is more usefully understood as a signal: when employees route around official tooling, the official tooling is failing to meet a real need.
The right organizational response is rarely just blocking. It is investigating what the shadow tool does well, building or sanctioning a controlled alternative, and improving the governance layer so users do not have to go shadow to get their work done.
Shadow knowledge
Shadow knowledge is operational knowledge held informally inside an organization — in private notes, DMs, individual heads, and ad-hoc spreadsheets — outside the organization's official documentation. It is the knowledge that exists but is not findable through the systems the organization considers authoritative.
Shadow knowledge is a close cousin of tribal knowledge but broader. Tribal knowledge tends to be cultural — how things are done. Shadow knowledge includes operational data: which customer needs what, which deployment failed how, which third-party API has the quirky behavior.
Almost every organization runs primarily on shadow knowledge. The official documentation is a thin shell over the real, undocumented working memory of its people.
Psychological safety in engineering teams
Psychological safety is the shared belief inside a team that members can take interpersonal risks — admit mistakes, ask basic questions, disagree with senior people, surface bad news — without fearing punishment or status loss. The concept comes from Amy Edmondson's research and was identified by Google's Project Aristotle as the most predictive factor in effective teams.
Psychological safety is not the absence of disagreement or accountability. Highly safe teams disagree intensely and hold high standards. What they do not do is punish the act of disagreeing or the act of being wrong in good faith.
In engineering specifically, psychological safety determines whether engineers report problems early or hide them, whether they ask for help on hard problems or struggle silently, and whether they push back on bad designs or comply quietly.
Working agreement
A working agreement is an explicit set of norms a team adopts about how it will operate together — how decisions are made, how meetings are run, what response times are expected, how disagreements are handled, how on-call works. The agreement is written down, agreed to, and revisited periodically.
Working agreements are distinct from policies. Policies come from above and apply across the organization. Working agreements are team-level and team-owned. They reflect the operational reality of how this particular team chooses to work, often in ways that diverge from organizational defaults.
For distributed teams, working agreements are especially valuable because they replace the implicit conventions that build up in co-located teams. Things that go unsaid in offices have to be said in distributed teams, or they go unobserved.
Wrap
See: Engineering wrap.
This glossary is maintained by StandIn. Terms reflect the vocabulary of async governance infrastructure — the layer distributed engineering teams need but rarely name.
Have a term that belongs here? Contact us.
Get the vocabulary that makes distributed teams work
One email per week on distributed team handoffs. No spam.
See the vocabulary in action.
StandIn is built around these concepts. Engineers publish a declared state before going offline. The next shift starts with full context. No standup required.