Back to BlogCase Study

How a Series B Fintech Deployed AI Agents Safely (Composite)

|4 min read|
case-studycompositeai-agentsfintech

This post describes a hypothetical scenario based on common patterns we observe in distributed engineering teams. It is not a specific customer. Details have been generalized, and the outcomes are framed in directional terms rather than as precise measurements.

The company in this composite is a Series B fintech with a 45-engineer distributed team. The CTO had been under pressure for two quarters to demonstrate AI adoption inside engineering — both from the board, who wanted to see the company "leverage AI," and from engineers, who were already using third-party coding assistants and wanted institutional support. The CTO's concern was specific: in a fintech context, an AI agent that confidently answers a question about regulatory-adjacent code with a wrong answer is a real liability, not a productivity nuisance.

The structural problem

Most off-the-shelf coding assistants have no notion of declared state. They will happily summarize a service based on the code in front of them, without any awareness of what the team most recently decided about that service or what the current deployment posture is. In a fintech context, that creates a particular failure mode: the assistant gives a confident answer about something the team has explicitly decided to handle differently, and the engineer either trusts the wrong answer or has to verify it manually, in which case the assistant has saved no time.

The CTO's structural concern was that AI inside engineering had to refuse — had to be willing to say "this is not declared, I do not know" — for the AI to be trustworthy in a regulated context.

The intervention

The team rolled out wraps as the layer of declared state, and scoped Representatives as the layer of grounded AI answering. The Representative answered engineering questions only from declared state, with citations, and refused when the answer was not in the record. The team continued to use third-party coding assistants for code completion and refactoring — that is a different shape of task — but routed factual questions about team state and decisions to the Representative.

The CTO instituted a guideline: any AI-generated summary that was going to be referenced in a compliance-adjacent decision had to come with a citation back to the underlying wrap. No citation, no use.

Governance, not a status channel

StandIn is async governance infrastructure. Engineers declare working state before they go offline. Representatives answer from the record, cite the source, and refuse when the answer is not there.

Request access →

The directional results

After about five months, the team reported three directional changes. First, the engineers used AI inside engineering more, not less — the scoped Representative was something they trusted, where ungrounded assistants had been treated with suspicion. Second, the number of confidently-wrong AI summaries about team decisions dropped to effectively zero, because the Representative refused rather than hallucinated. Third, the compliance team — initially skeptical of any AI inside engineering — became more comfortable with the rollout once they saw that the AI was producing citable references rather than free-form summaries.

The friction the team did not anticipate was the gap between what engineers wanted AI to do and what the Representative would actually do. Engineers occasionally asked the Representative questions that required reasoning over code that was not in any wrap. The Representative refused. The team had to coach engineers on which kinds of questions belonged in which AI surface, which took time.

What the team would do differently

The retrospective surfaced three lessons. First, frame the refusal behavior as a feature from the start; engineers who expect a "smart" assistant interpret a refusal as a failure unless they have been told otherwise. Second, integrate the compliance team into the rollout early — they will want to know how this surface works before they will trust it. Third, pair the Representative with a separate coding-assistant surface; the two solve different problems and should not be presented as the same thing.

Frequently asked questions

Is this a real fintech?

No. This is a composite based on common patterns we observe in fintechs deploying AI inside engineering under regulatory pressure. The specifics have been generalized. The structural pattern — grounded AI, refusal as a feature, compliance team involvement — is what we see consistently.

Why not just use a normal coding assistant?

Normal coding assistants answer based on the code in front of them. They have no model of what the team most recently decided. In a fintech context, the team's decision is often the load-bearing fact — the code may be old, but the decision about how to interpret it is current. A grounded Representative respects that; a code-only assistant does not.

Is refusal really a feature?

In a regulated context, yes. A refusal points to a gap in the declared state; a hallucinated answer creates a downstream liability. The teams that adopt this framing earliest get the most value from it. The teams that fight the refusal behavior end up with the same problem the ungrounded assistants had.

Get async handoff insights in your inbox

One email per week. No spam. Unsubscribe anytime.

Ready to eliminate your daily standup?

Distributed teams use StandIn to start every shift with full context — no standup required. Engineers post a 60-second wrap. The next shift wakes up knowing exactly what to work on.

You might also like