Module 023 · Agent ROI. · Bots of Today

Section 01

Hello

Opens the module·The ROI trap

Most agent ROI numbers are fiction. Here's how to produce the real ones.

Your CFO asked: "what's the ROI on the AI agents we're paying for?" You don't have a clean answer. Neither does anyone. "Time saved" estimates are soft. "Tickets deflected" is gameable. "Productivity lift" is a vibe.

The problem isn't that agents don't produce value. It's that most companies never instrumented for measuring value. By the time the CFO asks, there's no way to answer honestly.

This module is 90 minutes of building the measurement you needed 6 months ago. By the end:

Four categories of agent value, each measurable.
An ROI dashboard for one specific agent in your stack.
A quarterly review ritual so the CFO's next ask has an answer.

Thinker.

Section 02

Thinker

Reasoning·Four kinds of agent value

Agents produce value in four categories. Measuring them is different work per category.

Cost reduction. Hours not spent, headcount avoided, vendor fees not paid. Directly measurable in dollars.
Revenue produced. Qualified leads created, deals influenced, churn prevented. Measurable with attribution work.
Quality lift. Lower defect rates, faster cycle times, higher customer satisfaction. Measurable with before/after.
Capability unlock. Things the team can now do that they couldn't before. Hardest to measure but often the biggest real value.

The ROI trap

Reporting only category 1 (cost reduction) makes agents look boring. It undervalues capability unlock. A good ROI report weights all four.

The counterfactual problem

ROI is always "what happened vs. what would have happened without the agent." You never see the counterfactual directly. You estimate it. Be honest about the estimate's noise.

Talker.

Section 03

Talker

Prompts·The measurement prompt

The measurement prompt

Run this once per agent, quarterly, to produce a real ROI brief.

You are a finance analyst. I'm producing a quarterly ROI
brief for an AI agent. I'll give you:

- The agent's name and what it does.
- Volume data: # calls, # actions, over the last 90 days.
- Cost data: LLM spend, engineer hours maintaining.
- Outcome data: [relevant metric: tickets resolved, leads
  qualified, drafts produced, etc.].

Produce a brief with:
1. Cost incurred (hard dollars).
2. Estimated cost avoided (hours saved × loaded rate).
3. Estimated revenue influence (if applicable, with
   assumption stated).
4. Quality signal (before/after, if available).
5. Capability unlock (what can we now do? 1 sentence).
6. Net assessment: positive, break-even, negative. One
   sentence of reasoning.

Be conservative on estimates. Flag assumptions. Do not
manufacture precision.

The "do not manufacture precision" line matters. CFOs smell false precision. Ranges beat fake decimals.

Rememberer.

Section 04

Rememberer

Memory·Metrics as memory

ROI data has a home. Without one, every quarter you rebuild the dataset.

[company-repo]/roi/
  agents/
    support-agent/
      usage-Q1-2026.csv
      outcomes-Q1-2026.csv
      brief-Q1-2026.md
    sales-agent/
      ...
  loaded-rates.md         (fully-loaded cost per role)
  attribution-rules.md    (how to count influenced revenue)

The loaded-rates file

Finance gives you fully-loaded hourly cost per role. SDR: $80/hr. Support rep: $50/hr. Engineer: $150/hr. Write it down once. Everyone references the same number.

The attribution rules

Revenue influence is a policy question. Does a lead that the agent qualified and a human closed count 100% for the agent? 50%? 0%? Write the rule. Apply consistently.

Doer.

Section 05

Doer

Actions·Build your first ROI dashboard

Twelve minutes. Produce a real ROI brief for one agent in your stack.

Build block · 12 minutes

Build an ROI brief

Step 1. Pick the agent (1 min)

Whichever you're least sure is paying off.

Step 2. Pull the data (4 min)

Three numbers, minimum:

LLM spend, last 90 days.
Engineer hours maintaining, last 90 days.
One outcome metric (tickets resolved, leads created, drafts shipped).

Step 3. Run the measurement prompt (2 min)

Paste the prompt from Talker with your data. Save the output as brief-Q[N].md.

Step 4. Stress-test the assumptions (3 min)

Read every assumption the brief makes. Ask: is this number a vibe or a fact? Adjust. Conservative beats overstated.

Step 5. Share (2 min)

Send to your CFO or whoever asked. Make the assumptions visible. Offer to redo with better data next quarter.

Expected

One honest ROI brief. Not a defense, not a pitch. An analysis. That's the baseline for the rest of the year.

If something's wrong

You have no outcome data: instrument now. Every agent needs at least one outcome metric. If it doesn't have one, you can't measure it.
The brief looks too good: you're being generous. Cut the estimate by 30% and see if it still holds.
The brief looks bad: that's useful information. Maybe this agent should be retired, not reported.

Rookie.

Section 06

Rookie

Pitfalls·Three ROI failures

Failure 1. Reporting activity as outcome

"Our agent handled 5,000 tickets this quarter." That's activity. The outcome question is: were those tickets resolved? At what customer satisfaction? The activity metric hides the outcome.

Fix: every agent dashboard shows activity AND outcome. If you can't measure outcome, start there before claiming value.

Failure 2. Fake precision

"This agent saved $127,453.22 this quarter." No it didn't. It saved somewhere between $40k and $200k, with a lot of assumptions. Writing "$127k" hides the range.

Fix: report ranges. "Cost savings: $50k-$150k, assuming X." Executives respect honesty about uncertainty more than false precision.

Failure 3. Ignoring capability unlock

The agent let you respond to customers 24/7 in 10 languages. That's not in the hours-saved column. It's a new capability.

Fix: report capability unlocks as qualitative wins with quantifiable downstream effects (e.g., "launched in 3 new markets").

Manager.

Section 07

Manager

Team process·ROI on a team

One owner per agent's ROI

The person who owns the agent also owns its ROI brief. Not finance. Not the data team. The owner. Because the owner knows the assumptions.

The quarterly ritual

Every quarter, every agent owner produces a brief using the prompt from Talker. 30 minutes per agent. Files them in roi/agents/[agent]/brief-Q[N].md.

The team lead compiles them into a one-page summary for leadership. Nothing more ornate needed.

The retirement conversation

If an agent's brief is negative two quarters in a row, schedule a retirement conversation. Maybe the agent should be tuned. Maybe it should be killed. Don't let dead agents linger.

Chief.

Section 08

Chief

Governance·ROI governance

Risk 1. ROI as optics

Boards love AI ROI numbers. That creates pressure to produce them, even when the data is thin. Resist.

Governance: if an agent doesn't have clean outcome instrumentation, report "instrumentation in progress, preliminary estimates only." Don't let optics drive fiction.

Risk 2. Underinvestment in measurement

Measurement is boring. Building new agents is exciting. Teams keep shipping new agents without measuring the old ones. Two years later, you have 15 agents and 2 real ROI numbers.

Governance: budget measurement as a percentage of build. 10-20% of agent engineering time goes to measurement infrastructure. Not optional.

Risk 3. The capability-unlock blind spot

The biggest value agents create (doing things you literally couldn't do before) is the hardest to measure. If you only report cost savings, you systematically undervalue the most strategic agents.

Governance: every ROI brief has a "capability unlock" section, even if qualitative. Train leadership to read that section as equally important.

Founder.

Section 09

Founder

Synthesis·The solo ROI loop

Solo founder: your ROI conversations are with yourself and your investors.

The one-page ROI doc

Once a quarter, 30 minutes, update one markdown file:

# Agent ROI, Q[N]
- [agent 1]: what it costs, what it produced, what it unlocked
- [agent 2]: ...
Honest assessment: which I'd keep, which I'd kill, which
I'd double down on.

The gut check

For each agent: if I had to pay its cost out of my own pocket this month, would I? If yes, it's earning. If no, it isn't. The gut answer is usually right.

The one thing to remember

Measure outcomes, not activity.

Agents are easy to operate and hard to evaluate. The discipline that separates useful AI programs from expensive ones is: ship an outcome metric with every agent. Review it quarterly. Kill the losers. Double down on the winners. No shame in either move.