Recoupe — Autonomous Subrogation
Seven agents that read a closed claim, assign fault by jurisdiction, compute what is recoverable, and pursue it — every decision citation-grounded and auditable.
The problem
US property & casualty insurers leave an estimated $15–25B in subrogation recovery on the table every year — not because they cannot recover it, but because human adjusters can only work the biggest files. The long tail of small claims gets dropped at intake.
Subrogation is an ideal agentic testbed: the ground truth is codifiable (negligence law is published, carrier behaviour is observable, recoverable amounts are derivable), and the decisions repeat with the same shape every time.
Architecture
- 1
Intake
Reads the claim file and extracts parties, losses, and fault facts (LLM extraction or deterministic heuristics).
- 2
Liability
Assigns the fault percentage under the correct state's negligence regime (comparative / modified / contributory).
- 3
Quantum
Computes the recoverable dollar amount given fault, damages, and policy limits.
- 4
Strategy
Decides pursue or drop, with the threshold tunable per carrier.
- 5
Demand
Drafts the demand letter with grounded statutory citations.
- 6
Negotiation
Works counter-offers against carrier-specific settlement behaviour.
- 7
Litigation
Escalates only when the expected value of suit beats settlement.
- 8
Audit trail
Every decision appended with model, confidence, and evidence; streamed live to the UI over SSE.
Key tradeoffs
Deterministic skeleton, LLM polish — the math is codified Python; the model extracts and narrates.
Why · Insurance is regulated. A system that produces different fault percentages run to run is not deployable; output is bit-identical without a key.
A citation-integrity guardrail rejects unsourced authorities before they reach the audit trail.
Why · Lawyers do not hire researchers who cite cases that do not exist; AI generating legal arguments should meet the same bar.
The codified knowledge base (per-jurisdiction negligence rules + carrier graph) is the moat, not the agent chain.
Why · Anyone can wire seven LLM calls together; almost nobody builds the per-jurisdiction map underneath.
Audit trail as a first-class product feature.
Why · The trail — model, confidence, evidence, approver — is what makes a compliance officer say yes.
Eval results
Actual recovered ÷ truly recoverable, on synthetic claims with known-true values.
Mean absolute error in the fault percentage vs known truth.
Mean error on the recoverable dollar amount.
Share of cited authorities that were genuinely retrieved.
Production proof
The artifact that keeps the numbers honest — the eval harness / monitoring gates that run in CI, not a one-off notebook result.
Self-grading on ground truth
CI · passingSynthetic claims carry known-true fault and recoverable values, so every agent decision is scored — most agentic demos have no quantitative answer to "how right is it?"
Let's talk
I'm focused on finance AI — credit risk, RegTech, AML, and agentic investment research. Open to roles, mentorship, and collaborators in fintech, quant, and bank AI.