PRAETOR — Agentic M&A Due-Diligence Engine
Pick any company on a 3D globe and watch 9 specialist agents run parallel due diligence on real public data, cross-reference each other, and render a cited go/no-go verdict — with a human on the final gate.
The problem
M&A due diligence is the textbook case for multi-agent AI: thousands of documents, a hard exclusivity clock, and value that lives in the connections across workstreams — a lawsuit Legal finds is a provision Finance must book and a disclosure the seller must make. Humans lose those threads across siloed teams.
The three things that kill "AI for diligence": no provenance (so nothing is trustable), no cross-workstream linking (so it is just N chatbots), and full automation of a decision that legally must have a human on the hook. PRAETOR is built around fixing exactly those three.
Architecture
- 1
Engagement builder
Resolves a company from the global GLEIF registry or a US ticker (SEC EDGAR), geolocates the HQ (GeoNames) for the 3D globe, or ingests an uploaded report (PDF/DOCX/TXT).
- 2
Knowledge layer
SQLite medallion lakehouse (bronze/silver/gold), a networkx deal knowledge graph (entities + OWNS/DIRECTOR_OF edges, UBO unwind), and a BM25 + hashing-embedding hybrid index — all in-process, keyless.
- 3
9 specialist agents (parallel)
INTEGRITAS (sanctions/PEP/UBO), FISCUS (QoE from XBRL), LEX (contracts + litigation), AGORA, CENSUS, PEOPLE, CYBER, ESG, and PULSE (live web pulse via Groq compound web search).
- 4
Synthesis
De-duplicates and CROSS-REFERENCES findings across workstreams into one ranked red-flag register, then derives the valuation-impact layer (price adjustments, escrow, indemnities, conditions precedent).
- 5
TRIBUNAL
Deterministic decision rule → GO / NO-GO / CONDITIONAL with ranked rationale, deal-breakers and protections; the LLM only writes the prose.
- 6
Human-in-the-loop + audit
High-severity findings raise sign-off gates; every step is written to a replayable audit log. Live agent activity streams to the UI over SSE.
Key tradeoffs
Agents EXTRACT findings deterministically from real sources; the LLM only narrates and ranks.
Why · In diligence a confident-but-wrong claim is a liability. Facts must come from the source, not the model.
Provenance invariant enforced centrally — no finding without a source_ref.
Why · "Please cite your sources" is a prompt, not a control. The Finding constructor itself guarantees every claim is traceable to a filing, registry record, or web article.
Real data only — SEC EDGAR, OFAC SDN, GLEIF, GDELT, live web — no synthetic data room.
Why · A capability you can verify on a real public company beats a demo on invented data.
Keyless & offline-capable; one Groq key unlocks live narration + web search (compound model).
Why · The whole pipeline runs with zero keys (deterministic mock + GDELT baseline); the key only enriches it — no single point of failure.
Eval results
Every emitted finding carries a usable source_ref — enforced in the Finding constructor, not by prompt. Verified across full runs.
Dispatched concurrently per engagement; each writes severity- and confidence-scored, cited findings into the shared register.
On the First Solar case after an adversarial 2-verifier audit: an offensive IP suit reclassified as an asset (not a liability), boilerplate risk-factor language stopped tripping HIGH findings.
A multi-agent adversarial audit (each finding double-verified) surfaced 25 issues — path traversal, regex precision, a missed 10-K Item-15 note section, geocoder collisions — all fixed and re-verified.
Production proof
The artifact that keeps the numbers honest — the eval harness / monitoring gates that run in CI, not a one-off notebook result.
Provenance + human gates by construction
CI · passingPRAETOR drafts; humans decide. Critical/high findings force escalation gates and nothing goes final without sign-off — and every line of the verdict traces back to a source.
Let's talk
I'm focused on finance AI — credit risk, RegTech, AML, and agentic investment research. Open to roles, mentorship, and collaborators in fintech, quant, and bank AI.