How we build · The Mathematical Autopsy

AI is built backwards.
So we inverted it.

Those six failures aren't six problems — they're one. Software is written first and tested after, so correctness is something you observe and hope for, never something you guarantee. With a probabilistic model underneath, that is fatal. So we flipped the order: prove the math first, then compile it into the code.

Talk to us Why it's built wrong →

How software is built today

write code→test→hope

Inverted ↻

The Mathematical Autopsy

intentwhat must be truesha256:e3b0…→calculusstated as mathsha256:4d8b…→proofchecked in Lean 4sha256:7f2a…→compiledinto the codesha256:c41a…

each part hashed, then chained & signed · ed25519, your key → Receipt of Truth

guaranteed · the system cannot violate it

Correctness stops being something you test for afterward. It becomes a property of how the software is built.

The root cause

Every other approach trusts code it only tested.

For fifty years software has been built the same way: write the code, then write tests, then ship and watch. Tests can only check the cases you thought of — correctness is something you sample after the fact, never something the system guarantees. For ordinary software that's a calculated risk. For a probabilistic model that can act on the world, it's the whole problem: you are hoping, at scale, forever. The six failures all grow from this one root.

The inversion

Most software hopes it's right. Ours proves it first.

You cannot make a guess trustworthy by stacking another guess on top to check it. The only way out is to change how the software gets built — so the guarantees are part of the construction, not a layer bolted on at the end.

How software is built today

write code
→ test
→ hope

Correctness is something you observe after the fact, with tests that go stale and a confidence score that hides the 1%.

becomes

The Mathematical Autopsy

prove the math
→ compile the rules in
→ enforce every step

Correctness is a structural fact of the build. The proven rule is enforced before any action fires — by construction, not by promise.

The method

Four moves, math first. Code last.

The AI still does the flexible part — proposing the math and writing the code. What changes is the order, and what gets trusted: the math, checked by a public proof system, before a line of code runs in production.

Define the intent as math

Start from what must always be true — the invariants the system can never violate — and write them as formal mathematical statements, not prose policy. A refund policy, a suitability rule, a dosing limit becomes a theorem to be proven, not a paragraph to be interpreted.

# intent, stated as an invariant
∀ order. dose(order) ≤ max_safe(patient)

Draft the proof, check it in Lean 4

An AI proof-drafting assistant proposes the math and a candidate proof. The Lean 4 kernel — a public, independently-auditable proof system — checks it. Nothing is taken on the model's word: a proof either passes the kernel or it does not exist. The AI does the flexible part; the math is what we trust.

checking dose_within_ceiling … ✓ accepted by kernel

Compile the proven rules into the app

The proven rules become constraints compiled directly into the running code — not a guardrail bolted on top, but part of how the software is built. The model is never exposed to the caller except through the constraints it has been proven to satisfy.

Gate the action, sign the decision

At runtime the gate evaluates the action before it fires. If a rule would break, the action never happens. Either way the decision writes a signed Decision Receipt — inputs, invariants checked, verdict, signature — replayable by your auditor on a clean machine, without us in the room.

# before the action, every time
gate(order) → ADMITTED · receipt 0x4e11 · signed

Watch it run

From a locked intent to a sealed proof.

This is MAE, the engine that runs the method. A plain-language rule goes in. It gets proven in math, stress-tested, compiled into governed code, and sealed into a signed receipt your auditor can re-verify offline.

MAE / mathematical autopsy

Runtime ready · sealed

Forge

Locked intent

Reading your intentPinning down exactly what must always hold

Proposing the calculusTurning the rule into math we can prove

Proving itChecking it holds for every input, not the easy ones

Stress-testing the edgesThrowing the nasty cases at it

Writing the governed codeCode that cannot drift from the proof

Sealing the receiptSigned, reproducible, verifiable offline

Packaging for exportProof, code, and receipt in one signed bundle

The math

Proof · Lean 4 kernel

Edge cases

Empty inputBoundary valueHostile inputConcurrent retry

Governed code

Packaged for export

Guarantee bundle

proof + governed code + signed receipt

.maw

Proven guarantee ·

Proven

The Receipt of Truth

Everything that proves the rule, sealed into one artifact.

When the build finishes, every part that establishes the rule is sound is hashed, chained together, and signed into the runtime. Change one byte of any of them and the signature breaks — so the binary carries cryptographic proof of its own origin.

Intent

The locked rule

Lean 4 proof

Kernel-checked

Invariants

What the runtime enforces

Notebook

Reproducible verification

Extracted code

The proven code itself

Kernel receipt

The kernel's accept record

Scorecard

Determinism · Totality · Soundness · Coverage

→each
SHA-256⋯hash-
chained⋯signed
ed25519→

Receipt of TruthSealed

Build-time · sealed into the runtime

chain7 parts · each sha256

digestsha256:9f2c…a1

signatureed25519 · your key

named byevery Decision Receipt

Sealedappend-only · verifiable offline

At runtime, every Decision Receipt points back to this seal — so a single decision can be traced all the way to the math that proved its rule.

What comes out the other side

Deterministic software, provable by construction.

Same input, same output, every time. And because the guarantees are built in rather than observed after the fact, the software carries seven properties by construction — each one a structural fact, not a promise.

The ordering — math first, code last — is the whole bet. It is what turns "we think it's safe" into "here is the proof, re-run it yourself."

Reproducible

The same inputs produce the same result, anywhere, any time.

Traceable

Every decision names the rule, the inputs, and the path it took.

Explainable

The reason is the proven invariant, not a post-hoc rationalization.

Auditable

A signed receipt per decision, readable by a third party.

Replayable

Re-run the exact decision on a clean machine and get the same answer.

Falsifiable

If a guarantee is wrong, the proof fails — visibly, not silently.

Verifiable

Signed with a key we don't hold — checkable off-platform, without us.

⎯

Deterministic

The root of all seven: no drift, no dice, the same governed output every call.

The proof object

One receipt. Five people satisfied.

Every governed decision ships a Decision Receipt — the single artifact that answers everyone who can pull the thread, without a meeting and without us in the loop.

Auditor Show me what it checked — and prove it ran.Regulator Prove the control applied before the action.Board Can we defend this if it goes wrong?Customer Was I treated the way the policy says?Court Produce the record — verifiable without you.

Decision Receipt#RX-3318

01Captured the order.patient 71kg · renal-adjusted · creatinine 1.8complete

02Refused the AI's first dose.8 mg/kg > renal-adjusted ceiling 5 mg/kg (Lean 4) — blocked before the orderrefused

03Admitted 4.5 mg/kg.inside the proven ceiling — so it firedadmitted

# same order in → same decision out run 2026-05-02 → refuse 8mg/kg · admit 4.5mg/kg run 2026-06-18 → refuse 8mg/kg · admit 4.5mg/kg ✓ byte-identical # deterministic — no drift, no exceptions

# the hospital's auditor, their machine, without us $ smarthaus verify RX-3318 signature ✓ valid (customer-held key) invariant ✓ re-proven (Lean 4) VERIFIED

Admitteddecided in 31ms · signed

Write the code, then hope it's right. Or prove it first, and never have to hope.

Why now

Formal proof just stopped being a lab exercise. The timing is the opening.

Three shifts landed at once. Each one was a wall until recently; together they make math-first AI buildable for the first time.

Proof got cheap

Machine-checkable proofs in minutes, not months.

An AI proof-drafting assistant can now produce the math and the proof, with a public kernel checking it. Formal verification used to be a research-lab cost; it is finally in reach of people building real software.

Agents act

AI stopped suggesting and started doing.

Agents now take real actions — they move money, send the disclosure, change the record. The moment AI acts, watching the output after the fact stops being enough.

A standard place to enforce

The industry agreed where the gate goes.

Tool-calling standards put a consistent seam between the model and the action it wants to take — exactly where a pre-action gate belongs.

The deadlines are real

Regulators arrived with dates, not opinions.

EU AI Act obligations on high-risk systems phase in through 2026; Colorado's AI Act lands Jan 2027. Buyers can no longer ship on a confidence score.

The substance underneath

120+ theorems machine-proven500+ invariant rules1,000+ verification testsThree patents filed

See it run on your hardest use case.

“You cannot govern a guess with another guess.”

Talk to us See the operators →

The clock is running — Colorado SB 26-189 · Jan 2027 · EU AI Act high-risk · 2027

AI is built backwards.So we inverted it.