Trust stack

What we can guarantee, and where the guarantee stops.

C Proof rules out categories of bugs. It does not certify that a model is the right model. The trust stack has three levels. Each one names what it removes from the table and the failure it cannot touch. Levels 1 and 2 ship today. Level 3 is future work, and we say so here so a buyer does not plan around something that is not there yet.

Trust stack diagram. Level 1 language guarantees and Level 2 spec verification are drawn as solid filled boxes marked shipped. Level 3 mechanized proof is drawn as a dashed outline marked future.
The three levels at a glance. Levels 1 and 2 are solid and shipped; Level 3 is a dashed outline because it is future work.

Level 1: the language makes whole bug classes impossible

Shipped

Chelis is a statically typed functional language. The type system tracks named tensor dimensions, numeric precision, differentiability, and ownership, and it checks them at compile time. A matrix multiply with mismatched dimensions does not compile. A function that drops a linear resource does not compile. A precision error between an f32 and an f64 path is caught before the binary exists.

Those are not run-time assertions that fire under load. They are categories of bug that the language refuses to express. The same checks apply to code a model generated as to code a person wrote.

The limit at Level 1. A program can be fully type-correct and still compute the wrong thing. The types say the shapes line up and the precision is consistent. They do not say the formula is the one your team meant to compute. A clean compile is necessary, not sufficient.

Level 2: the implementation matches the specification

Shipped

Level 2 closes the gap between what the code is required to do and what it actually does. Requirements written in structured English become executable properties, and the compiler proves the implementation satisfies them. Where the math admits a formal proof, it carries one. Where it does not, the property is exercised by seeded randomized testing instead. Either way, the check is part of the build, not a separate review someone has to remember to run.

This is demonstrated across regulated industries: pharmaceutical dosing, aerospace flight control, automotive safety systems, manufacturing process control, and power grid protection. Every property carries a certificate that identifies which proof tier was used and what evidence supports it, so an auditor can confirm both that a requirement was checked and how it was checked.

The limit at Level 2. Verification proves the implementation satisfies the requirements. It does not verify the requirements describe what your business actually needs. A correct implementation of a wrong specification is still wrong code.

Level 3: mechanized proof under the language

Future

Level 3 grounds the guarantees in a mechanized proof of the language semantics rather than in the compiler implementation alone. The proof establishes that a well-typed program steps without getting stuck and keeps its type as it runs, that tensor dimensions are safe, that effects and linearity behave, and that automatic differentiation is correct, all checked by a proof assistant. This is future work. We name it so the roadmap is visible, not so it can be sold as present.

The formal foundation page covers the five theorems and why a mechanized proof matters when a binary moves money.

The limit at Level 3. A proof about the language is a proof about the language. It bounds what the runtime can do to a program. It cannot reach up and judge the program's purpose. The deepest level of the stack still leaves the same question open: is this the right model for this business, priced under the right assumptions. That question belongs to your risk function. The stack is built to make every other question stop being a question, so that one gets the attention.