Add a mathematical constraint system by kripken · Pull Request #8816 · WebAssembly/binaryen

kripken · 2026-06-08T23:12:29Z

This allows defining constraints like { x >= 0 && x <= 100 } and to then check if they
imply something else is true or false, like { x >= 0 && x <= 100 } => { x < 9999 }
(example of a valid inference).

This is the minimal first part of such a system, focusing on ==, !=, and very simple
solving. Putting up for design feedback before I work in depth on the rest.

Next steps are to add >=, < etc., and to add a pass that uses this in a control-flow
aware way, that is, the goal is to optimize things like

if (x > 10) {
   assert(x > 0); // this can be removed
}

This is important to remove userspace bounds checks for Kotlin (and likely Java).

inplace_vector part here is from #8814 (will rebase once it lands).

tlively

I highly recommend explicitly framing the constraint space as a lattice:

Both and_ and fuzzyOr are effectively merging constraints. You want both (but especially fuzzyOr) to have all the properties of a lattice join operator: monotonicity, associativity, commutativity, idempotency, etc. You also want fuzzyOr to be as precise as possible; it has to lose some precision sometimes, but you only want it to lose as much precision as necessary given the representation of constraints. So you want it to be a least upper bound, i.e. a join.
Making the constraint space a lattice will give you all the nice properties you want for using it in a program analysis: order-independence, guaranteed convergence, etc. It also reduces all the novelty and complexity to just generating the constraints in the first place; getting to the fixed point after that is just the classic worklist + graph traversal pattern.
Making the constraint space a lattice will let you test it in the lattice fuzzer, which can do a better job than just unit tests alone of making sure it has all the properties we want, including that we do not unnecessarily lose precision in the merge operation.

tlively · 2026-06-08T23:28:49Z

+
+// We limit constraints to a low number to ensure good performance even with
+// simple brute-force solving.
+// TODO: use a generic constraint solver..?


I did have that POC for pulling in Z3. In the limit I guess that's what we'd want. 5c2bbb7

tlively · 2026-06-08T23:32:15Z

+  //   { this } => { condition }
+  //
+  // https://en.wikipedia.org/wiki/Material_conditional#Truth_table
+  Result check(const Constraint& condition) const;


Perhaps proves or implies?

Hmm, yeah. Another option is eval as @MaxGraey suggests?

MaxGraey · 2026-06-09T07:41:13Z

That's awesome!

Have you considered more academic and conventional naming for lattice-like stuff?

Value -> Term
Result -> KnownTruth
ConstraintSet -> Conjunction

check(conj) -> eval(conj)
and_(conj) -> meet(conj) / meetWith(conj)
fuzzyOr(conj) -> join(conj) / joinWith(conj)

or something like this?

kripken · 2026-06-09T16:36:29Z

@tlively Definitely making this a lattice would have benefits, but it would add overhead and complexity, I worry. Specifically, having a limited capacity (number of constraints in a set), as in the current design, is really nice for efficiency, but makes it not a lattice. Here is a concrete example. For a lattice we need this absorption law: (a ^ b) v b == b. Take

a = { x >= 10 && x <= 20 }  ;; span of numbers: 10, 11, .., to 20
b = { x & 1 }               ;; all odd numbers

a ^ b should be the set of odd numbers in that range, i.e., 11, 13, .., 19. However, that can't be written if the capacity is 2. So a ^ b loses something. That doesn't mean it isn't useful! We can define a ^ b to contain any 2 of the 3 constraints being combined (this can prove fewer things, but more than nothing). E.g. a ^ b = a (just ignore b). But then

(a ^ b) v b == a v b != b

which breaks the absorption rule.

(This is sort of parallel to the issue with multiple constants in possible-constants - we only support one constant, not an arbitrary number. An arbitrary number is necessary for all the nice mathematical properties we want, but the overhead isn't worth it in GUFA.)

kripken · 2026-06-09T16:39:45Z

@MaxGraey

Value -> Term

Good idea, I think that makes sense.

Result -> KnownTruth

I think this is clear enough already, and shorter?

ConstraintSet -> Conjunction

I left this intentionally vague as this may expand in the future. A set of constraints is, atm, a conjunction, but if we find a nice way to allow OR and not just AND, we should add it. The idea is, conceptually, a set of constraints that can prove things.

MaxGraey · 2026-06-09T16:58:49Z

Btw binaryen already has some basic semi and full lattices: https://github.com/WebAssembly/binaryen/blob/main/src/analysis/lattice.h and https://github.com/WebAssembly/binaryen/blob/main/src/analysis/lattices/abstraction.h infra. So how about this?

class LowerBound : Lattice { ... }
class UpperBound : FullLattice { ... }
class RangeBound : FullLattice { ... }

tlively · 2026-06-09T17:27:35Z

Specifically, having a limited capacity (number of constraints in a set), as in the current design, is really nice for efficiency, but makes it not a lattice.

Certainly you cannot have all three of these properties:

The constraint system is a lattice.
The constraints have bounded representation.
The join and meet operators are logical or and logical and with full precision.

We both agree that we must give up on (3). I'm just saying that we should design the system such that we can still have (1) and (2) instead of just (2). Designing the system to be a lattice will require some care and perhaps some additional compromises on precision, but I strongly believe the benefits would be worth it.

One option would be to come up with all the constraints we're interested in, then figure out how to structure them properly to make sure they form a lattice.

Another way to keep things simple would be to use the product of multiple simpler lattices for the analysis. For example, we could simultaneously do a range analysis, sign analysis, and bit analysis, each of which is very simple to understand. Each individual component of the larger lattice could be developed independently.

Otherwise we should just pull in Z3 rather than reinventing the wheel for an arbitrary constraint solver.

kripken · 2026-06-09T20:33:58Z

One option would be to come up with all the constraints we're interested in, then figure out how to structure them properly to make sure they form a lattice.

Here are the ones we know we want, from direct user feedback:

Ranges, things like x >= 10 && x <= 20, for bounds check removal
Equality/inequality, like x != null for null check removal
Subtyping checks, like isSubType(x, T) for cast check removal

And possibly other things that are common in branch conditions (the above three are all derived from that). E.g. if if (x & 1) { .. } is common then we want x & 1 - we could gather data here to be more precise.

The product of lattices for each of those things grows large quickly, so I worry about overhead there.

Z3 is definitely an option in the long term, but (1) we need only a tiny subset of it, and (2) we really want a wasm constraint solver, and implementing wasm-isms in Z3 (like subtyping etc.) may be difficult/inefficient.

To be clear about the scope: This rather small PR adds !=, ==. Later PRs can add <=, >, isSubType and will not grow this code by much. The final step is a generic pass that uses this in a control-flow-aware way - that is code we need even if we use Z3.

So all this will remain quite small and focused: this is not a big project!

tlively · 2026-06-09T22:23:55Z

The most trivial way to turn the current constraint system into a lattice is to define a total order on Constraint:

join (i.e. fuzzyOr) would do what the current codes does where it sees if the new constraints are subsumed by or subsume the current constraints. If not, the result would be top (which is similar to what the current code does, except that the empty constraint set does double duty as top and bottom in the current code).
meet (i.e. and_) would similarly check whether the new constraints are subsumed by or subsume the current constraints. Any extra new constraints (i.e. those not related to existing constraints) would be added if there is space for them, and otherwise only the top 3 constraints according to the arbitrary total order on constraints would be kept.

This logic can be abstracted out into a generic BoundedConjunction<L, N> lattice, then Constraint can be its own lattice where checks (or proves or whatever) is the < operator for the lattice, then the analysis here could just use Conjunction<Constraint, 3>.

(Note that BoundedConjunction would be a join semilattice, not a full lattice. But that's sufficient for our purposes. We probably should choose a name other than meet to avoid confusion.)

kripken · 2026-06-09T22:54:54Z

I don't quite see how this would work - again, the problem is the capacity, which is bounded - but I figured that maybe I'm just not seeing it? So I asked Gemini, giving it this PR's code + an overview + your last comment. It wrote a bunch of code, ending up with a semilattice - not a lattice. Here is Gemini's explanation when I asked why:

> You wrote `BoundedConjunction is a join-semilattice, but not a full lattice.` why is it not a full lattice?

▸ Thought for 8s, 480 tokens
  The user is asking why `BoundedConjunction` is not a full lattice.
   BoundedConjunction<L, N>  is not a full lattice (specifically, not a meet-semilattice) because the meet (greatest lower bound, corresponding to logical AND) does not always exist due to the size limit N.  

  In a full lattice, for any two elements A and B, there must exist a unique greatest lower bound A land B (meet) such that:

  1. (A land B) ≤ A
  2. (A land B) ≤ B
  3. For any C where C ≤ A and C ≤ B, then C ≤ (A land B).

  For a conjunction of constraints, the natural meet is the union of the constraints (e.g., {x = =5} land {y = =10} = {x = =5,y = =10}). This represents a state that is more specific (lower in the lattice)   
  than both.

  However, in  BoundedConjunction<L, N> , if the union of constraints exceeds the size N, we are forced to discard some constraints to stay within the limit.

  Discarding a constraint makes the set less restrictive (larger in the lattice, closer to Top). As a result, the bounded approximation of the meet is no longer guaranteed to be a lower bound of the inputs.  

  ### Example

  Let N = 1, and we want to find the meet of:

  • A = {x = =5}
  • B = {y = =10}

  1. The true mathematical meet is {x = =5,y = =10}.
  2. Since N = 1, we cannot represent this. We must discard one constraint, say we keep only {x = =5}.
  3. Let our approximated meet be M = {x = =5}.
  4. For M to be a valid meet, it must be a lower bound of both A and B (M ≤ A and M ≤ B).
      • M ≤ A holds because {x = =5} ≤ {x = =5}.
      • M ≤ B does not hold because {x = =5} not ≤ {y = =10} (knowing x = =5 does not imply y = =10).


  Since we cannot construct a valid lower bound for all pairs of elements, the meet operation is not closed under this bounded representation, making it a join-semilattice (where the join/OR always exists,   
  defaulting to the empty set/Top if no common constraints are found) but not a full lattice.

Am I and Gemini wrong?

tlively · 2026-06-10T00:14:11Z

No, that's exactly correct. See the parenthetical note I added to my previous comment in an edit. And it's mostly fine that it's a semilattice because the generic worklist algorithm that propagates information to find a fixed point only does joins. The only catch is that the transfer function will use boundedMeet, which will not be monotonic :( AFAICT, this means that we might not get order-independence after :(((( But the factoring of the code will still be much nicer IMO :)

kripken · 2026-06-10T00:27:10Z

Ok, good, then we are on the same page - this is not a lattice, so we lose all the nice properties that a lattice normally has.

That leaves the code factoring as a possible benefit. But when I ran Gemini on this, I didn't see a code benefit either - mostly a bunch of new boilerplate to fit into the Lattice framework. Unless you have a way to do this without boilerplate that actually reduces code rather than adds?

tlively · 2026-06-10T00:37:27Z

#8821 and #8824 show the generic lattices we could add. Obviously the code is more complex if you count the heavily-templated lattice implementations, but I don't think that's the right way to look at it. Even supposing that we never reuse the lattices for anything else (although we could!), factoring the constraint system into composed lattices makes it much easier to focus on the interesting things and abstract away all the complexity around managing our knowledge of independent constraints. It also makes the code much more unit-testable and fuzzable.

kripken added 17 commits June 8, 2026 14:07

go

0fc9e55

go

5ad1b75

go

8181363

go

d0ad2f4

go

c5b7d1d

go

0e35b2b

go

60d50b4

feedback

626b5d7

feedback

5896406

fix

cf29b58

clean

cdaff6b

Merge remote-tracking branch 'myself/inplace' into constraint.by.itself

c02ba2e

clean

94d2161

clean

735d7ea

const

6e80fde

undo CFP change

cf7fcc6

Merge branch 'inplace' into constraint.by.itself

ac02454

kripken requested a review from a team as a code owner June 8, 2026 23:12

kripken requested review from stevenfontanella and removed request for a team June 8, 2026 23:12

kripken added 4 commits June 8, 2026 16:14

tidy

0930461

add.assert

7b7d2ac

fix.comment

679bd24

Merge remote-tracking branch 'origin/main' into constraint.by.itself

920e7a9

tlively reviewed Jun 8, 2026

View reviewed changes

value => term

44ad794

kripken mentioned this pull request Jun 9, 2026

Add a OneOf lattice #8821

Draft

Conversation

kripken commented Jun 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tlively left a comment

Choose a reason for hiding this comment

Uh oh!

tlively Jun 8, 2026

Choose a reason for hiding this comment

Uh oh!

tlively Jun 8, 2026

Choose a reason for hiding this comment

Uh oh!

kripken Jun 9, 2026

Choose a reason for hiding this comment

Uh oh!

MaxGraey commented Jun 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kripken commented Jun 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kripken commented Jun 9, 2026

Uh oh!

MaxGraey commented Jun 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tlively commented Jun 9, 2026

Uh oh!

kripken commented Jun 9, 2026

Uh oh!

tlively commented Jun 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kripken commented Jun 9, 2026

Uh oh!

tlively commented Jun 10, 2026

Uh oh!

kripken commented Jun 10, 2026

Uh oh!

tlively commented Jun 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

kripken commented Jun 8, 2026 •

edited

Loading

MaxGraey commented Jun 9, 2026 •

edited

Loading

kripken commented Jun 9, 2026 •

edited

Loading

MaxGraey commented Jun 9, 2026 •

edited

Loading

tlively commented Jun 9, 2026 •

edited

Loading