Add static checks via Liquid Haskell annotations#1
Open
ninioArtillero wants to merge 5 commits into
Open
Conversation
744bbe7 to
7b679ef
Compare
Member
|
👋 @ninioArtillero, maybe check the instructions in the README once more. I think eliminating the cabal.project file is for the better, as no rebuild of LH happens now. Moreover, |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR adds static checks using Liquid Haskell (LH). The PR does not modify the existing functions, except for one small change with no semantic effect, explained below.
The Liquid Haskell annotations are ignored in regular builds of Diff, but they can be checked by building a new package
Diff-liquidhaskellthat re-uses the source tree of the originalDiffpackage, so that the actual code ofDiffis checked. Usually, using a cabal flag is possible to enable Liquid Haskell instead of creating a separate package, but here that would create a circular dependency sinceDiffis a transitive dependency ofliquidhaskell.Liquid Haskell checks the Haskell modules by using an external SMT solver, which needs to be installed separately. A README is provided with a note about it.
The static checks were developed during a Tweag's project, described in this blog post; the main goal being to strenghten the codebase with static checks of previously informal invariants and expectations. The post explains the full scope of the project, a fraction of which is contributed in this PR, with other PRs to follow if desired.
In the following sections (statically cheked invariants, termination metrics and omitted checks) I explain the meaning of the LH annotations contained in this PR. The last section addresses some possible concerns from the maintenance perspective.
Statically checked invariants
The
DLdata type has the following specification:{-@ data DL = DL { poi :: Nat , poj :: Nat , path :: { p : [DI] | len p <= poi + poj } } @-}This checks that
DLcoordinate values are always natural numbers (LH definesNatasIntvalues that are>= 0), and that their edit trace lenght is bounded by their sum. This specification is in accordance with the algorithm definition and an invariant of the implementation that is expected from anyDLvalue.The
dstepfunction has this specification:{-@ dstep :: (Nat -> Nat -> Bool) -> {nodes : [DL] | len nodes > 0} -> {v : [DL] | len v = len nodes + 1} @-}Which requires its function argument (the parameterized version of
canDiag, a.k.a. the diagonal predicate) to only have natural number arguments (pre-condition), the node list argument to be non-empty (pre-condition) and the resulting node list to increase the input length by one (post-condition). Note how we bind thenodesvariable to the input value andvto the output so that we can relate them in the post-condition. This specification provides a detail on the shape of the result, which relates directly to the original algorithm inner loop.The last provided specification is
{-@ addsnake :: (Nat -> Nat -> Bool) -> x : DL -> {v : DL | path v == path x } @-}Which again requires the function argument (the diagonal predicate) to have only natural number arguments and specifies that input and output nodes must have the same edit trace, which is the fundamental requirement for
addsnaketo do what it ought.The pre- and post-conditions forming the function specifidcations are checked at each call of this functions (unless instructed otherwise).
Termination metrics
Termination metrics are introduced using the
/ [metric1, metric2,...]syntax at the end of a function specification.They point LH to a reducing quantity across recursive calls in a recursive function definition, which it uses to prove that such function terminates. Sometimes
metric1decreases, but when it stays constant, thenmetric2should decrease, and so on to define a lexicographic order.All the introduced metrics are appended to specifications matching the existing function signatures, so no additional checks have been introduced:
{-@ go :: Hunk c -> xs : [Diff [c]] -> [Hunk c] / [len xs] @-} {-@ toLineRange :: Int -> Int -> diffs : [Diff[String]] -> [DiffOperation LineRange] / [len diffs, 0] @-} {-@ toChange :: Int -> Int -> [String] -> [String] -> diffs : [Diff [String]] -> [DiffOperation LineRange] / [len diffs, 1] @-} {-@ doParse :: [DiffOperation LineRange] -> diffs : [String] -> [DiffOperation LineRange] / [len diffs] @-}The relevant example among these are the mutually recursive
doPreffixanddoSuffix:{-@ doPrefix :: h : Hunk c -> [Diff [c]] / [len h, 0]@-} {-@ doSuffix :: h : Hunk c -> [Diff [c]] / [len h, 1] @-}Which uses the fact that, in most equations, the (mutually) recursive call is performed on the tail of the list, so the list length is reduced. The second metric (just constants) is a fallback for the case of
doSuffixsecond guard of its third equation wheredoPreffixis called on a list of equal length as the input; in this case the second metric decreases from 1 to 0.The only source change in this PR is a small refactoring introduced for this termination check to work:
This does not change the semantics, because when we reach this equation
dis known to not be aBothvalue and this takes us to the lastdoSuffixequation that does the same thing.Omitted checks
The following annotations
{-@ lazy addsnake @-} {-@ ignore ses @-}Prevent LH from trying to prove termination for
addsnakeand omit all checks insidesesdefinition body. This is so because both are significantly more complex to check and are left for another PR.Other changes
READMEfile explains how to build and test Diff, and how to run Liquid Haskell..gitignorefile was updated to also ignore nested build outputs.Notes on maintenance
Here are some arguments that support the integration of these static checks:
Diff-liquidhaskell.Diffmodule), which could help identify deviations from the expected space-time complexity and possible optimizations.A question that might naturally arise is: What is a maintainer supposed to make of a failing LH check whose errors she doesn't understand? e.g. an error turns out to be uninformative. In relation to the previous points, these checks can't break the package build or CI (unless added to the pipeline), they are currently intended to be run locally and can be safely ignored.
On the other hand, if the decision is not to ignore the check failures, the failing check readily signals that either the changes violate a specification or introduce new ambiguity that LH cannot solve from existing specifications. The latter case might arise e.g. from a new function definition using other annotated functions or lacking a termination metric if it's recursive; this can be solved simply by adding a
lazyorignoreannotation to the offending function. Cases of the former kind can be trickier to diagnose and fix, but are probably worth a detailed look, e.g. the refactoring of a annotated function might be violating a property we care about. Alternatively the changes could be making a specification obsolete, so removing it altogether is a way forward if writing a new one is not viable given effort constraints.Both Haskell and Liquid Haskell are technologies we endorse at Tweag, so we have vested interest in their improvement and adoption, and are generally available to help with whatever problem comes up. We are currently working on improving error messages, documentation and developer experience. Helping maintain this annotations and fix surfacing bugs align with our current agenda.