Switch to more efficient sexp datatype#1365
Open
myreen wants to merge 17 commits into
Open
Conversation
Rewrite compiler/parsing/fromSexpScript.sml to use mlsexp (Atom/Expr) from basis/pure instead of simpleSexp from HOL4's context-free examples. This is Phase 1 of eliminating simpleSexp from CakeML. Key changes: - Ancestor: mlsexp replaces simpleSexpParse - Holmakefile: INCLUDES basis/pure instead of HOL4 context-free - Encoding: SX_SYM/SX_NUM/SX_STR/SX_CONS replaced by Atom/Expr - listsexp xs = Expr xs (trivial, lists are native) - dstrip_sexp extracts tag + args from Expr (Atom tag :: args) - All roundtrip proofs (encoder/decoder bijection) updated - dstrip_sexp_SOME uses strlit nm form for efficient gvs resolution Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Use mlsexp$fromString instead of parse_sexp for sexp input parsing, and sexp_to_string instead of print_sexp for sexp output. Remove simpleSexpParse ancestor and formal-languages/context-free includes from compiler, scheme, and dafny Holmakefiles. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…n files SexpProg (in basis/) already translates the mlsexp parser/printer to CakeML, so the translation files no longer need to translate simpleSexp's PEG parser, printer, or destructor functions. Remove ~300 lines of now-unnecessary code. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Contributor
|
In addition to the proof failure in |
Contributor
|
the sexp_parser failure is known, I just didn't want to fix it manually yet... |
Resolve conflicts in compiler/compilerScript.sml and compiler/parsing/fromSexpScript.sml by keeping the sexp-switch representation (Atom/Expr + mlsexp$fromString); master's conflicting blocks were in the old SX_CONS/SX_SYM/parse_sexp representation that this branch replaces. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The rewritten proofs referenced mlstringTheory.implode_def, which no longer exists now that mlstring's constructor is `implode` (with `strlit` an inferior overload of it). Those references caused static errors that aborted the theory. Since `strlit = implode` is definitional, the implode_def rewrite was a no-op; the affected proofs close with the existing implode_explode/explode_implode lemmas. Also fixes the Char witness in litsexp_sexplit: `str c` (a string) -> `implode [c]` (the correct mlstring). fromSexpTheory now builds with all proofs complete. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
SXNUM is now an ordinary smart-constructor definition (SXNUM n = Atom (toString (&n))) rather than a simpleSexp datatype constructor, so the translator no longer handles it automatically. Translate fromSexpTheory.SXNUM_def before locnsexp_def (its first use) so the encoder translations close. to_sexpProgTheory now builds. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- Translate fromSexpTheory.SXNUM_def before locnsexp_def: SXNUM is now a smart-constructor function, not a datatype constructor, so the translator no longer handles it automatically. - Remove the obsolete litsexp_side_thm: litsexp now translates without a precondition, so litsexp_side no longer exists. - Restore HOL term/type quotation marks on the main_function sanity check and the program-assembly antiquotation, which an earlier commit had turned into ASCII string literals. dafny_compilerProgTheory now builds. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Assisted-by: Claude:claude-opus-4-8[1m]
Contributor
|
Claude did some really bad merging... |
The switch from simpleSexpParse to mlsexp dropped simpleSexpParse from the Ancestors, removing print_sexp. The --print_sexp path in compile_def still referenced it. Use mlsexp$sexp_to_string (flat output) instead; it returns an mlstring directly, so the old implode/"++" wrapping is no longer needed. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
astToSexprLib is a hand-written SML mirror of the fromSexpScript.sml encoders that serialises a CakeML AST term to s-expression text. It still produced the old simpleSexp format, which no longer matches the migrated mlsexp-based encoders/decoders, so its output would not re-parse. Rewrite it to emit exactly what mlsexp$sexp_to_string would print for the encoder results (decsexp/expsexp/...), keeping the public API (write_ast, write_ast_to_file) unchanged so all consumers are unaffected. Notable format changes (mlsexp sexp = Atom | Expr): - no dotted pairs/tuples; pairs become 2-element lists, empty list "()" - atoms quoted only when unsafe (faithful encode_control + make_str_safe + escape_str ports), using an ordinal isPrint (32..126) test - IntLit now tagged with a "~" sign; Char via SEXSTR; StrLit bare; words/Float64 via decimal SXNUM - locations as nested s-expressions, incl. EOFpt - ThunkOp operators handled; explicit op->tag table that keeps Vsub_unsafe's underscore while stripping the other unsafe ops Verified in HOL: byte-exact match against sexp_to_string (decsexp d) for every literal/op/pattern/type/declaration form (incl. tricky strings), and a full write_ast -> fromString -> sexplist sexpdec round-trip recovering the original program. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Contributor
|
Seems like it builds now. Before this gets merged, the changes should be documented (including checking for regressions). One example of a silent regression: |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.