Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Persistent structure-based term identifiers #1830

Merged
merged 18 commits into from
Mar 30, 2023
Merged

Conversation

samcowger
Copy link
Contributor

This adds a field to STApp-constructed Terms to store the hash of the TermF they encompass, giving shared terms a readily-available, (relatively) unique identifier based solely on the shape/structure of their subterms. The goal of this is to provide terms with a persistent, concise reference that does not change between SAW invocations.

In doing this, I changed the Hashable and Eq instances to leverage this field. Before, Hashable was implemented solely in terms of stAppIndex, which was basically guaranteed unique for any single SAW invocation. Now, Hashable is implemented solely in terms of this stored hash. This slightly increases the possibility of term hash collisions, but I believe the likelihood of collisions is no worse than that of automatically-derived instances. RFC: should we now handwrite a Hashable instance for TermF to mitigate this risk somewhat?

One use case for this (which this PR does not implement, but which a future PR will) is to allow users to print terms with memoization based on these references, rather than based on mercurially-unique integer tags. This would help users to maintain context in larger, heavily-memoized terms (e.g. a proof goal) as users enter and exit SAW REPLs and proof subshells in the process of proof development.

I originally made these changes to avoid computing `TermF` hashes twice, but in
doing so assumed a total absence of term hash collisions. The cost of that
assumption far outweighs the cost of hash recomputation which, at least for
terms with cached subterm hashes (which I expect to encompass most terms), is
actually quite cheap.
@samcowger
Copy link
Contributor Author

One alternative to the approach implemented here, if there is substantial concern about changing the Hashable instance, would be to define a parallel class, say Hashable', whose only purpose would be to provide a structure-based hash in tandem with the current index-based hash, and implement it on the Term and its subordinate types. We would still need to modify the representation of STApp Terms to cache this hash, but could otherwise leave the Hashable and Eq instances unchanged.

Copy link
Contributor

@eddywestbrook eddywestbrook left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While you are changing the Term datatype, please add a haddock description to it to explain the design concepts associated with it, including: that the STApp constructor implements hash-consing (maybe put a pointer to Wikipedia or something to explain the concept?) using the stAppIndex field; that the stAppIndex field is always meant to be unique; that the stAppTermF field "ties the knot" for the TermF data-type (it would be great if you could dig up a reference for this style of datatype functor, but it's fine if you can't); and that the stAppHash and stAppFreeVars fields cache the free variables and hash values of the term.

@samcowger
Copy link
Contributor Author

While you are changing the Term datatype, please add a haddock description to it to explain the design concepts associated with it, including: that the STApp constructor implements hash-consing (maybe put a pointer to Wikipedia or something to explain the concept?) using the stAppIndex field; that the stAppIndex field is always meant to be unique; that the stAppTermF field "ties the knot" for the TermF data-type (it would be great if you could dig up a reference for this style of datatype functor, but it's fine if you can't); and that the stAppHash and stAppFreeVars fields cache the free variables and hash values of the term.

Can do!

Copy link
Contributor

@eddywestbrook eddywestbrook left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent, thanks for adding those haddocks!

@samcowger samcowger added the PR: ready to merge Magic flag for pull requests to ask Mergify to merge given an approval and a successful CI run label Mar 30, 2023
@mergify mergify bot merged commit ab51c9b into master Mar 30, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
PR: ready to merge Magic flag for pull requests to ask Mergify to merge given an approval and a successful CI run
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants