-
Notifications
You must be signed in to change notification settings - Fork 63
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Persistent structure-based term identifiers #1830
Conversation
I originally made these changes to avoid computing `TermF` hashes twice, but in doing so assumed a total absence of term hash collisions. The cost of that assumption far outweighs the cost of hash recomputation which, at least for terms with cached subterm hashes (which I expect to encompass most terms), is actually quite cheap.
One alternative to the approach implemented here, if there is substantial concern about changing the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While you are changing the Term
datatype, please add a haddock description to it to explain the design concepts associated with it, including: that the STApp
constructor implements hash-consing (maybe put a pointer to Wikipedia or something to explain the concept?) using the stAppIndex
field; that the stAppIndex
field is always meant to be unique; that the stAppTermF
field "ties the knot" for the TermF
data-type (it would be great if you could dig up a reference for this style of datatype functor, but it's fine if you can't); and that the stAppHash
and stAppFreeVars
fields cache the free variables and hash values of the term.
Can do! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Excellent, thanks for adding those haddocks!
This adds a field to
STApp
-constructedTerm
s to store the hash of theTermF
they encompass, giving shared terms a readily-available, (relatively) unique identifier based solely on the shape/structure of their subterms. The goal of this is to provide terms with a persistent, concise reference that does not change between SAW invocations.In doing this, I changed the
Hashable
andEq
instances to leverage this field. Before,Hashable
was implemented solely in terms ofstAppIndex
, which was basically guaranteed unique for any single SAW invocation. Now,Hashable
is implemented solely in terms of this stored hash. This slightly increases the possibility of term hash collisions, but I believe the likelihood of collisions is no worse than that of automatically-derived instances. RFC: should we now handwrite aHashable
instance forTermF
to mitigate this risk somewhat?One use case for this (which this PR does not implement, but which a future PR will) is to allow users to print terms with memoization based on these references, rather than based on mercurially-unique integer tags. This would help users to maintain context in larger, heavily-memoized terms (e.g. a proof goal) as users enter and exit SAW REPLs and proof subshells in the process of proof development.