Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add UniqueSet, remove direct name comparisons #5815

Merged
merged 6 commits into from
Mar 6, 2024

Conversation

ana-pantilie
Copy link
Contributor

@ana-pantilie ana-pantilie commented Mar 4, 2024

  • adds UniqueSet module
  • splits PlutusCore.Name into 3 modules: PlutusCore.Name.Unique (containing the main Name -> Unique logic), PlutusCore.Name.UniqueMap and PlutusCore.Name.UniqueSet which depend on Unique
  • reasoning for ^: to split common namespace, similar to the standard Map and Set; UniqueMap and UniqueSet should be imported qualified
  • replaces the standard Map and Set containers for names with UniqueMap and UniqueSet such that all comparisons are now done using Unique instead of Ord name

Pre-submit checklist:

  • Branch
    • Tests are provided (if possible)
    • Commit sequence broadly makes sense
    • Key commits have useful messages
    • Changelog fragments have been written (if appropriate)
    • Relevant tickets are mentioned in commit messages
    • Formatting, PNG optimization, etc. are updated
  • PR
    • (For external contributions) Corresponding issue exists and is linked in the description
    • Targeting master unless this is a cherry-pick backport
    • Self-reviewed the diff
    • Useful pull request description
    • Reviewer requested

@ana-pantilie ana-pantilie marked this pull request as ready for review March 5, 2024 11:41
@ana-pantilie ana-pantilie changed the title Add UniqueSet Add UniqueSet, remove Ord name constraints Mar 5, 2024
@ana-pantilie ana-pantilie changed the title Add UniqueSet, remove Ord name constraints Add UniqueSet, remove direct name comparisons Mar 5, 2024
@ana-pantilie ana-pantilie requested review from effectfully and a team March 5, 2024 11:54
Copy link
Contributor

@michaelpj michaelpj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, a few doc thoughts, and some musings about making the interfaces more generic by making more use of typeclasses/lenses.

import Data.IntSet.Lens qualified as IS
import PlutusCore.Name.Unique (HasUnique (..), Unique (Unique))

-- | A set containing uniques.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this good? I have honestly forgotten, and it's not in the PR description, or more importantly, in the doc for this type. How will we remember when to use it or not? What were the considerations that led to us defining this? I'm sure you know - write it down!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same applies to UniqueMap

}
deriving newtype (Show, Eq, Semigroup, Monoid)

-- | Insert a value by a unique.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These comments read as more "map-y" (indeed, I think they're just copied from the map version). I think it could just be "insert a unique". And similarly for the other comments.


-- | Insert a value by the unique of a name.
insertByName :: (HasUnique name unique) => name -> UniqueSet unique -> UniqueSet unique
insertByName = insertByUnique . view unique
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tbh, I slightly question whether these functions pass the Fairbarn threshold?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I realise most of them existed already, I'm just questioning whether they should!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other functions are defined based on insertByUnique and insertByName to avoid redoing all back and forth between the ints and the uniques and the names

unique ->
UniqueSet unique ->
UniqueSet unique
insertByUnique = coerce . IS.insert . coerce
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

last coerce is unnecessary?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's necessary because it coerces the unique into the Key type used by IntSet.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, that was very amibguous! I meant the one that is applied last i.e. the one that is applied first 😅 I think we get back a UniqueSet unique from insert, so no need to coerce it again.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, that's the IntSet insert, its result needs to be wrapped into a UniqueSet.

memberByName = memberByUnique . view unique

notMemberByName :: (HasUnique name unique) => name -> UniqueSet unique -> Bool
notMemberByName n = not . memberByName n
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

surely this is not needed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The standard Set API has it https://hackage.haskell.org/package/containers-0.7/docs/Data-Set.html#v:notMember.

I guess the reason for defining this is that you might want to use it as an operator: x notMember s. I think not $ member x s is a tiny bit harder to read. 😄 That's why I chose to add notMemberByName, but I can remove it if you prefer.

restrictKeys (UniqueMap m) (UniqueSet s) =
UniqueMap $ IM.restrictKeys m s

foldr :: (a -> b -> b) -> b -> UniqueMap unique a -> b
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I couldn't get this to work quickly, not sure what's wrong. I'll revisit it the next time I need to do something with UniqueMap/Set (which is very soon).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm confused, how do we not get that with simple deriving Foldable?

singletonByName :: (HasUnique name unique) => name -> a -> UniqueMap unique a
singletonByName n a = insertByName n a mempty

-- | Insert a named value by the index of the unique of the name.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm slightly unsure of your use of "index" here? I think you just mean by the unique? Applies throughout

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, I forgot to fix the docs! Thanks for pointing it out.


-- * Functions
insertByUnique,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@@ -221,35 +223,45 @@ programMapNames f g (Program a v term) = Program a v (termMapNames f g term)
-- Free variables

-- | Get all the free term variables in a term.
fvTerm :: Ord name => Traversal' (Term tyname name uni fun ann) name
fvTerm :: HasUnique name unique => Traversal' (Term tyname name uni fun ann) name
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is actually a better signature full stop. Ord name suggests it might work for e.g. de bruijn names, but it really won't.


lookupVarInfo ::
(HasUnique name TermUnique) =>
name ->
S name uni fun a ->
Maybe (VarInfo name uni fun a)
lookupVarInfo n s = lookupName n $ s ^. vars
lookupVarInfo n s = UMap.lookupName n $ s ^. vars
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as #5815 (comment)

Signed-off-by: Ana Pantilie <[email protected]>
@ana-pantilie ana-pantilie added the No Changelog Required Add this to skip the Changelog Check label Mar 6, 2024
@ana-pantilie ana-pantilie merged commit b4f045f into master Mar 6, 2024
5 of 6 checks passed
@ana-pantilie ana-pantilie deleted the ana/plt-8647-unique-set branch March 6, 2024 17:33
Copy link
Contributor

@effectfully effectfully left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for such a late review, I'll try to do better next time.

All looks great to me, comments are nitpicks and thinking out loud.

{- | Defines the 'Name' type used for identifiers in Plutus Core together with a technique
to minimise the cost of 'Name' comparisons.

A 'Name' is a piece of text used to identify a variable inside the Plutus Core languages.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically, it's the Unique that identified a Name, Text is there just for human readability.

therefore, compare the integers instead, which is obviously much more cost-effective.

We distinguish between the names of term variables and type variables by defining wrappers
over 'Name': 'TermName' and 'TyName'. Since the code we usually write is polymorphic in the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's no TermName type as far as I'm aware. TermName appears as a constructor in a few places, but that's irrelevant.

UniqueMap (..),
insertByUnique,
insertByName,
singletonByName,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: that's some weird formatting.

restrictKeys (UniqueMap m) (UniqueSet s) =
UniqueMap $ IM.restrictKeys m s

foldr :: (a -> b -> b) -> b -> UniqueMap unique a -> b
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm confused, how do we not get that with simple deriving Foldable?


{-# INLINE isEmpty #-}
isEmpty :: UniqueMap unique a -> Bool
isEmpty (UniqueMap m) = IM.null m
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... and that as well, surely it's just null assuming we derive a Foldable instance for UniqueMap unique.

newtype UniqueMap unique a = UniqueMap
{ unUniqueMap :: IM.IntMap a
}
deriving newtype (Show, Eq, Semigroup, Monoid, Functor)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I probably originally added those Semigroup and Monoid instances just to be able to use mempty. Semigroup is kinda weird for IntMap: instead of doing <> pointwise it simply drops elements from the second map when their keys exist in the first one, which can backfire. Would be good to check if we can remove those instances, although I guess since it was me who introduced that tech debt, I should probably do that myself.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tried to remove the instances, doesn't seem trivial unfortunately. I'll open an issue.

newtype UniqueSet unique = UniqueSet
{ unUniqueSet :: IS.IntSet
}
deriving newtype (Show, Eq, Semigroup, Monoid)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the other hand these instances here are completely fine and we probably should have consistent behavior between UniqueMap and UniqueSet. I wonder if this is why containers developers chose a somewhat weird instance for Semigroup (UniqueMap a).

singletonName n = insertByName n mempty

-- | Convert a 'Foldable' into a 'UniqueSet' using the given insertion function.
fromFoldable ::
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would just do something like foldMap (singleton . coerce) or something

foldMap is usually lazy, so that one can derive foldr in terms of it for a right-nested structure and just generally define lazy functions, as well as strict ones. foldl' in terms of foldr is easy, foldr in terms of foldl' is literally one of my challenges, so you know it's evil.

foldl' here is gonna be more efficient for a right-nested structure than foldMap (singleton . coerce) and isn't any more hard to define. In the ideal world we'd generalize this to left-nested structures (if that's possible, I don't remember ever doing that) and use tree-like folding, but who cares, foldl' should be enough for most practical use cases and we can always improve it if it ever becomes a problem.

Getting (UniqueSet unique) s name ->
s ->
UniqueSet unique
setOfByName l = views l singletonName
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I looked at Data.Set.Lens, got confused and gonna state that I trust that you got this one right :)

@effectfully
Copy link
Contributor

Oh and thanks a lot for documenting existing code, that's extremely helpful.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
No Changelog Required Add this to skip the Changelog Check
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants