Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Lift instances #343

Merged
merged 3 commits into from
Dec 26, 2021
Merged

Conversation

treeowl
Copy link
Collaborator

@treeowl treeowl commented Dec 21, 2021

Starts work on #342

@treeowl treeowl self-assigned this Dec 21, 2021
@treeowl treeowl requested review from phadej and sjakobi and removed request for phadej December 21, 2021 02:55
unordered-containers.cabal Outdated Show resolved Hide resolved

instance TH.Lift a => TH.Lift (Array a) where
#if MIN_VERSION_template_haskell(2,16,0)
liftTyped ar = [|| fromList' arlen arlist ||]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's disappointing that SmallArray# (and other *Array#) don't have literals, but I don't think we can do better atm.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fortunately, I don't that any overhead converting the arrays from lists will really be noticeable, especially since I expect all this spliced stuff gets floated out as CAFs. Literals would be nice.

Copy link
Contributor

@phadej phadej Dec 21, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

they will be CAFs, but they could been off-heap structures, which would been even better.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean just in the program code? Do we have things with pointers that work like that?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

E.g. Map and IntSet, yes.

Copy link
Member

@sjakobi sjakobi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To fully fix #342, we should also expose the Array constructor in Data.HashMap.Internal.Array, and for better ergonomics also expose the HashMap constructors in Data.HashMap.Internal.Strict.

unordered-containers.cabal Outdated Show resolved Hide resolved
Comment on lines +481 to +490
fromList' :: Int -> [a] -> Array a
fromList' n xs0 =
CHECK_EQ("fromList'", n, Prelude.length xs0)
run $ do
mary <- new_ n
go xs0 mary 0
where
go [] !mary !_ = return mary
go (!x:xs) mary i = do write mary i x
go xs mary (i+1)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the motivation for adding this in addition to the existing fromList which looks identical at first glance?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's strict in the elements, so the resulting HashMap won't be full of thunks.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah! Can you add haddocks that point this out?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, but the leaves may still be thunks .... Hmm.... Maybe it's better to be lazy after all?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it's better to be lazy after all?

@madgen, what do you think / prefer?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sjakobi in my particular application laziness doesn't play much role, so I prefer it there weren't full of thunks, but I don't think that is a superior choice in general. Do what you think would be most generally applicable please.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sjakobi, The problem is that it's not a "full of thunks" vs. "not full of thunks". If we're lazy in the leaves, as we really should be, then we should be lazy in the spines as well for best results. But users may want to be strict in the leaves, so we should have a strict lift in the Strict module.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand. What you mean by leaves. Elements in the HashMap? Isn't the assumption that they will be floated out and made static data if they can?

(That's why I'd like SmallArray# literals, you'll just know that lifted expressions won't have any thunks as it perfectly would be just constructors or literals all the way).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@phadej, I guess we want to force construction of the values? Will the derived instance do that? I got a bit confused again.

@treeowl
Copy link
Collaborator Author

treeowl commented Dec 21, 2021

To fully fix #342, we should also expose the Array constructor in Data.HashMap.Internal.Array, and for better ergonomics also expose the HashMap constructors in Data.HashMap.Internal.Strict.

You're right; this isn't a full fix.

@treeowl
Copy link
Collaborator Author

treeowl commented Dec 22, 2021

@phadej, and @madgen, could you confirm that my commit building the values eagerly is the reasonable thing? If so, I'll squash this all and merge. We should open a ticket to use SmallArray literals if those ever become available.

@treeowl treeowl force-pushed the lift branch 4 times, most recently from 76f9f33 to 8ae3071 Compare December 22, 2021 23:25
@treeowl
Copy link
Collaborator Author

treeowl commented Dec 22, 2021

Note: StandaloneDeriving is needed for the @since annotations under GHC 8.0.2. Those weren't yet allowed in deriving lists.

@madgen
Copy link

madgen commented Dec 23, 2021

@phadej, and @madgen, could you confirm that my commit building the values eagerly is the reasonable thing? If so, I'll squash this all and merge. We should open a ticket to use SmallArray literals if those ever become available.

I'll test it against my code tomorrow and let you know. Sorry, I just couldn't get my head off of work in the past two days.

@madgen
Copy link

madgen commented Dec 26, 2021

@treeowl I tried it on my project. It works like a charm I still see 100x perf benefit.

However, as with my original hacky implementation, I can only get it work using monomorphisation. I think this diff still moves the needle in the right direction. So I suggest we go with it now.

I'll produce a minimal repro and inquire about why we are losing expressivity and maybe some TTH gurus can suggest a better way.

@madgen
Copy link

madgen commented Dec 26, 2021

Also I checked the performance against containers, both the staged and unstaged versions are ~4x faster for unordered-containers than their containers counterparts.

@treeowl
Copy link
Collaborator Author

treeowl commented Dec 26, 2021

Thanks for checking, @madgen. Could you explain what you mean about monomorphisation?

@madgen
Copy link

madgen commented Dec 26, 2021

@treeowl

The function I want to stage has the type and it uses both HashSet and HashMap.

recognise :: forall a. (Eq a, Hashable a) => NFA a -> String -> Bool

So I'd love to stage it as

sRecognise :: forall a. (Eq a, Hashable a, Lift a) => NFA a -> Q (TExp (String -> Bool))

However, if I do this, then I get a type error that says a in Eq a is ambiguous and the instance cannot be resolved because of the call to HashMap.lookup. The error does not happen where sRecognise is defined but it happens where it is used and expanded by GHC.

This problem does not happen with the Lift instances for containers. That is why I hoped deriving Lift rather than handwriting the instances (at least partially) might resolve the problem.

The only way around I could find was to instead give the staged function the following signature where a is specialised to Int. This shouldn't be necessary and it is not ideal, but it works for now.

sRecognise :: NFA Int -> Q (TExp (String -> Bool))

Here's the implementation for completeness:

sRecognise :: forall a. (Eq a, Hashable a, Lift a) => NFA a -> Q (TExp (String -> Bool))
sRecognise NFA{..} = [|| \str ->
  case foldl $$(step) (Just _startState) str of
    Just st -> S.member st _acceptingStates
    Nothing -> False
  ||]
  where
  trFx = transitionFunction _moves

  step :: Q (TExp (Maybe a -> Char -> Maybe a))
  step =
    [||
      \mSt c -> do
        st <- mSt
        (st, c) `M.lookup` trFx
    ||]

@treeowl
Copy link
Collaborator Author

treeowl commented Dec 26, 2021

I'm guessing you might want ExtendedDefaultRules?

@treeowl treeowl merged commit 6910660 into haskell-unordered-containers:master Dec 26, 2021
@phadej
Copy link
Contributor

phadej commented Dec 26, 2021

That is an old (T)TH problem. The types are not preserved, and the spliced AST is re-typechecked. Sometimes you get ambiguity errors. Very easy to get them with integral types, as literals like 5 are easily ambiguous.

You can workaround that by using https://hackage.haskell.org/package/lift-type-0.1.0.1/docs/LiftType.html, though it doesn't add a helpful utility:

annotate :: Typeable a => TExpQ a -> TExpQ a
annotate ...

but you can write it with a bit of unsafeTExpCoerce.

@madgen
Copy link

madgen commented Dec 26, 2021

@treeowl ExtendedDefaultRules did the trick! Thank you.

@phadej, do you know why the containers instances provided in th-lift-instances doesn't suffer from the same problem?

Thanks for the reading material. I'll get on that now.

@phadej
Copy link
Contributor

phadej commented Dec 26, 2021

@madgen, hard to say. -ddump-splices and look how they are different.

@treeowl
Copy link
Collaborator Author

treeowl commented Dec 26, 2021

@madgen , Hashable isn't a standard class specified by the Report, so variables with a Hashable constraint won't be defaulted without ExtendedDefaultRules.

@madgen
Copy link

madgen commented Dec 26, 2021

@phadej @treeowl Both of you have been awesome. Thanks for all the help and for the Lift instances. Let me know if I can return the favour.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants