Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

haskell/nucleotide-count: Add mentoring notes #654

Merged
merged 7 commits into from
Jan 4, 2019
Merged

haskell/nucleotide-count: Add mentoring notes #654

merged 7 commits into from
Jan 4, 2019

Conversation

sshine
Copy link
Contributor

@sshine sshine commented Dec 7, 2018

No description provided.

frerich
frerich previously approved these changes Dec 8, 2018
Copy link
Contributor

@frerich frerich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Another solution I'd like to propose is based on Map.fromListWith (+):

nucleotideCounts :: String -> Either String (Map Char Int)
nucleotideCounts xs
    | all (`elem` "ACGT") xs = Right (counts `Map.union` defaults)
    | otherwise              = Left xs
  where
    counts   = Map.fromListWith (+) (zip xs [1,1..])
    defaults = Map.fromList (zip "ACGT" [0,0..])

The summing part is hididen in Map.fromListWith (+) here. Since it's possible that a certain nucleotide does not occur even once, we define a map of 'default' counts (i.e. all zero) which is then used to form the final map -- the order of arguments to Map.union matters here!

It could be argued that doing two passes over the input list (one by the all call in the guard, the other by evaluating the counts definition) is inefficient, but I like it for being quite clear about when this function yields a Right value and when Left.

@sshine
Copy link
Contributor Author

sshine commented Dec 10, 2018

Very nice, @frerich. Thanks! Since exercism/haskell#714 we're using a custom data type

data Nucleotide = A | C | G | T deriving (Eq, Ord, Show)

I'll try and adapt your solution to this data type and add this.

@frerich
Copy link
Contributor

frerich commented Dec 10, 2018

I imagine replacing "ACGT" with [A,C,G,T] (or maybe deriving Enum and then using [A .. T] if you fancy) ought to be enough. :-)

@sshine
Copy link
Contributor Author

sshine commented Dec 10, 2018

@frerich: I like the use of M.fromListWith (+), so I merged it with my previous favorite solution and added the use of readEither and <> for error conversion, so it now relies entirely on the Read instance (and there are no separate validation functions that can desync from the data type). I also added Enum and Bounded so that the zero map can be generated as

M.fromListWith (+) [minBound..] [0,0..]

@sshine
Copy link
Contributor Author

sshine commented Dec 10, 2018

OK, I think I've golfed this enough.

I had another idea of

{-# LANGUAGE TypeApplications #-}

isNucleotide :: Char -> Bool
isNucleotide = isJust . readMaybe @Nucleotide . return

but I don't think there are any of the current reasonable solutions that benefit from this, since the ones that use isNucleotide subsequently use read, so they might as well have used readMaybe or readEither to begin with.

@sshine
Copy link
Contributor Author

sshine commented Dec 14, 2018

@frerich: Could you re-approve my changes?

Copy link
Contributor

@frerich frerich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, except for one single sentence which seems somewhat dubious to me.

This solution also handles errors in an initial guard.

Because GHC can fuse list combinators, four separate `length . filter (== c)`
do not necessarily perform bad. Zeroes are not a special case here.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sentence seems a little fishy to me -- four separate length . filter (== c) calls "do not necessarily" perform bad?

It's true that GHC can fuse list combinators, but to the best of my knowledge this only works if you compose certain functions (at least catamorphisms). I'm not sure this claim is accurate when applied to four independent invocations of count.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll have to run a benchmark then.

Sounds like you're right.

I recall this one not running nearly like four times for large input.

sshine added a commit to exercism/haskell that referenced this pull request Dec 17, 2018
Nucleotide Count has several reasonable implementations and learning
goals. Mentor notes are made available in exercism/website-copy#654 and
the discussion to replace some core track exercises is advanced in #761.

Until either its difficulty has been adjusted (#793) or another order
has been picked (#790), continue to order config.json by difficulty.
@rpottsoh rpottsoh merged commit 9147b81 into exercism:master Jan 4, 2019
@sshine
Copy link
Contributor Author

sshine commented Jan 5, 2019

This was still in my queue of things to address once I got home from vacation (yesterday), but you've only bumped it up the queue, since the mentor notes are now somehow incorrect. :-D

I'll fix this Monday.

PTrottier pushed a commit to PTrottier/website-copy that referenced this pull request Jan 25, 2019
* haskell/nucleotide-count: Add mentoring notes

* haskell/nucleotide-count: Add fourth reasonable solution with M.fromListWith (+)

* haskell/nucleotide-count: Merge two reasonable solutions into best-of-breed

* haskell/nucleotide-count: Clarify best-of-breed example

* haskell/nucleotide-count: Reorder solutions, add more comments.

* haskell/nucleotide-count: Add foldM-based solution with separated error handling

* haskell/nucleotide-count: Add missing imports to second foldM solution
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants