-
Notifications
You must be signed in to change notification settings - Fork 416
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implemented CountBy (Issue 104) #207
Closed
Closed
Changes from 14 commits
Commits
Show all changes
15 commits
Select commit
Hold shift + click to select a range
3588216
countBy implemented
leandromoh 987ef8c
Tests implemented
leandromoh 0526568
adjusts CountBy
leandromoh 2da65a8
adjusts CountByTest
leandromoh db337ac
added List Add() extensions
leandromoh 4fadb74
added CountByHasKeysOrderedLikeGroupBy
leandromoh 6e218c8
changed List to IList
leandromoh 9bc8c95
adjust for become lazy
leandromoh 3389416
added CountByIsLazy
leandromoh 74964a7
Optimization of lookups in CountBy
atifaziz ae13491
Clear the keys dictionary in CountBy before yielding
atifaziz 0162049
Optimize CountBy for adjacent keys
atifaziz 5e6808a
Simpler CountBy iterator locals lifetime scoping
atifaziz 1e6b47a
adjust on typeparamref
leandromoh 203c5db
updated project.json and README.md
leandromoh File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,109 @@ | ||
#region License and Terms | ||
// MoreLINQ - Extensions to LINQ to Objects | ||
// Copyright (c) 2016 Leandro F. Vieira (leandromoh). All rights reserved. | ||
// | ||
// Licensed under the Apache License, Version 2.0 (the "License"); | ||
// you may not use this file except in compliance with the License. | ||
// You may obtain a copy of the License at | ||
// | ||
// http://www.apache.org/licenses/LICENSE-2.0 | ||
// | ||
// Unless required by applicable law or agreed to in writing, software | ||
// distributed under the License is distributed on an "AS IS" BASIS, | ||
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
// See the License for the specific language governing permissions and | ||
// limitations under the License. | ||
#endregion | ||
|
||
using NUnit.Framework; | ||
using System; | ||
using System.Collections.Generic; | ||
using System.Linq; | ||
using LinqEnumerable = System.Linq.Enumerable; | ||
|
||
namespace MoreLinq.Test | ||
{ | ||
[TestFixture] | ||
public class CountByTest | ||
{ | ||
[Test] | ||
[ExpectedException(typeof(ArgumentNullException))] | ||
public void CountByWithNullSequence() | ||
{ | ||
IEnumerable<int> sequence = null; | ||
sequence.CountBy(x => x % 2 == 0); | ||
} | ||
|
||
[Test] | ||
[ExpectedException(typeof(ArgumentNullException))] | ||
public void CountByWithNullProjection() | ||
{ | ||
Func<int, bool> projection = null; | ||
Enumerable.Range(1, 10).CountBy(projection); | ||
} | ||
|
||
[Test] | ||
public void CountBySimpleTest() | ||
{ | ||
var result = new[] { 1, 2, 3, 4, 5, 6, 1, 2, 3, 1, 1, 2 }.CountBy(c => c); | ||
|
||
var expectations = new List<KeyValuePair<int, int>>() | ||
{ | ||
{ 1, 4 }, | ||
{ 2, 3 }, | ||
{ 3, 2 }, | ||
{ 4, 1 }, | ||
{ 5, 1 }, | ||
{ 6, 1 }, | ||
}; | ||
|
||
result.AssertSequenceEqual(expectations); | ||
} | ||
|
||
[Test] | ||
public void CountByEvenOddTest() | ||
{ | ||
var result = Enumerable.Range(1, 100).CountBy(c => c % 2); | ||
|
||
var expectations = new List<KeyValuePair<int, int>>() | ||
{ | ||
{ 1, 50 }, | ||
{ 0, 50 }, | ||
}; | ||
|
||
result.AssertSequenceEqual(expectations); | ||
} | ||
|
||
[Test] | ||
public void CountByWithEqualityComparer() | ||
{ | ||
var result = new[] { "a", "B", "c", "A", "b", "A" }.CountBy(c => c, StringComparer.OrdinalIgnoreCase); | ||
|
||
var expectations = new List<KeyValuePair<string, int>>() | ||
{ | ||
{ "a", 3 }, | ||
{ "B", 2 }, | ||
{ "c", 1 }, | ||
}; | ||
|
||
result.AssertSequenceEqual(expectations); | ||
} | ||
|
||
[Test] | ||
public void CountByHasKeysOrderedLikeGroupBy() | ||
{ | ||
var randomSequence = MoreEnumerable.Random(0, 100).Take(100).ToArray(); | ||
|
||
var countByKeys = randomSequence.CountBy(x => x).Select(x => x.Key); | ||
var groupByKeys = randomSequence.GroupBy(x => x).Select(x => x.Key); | ||
|
||
countByKeys.AssertSequenceEqual(groupByKeys); | ||
} | ||
|
||
[Test] | ||
public void CountByIsLazy() | ||
{ | ||
new BreakingSequence<string>().CountBy(x => x.Length); | ||
} | ||
} | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,120 @@ | ||
#region License and Terms | ||
// MoreLINQ - Extensions to LINQ to Objects | ||
// Copyright (c) 2016 Leandro F. Vieira (leandromoh). All rights reserved. | ||
// | ||
// Licensed under the Apache License, Version 2.0 (the "License"); | ||
// you may not use this file except in compliance with the License. | ||
// You may obtain a copy of the License at | ||
// | ||
// http://www.apache.org/licenses/LICENSE-2.0 | ||
// | ||
// Unless required by applicable law or agreed to in writing, software | ||
// distributed under the License is distributed on an "AS IS" BASIS, | ||
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
// See the License for the specific language governing permissions and | ||
// limitations under the License. | ||
#endregion | ||
|
||
namespace MoreLinq | ||
{ | ||
using System; | ||
using System.Collections.Generic; | ||
using System.Linq; | ||
using System.Runtime.CompilerServices; | ||
|
||
static partial class MoreEnumerable | ||
{ | ||
/// <summary> | ||
/// Applies a key-generating function to each element of a sequence and returns a sequence of | ||
/// unique keys and their number of occurrences in the original sequence. | ||
/// </summary> | ||
/// <typeparam name="TSource">Type of the elements of the source sequence.</typeparam> | ||
/// <typeparam name="TKey">Type of the projected element.</typeparam> | ||
/// <param name="source">Source sequence.</param> | ||
/// <param name="keySelector">Function that transforms each item of source sequence into a key to be compared against the others.</param> | ||
/// <returns>A sequence of unique keys and their number of occurrences in the original sequence.</returns> | ||
public static IEnumerable<KeyValuePair<TKey, int>> CountBy<TSource, TKey>(this IEnumerable<TSource> source, Func<TSource, TKey> keySelector) | ||
{ | ||
return source.CountBy(keySelector, null); | ||
} | ||
|
||
/// <summary> | ||
/// Applies a key-generating function to each element of a sequence and returns a sequence of | ||
/// unique keys and their number of occurrences in the original sequence. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This does not mention how this overload is any different that the other & therefore could look odd on a page of summaries. It should mention the additional argument. Considering adding:
|
||
/// An additional argument specifies a comparer to use for testing equivalence of keys. | ||
/// </summary> | ||
/// <typeparam name="TSource">Type of the elements of the source sequence.</typeparam> | ||
/// <typeparam name="TKey">Type of the projected element.</typeparam> | ||
/// <param name="source">Source sequence.</param> | ||
/// <param name="keySelector">Function that transforms each item of source sequence into a key to be compared against the others.</param> | ||
/// <param name="comparer">The equality comparer to use to determine whether or not keys are equal. | ||
/// If null, the default equality comparer for <typeparamref name="TSource"/> is used.</param> | ||
/// <returns>A sequence of unique keys and their number of occurrences in the original sequence.</returns> | ||
public static IEnumerable<KeyValuePair<TKey, int>> CountBy<TSource, TKey>(this IEnumerable<TSource> source, Func<TSource, TKey> keySelector, IEqualityComparer<TKey> comparer) | ||
{ | ||
if (source == null) throw new ArgumentNullException("source"); | ||
if (keySelector == null) throw new ArgumentNullException("keySelector"); | ||
|
||
return CountByImpl(source, keySelector, comparer); | ||
} | ||
|
||
static IEnumerable<KeyValuePair<TKey, int>> CountByImpl<TSource, TKey>( | ||
IEnumerable<TSource> source, Func<TSource, TKey> keySelector, | ||
IEqualityComparer<TKey> comparer) | ||
{ | ||
List<TKey> keys; | ||
List<int> counts; | ||
|
||
// Avoid the temptation to inline the CountByLoop method, which | ||
// exists solely to separate the scope & lifetimes of the locals | ||
// needed for the actual looping of the source & production of the | ||
// results (that happens once at the start of iteration) from | ||
// those needed to simply yield the results. It is harder to reason | ||
// about the lifetimes (if the code is inlined) with respect to how | ||
// the compiler will rewrite the iterator code as a state machine. | ||
// For background, see: | ||
// http://blog.stephencleary.com/2010/02/q-should-i-set-variables-to-null-to.html | ||
|
||
CountByLoop(source, keySelector, comparer ?? EqualityComparer<TKey>.Default, | ||
out keys, out counts); | ||
|
||
for (var i = 0; i < keys.Count; i++) | ||
yield return new KeyValuePair<TKey, int>(keys[i], counts[i]); | ||
} | ||
|
||
static void CountByLoop<TSource, TKey>(IEnumerable<TSource> source, | ||
Func<TSource, TKey> keySelector, IEqualityComparer<TKey> comparer, | ||
out List<TKey> keys, out List<int> counts) | ||
{ | ||
var dic = new Dictionary<TKey, int>(comparer); | ||
keys = new List<TKey>(); | ||
counts = new List<int>(); | ||
var havePrevKey = false; | ||
var prevKey = default(TKey); | ||
var index = 0; | ||
|
||
foreach (var item in source) | ||
{ | ||
var key = keySelector(item); | ||
|
||
if (// key same as the previous? then re-use the index | ||
(havePrevKey && comparer.GetHashCode(prevKey) == comparer.GetHashCode(key) | ||
&& comparer.Equals(prevKey, key)) | ||
// otherwise try & find index of the key | ||
|| dic.TryGetValue(key, out index)) | ||
{ | ||
counts[index]++; | ||
} | ||
else | ||
{ | ||
dic[key] = keys.Count; | ||
keys.Add(key); | ||
counts.Add(1); | ||
} | ||
|
||
prevKey = key; | ||
havePrevKey = true; | ||
} | ||
} | ||
} | ||
} |
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I change the last line of
CountByImpl
:to simply return
dic
instead:then this test (
CountByHasKeysOrderedLikeGroupBy
) doesn't fail.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In pratice, yes, it doesnt fail (because, currently, dictionary brings the results in order that keys were inserted), but as you once said yourself
So order in enumeration of dictionary can change anyday with a new implementation. To ensure CountBy always return in the order that keys were found the current last line is necessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What
CountByIsLazy
test should exactly do? If I understood well, the implementation isn't lazy since I must iterate over all the IEnumerable before return any KeyValuePair.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CountBy
isn't lazy but it can and should be (likeGroupBy
). The iterator doesn't need to run until someone iterates the results! You can make it lazy very easily by usingyield
to return the results. So instead of the following as the last line ofCountByImpl
:Do instead:
Now the compiler will re-write the code to run during iteration, rendering it lazy!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But It isn't the same thing since Select is lazy (do exactly the
foreach yield
) ?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how could I write a test that check if the return of CountBy is lazy (for CountByIsLazy) ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See some of the existing tests that test for laziness for inspiration.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes,
Select
is lazy except you compute the result imperatively before you send them back viaSelect
. That computation is happening whenCountBy
is called when it should happen when the enumerable returned by your method is enumerated by the caller (and which could be some time after).There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, thanks for the explanation.
CountByIsLazy
was implemented.