Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implemented CountBy (Issue 104) #207

Closed
wants to merge 15 commits into from
92 changes: 92 additions & 0 deletions MoreLinq.Test/CountByTest.cs
Original file line number Diff line number Diff line change
@@ -0,0 +1,92 @@
#region License and Terms
// MoreLINQ - Extensions to LINQ to Objects
// Copyright (c) 2016 Leandro F. Vieira (leandromoh). All rights reserved.
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
#endregion

using NUnit.Framework;
using System;
using System.Collections.Generic;
using System.Linq;
using LinqEnumerable = System.Linq.Enumerable;

namespace MoreLinq.Test
{
[TestFixture]
public class CountByTest
{
[Test]
[ExpectedException(typeof(ArgumentNullException))]
public void CountByWithNullSequence()
{
IEnumerable<int> sequence = null;
sequence.CountBy(x => x % 2 == 0);
}

[Test]
[ExpectedException(typeof(ArgumentNullException))]
public void CountByWithNullProjection()
{
Func<int, bool> projection = null;
Enumerable.Range(1, 10).CountBy(projection);
}

[Test]
public void CountBySimpleTest()
{
var result = new[] { 1, 2, 3, 4, 5, 6, 1, 2, 3, 1, 1, 2 }.CountBy(c => c);

var expectations = new List<KeyValuePair<int, int>>()
{
{ 1, 4 },
{ 2, 3 },
{ 3, 2 },
{ 4, 1 },
{ 5, 1 },
{ 6, 1 },
};

result.AssertSequenceEqual(expectations);
}

[Test]
public void CountByEvenOddTest()
{
var result = Enumerable.Range(1, 100).CountBy(c => c % 2);

var expectations = new List<KeyValuePair<int, int>>()
{
{ 1, 50 },
{ 0, 50 },
};

result.AssertSequenceEqual(expectations);
}

[Test]
public void CountByWithEqualityComparer()
{
var result = new[] { "a", "B", "c", "A", "b", "A" }.CountBy(c => c, StringComparer.OrdinalIgnoreCase);

var expectations = new List<KeyValuePair<string, int>>()
{
{ "a", 3 },
{ "B", 2 },
{ "c", 1 },
};

result.AssertSequenceEqual(expectations);
}
}
}
4 changes: 4 additions & 0 deletions MoreLinq.Test/TestExtensions.cs
Original file line number Diff line number Diff line change
Expand Up @@ -73,5 +73,9 @@ internal static IEnumerable<string> GenerateSplits(this string str, params char[
yield return split;
}

internal static void Add<TKey, TValue>(this List<KeyValuePair<TKey, TValue>> list, TKey key, TValue value)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we are going to add this, might as well extend IList<KeyValuePair<TKey, TValue>> (interface) as opposed to List<KeyValuePair<TKey, TValue>> (implementation).

{
list.Add(new KeyValuePair<TKey, TValue>(key, value));
}
}
}
83 changes: 83 additions & 0 deletions MoreLinq/CountBy.cs
Original file line number Diff line number Diff line change
@@ -0,0 +1,83 @@
#region License and Terms
// MoreLINQ - Extensions to LINQ to Objects
// Copyright (c) 2016 Leandro F. Vieira (leandromoh). All rights reserved.
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
#endregion

namespace MoreLinq
{
using System;
using System.Collections.Generic;
using System.Linq;

static partial class MoreEnumerable
{
/// <summary>
/// Applies a key-generating function to each element of a sequence and returns a sequence of
/// unique keys and their number of occurrences in the original sequence.
/// </summary>
/// <typeparam name="TSource">Type of the elements of the source sequence.</typeparam>
/// <typeparam name="TKey">Type of the projected element.</typeparam>
/// <param name="source">Source sequence.</param>
/// <param name="keySelector">Function that transforms each item of source sequence into a key to be compared against the others.</param>
/// <returns>A sequence of unique keys and their number of occurrences in the original sequence.</returns>
public static IEnumerable<KeyValuePair<TKey, int>> CountBy<TSource, TKey>(this IEnumerable<TSource> source, Func<TSource, TKey> keySelector)
{
return source.CountBy(keySelector, null);
}

/// <summary>
/// Applies a key-generating function to each element of a sequence and returns a sequence of
/// unique keys and their number of occurrences in the original sequence.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does not mention how this overload is any different that the other & therefore could look odd on a page of summaries. It should mention the additional argument. Considering adding:

An additional argument specifies a comparer to use for testing equivalence of keys.

/// An additional argument specifies a comparer to use for testing equivalence of keys.
/// </summary>
/// <typeparam name="TSource">Type of the elements of the source sequence.</typeparam>
/// <typeparam name="TKey">Type of the projected element.</typeparam>
/// <param name="source">Source sequence.</param>
/// <param name="keySelector">Function that transforms each item of source sequence into a key to be compared against the others.</param>
/// <param name="comparer">The equality comparer to use to determine whether or not keys are equal.
/// If null, the default equality comparer for <c>TSource</c> is used.</param>
atifaziz marked this conversation as resolved.
Show resolved Hide resolved
/// <returns>A sequence of unique keys and their number of occurrences in the original sequence.</returns>
public static IEnumerable<KeyValuePair<TKey, int>> CountBy<TSource, TKey>(this IEnumerable<TSource> source, Func<TSource, TKey> keySelector, IEqualityComparer<TKey> comparer)
{
if (source == null) throw new ArgumentNullException("source");
if (keySelector == null) throw new ArgumentNullException("keySelector");

return CountByImpl(source, keySelector, comparer);
}

private static IEnumerable<KeyValuePair<TKey, int>> CountByImpl<TSource, TKey>(IEnumerable<TSource> source, Func<TSource, TKey> keySelector, IEqualityComparer<TKey> comparer)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that we have all the tests in place, it would be good to work in some optimisations. The current implementation does too many lookups:

  1. Key existence check: dic.ContainsKey(key)
  2. Counter increment: dic[key]++
  3. At the end when yielding back the results

Another optimisation could be is that as long as the current key is the same as the last, just increase the counter. This way, adjacent keys don't require containment tests on the dictionary.

{
var dic = new Dictionary<TKey, int>(comparer);
var keys = new List<TKey>();

foreach (var item in source)
{
var key = keySelector(item);

if (dic.ContainsKey(key))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider collapsing this if into the simpler form:

dic[key] = dic.Contains(key) ? dic[key] + 1 : 1;

{
dic[key]++;
}
else
{
dic[key] = 1;
keys.Add(key);
}
}

return keys.Select(k => new KeyValuePair<TKey, int>(k, dic[k]));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add a test case to specifically test that keys are indeed returned in the order they were seen in the source? If we return dic directly, we should be able to see the test fail.

Copy link
Collaborator Author

@leandromoh leandromoh Nov 2, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done! check the CountByHasKeysOrderedLikeGroupBy test.
I replaced List by IList in the extension method too.

}
}
}