5

I have this one problem. There is a string

string [5] names = { "John", "Sam", "Harry", "Sam", "John" }

I need to find the most common elements in the array. I tried using :

string MostCommon = names.GroupBy(v => v)
    .OrderByDescending(g => g.Count())
    .First()
    .Key;

Unfortunately it only finds one element, f.e., MostCommon = John, and in this case I need not only John, but Sam too. How could I do that? Maybe LINQ is not necessary in this case?

3
  • 1
    .First() is your problem. Take off the .First(), and you'll have more than one result, but you won't know what the specific counts are with that single LINQ statement. Commented Sep 20, 2016 at 15:47
  • Possible duplicate of Return max repeated item in list Commented Sep 20, 2016 at 16:14
  • 3
    No, that duplicate makes what OP wants to avoid, only select the very first max. It doesn´t handle the case of the OP where more than one member may have the max count. Commented Sep 20, 2016 at 16:15

4 Answers 4

12

First will obviously only select the very first element of your sequence. However you need all groups with an equal number. So select the name and number per group and order afterwards. Finally select all those groups having the same count as the very first group.

var groups = names.GroupBy(x => x)
    .Select(x => new { x.Key, Count = x.Count() })
    .OrderByDescending(x => x.Count);
int max = groups.First().Count;
var mostCommons = groups.Where(x => x.Count == max);

EDIT: You could also use TakeWhile instead of Where in the last statement which will avoid unnecessary comparisons for the last elements in the groups-list and stops immediately when the first group was found having less elements than the first one:

var mostCommons = groups.TakeWhile(x => x.Count == groups.First().Count);
Sign up to request clarification or add additional context in comments.

5 Comments

This is highly inefficient way which executes groups query (involving grouping and ordering) a lot of times. At least put groups.First().Count into a variable outside the last query, to make the groups query execute "only" twice. Still it will be worse than the Amit Hasan approach (and not counting the possible non LINQ solutions), but at least can be considered not so bad.
That's a very good point about the advantage of TakeWhile over Where. For the benefit of future readers I think you should consider adding a snippet using TakeWhile and possibly even remove the Where code entirely because TakeWhile is the better choice.
@BACON Done so.
@IvanStoev Done as suggested, however I doubt there is an approach that avoids executing the group-statement twice, even in your cited answer the namegroup is iterated twice. We could force immediate avaluation using ToList, but so far both solutions should perform similar. I tested this with a non-LINQ-approach which will also need two iterations of the original sequence.
@HimBromBeere Correct. The point was to not execute it N times :) The approach with Max + Where is better because it's O(N) even with 2 passes. While OrderByDescending + First is O(N * lg(N)) as you know. GroupBy operation is O(N) in both cases, so I'm not counting it.
5

This could be done as follows -

 var nameGroup = names.GroupBy(x => x);
 var maxCount = nameGroup.Max(g => g.Count());
 var mostCommons = nameGroup.Where(x => x.Count() == maxCount).Select(x => x.Key).ToArray();

Comments

4

Combine your first LINQ with another similar linq based on the count of the most common name you found.

string MostCommon = names.GroupBy(v => v)
    .OrderByDescending(g => g.Count())
    .First();

int count = names.Where(x => x == MostCommon).Count();

var mostCommonList = names.GroupBy(v => v)
    .Where(g => g.Count() == count);

2 Comments

This does not compile due to the non-existent .Key property on the very last line (plus the additional = in assigning mostCommonList). It cannot be assumed that there is only one "most common name", so the result needs to be IEnumerable<string> and not string. Therefore, .Key needs to be removed (or, less usefully, replaced with .Select(v => v)).
@BACON too many bad copy/pastas, thanks for calling me out. I fixed the = and removed both .Key statements. I think the answer should be fine without the .Select() as the .Where will return an enumerable.
0
//With Dictionary
//This is more useful if you are looking to interview big companies otherwise use the 
 Linq option which is short and handy

public static int MaxOccurrenceOfWord(string[] words)
    {
        var counts = new Dictionary<string, int>();
        int occurrences = 0;
        foreach (var word in words)
        {
            int count;
            counts.TryGetValue(word, out count);
            count++;
             //Automatically replaces the entry if it exists;
            //no need to use 'Contains'
            counts[word] = count;
        }

        string mostCommonWord = null;
        foreach (var pair in counts)
        {
            if (pair.Value > occurrences)
            {
                occurrences = pair.Value;
                mostCommonWord = pair.Key;
            }
        }
        Console.WriteLine("The most common number is {0} and it appears {1} times",
            mostCommonWord, occurrences);

        return occurrences;

    }

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.