Finding Distinct Elements In A List(T)

August 4th, 2011 Leave a comment Go to comments

LINQ provides a Distinct() method, but in order to find distinct elements in a list of some target class, we must first implement the IEqualityComparer<T> interface in our target class. That’s what Distinct() uses in order to compute whether one element is the same as another element. Implementing IEqualityComparer<T>, however, is not so straightforward as it requires us to override Equals() and GetHashCode() methods. Most of the times if all we need to do is just to find non-duplicated elements in a given list, IEqualityComparer<T> route seems like an overkill.

Other developers have come up with some slick ways to solve this problem like creating a generic EqualityComparer<T> and using HashSet<T>, but I find both of those methods to be more complex solutions to a simpler problem.

I usually use a much simpler way to accomplish the same task which I will share with you below. Consider the following rather simple Customer class:

public class Customer
{   
    public string FirstName { get; set; }
    public string LastName { get; set; }
}

We initialize this class with some customers as follows:

List< Customer > customers = new List< Customer >;
{
    new Customer {FirstName = "John", LastName = "Doe"},
    new Customer {FirstName = "Jane", LastName = "Doe"},
    new Customer {FirstName = "John", LastName = "Doe"},
    new Customer {FirstName = "Jay", LastName = null},
    new Customer {FirstName = "Jay", LastName = "Doe"}
};

So far so good! Now all we need is a list of distinct elements by, let’s say, FirstName (find all elements that have different first names). We can do that by first grouping our list by FirstName, and then selecting only the first element out of each group as follows:

var distinctCustomers = customers.GroupBy(s => s.FirstName)
    .Select(s => s.First());

// Will print John Doe, Jane Doe, and Jay
foreach(var customer in distinctCustomers)
    Console.WriteLine("{0} {1}", customer.FirstName, customer.LastName);

Easy enough? Now let’s say we need to find out whether or not the list actually contains any duplicates to begin with. We can also do that easily by again grouping and then computing whether or not any of the groups contain more than 1 elements as follows:

var hasDusplicates = customers.GroupBy(s => s.FirstName)
    .Any(s => s.Count() > 1);

Console.WriteLine(hasDusplicates);

I hope that this simplifies your code.

  1. Jimmy
    August 21st, 2011 at 13:31 | #1

    Thank you, very useful!

  2. sophi
    June 8th, 2012 at 05:51 | #2

    Hi, Thanks a lot, it worked out for me

  1. No trackbacks yet.
 

Comment moderation is enabled. Your comment may take some time to appear.