1

I know this as asked many times but cannot see something that works. I am reading a csv file and then I have to remove duplicate lines based on one of the columns "CustomerID". Basically the CSV file can have multiple lines with the same customerID.

I need to remove the duplicates.

    //DOES NOT WORK
     var finalCustomerList = csvCustomerList.Distinct().ToList();  

     I have also tried this extension method //DOES NOT WORK
     public static IEnumerable<t> RemoveDuplicates<t>(this IEnumerable<t> items)
        {
        return new HashSet<t>(items);
        }

What works for me is

  • I Read the CSV file into a csvCustomerList
  • Loop through csvCustomerList and check if a customerExists If it doesnt I add it.

     foreach (var csvCustomer in csvCustomerList)
     {
        var Customer = new customer();
        customer.CustomerID = csvCustomer.CustomerID;
        customer.Name = csvCustomer.Name; 
        //etc.....
    
        var exists = finalCustomerList.Exists(x => x.CustomerID == csvCustomer.CustomerID);
        if (!exists)
        {
           finalCustomerList.Add(customer);
        }
     }
    

    Is there a better way of doing this?

2 Answers 2

4

For Distinct to work with non standard equality checks, you need to make your class customer implement IEquatable<T>. In the Equals method, simply compare the customer ids and nothing else.
As an alternative, you can use the overload of Distinct that requires an IEqualityComparer<T> and create a class that implements that interface for customer. Like that, you don't need to change the customer class.
Or you can use Morelinq as suggested by another answer.

Sign up to request clarification or add additional context in comments.

1 Comment

Thanks that makes sense.However Cannot modify the class
3

For a simple solution, check out Morelinq by Jon Skeet and others.

It has a DistinctBy operator where you can perform a distinct operation by any field. So you could do something like:

var finalCustomerList = csvCustomerList.DistinctBy(c => c.customerID).ToList(); 

2 Comments

Thanks that would work just fine.Not sure I am allowed to ref another third party library
@user231465 - then look at the source and pretend you wrote it yourself. code.google.com/p/morelinq/source/browse/trunk/MoreLinq/… ;-) (only joking of course...)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.