220

I have two lists List that I need to combine in third list and remove duplicate values from that lists

A bit hard to explain, so let me show an example of what the code looks like and what I want as a result, in sample I use int type not ResultAnalysisFileSql class.

first_list = [1, 12, 12, 5]

second_list = [12, 5, 7, 9, 1]

The result of combining the two lists should result in this list: resulting_list = [1, 12, 5, 7, 9]

You'll notice that the result has the first list, including its two "12" values, and in second_list has an additional 12, 1 and 5 value.

ResultAnalysisFileSql class

[Serializable]
    public partial class ResultAnalysisFileSql
    {
        public string FileSql { get; set; }

        public string PathFileSql { get; set; }

        public List<ErrorAnalysisSql> Errors { get; set; }

        public List<WarningAnalysisSql> Warnings{ get; set; }

        public ResultAnalysisFileSql()
        {

        }

        public ResultAnalysisFileSql(string fileSql)
        {
            if (string.IsNullOrEmpty(fileSql)
                || fileSql.Trim().Length == 0)
            {
                throw new ArgumentNullException("fileSql", "fileSql is null");
            }

            if (!fileSql.EndsWith(Utility.ExtensionFicherosErrorYWarning))
            {
                throw new ArgumentOutOfRangeException("fileSql", "Ruta de fichero Sql no tiene extensión " + Utility.ExtensionFicherosErrorYWarning);
            }

            PathFileSql = fileSql;
            FileSql = ObtenerNombreFicheroSql(fileSql);
            Errors = new List<ErrorAnalysisSql>();
            Warnings= new List<WarningAnalysisSql>();
        }

        private string ObtenerNombreFicheroSql(string fileSql)
        {
            var f = Path.GetFileName(fileSql);
            return f.Substring(0, f.IndexOf(Utility.ExtensionFicherosErrorYWarning));
        }


        public override bool Equals(object obj)
        {
            if (obj == null)
                return false;
            if (!(obj is ResultAnalysisFileSql))
                return false;

            var t = obj as ResultAnalysisFileSql;
            return t.FileSql== this.FileSql
                && t.PathFileSql == this.PathFileSql
                && t.Errors.Count == this.Errors.Count
                && t.Warnings.Count == this.Warnings.Count;
        }


    }

Any sample code for combine and removing duplicates ?

0

6 Answers 6

423

Have you had a look at Enumerable.Union

This method excludes duplicates from the return set. This is different behavior to the Concat method, which returns all the elements in the input sequences including duplicates.

List<int> list1 = new List<int> { 1, 12, 12, 5};
List<int> list2 = new List<int> { 12, 5, 7, 9, 1 };
List<int> ulist = list1.Union(list2).ToList();

// ulist output : 1, 12, 5, 7, 9
Sign up to request clarification or add additional context in comments.

2 Comments

@Dr TJ: Do your person Class implements IEqualityComparer<T>? If so, you'll need to check your GetHashCode and Equals methods. See the Remarks section of msdn.microsoft.com/en-us/library/bb341731.aspx.
Important to note because I ran into issues using this on 2 different collections: "You can't union two different types, unless one inherits from the other" from stackoverflow.com/a/6884940/410937 which yielded a cannot be inferred from the usage error.
38

why not simply eg

var newList = list1.Union(list2)/*.Distinct()*//*.ToList()*/;

oh ... according to the documentation you can leave out the .Distinct()

This method excludes duplicates from the return set

Comments

34

Union has not good performance : this article describe about compare them with together

var dict = list2.ToDictionary(p => p.Number);
foreach (var person in list1)
{
        dict[person.Number] = person;
}
var merged = dict.Values.ToList();

Lists and LINQ merge: 4820ms
Dictionary merge: 16ms
HashSet and IEqualityComparer: 20ms
LINQ Union and IEqualityComparer: 24ms

4 Comments

Also another benefit of using a dictionary merge -> I have two list coming back from DB data. And my data has a timestamp field, which is different in the two lists of data. With the union I get duplicates due to the timestamp being different. But with the merge I can decide which unique field I want consider in the dictionary. +1
Can vary by processor speed, depends what kind of CPU you have.
And at the end of the article it says, "I prefer LINQ Union because it communicates intent very clearly." ;) (also, there was only a 8 ms difference)
For small lists where the difference is negligible, Union results in cleaner and more readable code. Spending time to hyper-optimise code when it's not slow may incur a maintenance penalty down the road.
18

Use Linq's Union:

using System.Linq;
var l1 = new List<int>() { 1,2,3,4,5 };
var l2 = new List<int>() { 3,5,6,7,8 };
var l3 = l1.Union(l2).ToList();

Comments

13
    List<int> first_list = new List<int>() {
        1,
        12,
        12,
        5
    };

    List<int> second_list = new List<int>() {
        12,
        5,
        7,
        9,
        1
    };

    var result = first_list.Union(second_list);

Comments

0

To elaborate on fateme maddahi's answer.

If list2 contains two or more items with duplicate Number properties you will get an exception with ToDictionary:

[System.ArgumentException: An item with the same key has already been added.]

To avoid this, use this instead:

var dict = list2
    .GroupBy(p => p.Number, StringComparer.OrdinalIgnoreCase)
    .ToDictionary(g => g.Key, g => g.First(), StringComparer.OrdinalIgnoreCase);

foreach (var person in list1)
{
        dict[person.Number] = person;
}
var merged = dict.Values.ToList();

Source

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.