Use Powershell to compare two text files and remove lines with duplicate

Question

I have two text files that contain many duplicate lines. I would like to run a powershell statement that will output a new file with only the values NOT already in the first file. Below is an example of two files.

File1.txt
-----------
Alpha
Bravo
Charlie


File2.txt
-----------
Alpha
Echo
Foxtrot

In this case, only Echo and Foxtrot are not in the first file. So these would be the desired results.

OutputFile.txt
------------
Echo
Foxtrot

I reviewed the below link which is similar to what I want, but this does not write the results to an output file.

Remove lines from file1 that exist in file2 in Powershell

This script simply creates an outputfile.txt that is an exact copy of file2.txt. Do you get different results? — JadonR
– JadonR, Commented Dec 30, 2019 at 0:00

Glenn · Accepted Answer · 2019-12-30 15:13:10Z

3

Here's one way to do it:

# Get unique values from first file
$uniqueFile1 = (Get-Content -Path .\File1.txt) | Sort-Object -Unique

# Get lines in second file that aren't in first and save to a file
Get-Content -Path .\File2.txt | Where-Object { $uniqueFile1 -notcontains $_ } | Out-File .\OutputFile.txt

edited Dec 30, 2019 at 15:13

answered Dec 29, 2019 at 23:27

Glenn

1,9532 gold badges16 silver badges22 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

JadonR Over a year ago

Nice! That creates the results I'm looking for. I added " | Out-File .\OutputFile.txt " to the end of your script so that it would create an output file as I need. Thanks!

Glenn Over a year ago

Thanks, added that to the answer.

nabrond · Accepted Answer · 2019-12-31 00:20:43Z

2

Using the approach in the referenced link will work however, for every line in the original file, it will trigger the second file to be read from disk. This could be painful depending on the size of your files. I think the following approach would meet your needs.

$file1 = Get-Content .\File1.txt
$file2 = Get-Content .\File2.txt

$compareParams = @{
    ReferenceObject = $file1
    DifferenceObject = $file2
}

Compare-Object @compareParams | 
    Where-Object -Property SideIndicator -eq '=>' |
    Select-Object -ExpandProperty InputObject |
    Out-File -FilePath .\OutputFile.txt

This code does the following:

Reads each file into a separate variable
Creates a hashtable for the parameters of Compare-Object (see about_Splatting for more information)
Compares the two files in memory and passes the results to Out-File
Writes the contents of the pipeline to "OutputFile.txt"

If you are comfortable with the overall flow of this, and are only using this in one-off situations, the whole thing can be compressed into a one-liner.

(Compare-Object (gc .\File1.txt) (gc .\File2.txt) | ? SideIndicator -eq '=>').InputObject | Out-File .\OutputFile.txt

edited Dec 31, 2019 at 0:20

answered Dec 29, 2019 at 19:19

nabrond

1,3788 silver badges17 bronze badges

2 Comments

JadonR Over a year ago

Thanks for the quick reply. I tried using your script, however it returned the opposite of what I was really looking for. So using my example above, it created an OutputFile.txt with only the line "Alpha". So it seems to be grabbing everything that is the same, rather than only the unique values from the second text file.

nabrond Over a year ago

Apologies, I misread your intent as gathering what was the same, not just the differences in the second file. I updated my code to better suit your use case.

Collectives™ on Stack Overflow

Use Powershell to compare two text files and remove lines with duplicate

2 Answers 2

2 Comments

2 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

2 Comments

2 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related