192

I want to read a file line by line in PowerShell. Specifically, I want to loop through the file, store each line in a variable in the loop, and do some processing on the line.

I know the Bash equivalent:

while read line do
    if [[ $line =~ $regex ]]; then
          # work here
    fi
done < file.txt

Not much documentation on PowerShell loops.

4
  • 2
    The selected answer from Mathias is not a great solution. Get-Content loads the entire file into memory at once, which will fail or freeze on large files. Commented Jun 7, 2019 at 18:23
  • 1
    @KolobCanyon that is completely untrue. By default Get-Content loads each line as one object in the pipeline. If you're piping to a function that doesn't specify a process block, and spits out another object per line into the pipeline, then that function is the problem. Any problems with loading the full content into memory are not the fault of Get-Content. Commented Jul 4, 2019 at 12:39
  • 1
    @TheFish foreach($line in Get-Content .\file.txt) It will load the entire file into memory before it begins iterating. If you don't believe me, go get a 1GB log file and try it. Commented Jul 5, 2019 at 15:00
  • 5
    @KolobCanyon That's not what you said. You said that Get-Content loads it all into memory which is not true. Your changed example of foreach would, yes; foreach is not pipeline aware. Get-Content .\file.txt | ForEach-Object -Process {} is pipeline aware, and will not load the entire file into memory. By default Get-Content will pass one line at a time through the pipeline. Commented Jul 8, 2019 at 10:46

5 Answers 5

317

Not much documentation on PowerShell loops.

Documentation on loops in PowerShell is plentiful, and you might want to check out the following help topics: about_For, about_ForEach, about_Do, about_While.

foreach($line in Get-Content .\file.txt) {
    if($line -match $regex){
        # Work here
    }
}

Another idiomatic PowerShell solution to your problem is to pipe the lines of the text file to the ForEach-Object cmdlet:

Get-Content .\file.txt | ForEach-Object {
    if($_ -match $regex){
        # Work here
    }
}

Instead of regex matching inside the loop, you could pipe the lines through Where-Object to filter just those you're interested in:

Get-Content .\file.txt | Where-Object {$_ -match $regex} | ForEach-Object {
    # Work here
}
Sign up to request clarification or add additional context in comments.

12 Comments

the last one is the most idiomatic for powershell, and can be even more succinctly written with gc 'file.txt' | ?{ $_ -match $regex } | %{ <#stuff#> }
Yes but, "succinct' and 'lucid' are two different things. If you need anyone to read this script ever then I beg you - don't do this to us.
Get-Content reads the whole file into memory. For large files, this could be a bad approach.
@Jayanth Not before it starts pushing output downstream. Get-Content ... |ForEach-Object { ... } has very different performance characteristics than, say (Get-Content ...) |ForEach-Object { ... }, at least for very large files.
@Mathias, thanks for the info. Could you update your answer with this detail for the "Get-Content ... | ForEach-Object" part of the solution?
|
95

Get-Content has bad performance; it tries to read the file into memory all at once.

C# (.NET) file reader reads each line one by one

Best Performace

foreach($line in [System.IO.File]::ReadLines("C:\path\to\file.txt"))
{
       $line
}

Or slightly less performant

[System.IO.File]::ReadLines("C:\path\to\file.txt") | ForEach-Object {
       $_
}

The foreach statement will likely be slightly faster than ForEach-Object (see comments below for more information).

13 Comments

I would probably use [System.IO.File]::ReadLines("C:\path\to\file.txt") | ForEach-Object { ... }. The foreach statement will load the entire collection to an object. ForEach-Object uses a pipeline to stream with. Now the foreach statement will likely be slightly faster than the ForEach-Object command, but that's because loading the whole thing to memory usually is faster. Get-Content is still terrible, however.
That is a very common misconception. foreach is a statement, like if, for, or while. ForEach-Object is a command, like Get-ChildItem. There is also a default alias of foreach for ForEach-Object, but it is only used when there is a pipeline. See the long explanation in Get-Help about_Foreach, or click the link in my previous comment which goes to an entire article by Microsoft's The Scripting Guys about the differences between the statement and the command.
@BaconBits blogs.technet.microsoft.com/heyscriptingguy/2014/07/08/… Learned something new. Thanks. I assumed they were the same because Get-Alias foreach => Foreach-Object, but you are right, there are differences
That will work, but you'll want to change $line to $_ in the loop's script block.
@TheFish true, but this being a canonical question, I think people should know that using Get-Content is the devil.
|
14

Reading Large Files Line by Line

Original Comment (1/2021) I was able to read a 4GB log file in about 50 seconds with the following. You may be able to make it faster by loading it as a C# assembly dynamically using PowerShell.

[System.IO.StreamReader]$sr = [System.IO.File]::Open($file, [System.IO.FileMode]::Open)
while (-not $sr.EndOfStream){
    $line = $sr.ReadLine()
}
$sr.Close() 

Addendum (3/2022) Processing the large file using C# embedded in PowerShell is even faster and has less "gotchas".

$code = @"
using System;
using System.IO;

namespace ProcessLargeFile
{
    public class Program
    {
        static void ProcessLine(string line)
        {
            return;
        }

        public static void ProcessLogFile(string path) {
            var start_time = DateTime.Now;
            StreamReader sr = new StreamReader(File.Open(path, FileMode.Open));
            try {
                while (!sr.EndOfStream){
                    string line = sr.ReadLine();
                    ProcessLine(line);
                }
            } finally {
                sr.Close();
            }
            var end_time = DateTime.Now;
            var run_time = end_time - start_time;
            string msg = "Completed in " + run_time.Minutes + ":" + run_time.Seconds + "." + run_time.Milliseconds;
            Console.WriteLine(msg);
        }

        static void Main(string[] args)
        {
            ProcessLogFile("c:\\users\\tasaif\\fake.log");
            Console.ReadLine();
        }
    }
}
"@
 
Add-Type -TypeDefinition $code -Language CSharp

PS C:\Users\tasaif> [ProcessLargeFile.Program]::ProcessLogFile("c:\\users\\tasaif\\fake.log")
Completed in 0:17.109

4 Comments

Tareq Saif -- 4 GB in 50 secs has not been true for me with this example. Am I missing something?
@ToC I tried it again today and I believe I filtered my dataset first before performing any function calls. For example if ($line.Contains("relevant information")){ Do something useful } If you try running a function on every line (including an empty function) it takes much longer. If you must run a function for each line and want it to run faster I would look into parallelizing the code maybe using threads.
Apparently, I can't go back to modify my comment. I tried embedding C# in the PowerShell and it doesn't suffer from that limitation. With an empty function and just reading the lines, it processed in 18 seconds. I'll add the code to my comment above.
-Thank you, I'll try this and see how it plays out. Appreciate you taking time to add more details !!
10

The almighty switch works well here:

'one
two
three' > file

$regex = '^t'

switch -regex -file file { 
  $regex { "line is $_" } 
}

Output:

line is two
line is three

Comments

5

Set-Location 'C:\files'
$files = Get-ChildItem -Name -Include *.txt
foreach($file in $files){
        Write-Host("Start Reading file: " + $file)
        foreach($line in Get-Content $file){
            Write-Host($line)
        }
        Write-Host("End Reading file: " + $file)                
}

1 Comment

Your answer could be improved with additional supporting information. Please edit to add further details, such as citations or documentation, so that others can confirm that your answer is correct. You can find more information on how to write good answers in the help center.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.