36

When reading lines from a text file using python, the end-line character often needs to be truncated before processing the text, as in the following example:

f = open("myFile.txt", "r")
for line in f:
    line = line[:-1]
    # do something with line

Is there an elegant way or idiom for retrieving text lines without the end-line character?

1

6 Answers 6

54

The idiomatic way to do this in Python is to use rstrip('\n'):

for line in open('myfile.txt'):  # opened in text-mode; all EOLs are converted to '\n'
    line = line.rstrip('\n')
    process(line)

Each of the other alternatives has a gotcha:

  • file('...').read().splitlines() has to load the whole file in memory at once.
  • line = line[:-1] will fail if the last line has no EOL.
Sign up to request clarification or add additional context in comments.

2 Comments

HTTP and other protocols specify '\r\n' for line endings, so you should use line.rstrip('\r\n') for robustness.
Thanks for your help! I needed to open a text file and I was amazed to see that the \n - thing is even in Python as it is in Perl, C and so many other languages. I'll bookmark this and never forget it.
17

Simple. Use splitlines()

L = open("myFile.txt", "r").read().splitlines();
for line in L: 
    process(line) # this 'line' will not have '\n' character at the end

1 Comment

But do note this loads the entire file into memory first, which may render it unsuitable for some situations.
6

What's wrong with your code? I find it to be quite elegant and simple. The only problem is that if the file doesn't end in a newline, the last line returned won't have a '\n' as the last character, and therefore doing line = line[:-1] would incorrectly strip off the last character of the line.

The most elegant way to solve this problem would be to define a generator which took the lines of the file and removed the last character from each line only if that character is a newline:

def strip_trailing_newlines(file):
    for line in file:
        if line[-1] == '\n':
            yield line[:-1]
        else:
            yield line

f = open("myFile.txt", "r")
for line in strip_trailing_newlines(f):
    # do something with line

4 Comments

Mac files using '\r', windows uses '\r\n', it starts to get chunky. Much better to use str.rstrip()
If the file is opened in text mode, the platform's native line endings are automatically converted to a single '\n' as they are read in. And only really old Mac OSs use plain '\r'. You can't use rstrip() if you want to retain trailing spaces and tabs.
Good idea, with the generator. Would be handy in a reusable library. I would combine your solution with efonitis' solution (to save the if:else:). Without the reusable library at hand, I would prefer efotinis' solution (using line.rstrip('\n')).
+1; that's what I use. Could you please replace your if/else with rstrip('\n')?
5

Long time ago, there was Dear, clean, old, BASIC code that could run on 16 kb core machines: like that:

if (not open(1,"file.txt")) error "Could not open 'file.txt' for reading"
while(not eof(1)) 
  line input #1 a$
  print a$
wend
close

Now, to read a file line by line, with far better hardware and software (Python), we must reinvent the wheel:

def line_input (file):
    for line in file:
        if line[-1] == '\n':
            yield line[:-1]
        else:
            yield line

f = open("myFile.txt", "r")
for line_input(f):
    # do something with line

I am induced to think that something has gone the wrong way somewhere...

1 Comment

While, considering that python is our best option for a entry level interpreted language, I agree on this comment, it could be convenient to notice that 16kb BASIC with a WHILE sentence were never common.
3

What do you thing about this approach?

with open(filename) as data:
    datalines = (line.rstrip('\r\n') for line in data)
    for line in datalines:
        ...do something awesome...

Generator expression avoids loading whole file into memory and with ensures closing the file

Comments

2

You may also consider using line.rstrip() to remove the whitespaces at the end of your line.

2 Comments

I use rstrip() as well, but you have to keep in mind it also takes out trailing spaces and tabs
As efotinis has shown, if you specify the chars argument, you can specfy what to strip. From the documentation: """rstrip([chars]) The chars argument is a string specifying the set of characters to be removed. If omitted or None, the chars argument defaults to removing whitespace."""

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.