35

I only need to read the first line of a huge file and change it.

Is there a trick to only change the first line of a file and save it as another file using Python? All my code is done in Python and would help me to keep consistency.

The idea is to not have to read and then write the whole file.

7
  • Is the new line going to be exactly the same length as the old one? Commented Feb 18, 2013 at 23:32
  • Can you have the first line as a variable, then change it based on an argument from another file? Commented Feb 18, 2013 at 23:32
  • @EmilVikström no the new line would be of different length. Commented Feb 18, 2013 at 23:36
  • @ChristopherMarshall i think i could Commented Feb 18, 2013 at 23:37
  • 2
    I suppose you know this isn't a Python limitation but rather that of file system operations. There were line-oriented filetypes in the dark, remote past which are now blissfully dead. Commented Feb 18, 2013 at 23:42

7 Answers 7

39

shutil.copyfileobj() should be much faster than running line-by-line. Note from the docs:

Note that if the current file position of the [from_file] object is not 0, only the contents from the current file position to the end of the file will be copied.

Thus:

from_file.readline() # and discard
to_file.write(replacement_line)
shutil.copyfileobj(from_file, to_file)
Sign up to request clarification or add additional context in comments.

1 Comment

what mode should i use when opening to_file? just w?
5

If you want to modify the top line of a file and save it under a new file name, it is not possible to simply modify the first line without iterating over the entire file. On the bright side, as long as you are not printing to the terminal, modifying the first line of a file is VERY, VERY fast even on vasy large files.

Assuming you are working with text-based files (not binary,) this should fit your needs and perform well enough for most applications.

import os
newline = os.linesep # Defines the newline based on your OS.

source_fp = open('source-filename', 'r')
target_fp = open('target-filename', 'w')
first_row = True
for row in source_fp:
    if first_row:
        row = 'the first row now says this.'
        first_row = False
    target_fp.write(row + newline)

4 Comments

thanks! this is indeed an alternative solution quite slow for a huge file but from other answers seems this may be the only way...
'\r\n' is almost never correct in any context. If you want to read/write text files with python use 'rt' or 'wt' mode.
i did some quick digging and unfortunately i have not been able to derive anything better.. if i do find something, i'll be sure to update!
@msw: wt and rt appear to be Python 3-specific implementations. generally, it's best to write code which is python 2.5+ and 3 compatible if at all possible. Python 2.7 Docs: docs.python.org/2.7/library/functions.html#open Python 3.3 Docs: docs.python.org/3.3/library/functions.html#open
4

An alternate solution that does not require iterating over the lines that are not of interest.

def replace_first_line( src_filename, target_filename, replacement_line):
    f = open(src_filename)
    first_line, remainder = f.readline(), f.read()
    t = open(target_filename,"w")
    t.write(replacement_line + "\n")
    t.write(remainder)
    t.close()

1 Comment

Doesn't this require reading the entire file into memory? Such an operation will not only iterate over the file in the background, but also be impossible on very large files.
2

Unless the new line is the same length as the old line, you can not do this. If it is, you could solve this problem through a mmap.

1 Comment

thanks! i'd have thought this was easily done but i think you are right, good answer. In my case though the new line is of different length to the old line.
2

The sh module worked for me:

import sh

first = "new string"
sh.sed("-i", "1s/.*/" + first + "/", "file.x")

1 Comment

sh.sed("-i.bak", "1s/.*/" + first + "/", "file.x") works for me. BTW the shutil and fileinput based solutions both produce unexpected results for me.
1

The solution i would use is to use create a file missing old first line

from_file.readline() # and discard shutil.copyfileobj(from_file, tail_file)

then create a file with the new first line

then use the following to concatenate the newfirstline file and tail_file

for f in ['newfirstline.txt','tail_file.txt']:
with open(f,'rb') as fd:
    shutil.copyfileobj(fd, wfd, 1024*1024*10

Comments

0

Here is the working example of "Nacho" answer:

import subprocess

cmd = ['sed', '-i', '-e', '1,1s/.*/' + new_line + '/g', 'filename.txt']

subprocess.call(cmd)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.