Skip first couple of lines while reading lines in Python file

Question

I want to skip the first 17 lines while reading a text file.

Let's say the file looks like:

0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
good stuff

I just want the good stuff. What I'm doing is a lot more complicated, but this is the part I'm having trouble with.

stackoverflow.com/questions/620367/… or stackoverflow.com/questions/4796764/… etc..? — Ryan Kempt
– Ryan Kempt, Commented Mar 6, 2012 at 5:57

cs95 · Accepted Answer · 2018-05-15 05:49:42Z

173

Use a slice, like below:

with open('yourfile.txt') as f:
    lines_after_17 = f.readlines()[17:]

If the file is too big to load in memory:

with open('yourfile.txt') as f:
    for _ in range(17):
        next(f)
    for line in f:
        # do stuff

edited May 15, 2018 at 5:49

cs95

406k106 gold badges744 silver badges797 bronze badges

answered Mar 6, 2012 at 5:57

wim

368k114 gold badges681 silver badges818 bronze badges

Sign up to request clarification or add additional context in comments.

5 Comments

riddleculous Over a year ago

I use the second solutions to read ten lines at the end of a file with 8 million (8e6) lines and it takes ~22 seconds. Is this still the preferred (=fastest) way for such long files (~250 MB)?

wim Over a year ago

I would use tail for that.

riddleculous Over a year ago

@wim: I guess, tail doesn't work on Windows. Furthermore I don't always want to read the last 10 lines. I want to be able to read some lines in the middle. (e.g. if I read 10 lines after ~4e6 lines in the same file it takes still half of that time, ~11 seconds)

wim Over a year ago

The thing is, you need to read the entire content before line number ~4e6 in order to know where the line separator bytes are located, otherwise you don't know how many lines you've passed. There's no way to magically jump to a line number. ~250 MB should be OK to read entire file to memory though, that's not particularly big data.

tony_tiger Over a year ago

@riddleculous see stackoverflow.com/q/3346430/2491761 for getting last lines

Jean-François Fabre · Accepted Answer · 2018-04-14 20:47:36Z

50

Use itertools.islice, starting at index 17. It will automatically skip the 17 first lines.

import itertools
with open('file.txt') as f:
    for line in itertools.islice(f, 17, None):  # start=17, stop=None
        # process lines

edited Apr 14, 2018 at 20:47

Jean-François Fabre♦

141k24 gold badges179 silver badges246 bronze badges

answered Mar 6, 2012 at 6:02

Ismail Badawi

37.4k8 gold badges90 silver badges101 bronze badges

2 Comments

Aditya Harikrish Over a year ago

Is this feasible for large text files that may not fit in the memory? That is, does itertools.islice load the entire file into the memory? I couldn't find this in the documentation.

jrd1 Over a year ago

@AdityaHarikrish - all functions within itertools are iterators, which only consumes memory as the object is read - i.e. the whole file is not read into memory, only one line at a time. For the example provided, the only memory that will be allocated is the data required to read in the line content. Saving that line content is another matter entirely.

ninjagecko · Accepted Answer · 2023-12-08 12:40:50Z

4

for line in itertools.dropwhile(isBadLine, lines):
    # process as you see fit

Full demo:

from itertools import *

def isBadLine(line):
    return line=='0'

with open(...) as f:
    for line in dropwhile(isBadLine, f):
        # process as you see fit

Advantages: This is easily extensible to cases where your prefix lines are more complicated than "0" (but not interdependent).

edited Dec 8, 2023 at 12:40

answered May 6, 2012 at 23:08

ninjagecko

91.5k24 gold badges144 silver badges153 bronze badges

1 Comment

NeilG Over a year ago

Nice idea. Keeps it clean.

Azsgy · Accepted Answer · 2018-04-14 20:45:32Z

3

If you don't want to read the whole file into memory at once, you can use a few tricks:

With next(iterator) you can advance to the next line:

with open("filename.txt") as f:
     next(f)
     next(f)
     next(f)
     for line in f:
         print(f)

Of course, this is slighly ugly, so itertools has a better way of doing this:

from itertools import islice

with open("filename.txt") as f:
    # start at line 17 and never stop (None), until the end
    for line in islice(f, 17, None):
         print(f)

answered Apr 14, 2018 at 20:45

Azsgy

3,3152 gold badges31 silver badges43 bronze badges

Comments

willywonka · Accepted Answer · 2018-12-27 09:37:44Z

3

Here are the timeit results for the top 2 answers. Note that "file.txt" is a text file containing 100,000+ lines of random string with a file size of 1MB+.

Using itertools:

import itertools
from timeit import timeit

timeit("""with open("file.txt", "r") as fo:
    for line in itertools.islice(fo, 90000, None):
        line.strip()""", number=100)

>>> 1.604976346003241

Using two for loops:

from timeit import timeit

timeit("""with open("file.txt", "r") as fo:
    for i in range(90000):
        next(fo)
    for j in fo:
        j.strip()""", number=100)

>>> 2.427317383000627

clearly the itertools method is more efficient when dealing with large files.

answered Dec 27, 2018 at 9:37

willywonka

392 bronze badges

Comments

gsamaras · Accepted Answer · 2016-04-30 19:55:40Z

0

This solution helped me to skip the number of lines specified by the linetostart variable. You get the index (int) and the line (string) if you want to keep track of those too. In your case, you substitute linetostart with 18, or assign 18 to linetostart variable.

f = open("file.txt", 'r')
for i, line in enumerate(f, linetostart):
    #Your code

edited Apr 30, 2016 at 19:55

gsamaras

73.7k50 gold badges210 silver badges330 bronze badges

answered Jan 19, 2016 at 19:25

Wilder

454 bronze badges

1 Comment

wim Over a year ago

This won’t actually skip lines, it will just offset the enumerate counter.

Niklas R · Accepted Answer · 2012-03-06 05:59:49Z

-1

You can use a List-Comprehension to make it a one-liner:

[fl.readline() for i in xrange(17)]

More about list comprehension in PEP 202 and in the Python documentation.

answered Mar 6, 2012 at 5:59

Niklas R

17k30 gold badges113 silver badges212 bronze badges

4 Comments

wim Over a year ago

doesn't make much sense to store those lines in a list which will just get garbage collected.

ninjagecko Over a year ago

@wim: The memory overhead is trivial (and probably unavoidable nomatter which way you do it, since you will need to do O(n) processing of those lines unless you skip to an arbitrary point in the file); I just don't think it's very readable.

David Over a year ago

I agree with @wim, if you are throwing away the result, use a loop. The whole point of a list comprehension is that you meant to store the list; you can just as easily fit a for loop on one line.

Jean-François Fabre Over a year ago

or use a generator in a 0-memory deque.

the wolf · Accepted Answer · 2012-03-06 06:42:27Z

-1

Here is a method to get lines between two line numbers in a file:

import sys

def file_line(name,start=1,end=sys.maxint):
    lc=0
    with open(s) as f:
        for line in f:
            lc+=1
            if lc>=start and lc<=end:
                yield line


s='/usr/share/dict/words'
l1=list(file_line(s,235880))
l2=list(file_line(s,1,10))
print l1
print l2

Output:

['Zyrian\n', 'Zyryan\n', 'zythem\n', 'Zythia\n', 'zythum\n', 'Zyzomys\n', 'Zyzzogeton\n']
['A\n', 'a\n', 'aa\n', 'aal\n', 'aalii\n', 'aam\n', 'Aani\n', 'aardvark\n', 'aardwolf\n', 'Aaron\n']

Just call it with one parameter to get from line n -> EOF

answered Mar 6, 2012 at 6:42

the wolf

35.7k13 gold badges57 silver badges73 bronze badges

Comments

O.rka · Accepted Answer · 2016-08-27 21:43:09Z

-1

If it's a table.

pd.read_table("path/to/file", sep="\t", index_col=0, skiprows=17)

answered Aug 27, 2016 at 21:43

O.rka

31k78 gold badges213 silver badges336 bronze badges

Collectives™ on Stack Overflow

Skip first couple of lines while reading lines in Python file

9 Answers 9

5 Comments

2 Comments

1 Comment

Comments

Comments

1 Comment

4 Comments

Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

9 Answers 9

5 Comments

2 Comments

1 Comment

Comments

Comments

1 Comment

4 Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related