Read line once without removing it Python using .readline()

Question

I would like to count the occurences of missings of every line in a txt file.

foo.txt file:

1 1 1 1 1 NA    # so, Missings: 1
1 1 1 NA 1 1    # so, Missings: 1
1 1 NA 1 1 NA   # so, Missings: 2

But I would also like to obtain the amount of elements for the first line (assuming this is equal for all lines).

miss = []
with open("foo.txt") as f:
    for line in f:
        miss.append(line.count("NA"))

>>> miss
[1, 1, 2]         # correct

The problem is when I try to identify the amount of elements. I did this with the following code:

miss = []
with open("foo.txt") as f:
    first_line = f.readline()
    elements = first_line.count(" ")  # given that values are separated by space
    for line in f:
        miss.append(line.count("NA"))

>>> (elements + 1)
6   # True, this is correct          
>>> miss 
[1,2]  # misses the first item due to readline() removing lines.`

How can I read the first line once without removing it for the further operation?

Premature optimization is the root of all evil. Just calculate the length for each line inside the loop: for line in f: ... elements = len(line.split()). — georg
– georg, Commented Jun 3, 2013 at 9:33

Thorsten Kranz · Accepted Answer · 2013-06-03 08:55:34Z

2

Try f.seek(0). This will reset the file handle to the beginning of the file.

Complete example would then be:

miss = []
with open("foo.txt") as f:
    first_line = f.readline()
    elements = first_line.count(" ")  # given that values are separated by space
    f.seek(0)
    for line in f:
        miss.append(line.count("NA"))

Even better would be to read all lines, even the first line, only once, and checking for number of elements only once:

miss = []
elements = None
with open("foo.txt") as f:
    for line in f:
        if elements is None:
            elements = line.count(" ")  # given that values are separated by space
        miss.append(line.count("NA"))

BTW: wouldn't the number of elements be line.count(" ") + 1?

I'd recommend using len(line.split()), as this also handles tabs, double spaces, leading/trailing spaces etc.

edited Jun 3, 2013 at 8:55

answered Jun 3, 2013 at 8:49

Thorsten Kranz

12.8k2 gold badges45 silver badges57 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Mike Müller · Accepted Answer · 2013-06-03 09:41:06Z

2

Provided all lines have the number of items you can just count items in the last line:

miss = []
with open("foo.txt") as f:
    for line in f:
        miss.append(line.count("NA")
    elements = len(line.split())

A better way to count is probably:

elements = len(line.split())

because this also counts items separated with multiple spaces or tabs.

edited Jun 3, 2013 at 9:41

answered Jun 3, 2013 at 8:59

Mike Müller

86k21 gold badges174 silver badges165 bronze badges

2 Comments

georg Over a year ago

Note that .count(" ") will be off by 1, so len(split) is the only correct one.

Mike Müller Over a year ago

Thanks. Yes. That is the way I would do it. In addition, often there are more than one space or tabs in between items. Deleted the OP version.

John La Rooy · Accepted Answer · 2013-06-03 08:55:42Z

0

You can also just treat the first line separately

with open("foo.txt") as f:
    first_line = next(f1)
    elements = first_line.count(" ")  # given that values are separated by space
    miss = [first_line.count("NA")]
    for line in f:
        miss.append(line.count("NA")

answered Jun 3, 2013 at 8:55

John La Rooy

306k54 gold badges378 silver badges514 bronze badges

1 Comment

PascalVKooten Over a year ago

What exactly is next then?

Collectives™ on Stack Overflow

Read line once without removing it Python using .readline()

3 Answers 3

Comments

2 Comments

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

Comments

2 Comments

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related