Skip lines until the next block in a text file if a condition is met in PYthon

Question

I have a huge text file consisting of "blocks" like this:

block
object pen
fruit apple
people mike
block
electronic laptop
city dallas
fruit banana
object stapler
vehicle car
block
people george
fruit orange
vehicle truck
city austin
object hammer

In each block there is only one fruit at a random line. Each block has different number of lines. I want to iterate in this file, print everything including the fruit's name, and then skip until the next block. Once I find the fruit in one block, it is a waste of time to check if the next line is fruit or not. I just want to jump to the next block, but the problem is I don't know how many lines ahead the block is. So the output should look like:

block
object pen
the fruit is: apple
block
electronic laptop
city dallas
the fruit is: banana
block
people george
the fruit is: orange

I can produce this output in two ways, one:

flag = True
with open("sample.txt", "r") as f:
    for line in f.readlines():
        if line.split()[0] == 'fruit':
            print "the fruit is: " + line.split()[1]
            flag = False
        if line.split()[0] == 'block':
            flag = True
        if flag:
            print line

And two:

flag = False
with open("sample.txt", "r") as f:
    for line in f.readlines():
        if line.split()[0] == 'fruit':
            print "the fruit is: " + line.split()[1]
            flag = True
        if line.split()[0] == 'block':
            flag = False
        if flag:
            continue
        print line

But this is not what I want. My code still checks each line whether it is fruit. I want to skip the lines after fruit until block, and continue from there. How can I do that jump?

if flag and line.split()[0] == 'fruit': just test for the flag to avoid comparison — Jean-François Fabre
– Jean-François Fabre ♦, Commented Jan 17, 2018 at 20:18

Steven Rumbalski · Accepted Answer · 2018-01-17 21:24:59Z

3

from itertools import takewhile, dropwhile

def not_block(line): return line != 'block\n'
def not_fruit(line): return not line.startswith('fruit ')

with open("sample.txt", "r") as f:
    while True:
        for line in takewhile(not_fruit, dropwhile(not_block(f)):
            print line.rstrip()
        fruitline = next(f, None)
        if fruitline:
            print "the fruit is: " + fruitline.split()[1]
        else:
            break

edited Jan 17, 2018 at 21:24

answered Jan 17, 2018 at 20:31

Steven Rumbalski

45.7k10 gold badges96 silver badges125 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

score 1 · Accepted Answer · 2018-01-17 20:32:01Z

1

You can add an inner loop that you trigger after you find a block line. Note that this assumes your data is well-formed (i.e., every block has a fruit).

with open('data.txt') as f:
    for line in f:
        line = line.strip()
        if line == 'block':
            print(line)
            for line in f:
                line = line.strip()
                if line.startswith('fruit '):
                    print('the fruit is:', line.split(None, 1)[1])
                    break
                else:
                    print(line)

Another thing you can do that's a bit more involved but could be a lot faster if the data file is truly huge is use mmap with find().

edited Jan 17, 2018 at 20:32

answered Jan 17, 2018 at 20:20

user8651755

Comments

r.ook · Accepted Answer · 2018-01-17 21:42:04Z

0

Call me crazy but I want to take a different dig at this with list comprehension and zip():

with open("sample.txt", "r") as file:
    lines = [line.strip('\n') for line in file.readlines()]

blocks = [i for i, j in enumerate(lines) if j == 'block']
fruits = [i for i, j in enumerate(lines) if 'fruit' in j]

for i, j in zip(blocks, fruits):
    print('\n'.join(lines[i:j+1]))

Output:

block
object pen
fruit apple
block
electronic laptop
city dallas
fruit banana
block
people george
fruit orange

But that'll only work if each block is always followed by a fruit before the next block.

It looks pretty, okay. Don't question my weapon of choice...

edited Jan 17, 2018 at 21:42

answered Jan 17, 2018 at 21:28

r.ook

13.9k2 gold badges26 silver badges41 bronze badges

2 Comments

user8651755 Over a year ago

This reads all the lines into memory up front, which for a large file might be too slow or use too much memory. It also iterates over the lines twice. It is pretty clever, though.

r.ook Over a year ago

Ah shoot forgot the big list part of the requirements. Ah well, all in good fun.

Collectives™ on Stack Overflow

Skip lines until the next block in a text file if a condition is met in PYthon

3 Answers 3

Comments

Comments

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

Comments

Comments

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related