0

I have a huge text file consisting of "blocks" like this:

block
object pen
fruit apple
people mike
block
electronic laptop
city dallas
fruit banana
object stapler
vehicle car
block
people george
fruit orange
vehicle truck
city austin
object hammer

In each block there is only one fruit at a random line. Each block has different number of lines. I want to iterate in this file, print everything including the fruit's name, and then skip until the next block. Once I find the fruit in one block, it is a waste of time to check if the next line is fruit or not. I just want to jump to the next block, but the problem is I don't know how many lines ahead the block is. So the output should look like:

block
object pen
the fruit is: apple
block
electronic laptop
city dallas
the fruit is: banana
block
people george
the fruit is: orange

I can produce this output in two ways, one:

flag = True
with open("sample.txt", "r") as f:
    for line in f.readlines():
        if line.split()[0] == 'fruit':
            print "the fruit is: " + line.split()[1]
            flag = False
        if line.split()[0] == 'block':
            flag = True
        if flag:
            print line

And two:

flag = False
with open("sample.txt", "r") as f:
    for line in f.readlines():
        if line.split()[0] == 'fruit':
            print "the fruit is: " + line.split()[1]
            flag = True
        if line.split()[0] == 'block':
            flag = False
        if flag:
            continue
        print line 

But this is not what I want. My code still checks each line whether it is fruit. I want to skip the lines after fruit until block, and continue from there. How can I do that jump?

2
  • if flag and line.split()[0] == 'fruit': just test for the flag to avoid comparison Commented Jan 17, 2018 at 20:18
  • Your second piece of code is already what you're asking...? Commented Jan 17, 2018 at 20:32

3 Answers 3

3
from itertools import takewhile, dropwhile

def not_block(line): return line != 'block\n'
def not_fruit(line): return not line.startswith('fruit ')

with open("sample.txt", "r") as f:
    while True:
        for line in takewhile(not_fruit, dropwhile(not_block(f)):
            print line.rstrip()
        fruitline = next(f, None)
        if fruitline:
            print "the fruit is: " + fruitline.split()[1]
        else:
            break
Sign up to request clarification or add additional context in comments.

Comments

1

You can add an inner loop that you trigger after you find a block line. Note that this assumes your data is well-formed (i.e., every block has a fruit).

with open('data.txt') as f:
    for line in f:
        line = line.strip()
        if line == 'block':
            print(line)
            for line in f:
                line = line.strip()
                if line.startswith('fruit '):
                    print('the fruit is:', line.split(None, 1)[1])
                    break
                else:
                    print(line)

Another thing you can do that's a bit more involved but could be a lot faster if the data file is truly huge is use mmap with find().

Comments

0

Call me crazy but I want to take a different dig at this with list comprehension and zip():

with open("sample.txt", "r") as file:
    lines = [line.strip('\n') for line in file.readlines()]

blocks = [i for i, j in enumerate(lines) if j == 'block']
fruits = [i for i, j in enumerate(lines) if 'fruit' in j]

for i, j in zip(blocks, fruits):
    print('\n'.join(lines[i:j+1]))

Output:

block
object pen
fruit apple
block
electronic laptop
city dallas
fruit banana
block
people george
fruit orange

But that'll only work if each block is always followed by a fruit before the next block.

It looks pretty, okay. Don't question my weapon of choice...

2 Comments

This reads all the lines into memory up front, which for a large file might be too slow or use too much memory. It also iterates over the lines twice. It is pretty clever, though.
Ah shoot forgot the big list part of the requirements. Ah well, all in good fun.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.