0

I have a script where python should take each line at a time and do lots of stuffs (alignment and co.) So I tried to use count in order to iterate over every lines in my input file.

However, when I run it, it only uses the last line of the input files and runs the rest with it until the end. So the script is alright but iteration not at all For a test, I tried with only 4 lines and this is the iterating part of the script:

for line in open(sys.argv[1]):
     count+=1
     if count < 4 :
         continue
     elif count > 4 :
         break

I tried to write a test script to see if it does run every lines:

count = 0
file = open('mclOutput2', 'r') 
while True:
    count+=1
    if count < 4:
        print file.readlines()
    elif count > 4 :
        break

And this is the output I get

['mono|comp78360_c0_seq1\tpoly|comp71317_c0_seq1\tturc|comp70178_c0_seq1\tturc|comp19023_c0_seq1\n', 'mono|comp78395_c0_seq1\trubr|comp23732_c0_seq1\trugi|comp32227_c0_seq1\tsulc|comp11641_c0_seq1\n', 'mono|comp80301_c0_seq1\tnegl|comp30782_c0_seq1\tphar|comp29363_c0_seq1\tpoly|comp53026_c0_seq2\n', 'mono|comp80554_c0_seq1\tnegl|comp27459_c0_seq1\tpoly|comp57863_c0_seq2\trugi|comp11691_c0_seq1\n']
[]
[]

I am not really sure how to fix it, any ideas what I am doing wrong?

2 Answers 2

1

Better code:

from itertools import islice

def skip_lines(inf, n):
    list(islice(inf, n))

with open(sys.argv[1]) as inf:
    skip_lines(inf, 4)
    for count,line in enumerate(inf, 4):
        print("do your stuff here")

Edit: Looking at your data (quoted in your .readlines output), you want something like

GET_LINES = 4
with open(sys.argv[1]) as inf:
    for count,line in zip(range(1, GET_LINES+1), inf):
        data = [pairs.split('|') for pairs in line.strip().split('\t')]
        print("{:>3d}: {}".format(count, data))

which gives

  1: [['mono', 'comp78360_c0_seq1'], ['poly', 'comp71317_c0_seq1'], ['turc', 'comp70178_c0_seq1'], ['turc', 'comp19023_c0_seq1']]
  2: [['mono', 'comp78395_c0_seq1'], ['rubr', 'comp23732_c0_seq1'], ['rugi', 'comp32227_c0_seq1'], ['sulc', 'comp11641_c0_seq1']]
  3: [['mono', 'comp80301_c0_seq1'], ['negl', 'comp30782_c0_seq1'], ['phar', 'comp29363_c0_seq1'], ['poly', 'comp53026_c0_seq2']]
  4: [['mono', 'comp80554_c0_seq1'], ['negl', 'comp27459_c0_seq1'], ['poly', 'comp57863_c0_seq2'], ['rugi', 'comp11691_c0_seq1']]
Sign up to request clarification or add additional context in comments.

Comments

0

use file.readline instead of file.readlines

note that file is a python builtin, better not to use it as a variable name

5 Comments

Thanks yes you are right! it fixes my test run. What about my first part? I don't understand why my scripts does not iterate over every lines!
Yes it goes to continue, I did not want to write the entire code. I don't understand why the script only works for the last line of my file when I use the count counter
@user3188922 I'm not sure about what do you mean by 'work', what's your expected output of the first snippet?
The script comes back with some alignment and clusters, each line of my input file is one group. In the ideal world, the script would run the program for each group (so each line) of my input file and gives me back the alignment. At the moment, it only runs it for the last line (so group number 4) which I don't understand. I think my iteration is not good.
@user3188922 I'm guessing maybe you want if count <= 4 : #do your stuff instead of if count < 4 : continue?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.