3

I have two lists

A = ["ATTTGTA", "ATTTGTA", "ATTTGTA", "ATTTGTA"]

A_modified = ["ATTGTA", "AAAT", "TTTA"]

I want an output tab separated txt file looking like

ATTTGTA ATTGTA
ATTTGTA AAAT
ATTTGTA TTTA

I tried the following piece of code but it does not write o/p in two columns, just as new rows each time

with open ('processed_seq.txt','a') as proc_seqf:
          proc_seqf.write(A)
          proc_seqf.write("\t")
          proc_seqf.write(A_modified)

This is the output I get

ATTTGTA
    ATTGTA
ATTTGTA
    AAAT
ATTTGTA
    TTTA
5
  • 3
    I suggest using the csv module. Commented Oct 28, 2014 at 15:50
  • . write() adds a newline. Just omit it. Commented Oct 28, 2014 at 15:56
  • proc_seqf.write("%s\t%s" % (A, A_modified)) might also work as a replacement for all of your write() lines, but using zip is probably the best way to get it organized in a meaningful way first, then follow mihai's answer Commented Oct 28, 2014 at 16:19
  • I realised this is happening due to the following problem: each of the lists look like A = ["ATTTGTA\n", "ATTTGTA\n",..] and that is why the new line gets added. Can you tell me how to get rid of the \n at the end. thanks Commented Oct 28, 2014 at 17:35
  • Just found using str.strip on the strings after reading them from the text file and before creating the string solved all my problems.Thanks Commented Oct 28, 2014 at 17:47

4 Answers 4

10

You just need to pair the elements in the two list. You can do that using the zip function:

with open ('processed_seq.txt','a') as proc_seqf:
    for a, am in zip(A, A_modified):
        proc_seqf.write("{}\t{}".format(a, am))

I have also used format (see specs) to format the string to get everything in a single line.

Sign up to request clarification or add additional context in comments.

Comments

2

What about something like this? It provides you with some flexibility in input and output..

lines = [
    ['a', 'e', '7', '3'],
    ['b', 'f', '1', '5'],
    ['c', 'g', '2', '10'],
    ['d', 'h', '1', '14'],
    ]

def my_print( lns, spacing = 3 ):
    widths = [max(len(value) for value in column) + spacing
              for column in zip(*lines)]
    proc_seqf = open('processed_seq.txt','a')
    for line in lns:
       pretty = ''.join('%-*s' % item for item in zip(widths, line))
       print(pretty) # debugging print
       proc_seqf.write(pretty + '\n')
    return

my_print( lines )

I added the option that the user can decide the size of the spacing..

To match with your example data:

A = ["ATTTGTA", "ATTTGTA", "ATTTGTA", "ATTTGTA"]

A_modified = ["ATTGTA", "AAAT", "TTTA"]

lines = [ A, A_modified ]

Comments

1

Apart from other great answers, as an alternative with try/except it will write all remaining elements in the list if their lengths are different (at least in your sample):

with open ('processed_seq.txt','w') as proc_seqf:
    for each in range(max(len(A), len(A_modified))):
        try:
            proc_seqf.write("{}\t{}\n".format(A[each], A_modified[each]))
        except IndexError:
            if len(A) > len(A_modified):
                proc_seqf.write("{}\t\n".format(A[each]))
            else:
                proc_seqf.write("\t{}\n".format(A_modified[each]))

cat processed_seq.txt
ATTTGTA ATTGTA
ATTTGTA AAAT
ATTTGTA TTTA
ATTTGTA 

Comments

1

If your lists are huge ,i suggest use itertools.cycle() :

import itertools
ac=itertools.cycle(A)
a_mc=itertools.cycle(A_modified)
with open ('processed_seq.txt','a') as proc_seqf:
    for i in A_modified:
      proc_seqf.write("{}\t{}".format(ac.next(), a_mc.next()))

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.