38

I'm trying to parse a tab-separated file in Python where a number placed k tabs apart from the beginning of a row, should be placed into the k-th array.

Is there a built-in function to do this, or a better way, other than reading line by line and do all the obvious processing a naive solution would perform?

1
  • 4
    sometimes easy to forget, but it's customary to accept an answer to your question.. Commented Jun 21, 2017 at 13:21

4 Answers 4

72

You can use the csv module to parse tab seperated value files easily.

import csv

with open("tab-separated-values") as tsv:
    for line in csv.reader(tsv, dialect="excel-tab"): #You can also use delimiter="\t" rather than giving a dialect.
        ... 

Where line is a list of the values on the current row for each iteration.

Edit: As suggested below, if you want to read by column, and not by row, then the best thing to do is use the zip() builtin:

with open("tab-separated-values") as tsv:
    for column in zip(*[line for line in csv.reader(tsv, dialect="excel-tab")]):
        ...
Sign up to request clarification or add additional context in comments.

5 Comments

whenever an element is missing there are two consecutive tabs. will that work?
@Bob Why don't you try it and see? (But yes, it will).
@Lattyware: Your use of "file" as a variable name is a no-no... ;)
@martineau: of all the default builtin names to rebind, file is the least problematic, esp. because it doesn't even exist in 3. Y'all can have "for file in files:` when you pry it from my cold, dead hands! ;^)
@martineau I'm a Python 3.x man, so I sometimes forget this is smashing file in 2.x. Good point, however. Edited.
15

I don't think any of the current answers really do what you said you want. (Correction: I now see that @Gareth Latty / @Lattyware has incorporated my answer into his own as an "Edit" near the end.)

Anyway, here's my take:

Say these are the tab-separated values in your input file:

1   2   3   4   5
6   7   8   9   10
11  12  13  14  15
16  17  18  19  20

then this:

with open("tab-separated-values.txt") as inp:
    print( list(zip(*(line.strip().split('\t') for line in inp))) )

would produce the following:

[('1', '6', '11', '16'), 
 ('2', '7', '12', '17'), 
 ('3', '8', '13', '18'), 
 ('4', '9', '14', '19'), 
 ('5', '10', '15', '20')]

As you can see, it put the k-th element of each row into the k-th array.

Comments

7

Like this:

>>> s='1\t2\t3\t4\t5'
>>> [x for x in s.split('\t')]
['1', '2', '3', '4', '5']

For a file:

# create test file:
>>> with open('tabs.txt','w') as o:
...    s='\n'.join(['\t'.join(map(str,range(i,i+10))) for i in [0,10,20,30]])
...    print >>o, s

#read that file:
>>> with open('tabs.txt','r') as f:
...    LoL=[x.strip().split('\t') for x in f]
... 
>>> LoL
[['0', '1', '2', '3', '4', '5', '6', '7', '8', '9'], 
 ['10', '11', '12', '13', '14', '15', '16', '17', '18', '19'], 
 ['20', '21', '22', '23', '24', '25', '26', '27', '28', '29'], 
 ['30', '31', '32', '33', '34', '35', '36', '37', '38', '39']]
>>> LoL[2][3]
23

If you want the input transposed:

>>> with open('tabs.txt','r') as f:
...    LoT=zip(*(line.strip().split('\t') for line in f))
... 
>>> LoT[2][3]
'32'

Or (better still) use the csv module in the default distribution...

6 Comments

In Python, making an empty list and then appending values is an anti-pattern. That's what list comprehensions are for.
@Lattyware: I personally do not find the first form hard to read, but you are right -- a nested list comprehension is probably more Pythonic. Edited.
@drewk: [x.split('\t') for f.split('\n')] makes no sense. There's no x and files objects don't have a split() method.
@martineau: perfect example of why to use the csv module, no? typo fixed. I tested it
@drewk: Well, not so much...most likely the latter thing IMHO. ;)
|
1

You can easily do it like this way by python pandas pd.read_csv ('file_name.tsv', sep='\t')


[Note: need to install pandas with this command pip install pandas]

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.