Python - from file to data structure?

Question

I have large file comprising ~100,000 lines. Each line corresponds to a cluster and each entry within each line is a reference i.d. for another file (protein structure in this case), e.g.

1hgn 1dju 3nmj 8kfn
9opu 7gfb 
4bui

I need to read in the file as a list of lists where each line is a sublist, thus preserving the integrity of the cluster, e.g.

nested_list = [['1hgn', '1dju', '3nmj', '8kfn'], ['9opu', '7gfb'], ['4bui']]

My current code creates a nested list but the entries within each list are a single string and not comma separated. Therefore, I cannot splice the list with indices so easily.

Any help greatly appreciated.

Thanks, S :-)

Oli · Accepted Answer · 2010-05-28 12:30:37Z

13

Super simple:

with open('myfile', 'r') as f:
    data = [line.split() for line in f]

answered May 28, 2010 at 12:30

Oli

241k67 gold badges227 silver badges305 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Wayne Werner Over a year ago

Nope - that will do exactly what the OP asked. Yay Python & batteries included.

SilentGhost · Accepted Answer · 2010-05-28 12:37:59Z

6

You'll want to investigate the str.split() method.

>>> '1hgn 1dju 3nmj 8kfn'.split()
['1hgn', '1dju', '3nmj', '8kfn']

edited May 28, 2010 at 12:37

SilentGhost

322k67 gold badges312 silver badges294 bronze badges

answered May 28, 2010 at 12:28

Peter Milley

2,81821 silver badges20 bronze badges

Collectives™ on Stack Overflow

Python - from file to data structure?

2 Answers 2

1 Comment

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

1 Comment

Comments

Your Answer

Sign up or log in

Post as a guest

Related