1

I am trying to create a list for each column in python of my data that looks like this:

399.75833     561.572000000        399.75833     561.572000000  a_Fe I 399.73920 nm
399.78316     523.227000000        399.78316     523.227000000  
399.80799     455.923000000        399.80799     455.923000000  a_Fe I 401.45340 nm
399.83282     389.436000000        399.83282     389.436000000  
399.85765     289.804000000        399.85765     289.804000000  

The problem is that each row of my data is a different length. Is there anyway to format the remaining spaces of the shorter rows with a space so they are all the same length?

I would like my data to be in the form:

list one= [399.75833, 399.78316, 399.80799, 399.83282, 399.85765]
list two= [561.572000000, 523.227000000, 455.923000000, 389.436000000, 289.804000000]
list three= [a_Fe, " ", a_Fe, " ", " "]

This is the code I used to import the data into python:

fh  = open('help.bsp').read()
the_list = []
for line in fh.split('\n'):
    print line.strip()
    splits = line.split()
    if  len(splits) ==1 and splits[0]== line.strip():
        splits = line.strip().split(',')
    if splits:the_list.append(splits)

2 Answers 2

1

You need to use izip_longest to make your column lists, since standard zip will only run till the shortest length in the given list of arrays.

from itertools import izip_longest
with open('workfile', 'r') as f:
    fh = f.readlines()

# Process all the rows line by line
rows = [line.strip().split() for line in fh]
# Use izip_longest to get all columns, with None's filled in blank spots
cols = [col for col in izip_longest(*rows)]
# Then run your type conversions for your final data lists
list_one = [float(i) for i in cols[2]]
list_two = [float(i) for i in cols[3]]
# Since you want " " instead of None for blanks
list_three = [i if i else " " for i in cols[4]]

Output:

>>> print list_one
[399.75833, 399.78316, 399.80799, 399.83282, 399.85765]
>>> print list_two
[561.572, 523.227, 455.923, 389.436, 289.804]
>>> print list_three
['a_Fe', ' ', 'a_Fe', ' ', ' ']
Sign up to request clarification or add additional context in comments.

Comments

0

So, your lines are either whitespace delimited or comma delimited, and if comma delimited, the line contains no whitespace? (note that if len(splits)==1 is true, then splits[0]==line.strip() is also true). That's not the data you're showing, and not what you're describing.

To get the lists you want from the data you show:

with open('help.bsp') as h:
    the_list = [ line.strip().split() for line in h.readlines() ]
list_one = [ d[0] for d in the_list ]
list_two = [ d[1] for d in the_list ]
list_three = [ d[4] if len(d) > 4 else ' ' for d in the_list ]

If you're reading comma separated (or similarly delimited) files, I always recommend using the csv module - it handles a lot of edge cases that you may not have considered.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.