0

I have a column in Pandas.DataFrame where every row of this column is the list of numbers. But these number is typed as string

The column name is allmz, the dataframe is exp_df

print exp_df.iloc[:3]['allmz']
>2129    [445.120788574, 355.067465091, 355.069260997, ...
>2130    [445.120758057, 355.06748396, 355.069279865, 3...
>2131    [445.120880127, 355.067417985, 355.06921389, 3...
>Name: allmz, dtype: object

I tried to convert each number by iteritemsbut the type is still str. Although I assign mzz = float(mzz).

for ind, mzlist in exp_df['allmz'].iteritems():
    for mzz in mzlist:
        mzz = float(mzz)
print type(exp_df.iloc[0]['allmz'][0])
><type 'str'>

Each list comes from exp_df['allmz'] = exp_df['allmz'].apply(lambda x: x.split(" ")) so I tried to do

exp_df[each] = exp_df[each].apply(lambda x: float(y) for y in x.split(" "))

But I guess lambda is not applicable with for loop. How I can access and convert string in list in each row of Pandas.DataFrame?

1
  • 1
    No, mzz = float(mzz) does not do anything to the series. It only changes the object that mzz references. Variables != Objects. Commented Jan 18, 2018 at 13:05

4 Answers 4

2

Use a list comprehension inside apply i.e

Setup

m = pd.DataFrame(['1.2,2.3,3.4,4.5,6.5'],columns=['numbers'])
m['numbers'] = m['numbers'].str.split(',')
0    [1.2, 2.3, 3.4, 4.5, 6.5]
Name: numbers, dtype: object

Applying list comprehension

m['numbers'] = m['numbers'].apply(lambda x : [float(i) for i in x])

type(m.loc[0,'numbers'][0])
float
Sign up to request clarification or add additional context in comments.

Comments

1

I think you need add [] for list comprehension, split(' ') should be simplify by split() because default separator is whitespace:

exp_df[each] = exp_df[each].apply(lambda x: [float(y) for y in x.split()])

But much better is create columns if possible:

exp_df = exp_df.join(exp_df[each].str.split(expand=True).astype(float))

Comments

0

I got one way but not so nice. I changed the list at split to np.array and after that changed to float

exp_df['allmz'].apply(lambda x: np.asarray(x.split(" ")).astype(float))

Still looking forward to a better structure one. I believe for loop should have a way somehow.

Comments

0

Jan,

you can use simple list comprehension exp_df[each] = exp_df[each].apply(lamda array: [float(y) for y in array])

You can have both string parsing and conversion done at the same time by using numpy.fromstring (it will return numpy array instead of list) exp_df['all_mz'].apply(lambda s: np.fromstring(s, sep = " "))

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.