3

Using pd.Dataframe won't convert columns of lists into dataframe :

Below are my lines of code for reading in a .mat file and converting it into a dataframe from the many examples that I have seen.

However, when I read the data in the columns they still remain as lists with square brackets around the data. What I am trying to do is convert this into a proper dataframe so I can plot the last column (time data) along with the other columns as an x and y scatter plot. But the error I get is ValueError: scatter requires y column to be numeric. I have not shown the code for plotting the x and y data.

import mat4py as mp
data = mp.loadmat('test.mat')
df = pd.DataFrame(data)

When I type:

df.columns

I get the following:

Index(['col 1', 'col 2',
       'col 3', 'col 4', ... 'col 189'] with dtype='object', length=189)

If I type:

df.['col 1']

I get the following:

Out[95]: 
0       [0.0]
1       [0.0]
2       [0.0]
3       [0.0]
4       [0.0]
5       [0.0]
6       [0.0]
7       [0.0]
8       [0.0]
9       [0.0] 
... 1622 rows in total.

I even tried using .apply(pd.to_numeric, errors='coerce') to the columns and that does not work either. What am I doing wrong?

Update: The solution in the comments presented below applied only for a single column, but I wanted this to apply to every cell in the dataframe. When using .apply(lambda ..) on a whole dataframe, the columns become the index and messes up the dataframe. I found the solution that would properly apply the lambda to each cell and retain the dataframe. It is as follows:

mm = df.applymap(lambda x: x[0])

Many thanks to those who provided the original lambda solution.

2 Answers 2

3

You can do

df=pd.DataFrame({'col 1':[[0.0],[0.0]]})
df
Out[49]: 
   col 1
0  [0.0]
1  [0.0]
df['col 1'].apply(lambda x : x[0])
Out[50]: 
0    0.0
1    0.0
Name: col 1, dtype: float64
Sign up to request clarification or add additional context in comments.

6 Comments

@RafaelC yep , it may happen , we may need convert the using ast
nice +1 for list solution :)
I just tried the lambda and it does indeed work. How would I apply this to each and every column in the dataframe?
@GusG assign it back ? df=df.apply(lambda x : x[0])
Wen, yes I forgot to assign it back. In case you missed my edited comment, how do apply this to all the columns in the dataframe in one pass?
|
2

If [0.0] are strings,

import ast
df.c.transform(ast.literal_eval).str[0]
1    0.0
2    0.0
3    0.0
4    0.0
5    0.0
6    0.0
7    0.0
8    0.0
9    0.0

1 Comment

Wen/Rafaeil, the data is from Matlab so what I will be plotting are float values using plot scatter. How can I apply this transform to all 189 columns?And how do I convert what you just did to float values?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.