Python:ValueError: scatter requires y column to be numeric?

Question

Using pd.Dataframe won't convert columns of lists into dataframe :

Below are my lines of code for reading in a .mat file and converting it into a dataframe from the many examples that I have seen.

However, when I read the data in the columns they still remain as lists with square brackets around the data. What I am trying to do is convert this into a proper dataframe so I can plot the last column (time data) along with the other columns as an x and y scatter plot. But the error I get is ValueError: scatter requires y column to be numeric. I have not shown the code for plotting the x and y data.

import mat4py as mp
data = mp.loadmat('test.mat')
df = pd.DataFrame(data)

When I type:

df.columns

I get the following:

Index(['col 1', 'col 2',
       'col 3', 'col 4', ... 'col 189'] with dtype='object', length=189)

If I type:

df.['col 1']

I get the following:

Out[95]: 
0       [0.0]
1       [0.0]
2       [0.0]
3       [0.0]
4       [0.0]
5       [0.0]
6       [0.0]
7       [0.0]
8       [0.0]
9       [0.0] 
... 1622 rows in total.

I even tried using .apply(pd.to_numeric, errors='coerce') to the columns and that does not work either. What am I doing wrong?

Update: The solution in the comments presented below applied only for a single column, but I wanted this to apply to every cell in the dataframe. When using .apply(lambda ..) on a whole dataframe, the columns become the index and messes up the dataframe. I found the solution that would properly apply the lambda to each cell and retain the dataframe. It is as follows:

mm = df.applymap(lambda x: x[0])

Many thanks to those who provided the original lambda solution.

BENY · Accepted Answer · 2018-07-12 02:14:31Z

3

You can do

df=pd.DataFrame({'col 1':[[0.0],[0.0]]})
df
Out[49]: 
   col 1
0  [0.0]
1  [0.0]
df['col 1'].apply(lambda x : x[0])
Out[50]: 
0    0.0
1    0.0
Name: col 1, dtype: float64

answered Jul 12, 2018 at 2:14

BENY

324k22 gold badges176 silver badges250 bronze badges

Sign up to request clarification or add additional context in comments.

6 Comments

BENY Over a year ago

@RafaelC yep , it may happen , we may need convert the using ast

rafaelc Over a year ago

nice +1 for list solution :)

GusG Over a year ago

I just tried the lambda and it does indeed work. How would I apply this to each and every column in the dataframe?

BENY Over a year ago

@GusG assign it back ? df=df.apply(lambda x : x[0])

GusG Over a year ago

Wen, yes I forgot to assign it back. In case you missed my edited comment, how do apply this to all the columns in the dataframe in one pass?

|

rafaelc · Accepted Answer · 2018-07-12 02:17:22Z

2

If [0.0] are strings,

import ast
df.c.transform(ast.literal_eval).str[0]
1    0.0
2    0.0
3    0.0
4    0.0
5    0.0
6    0.0
7    0.0
8    0.0
9    0.0

answered Jul 12, 2018 at 2:17

rafaelc

59.4k15 gold badges64 silver badges87 bronze badges

1 Comment

GusG Over a year ago

Wen/Rafaeil, the data is from Matlab so what I will be plotting are float values using plot scatter. How can I apply this transform to all 189 columns?And how do I convert what you just did to float values?

Collectives™ on Stack Overflow

Python:ValueError: scatter requires y column to be numeric?

2 Answers 2

6 Comments

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

6 Comments

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related