Panda's DataFrame double transpose changes numeric types to object

Question

I'm reading a header and the data for the dataframe from 2 separate locations in excel (both are aligned properly but not adjacent). The header potentially contains many blanks and so I need to discard those headers and the corresponding columns in the data. So my final frame has non-null headers and data corresponding to those headers. The logic below using transposion works but I'm losing the data types upon double transposion - see specific example below - question 1) any suggestion on how I can achieve it without transposition? 2) is this how transpostion supposed to work? Should it not infer the dtypes again upon second transposition?

  In [25]:

hd=pd.DataFrame({0:['num'],
                 1:np.nan,
                 2:['ltr']})
hd
Out[25]:
0   1   2
0    num    NaN  ltr
In [26]:

data=pd.DataFrame({0:np.arange(3),
                 1:['a','b','c'],
                 2:['d','e','f']})
data
Out[26]:
0   1   2
0    0   a   d
1    1   b   e
2    2   c   f
In [27]:

df=data.T[hd.iloc[0].notnull()].T
df.columns=hd.iloc[0].dropna()     
df
Out[27]:
num ltr
0    0   d
1    1   e
2    2   f
In [28]:

df.dtypes
Out[28]:
0
num    object
ltr    object
dtype: object

In [25]:

hd=pd.DataFrame({0:['num'],
                 1:np.nan,
                 2:['ltr']})
hd
Out[25]:
0   1   2
0    num    NaN  ltr
In [26]:

data=pd.DataFrame({0:np.arange(3),
                 1:['a','b','c'],
                 2:['d','e','f']})
data
Out[26]:
0   1   2
0    0   a   d
1    1   b   e
2    2   c   f
In [27]:

df=data.T[hd.iloc[0].notnull()].T
df.columns=hd.iloc[0].dropna()     
df
Out[27]:
num ltr
0    0   d
1    1   e
2    2   f
In [28]:

df.dtypes
Out[28]:
0
num    object
ltr    object
dtype: object

this is as expected, dtypes are column based. you can use df.convert_objects() if you want to re-infer them. — Jeff
– Jeff, Commented Jul 10, 2014 at 17:11

Jeff · Accepted Answer · 2014-07-10 17:23:49Z

3

transposition converted dtypes to object when you have mixed-dtypes to begin. this is as expected, dtypes are column based. you can use df.convert_objects() if you want to re-infer them.

However, just do this:

In [10]: data.loc[:,hd.iloc[0].notnull()]
Out[10]: 
   0  2
0  0  d
1  1  e
2  2  f

In [11]: data.loc[:,hd.iloc[0].notnull()].dtypes
Out[11]: 
0     int64
2    object
dtype: object

edited Jul 10, 2014 at 17:23

answered Jul 10, 2014 at 17:15

Jeff

130k21 gold badges223 silver badges189 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Asclepius Over a year ago

convert_objects has been deprecated.

Sighonide Over a year ago

As above, one can now use: pandas.DataFrame.infer_objects

Collectives™ on Stack Overflow

Panda's DataFrame double transpose changes numeric types to object

1 Answer 1

2 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

2 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related