I'm reading a header and the data for the dataframe from 2 separate locations in excel (both are aligned properly but not adjacent). The header potentially contains many blanks and so I need to discard those headers and the corresponding columns in the data. So my final frame has non-null headers and data corresponding to those headers. The logic below using transposion works but I'm losing the data types upon double transposion - see specific example below - question 1) any suggestion on how I can achieve it without transposition? 2) is this how transpostion supposed to work? Should it not infer the dtypes again upon second transposition?
In [25]:
hd=pd.DataFrame({0:['num'],
1:np.nan,
2:['ltr']})
hd
Out[25]:
0 1 2
0 num NaN ltr
In [26]:
data=pd.DataFrame({0:np.arange(3),
1:['a','b','c'],
2:['d','e','f']})
data
Out[26]:
0 1 2
0 0 a d
1 1 b e
2 2 c f
In [27]:
df=data.T[hd.iloc[0].notnull()].T
df.columns=hd.iloc[0].dropna()
df
Out[27]:
num ltr
0 0 d
1 1 e
2 2 f
In [28]:
df.dtypes
Out[28]:
0
num object
ltr object
dtype: object
In [25]:
hd=pd.DataFrame({0:['num'],
1:np.nan,
2:['ltr']})
hd
Out[25]:
0 1 2
0 num NaN ltr
In [26]:
data=pd.DataFrame({0:np.arange(3),
1:['a','b','c'],
2:['d','e','f']})
data
Out[26]:
0 1 2
0 0 a d
1 1 b e
2 2 c f
In [27]:
df=data.T[hd.iloc[0].notnull()].T
df.columns=hd.iloc[0].dropna()
df
Out[27]:
num ltr
0 0 d
1 1 e
2 2 f
In [28]:
df.dtypes
Out[28]:
0
num object
ltr object
dtype: object
df.convert_objects()if you want to re-infer them.