It's because your dtypes are being changed after each assignment:
In [7]: df = pd.DataFrame(np.arange(15).reshape(5,3),columns=list('ABC'))
In [8]: df.dtypes
Out[8]:
A int32
B int32
C int32
dtype: object
In [9]: df.loc[5,:] = None
In [10]: df.dtypes
Out[10]:
A float64
B float64
C float64
dtype: object
In [11]: df.loc[:1,2] = 'nan'
after that last assignment the C column has been implicitly converted to object (string) dtype:
In [12]: df.dtypes
Out[12]:
A float64
B float64
C object
dtype: object
@ayhan has written very neat answer as a comment:
I think the main reason is for numerical columns, when you insert None
or np.nan, it is converted to np.nan to have a Series of type float.
For objects, it takes whatever is passed (if None, it uses None; if
np.nan, it uses np.nan -
docs)
(c) ayhan
Here is a corresponding demo:
In [39]: df = pd.DataFrame(np.arange(15).reshape(5,3),columns=list('ABC'))
In [40]: df.loc[4, 'A'] = None
In [41]: df.loc[4, 'C'] = np.nan
In [42]: df
Out[42]:
A B C
0 0.0 1 2.0
1 3.0 4 5.0
2 6.0 7 8.0
3 9.0 10 11.0
4 NaN 13 NaN
In [43]: df.dtypes
Out[43]:
A float64
B int32
C float64
dtype: object
In [44]: df.loc[0, 'C'] = 'a string'
In [45]: df
Out[45]:
A B C
0 0.0 1 a string
1 3.0 4 5
2 6.0 7 8
3 9.0 10 11
4 NaN 13 NaN
In [46]: df.dtypes
Out[46]:
A float64
B int32
C object
dtype: object
now we can use both None and np.nan for the object dtype:
In [47]: df.loc[1, 'C'] = None
In [48]: df.loc[2, 'C'] = np.nan
In [49]: df
Out[49]:
A B C
0 0.0 1 a string
1 3.0 4 None
2 6.0 7 NaN
3 9.0 10 11
4 NaN 13 NaN
UPDATE: starting from Pandas 0.20.1 the .ix indexer is deprecated, in favor of the more strict .iloc and .loc indexers.
np.nandf.ix[:1,2] = np.nanthen,df.ix[5,:] = Nonewill work as expected for you becauseCcolumn will be afloatso not sure what you mean. It seems like MaxU edited it in his accepted answer too...