First of all, I am new to Python so I am not an expert...
Here's my issue. I have this dataframe:
CODE_IRIS PDL_RESIDENTIEL PDL_TOTAL CONSO_RESIDENTIEL CONSO_TOTALE
0 10040101 500 510 11264 26677
1 10040102 806 809 16234 17318
2 10040201 921 925 14451 17065
3 10040202 937 943 13036 19516
4 10049999 94 94 1287 1287
The thing is, CODE_IRIS is an object and is supposed to have 9 characters, like this :
CODE_IRIS PDL_RESIDENTIEL PDL_TOTAL CONSO_RESIDENTIEL CONSO_TOTALE
17861 766810113 588 593 9344 14743
Therefore, I need to pass a prefix 0 when the length of characters in CODE_IRIS is inferior to 9, like I would do on Excel with the formula =IF(LEN([@[Code IRIS]]) < 9; 0&[@[Code IRIS]]; [@[Code IRIS]]).
Now, when I try to locate what are the values with only 8 characters with elec.loc[elec['CODE_IRIS'].str.len() < 9], the result I get is:
Out[393]:
Empty DataFrame
Columns: [CODE_IRIS, PDL_RESIDENTIEL, PDL_TOTAL, CONSO_RESIDENTIEL, CONSO_TOTALE]
Index: []
Then when I try to see how long each value is with elec['CODE_IRIS'].str.len(), the result I get is:
Out[396]:
0 NaN
1 NaN
...
Name: CODE_IRIS, Length: 23905, dtype: float64
Although the column CODE_IRIS is definitely an object, as you can see here:
elec.info()
<class 'pandas.core.frame.DataFrame'>
Int64Index: 23905 entries, 0 to 23904
Data columns (total 5 columns):
CODE_IRIS 23905 non-null object
PDL_RESIDENTIEL 23905 non-null int64
PDL_TOTAL 23905 non-null int64
CONSO_RESIDENTIEL 23905 non-null int64
CONSO_TOTALE 23905 non-null int64
dtypes: int64(4), object(1)
memory usage: 1.1+ MB
I don't understand. Can someone explain me what's wrong?
(I hope I have made myself as understandable as possible). thanks!