3

First of all, I am new to Python so I am not an expert...

Here's my issue. I have this dataframe:

   CODE_IRIS  PDL_RESIDENTIEL  PDL_TOTAL  CONSO_RESIDENTIEL  CONSO_TOTALE
0  10040101               500        510              11264         26677
1  10040102               806        809              16234         17318
2  10040201               921        925              14451         17065
3  10040202               937        943              13036         19516
4  10049999                94         94               1287          1287

The thing is, CODE_IRIS is an object and is supposed to have 9 characters, like this :

       CODE_IRIS  PDL_RESIDENTIEL  PDL_TOTAL  CONSO_RESIDENTIEL  CONSO_TOTALE
17861  766810113              588        593               9344         14743

Therefore, I need to pass a prefix 0 when the length of characters in CODE_IRIS is inferior to 9, like I would do on Excel with the formula =IF(LEN([@[Code IRIS]]) < 9; 0&[@[Code IRIS]]; [@[Code IRIS]]).

Now, when I try to locate what are the values with only 8 characters with elec.loc[elec['CODE_IRIS'].str.len() < 9], the result I get is:

Out[393]: 
Empty DataFrame
Columns: [CODE_IRIS, PDL_RESIDENTIEL, PDL_TOTAL, CONSO_RESIDENTIEL, CONSO_TOTALE]
Index: []

Then when I try to see how long each value is with elec['CODE_IRIS'].str.len(), the result I get is:

Out[396]: 
0       NaN
1       NaN
...
Name: CODE_IRIS, Length: 23905, dtype: float64

Although the column CODE_IRIS is definitely an object, as you can see here:

elec.info()
<class 'pandas.core.frame.DataFrame'>
Int64Index: 23905 entries, 0 to 23904
Data columns (total 5 columns):
CODE_IRIS            23905 non-null object
PDL_RESIDENTIEL      23905 non-null int64
PDL_TOTAL            23905 non-null int64
CONSO_RESIDENTIEL    23905 non-null int64
CONSO_TOTALE         23905 non-null int64
dtypes: int64(4), object(1)
memory usage: 1.1+ MB

I don't understand. Can someone explain me what's wrong?

(I hope I have made myself as understandable as possible). thanks!

1 Answer 1

1

You can just use zfill on every CODE_IRIS:

df['CODE_IRIS'] = df['CODE_IRIS'].map(lambda x: str(x).zfill(9))
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.