I'm trying to validate, if zip code for many different countries which stored in a table are in a correct format, as an example:
| ZIP | COUNTRY_CODE |
|---|---|
| 1033 SC | NL |
| 60593 | DE |
To do that I have a separate DF with country code and regular expressions as a pattern for every zip code.
| REGEX | COUNTRY |
|---|---|
| \d{4}[ ]?[A-Z]{2} | NL |
| \d{5} | DE |
I'm trying to merge this to tables based on a coutry code and then create which indicates as True or False if the zip code based on regex is correct.
Here is my currecnt code:
df_merged = pd.merge(regex_df, zip_df, left_on = 'CODE', right_on= 'COUNTRY_CODE')
df_merged['zip_correct'] = df_mergedf.CODE_y.str.contains(df_merged.REGEX.str, regex= True, na=False)
Hovewer I'm etting only false results since pandas is cheking the regex pattern in every row. How could I limit it to check it row by row?
Expected output:
| ZIP | COUNTRY_CODE | ZIP_CORRECT |
|---|---|---|
| 1033 SC | NL | TRUE |
| 60593 | DE | TRUE |
| 6059TT | DE | FALSE |
Could you please help?