Sorry if the title is unclear - I wasn't too sure how to word it. So I have a dataframe that has two columns for old IDs and new IDs.
df = pd.DataFrame({'old_id':['111', '2222','3333', '4444'], 'new_id':['5555','6666','777','8888']})
I'm trying to figure out a way to check the string length of each column/row and return any id's that don't match the required string length of 4 into a new dataframe. This will eventually turn into a dictionary of incorrect IDs.
This is the approach I'm currently taking:
incorrect_id_df = df[df.applymap(lambda x: len(x) != 4)]
and the current output:
old_id new_id
111 NaN
NaN NaN
NaN 777
NaN NaN
I'm not sure where to go from here and I'm sure there's a much better approach but this is the output I'm looking for where it's a single column dataframe with just the IDs that don't match the required string length and also with the column name id:
id
111
777
.stack()that dataframe and use it's.valuesattribute as an invalid list... but then you still have reference to what column it was found on ?