I have a DataFrame df:
I want: if Date1> Date2, then id1 else id2
Output:
How to complete this without using loops? Any hints pls
You can use numpy.where:
import numpy as np
df['output'] = np.where(df['Date1'].gt(df['Date2']), df['Id1'], df['Id2'])
Steps -
date1 and date2 to datetime object using the pd.to_datetime()df['date1'] = pd.to_datetime( df.date1)
df['date2'] = pd.to_datetime( df.date2)
import numpy as np
n = np.empty(len(df))
for row in df.iterrows():
# row[1] has the row data
# row[0] has the index
if row[1]['date1'] > row[1]['date2']:
n[row[0]] = 1
else:
n[row[0]] = 0
df['output'] = n
There are short hands for this code as well.
id1orid2? You can usedf.apply