I'm so desperate to seek help. I have two dataframes,
df1 is the patient's hospital admission record.
| patient_key | admission_dtm | admission_key |
|---|---|---|
| P001 | 41765 | P001-001 |
| P001 | 42223 | P001-002 |
| P001 | 42681 | P001-003 |
| P001 | 43139 | P001-004 |
| P001 | 43597 | P001-005 |
| P001 | 44055 | P001-006 |
df2 is the patient's outpatient appointment record
| patient_key | appointment_dtm |
|---|---|
| P001 | 41645 |
| P001 | 41687 |
| P001 | 41717 |
| P001 | 42162 |
| P001 | 42193 |
| P001 | 42497 |
What I want to do is to find an outpatient appointment before each admission. For example, before P001-001 admission, Patient P001 has 3 times outpatient appointments.
the expected outcome would be like this in df2:
| patient_key | appointment_dtm | admission_key |
|---|---|---|
| P001 | 41645 | P001-001 |
| P001 | 41687 | P001-001 |
| P001 | 41717 | P001-001 |
| P001 | 42162 | P001-002 |
| P001 | 42193 | P001-002 |
| P001 | 42497 | P001-003 |
I have used a very silly method like this
df2['admission_key'] = ''
for i in df2.index:
for j in df1.index:
if df2.['patient_key'].iloc[i] == df1['patient_key'].iloc[i] and
df2.['appointment_dtm'].iloc[i] > df1['admission_dtm'].iloc[i] and
df2.['appointment_dtm'].iloc[i] < df1['admission_dtm'].iloc[i].shift(-1):
df2['admission_key'] = df1['admission_key']
However, since the size is too large and it takes a very long time to run. May I know it there are any smarter ways to do this? Thank you so so much.