1

I was trying to merge two dataframes using a less-than operator. But I ended up using pandasql.

Is it possible to do the same query below using pandas functions? (Records may be duplicated, but that is fine as I'm looking for something similar to cumulative total later)

sql = '''select A.Name,A.Code,B.edate from df1 A
        inner join df2 B on A.Name = B.Name
        and A.Code=B.Code
        where A.edate < B.edate '''

df4 = sqldf(sql)

The suggested answer seems similar but couldn't get the result expected. Also the answer below looks very crisp.

1

1 Answer 1

2

Use:

df = df1.merge(df2, on=['Name','Code']).query('edate_x < edate_y')[['Name','Code','edate_y']]
Sign up to request clarification or add additional context in comments.

7 Comments

jezrael, could you add this method to How to do/workaround a conditional join in python Pandas?
@smci - I think it is something else, so not dupe, only similar.
df.query is very useful but under-used. +1
rael, perhaps, but that other question could sorely use your mention of df.query(), so I recommend you add it there too.
@sjd - Unfortunately pandas working not so nice like sql libs :(
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.