1

I am referring to this article:

https://kanoki.org/2019/07/04/pandas-difference-between-two-dataframes/

I don't understand this particular syntax for loc, where a lambda is doing the row filtering?

df = df1.merge(df2, how = 'outer' ,indicator=True).loc[lambda x : x['_merge']=='left_only']

What is this lambda doing, I know the end result - just trying to understand the use of lambdas in "loc" syntax.

1
  • 2
    It would make more sense to do it this way and it's also more readable because of query, I don't like the use of lambda for filtering. Commented Jul 27, 2020 at 14:42

1 Answer 1

3

loc accepts (among other things) a one-argument callable that is called on each row. The callable is expected to return something that can be used as an index (in this case, a boolean).

Effectively, this syntax means "for each row x in the merged dataframes, call the lambda on the row and select it if x['_merge'] == 'left_only'".

Sign up to request clarification or add additional context in comments.

2 Comments

Is there possible to filter result of merge without lambda, and with chaining methods as in example?
Yes. As @Erfan pointed out in the comment to the question, you can use .query('_merge == "left_only"') to obtain the same result.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.