Selecting rows based on criteria from another dataframe

Question

I have two DataFrames with different numbers of rows and columns, but which have at least one column containing some common information. Specifically, StationCode is always a LocationCode:

dataframe1.head()

   DistanceToPrev LineCode  SeqNum StationCode           StationName  RailTime
0               0       RD       1         A15           Shady Grove         0
1           14151       RD       2         A14             Rockville         4
2           10586       RD       3         A13             Twinbrook         3
3            5895       RD       4         A12           White Flint         3
4            7309       RD       5         A11  Grosvenor-Strathmore         3
5           11821       RD       6         A10        Medical Center         3
6            5530       RD       7         A09              Bethesda         2
7            9095       RD       8         A08    Friendship Heights         3
8            4135       RD       9         A07         Tenleytown-AU         2
9            5841       RD      10         A06          Van Ness-UDC         2

dataframe2.head()

     Car Destination DestinationCode DestinationName Group Line LocationCode                LocationName  Min
0     8    Glenmont             B11        Glenmont     1   RD          A01                Metro Center  BRD
28    8    Glenmont             B11        Glenmont     1   RD          B01        Gallery Pl-Chinatown  ARR
35    6    Glenmont             B11        Glenmont     1   RD          A14                   Rockville    1
45    8    Glenmont             B11        Glenmont     1   RD          B02            Judiciary Square    2
62    6    Glenmont             B11        Glenmont     1   RD          B07                      Takoma    3
80    6    Glenmont             B11        Glenmont     1   RD          A13                   Twinbrook    4
82    8    Glenmont             B11        Glenmont     1   RD          B03               Union Station    4
95    6    Glenmont             B11        Glenmont     1   RD          B08               Silver Spring    5
114   8    Glenmont             B11        Glenmont     1   RD          B35              NoMa-Gallaudet    6
129   6    Glenmont             B11        Glenmont     1   RD          A12                 White Flint    7
143   8    Glenmont             B11        Glenmont     1   RD          B04  Rhode Island Ave-Brentwood    8

I want to get only the rows in dataframe2 whose Min column has a value less than RailTime column in dataframe1 for the same StationCode that matches the LocationCode.

For example the row labeled 80 in dataframe2 has LocationCode A13 and Min 4. In dataframe1 StationCode A13 has RailTime 4, so that row should be excluded from dataframe2.

On the contrary, the row labeled 35 in dataframe2 has LocationCode A14 and a Min value of 1, which is less than the RailTime value for A14 from dataframe1, so it should be included.

Stefan · Accepted Answer · 2016-05-30 20:41:19Z

2

Simple solution would be:

df2 = df2.merge(df1[['StationCode', 'RailTime']], left_on='LocationCode', right_on='StationCode')
df2 = df2[df2.Min<df2.RailTime]

answered May 30, 2016 at 20:41

Stefan

43.1k13 gold badges80 silver badges84 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Keith Over a year ago

Simple, yet effective, Stefan. It works perfectly. I'd vote, but not enough rep yet. Accepted.

Collectives™ on Stack Overflow

Selecting rows based on criteria from another dataframe

1 Answer 1

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related