3

I have two DataFrames with different numbers of rows and columns, but which have at least one column containing some common information. Specifically, StationCode is always a LocationCode:

dataframe1.head()

   DistanceToPrev LineCode  SeqNum StationCode           StationName  RailTime
0               0       RD       1         A15           Shady Grove         0
1           14151       RD       2         A14             Rockville         4
2           10586       RD       3         A13             Twinbrook         3
3            5895       RD       4         A12           White Flint         3
4            7309       RD       5         A11  Grosvenor-Strathmore         3
5           11821       RD       6         A10        Medical Center         3
6            5530       RD       7         A09              Bethesda         2
7            9095       RD       8         A08    Friendship Heights         3
8            4135       RD       9         A07         Tenleytown-AU         2
9            5841       RD      10         A06          Van Ness-UDC         2

dataframe2.head()

     Car Destination DestinationCode DestinationName Group Line LocationCode                LocationName  Min
0     8    Glenmont             B11        Glenmont     1   RD          A01                Metro Center  BRD
28    8    Glenmont             B11        Glenmont     1   RD          B01        Gallery Pl-Chinatown  ARR
35    6    Glenmont             B11        Glenmont     1   RD          A14                   Rockville    1
45    8    Glenmont             B11        Glenmont     1   RD          B02            Judiciary Square    2
62    6    Glenmont             B11        Glenmont     1   RD          B07                      Takoma    3
80    6    Glenmont             B11        Glenmont     1   RD          A13                   Twinbrook    4
82    8    Glenmont             B11        Glenmont     1   RD          B03               Union Station    4
95    6    Glenmont             B11        Glenmont     1   RD          B08               Silver Spring    5
114   8    Glenmont             B11        Glenmont     1   RD          B35              NoMa-Gallaudet    6
129   6    Glenmont             B11        Glenmont     1   RD          A12                 White Flint    7
143   8    Glenmont             B11        Glenmont     1   RD          B04  Rhode Island Ave-Brentwood    8

I want to get only the rows in dataframe2 whose Min column has a value less than RailTime column in dataframe1 for the same StationCode that matches the LocationCode.

For example the row labeled 80 in dataframe2 has LocationCode A13 and Min 4. In dataframe1 StationCode A13 has RailTime 4, so that row should be excluded from dataframe2.

On the contrary, the row labeled 35 in dataframe2 has LocationCode A14 and a Min value of 1, which is less than the RailTime value for A14 from dataframe1, so it should be included.

1 Answer 1

2

Simple solution would be:

df2 = df2.merge(df1[['StationCode', 'RailTime']], left_on='LocationCode', right_on='StationCode')
df2 = df2[df2.Min<df2.RailTime]
Sign up to request clarification or add additional context in comments.

1 Comment

Simple, yet effective, Stefan. It works perfectly. I'd vote, but not enough rep yet. Accepted.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.