2

I've tried to figure this one out, but all methods end up with errors.

Quick background info: I have a dataframe with US states and info when a state has adopted a new law. In addition, the dataframe tells which are the neighbouring states. Here's a quick mockup:

(df: treated_states)

| year     | state |BC_law |n_ipo |Neighbor1 |Neighbor2 |Neighbor3 |Treated|
| -------- | ----- |-------|------|----------|----------|----------|-------|
| 1980     | AZ    |1999   |100   |CA        |AK        |WV        |1      |
| 1999     | AZ    |1999   |50    |CA        |AK        |WV        |1      |

(df: not_treated)

| year     | state |BC_law |n_ipo |Neighbor1 |Neighbor2 |Neighbor3 |Treated|
| -------- | ----- |-------|------|----------|----------|----------|-------|
| 1980     | WV    |0      |100   |CA        |AK        |WV        |0      |
| 1999     | AK    |0      |80    |CA        |AK        |WV        |0      |

In reality the dataframe includes all US states and all their neighbouring states in separate columns. There are data points before and and after the change in legislation.

I have filtered states that have adopted the new BC_law (df: treated_states) and those that haven't (df: not_treated). Now I should find out if the adoption of the new law affects the neighbouring states.

My problem: I should filter and store into a new dataframe the treated_states dataframes' Neighbor1, Neighbor2, Neighbor3 that exist in the dataframe not_treated

In case you wonder why I need to filter those: I'll do a difference-in-difference regression to determine if adopting the new law affects the neighboring states that don't adopt the BC_law.

2 Answers 2

1

Here is a more concise approach:

  • Filter the Neighbour like columns
  • Create boolean mask with DataFrame.isin to check whether each element in dataframe is contained in state column of non_treated
  • Reduce the boolean mask along the columns axis with any
cols = treated_states.filter(like='Neigh')
mask = cols.isin(not_treated.state).any(axis=1)

>>> treated_states[mask]

   year state  BC_law  n_ipo Neighbor1 Neighbor2 Neighbor3  Treated
0  1980    AZ    1999    100        CA        AK        WV        1
1  1999    AZ    1999     50        CA        AK        WV        1
Sign up to request clarification or add additional context in comments.

Comments

0

I managed to write this and it seems to work... Maybe it helps someone and a better coder can suggest a more efficient and elegant way to solve this

#creating a list of states that have not adopted the new legislation
not_treated_list = not_treated.STATE

#df with states that have not adopted the new legislation, that are also neighbouring states that have adopted the new legislation 
neighbour = treated_states[(treated_states.Neigh1.isin(not_treated_list)) | (treated_states.Neigh2.isin(not_treated_list)) |(treated_states.Neigh3.isin(not_treated_list)) | (treated_states.Neigh4.isin(not_treated_list)) | (treated_states.Neigh5.isin(not_treated_list)) | (treated_states.Neigh6.isin(not_treated_list)) | (treated_states.Neigh7.isin(not_treated_list)) | (treated_states.Neigh8.isin(not_treated_list)) ]

neighbour

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.