0

I have a data frame named 'plans_to_csv' looking like this:

enter image description here

I need to do the following analysis to realize what is the actual mode. But this takes so long to run. Is there an alternative way for writing this code to make it faster? Thanks a lot for your help in advance.

for i in range (0, len(plans_to_csv)-2):
    if (plans_to_csv['mode'][i+1]=='walk' and plans_to_csv['type'][i+2]=='car interaction' and 
        plans_to_csv['person_id'][i]==plans_to_csv['person_id'][i+2]):

        plans_to_csv['actual_mode_car'][i]=1

1 Answer 1

1

You can shift the columsn and do comparisons. That will make use of vectorization and should be faster.

selection = (plans_to_csv['mode'].shift(-1) == 'walk') & (plans_to_csv['type'].shift(-2)=='car interaction') & (plans_to_csv['person_id'] == plans_to_csv['person_id'].shift(-2))
plans_to_csv['actual_mode_car']= selection.astype(int)

Note that this sets all the entries to 0 that don't match the comparison. If this is not wanted, you can just do plans_to_csv['actual_mode_car'][selection]= 1

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.