1

I'm attempting to shift a column in a dataframe by creating a conditional statement, however I'm not sure what I'm doing wrong. There's about 1000+ rows in this dataframe, but here's a sample.

Original dataframe

Price Range    str_num     str_dir        str          str_sfx      city              zip
200 - 300k     123         Fake           St           Boulder      80304            None
300 - 400k     456         Main           St           Erie         80123            None
300 - 400k     789         E              Lolly        St           Boulder          80302
300 - 400k     999         N              Home         Ave          Lafayette        80027

Now what I want to do is say if the column str_dir doesn't have N, E, W, S in it, shift it to the right, fill the rest with NaN. Here's my code so far.

mylist = ['N','E','W','S']
a=df[~df['str_dir'].isin(mylist)].shift(periods=-1, axis='columns', fill_value=np.NaN)
out_df=a.combine_first(df)

However, when I run this code I get this dataframe.

Price Range    str_num     str_dir        str           city              zip
123            Fake        St             Boulder       80304            None
456            Main        St             Erie          80123            None
300 - 400k     789         E              Lolly         Boulder          80302
300 - 400k     999         N              Home          Lafayette        80027

What I'm looking for is this

Price Range    str_num     str_dir        str           str_sfx              city              zip
200 - 300k     123         NaN            Fake          St                  Boulder          80304
300 - 400k     456         NaN            Main          St                  Erie             80123
300 - 400k     789         E              Lolly         St                  Boulder          80302
300 - 400k     999         N              Home          Ave                 Lafayette        80027
2
  • You generally don't want to do this, this is an antipattern that your read_csv() statement mishandled the separators (or that the CSV separators are not correct). Are you able to go back to the read_csv() that malfunctioned, and let's fix it instead? Commented Aug 6, 2020 at 15:36
  • I essentially split one column that contained the whole address by whitespaces Commented Aug 6, 2020 at 16:27

1 Answer 1

1

Use Series.isin to create a boolean mask m, then use DataFrame.loc with mask m to select a rows and columns of dataframe that needed to be shifted using DataFrame.shift along axis=1:

m = ~df['str_dir'].isin(mylist)
df.loc[m, 'str_dir':] = df.loc[m, 'str_dir':].shift(axis=1)

Result:

  Price Range  str_num str_dir    str str_sfx       city    zip
0  200 - 300k      123     NaN   Fake      St    Boulder  80304
1  300 - 400k      456     NaN   Main      St       Erie  80123
2  300 - 400k      789       E  Lolly      St    Boulder  80302
3  300 - 400k      999       N   Home     Ave  Lafayette  80027
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.