shift columns one step if column value is empty

Question

Suppose I have a dataframe looking something like this:

  col1 col2 col3 col4
0    A    B    F    O
1    A         G    Q
2    A    C    G    P
3    A         H     
4    A    D    I     
5    A    D    I     
6    A         J    U
7    A    E         J

How can I shift the columns if the column value is empty?

  col1 col2  col3  col4
0    A    B     F     O
1    A    G     Q 
2    A    C     G     P
3    A    H  
4    A    D     I 
5    A    D     I 
6    A    J     U 
7    A    E     J

I thought I could check current column, if it's empty, take the next column value and make that empty.

for col in df.columns:    
    df[col] = np.where((df[col] == ''), df[f'col{int(col[-1])+1}'], df[col])
    df[f'col{int(col[-1])+1}'] = np.where((df[col] == ''), '', df[col])

But I am failing somewhere. Sample df below.

df = pd.DataFrame(
    {
        'col1': ['A','A','A','A','A','A','A','A'],
        'col2': ['B','','C','','D','D','','E'],
        'col3': ['F','G','G','H','I','I','J',''],
        'col4': ['O','Q','P','','','','U','J']
    }
)

Vishnudev Krishnadas · Accepted Answer · 2021-03-10 15:27:07Z

3

Replace empty string with NaN

df = df.replace('', np.nan)

Apply dropna row-wise

odf = df.apply(lambda x: pd.Series(x.dropna().values), axis=1)

To retain column names,

odf.columns = df.columns

NOTE: It is always good to represent missing data with NaN

Output

  col1 col2 col3 col4
0    A    B    F    O
1    A    G    Q  NaN
2    A    C    G    P
3    A    H  NaN  NaN
4    A    D    I  NaN
5    A    D    I  NaN
6    A    J    U  NaN
7    A    E    J  NaN

edited Mar 10, 2021 at 15:27

answered Mar 10, 2021 at 15:16

Vishnudev Krishnadas

11k2 gold badges29 silver badges58 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

M__ Over a year ago

This answer is shear Pythonic elegance. What a fantastic line of code! There was a small issue with columns to columns but its trivial and like compared with for example stackoverflow.com/questions/72104599/… ... think you can see the point. Summary, this post needs upvoting alot. This coder is good.

Quang Hoang · Accepted Answer · 2021-03-10 15:10:12Z

2

One way is to use np.argsort:

s = df.to_numpy()
orders = np.argsort(s=='', axis=1, kind='mergesort')

df[:] = s[np.arange(len(s))[:,None],orders]

Output:

  col1 col2 col3 col4
0    A    B    F    O
1    A    G    Q     
2    A    C    G    P
3    A    H          
4    A    D    I     
5    A    D    I     
6    A    J    U     
7    A    E    J

Note: A very similar approach can be found in this question.

answered Mar 10, 2021 at 15:10

Quang Hoang

151k11 gold badges64 silver badges86 bronze badges

1 Comment

destinychoice Over a year ago

numpy solutions also very fast

2 revs · Accepted Answer · 2021-03-10 15:17:41Z

0

You can count the number of empty values for a column, then sort it, and finally get the desired datframe..

counts = {}
for col in df.columns.to_list():
    counts[col] = (df[col]== '').sum()  #Based on the example you have provided.
# Then sort the dictionary based on counts.
counts = dict(sorted(counts.items(), key=lambda item: item[1]))
#Assign back to the dataframe.
df = df[[*counts.keys()]]
df

  col1 col3 col2 col4
0    A    F    B    O
1    A    G         Q
2    A    G    C    P
3    A    H          
4    A    I    D     
5    A    I    D     
6    A    J         U
7    A         E    J

edited Mar 10, 2021 at 15:17

community wiki

2 revs
Don'tAccept

Collectives™ on Stack Overflow

shift columns one step if column value is empty

3 Answers 3

1 Comment

1 Comment

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

1 Comment

1 Comment

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related