How to delete empty spaces from pandas DataFrame rows until first populated field?

Question

Lets say I imported a really messy data from a PFD and I´m cleaning it. I have something like this:

Name	Type	Date	other1	other2	other3
Name1	''	''	Type1	''	Date1
Name2	''	''	''	Type2	Date2
Name3	''	''	Type3	Date3	''
Name4	''	Type4	''	''	Date4
Name5	Type5	''	Date5	''	''

And so on. As you can see, Type is always before date on each row, but I basically need to delete all '' (currently empty strings on the DataFrame) while moving everything to the left so they align with their respective Type and Date columns. Additionally, there's more columns to the right with the same problem, but for structural reasons I cant remove ALL '', the solution I´m looking for would just move 'everything to the left' so to speak (as it happens with pd.shift).

I appreciate your help.

When you shift the values to each row on the left, the column names remain the same? if you could post your desired output it would also be nice. — Anoushiravan R
– Anoushiravan R, Commented May 3, 2022 at 22:18

MoRe · Accepted Answer · 2022-05-03 21:55:49Z

2

data = df.values.flatten()
pd.DataFrame(data[data != ""].reshape(-1, 3), columns = ['Name','Type', 'Date'])

or:

pd.DataFrame(df.values[df.values != ""].reshape(-1, 3), columns = ['Name','Type', 'Date'])

output:

    Name    Type    Date
0   Name1   Type1   Date1
1   Name2   Type2   Date2
2   Name3   Type3   Date3
3   Name4   Type4   Date4
4   Name5   Type5   Date5

without reshape:

pd.DataFrame(df.apply(lambda x: (a:=np.array(x))[a != ""] , axis = 1).values.tolist())

or:

s = df[0].copy()
for col in df.columns[1:]:
    s += " " + df[col]
pd.DataFrame(s.str.split().values.tolist(), columns = ['Name','Type', 'Date'])

edited May 3, 2022 at 21:55

answered May 3, 2022 at 19:41

MoRe

2,3942 gold badges5 silver badges26 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

C A Over a year ago

Thanks for your help. But in this case a reshape isn't easy to perform, since there's a lot more columns. I'm looking for something similar to a df.shift(axis='columns') but to the left and applying only for each row's case.

MoRe Over a year ago

@CA I think there is no shift like way for this problem, at least on my knowledge

C A · Accepted Answer · 2022-05-06 15:31:37Z

1

What worked for me was:

while '' in df['Type'].unique():
    for i,row in df.iterrows():
        if row['Type'] == '':
            df.iloc[i, 1:] = df.iloc[i, 1:].shift(-1, fill_value='')

And the same for next column

edited May 6, 2022 at 15:31

answered May 6, 2022 at 15:14

C A

135 bronze badges

Collectives™ on Stack Overflow

How to delete empty spaces from pandas DataFrame rows until first populated field?

2 Answers 2

2 Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

2 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related