1

I want to read a csv file with following structure:

1 file, 13 insertions, 23 deletions
1 file, 43 deletions
2 file, 7 insertions
4 file, 13 insertions, 45 deletions

I would like to get a Dataframe, which should look following:

file   insertions  deletions
1      13          23
1      0           43
1      7           0
4      13          45

Is there a good way to do this?

3
  • Is it csv file or a text file? Commented Jan 30, 2024 at 13:36
  • It is a string, but I use read_csv(StringIO(...)) Commented Jan 30, 2024 at 14:02
  • @Sirdrakeistda be careful with the unstack approach, this currently has an open bug (I added details in the answer) Commented Jan 30, 2024 at 14:29

1 Answer 1

0

You can read your file, split the values with extractall (or split), then pivot:

df = pd.read_csv('file.csv', sep=', *')
out = (df.stack().str.extract(r'(\d+) *(.*)')
         .droplevel(-1).pivot(columns=1, values=0)
         .fillna(0).rename_axis(index=None, columns=None)
        #.astype(int) # optional, to convert strings to integers
      )

NB. to use split, replace .str.extract(r'(\d+) *(.*)') with .str.split(' +', n=1, expand=True).

Output:

  deletions file insertions
0        23    1         13
1        43    1          0
2         0    2          7
3        45    4         13

Variant with unstack to keep the original order of columns:

out = (df.stack().str.extract(r'(\d+) *(.*)')
         .droplevel(-1)
         .set_index(1, append=True)[0]
         .unstack(fill_value=0, sort=False)
         .rename_axis(index=None, columns=None)
      )

Output:

  file insertions deletions
0    1         13        23
1    1         43         0
2    0          2         7
3    4         13        45

NB. I just realized that the output with unstack is incorrect when using sort=False which seems to be a bug.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.