How to write pandas dataframe to a csv with varying row length

Question

I've read in a csv in Pandas that has a variance in row values and some blank lines in between the rows.

Example:

This is an example

CustomerID; 123;
Test ID; 144;
Seen_on_Tv; yes;

now_some_measurements_1;
test1; 333; 444; 555; 
test2; 344; 455; 566;
test3; 5544; 3424; 5456;

comment; this test sample is only for this Stackoverflow question, but
similar to my real data.

When reading in this file, I use this code:

pat = pd.read_csv(FileName, skip_blank_lines = False, header=None, sep=";", names=['a', 'b', 'c', 'd', 'e']) 
pat.head(10)

output:

                         a      b       c       d    e
0       This is an example    NaN     NaN     NaN  NaN
1                      NaN    NaN     NaN     NaN  NaN
2               CustomerID    123     NaN     NaN  NaN
3                  Test ID    144     NaN     NaN  NaN
4               Seen_on_Tv    yes     NaN     NaN  NaN
5                      NaN    NaN     NaN     NaN  NaN
6  now_some_measurements_1    NaN     NaN     NaN  NaN
7                    test1    333   444.0   555.0     
8                    test2    344   455.0   566.0  NaN
9                    test3   5544  3424.0  5456.0  NaN

This works, especially since I have to change the CustomerID (via this code example) etc:

newID = 'HASHED'
pat.loc[pat['a'] == 'CustomerID', 'b']=newID

However, when I save this changed dataframe to csv, I get a lot of 'trailing' seperators (";") as most of the columns are empty and especially with the blank lines.

pat.to_csv('out.csv', sep=";", index = False, header=False)

output (out.csv):

This is an example;;;;
;;;;
CustomerID; HASHED;;;
Test ID; 144;;;
Seen_on_Tv; yes;;;
;;;;
now_some_measurements_1;;;;
test1; 333;444.0;555.0; 
test2; 344;455.0;566.0;
test3; 5544;3424.0;5456.0;
;;;;
comment; this test sample is only for this Stackoverflow question, but similar to my real
data. 
;;;

I've searched almost everywhere for a solution, but can not find it. How to write only the column values to the csv file that are not blank (except for the blank lines to separate the sections, which need to remain blank of course)?

Thank you in advance for your kind help.

You realise that by doing that you'll make it an invalid delimiter separated file and it may not load back in to some systems... (for instance... you couldn't reload your output back into pandas...) — Jon Clements
– Jon Clements, Commented Jun 28, 2022 at 13:22
Hi Jon, thank you for your question and I do understand. This is however a specific request from one of my clients, who have build up a tool to specifically read these files with the current set-up and that tool could not work with the separators that I've added in the export (the additional ";") — Renate van Kempen
– Renate van Kempen, Commented Jun 28, 2022 at 13:28

Jon Clements · Accepted Answer · 2022-06-28 13:48:28Z

1

A simple way would be to just parse your out.csv and for the non-blank lines (those consisting solely of ;'s) - write a stripped version of that line, eg:

with open('out.csv') as fin, open('out2.csv', 'w') as fout:
    for line in fin:
        if stripped := line.strip(';\n '):
            fout.write(stripped + '\n')
        else:
            fout.write(line)

Will give you:

This is an example
;;;;
CustomerID; HASHED
Test ID; 144
Seen_on_Tv; yes
;;;;
now_some_measurements_1
test1; 333;444.0;555.0
test2; 344;455.0;566.0
test3; 5544;3424.0;5456.0
;;;;
comment; this test sample is only for this Stackoverflow question, but similar to my real
data.
;;;

You could also pass a io.StringIO object to to_csv (to save writing to disk and then re-reading) as the output destination, then parse that in a similar fashion to produce your desired output file.

edited Jun 28, 2022 at 13:48

answered Jun 28, 2022 at 13:40

Jon Clements

143k34 gold badges254 silver badges288 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Renate van Kempen Over a year ago

Thank you for this prompt and great solution, it worked on my test file and on a real file. I've added an additional line to empty the 'blank-rows' as well. Again thank you.

Collectives™ on Stack Overflow

How to write pandas dataframe to a csv with varying row length

1 Answer 1

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related