0

Let's say I have CSV data as note below; let's call this original.csv:

name,value1,value2
firstname,34326408129478932874,553
secondname_a_very_long_one,65,123987
thirdname_medium,9686509933423,33

Basically, it's either single word text (no space separation, so no need for quoting) or numbers (here integers, but could be floats with decimals or scientific 1e-5 notation) - and there is no expectation that a comma could appear somewhere (outside of being a delimiter), so no need for special handling of the quoting of comma either ...

So, to ease the strain on my eyes when I view this .csv file in a text editor, I would like to format it with fixed width - space padded (left or right padding choice made per column, and separately for header row); note that the file is still comma delimited as a data format, the fixed width is just for viewing in editor - this is how I would like it to look like, let's call this tmpfw.csv:

name                      , value1              , value2
firstname                 , 34326408129478932874, 553
secondname_a_very_long_one,                   65, 123987
thirdname_medium          ,        9686509933423, 33

So here, the heading row is all left-aligned (right-padded with spaces); columns name and value2 are also left-aligned (right-padded with spaces); and column value1 is right-aligned (right-padded with spaces). The columns are sized (in characters) according to the largest string length of the data in that column; and there is an extra space as visual delimiter after the commas.

Of course, if I want to use this data in Python properly, I'd have to "strip" it first - but I don't mind, since as I mentioned, the data is such that I don't have to worry about quoting issues; here is a Python example of how I could use tmpfw.csv - let's call it test.py:

import sys
import csv
import pprint

with open('tmpfw.csv', newline='') as csvfile:
  my_csv = csv.reader(csvfile)
  my_csv_list = list(my_csv)

my_csv_list_stripped = [list(map(str.strip, irow)) for irow in my_csv_list]

print("\nmy_csv_list:\n")
pprint.pprint( my_csv_list )
print("\nmy_csv_list_stripped:\n")
pprint.pprint( my_csv_list_stripped )

#print("\nreprint stripped as csv:\n")
#csvwriter = csv.writer(sys.stdout) # just print out to terminal
#csvwriter.writerows(my_csv_list_stripped)

This is what I get printed:

$ python3 test.py

my_csv_list:

[['name                      ', ' value1              ', ' value2'],
 ['firstname                 ', ' 34326408129478932874', ' 553'],
 ['secondname_a_very_long_one', '                   65', ' 123987'],
 ['thirdname_medium          ', '        9686509933423', ' 33']]

my_csv_list_stripped:

[['name', 'value1', 'value2'],
 ['firstname', '34326408129478932874', '553'],
 ['secondname_a_very_long_one', '65', '123987'],
 ['thirdname_medium', '9686509933423', '33']]

I can use this as a base to convert the numbers to int later - so, I can use such a fixed-width csv fine, all is good ...

So, my question is: let's say I have the original.csv - what would be the easiest way in Python to obtain a "fixed-width formatted" tmpfw.csv? Do csv or pandas or other libraries have facilities for exporting a CSV format like this?

3
  • 1
    Do you need to also edit the CSV while you're looking at it? If not, I wouldn't mess with the CSV format itself; instead just make it look really pretty (and different), csvkit.readthedocs.io/en/latest/scripts/csvlook.html to make a table. If you do need to edit it by hand, I'd still recommend not changing the structure and get a plugin like marketplace.visualstudio.com/…. Commented Mar 22, 2023 at 13:31
  • 1
    If you find you must edit the CSV, add the spaces after the comma, before the next cell of data. That way you can use the skipinitialspace=True option from Python's csv.reader to get it "back to normal"—no special processing necessary. Commented Mar 22, 2023 at 13:37
  • Also, somewhat distantly related, but for reference: stackoverflow.com/questions/18911984/… Commented Mar 22, 2023 at 13:41

1 Answer 1

1

Sure – compute what the maximum length of each column is, then .ljust() them accordingly when printing:

import csv
import io

# pretend reading csv from file
csv_data = list(csv.reader(io.StringIO("""
name,value1,value2
firstname,34326408129478932874,553
secondname_a_very_long_one,65,123987
thirdname_medium,9686509933423,33
""".strip())))

n_cols = len(csv_data[0])
col_widths = [max(len(row[i]) for row in csv_data) for i in range(n_cols)]

for row in csv_data:
    print(', '.join(val.ljust(width) for val, width in zip(row, col_widths)))

This prints out

name                      , value1              , value2
firstname                 , 34326408129478932874, 553   
secondname_a_very_long_one, 65                  , 123987
thirdname_medium          , 9686509933423       , 33    

and naturally you could open a file and print(..., file=...) instead.

Sign up to request clarification or add additional context in comments.

1 Comment

Awesome, thanks @AKX - that works great! +1 for the "surprise" - I kind of thought "plain Python" solution would be a lot more complicated/verbose, but your solution is quite readable.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.