stripping the zeros in csv with python

Question

Hello I have a csv file and I need to remove the zero's with python:

Column 6, column 5 in python is defaulted to 7 digits. with this

AFI12001,01,C-,201405,P,0000430,2,0.02125000,US,60.0000
AFI12001,01,S-,201404,C,0001550,2,0.03500000,US,30.0000

I need to remove the zeros in front then I need to add a zero or zeros to make sure it has 4 digits total

so I would need it to look like this:

AFI12001,01,C-,201405,P,0430,2,0.02125000,US,60.0000
AFI12001,01,S-,201404,C,1550,2,0.03500000,US,30.0000

This code adds the zero's

import csv

new_rows = []
with open('csvpatpos.csv','r') as f:
csv_f = csv.reader(f)
for row in csv_f:
new_row = ""
col = 0
print row
for x in row:
col = col + 1
if col == 6:
if len(x) == 3:
x = "0" + x
new_row = new_row + x + ","
print new_row

However, I'm having trouble removing the zeros in front.

are all the numbers at the same index?

Padraic Cunningham
– Padraic Cunningham

2014-07-25 16:33:53 +00:00
Commented Jul 25, 2014 at 16:33 — Padraic Cunningham
– Padraic Cunningham, Commented Jul 25, 2014 at 16:33
BTW: correct indentions to make your code readable.

furas
– furas

2014-07-25 18:11:23 +00:00
Commented Jul 25, 2014 at 18:11 — furas
– furas, Commented Jul 25, 2014 at 18:11

Colin Phipps · Accepted Answer · 2014-07-25 16:37:47Z

2

Convert the column to an int then back to a string in whatever format you want.

row[5] = "%04d" % int(row[5])

answered Jul 25, 2014 at 16:37

Colin Phipps

9185 silver badges8 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

wflynny · Accepted Answer · 2014-07-31 23:33:41Z

1

You could probably do this in several steps with .lstrip(), then finding the resulting string length, then adding on 4-len(s) 0s to the front. However, I think it's easier with regex.

with open('infilename', 'r') as infile:
    reader = csv.reader(infile)
    for row in reader:
        stripped_value = re.sub(r'^0{3}', '', row[5])

Yields

0430
1550

In the regex, we are using the format sub(pattern, substitute, original). The pattern breakdown is:

'^' - match start of string
'0{3}' - match 3 zeros

You said all the strings in the 6th column have 7 digits, and you want 4, so replace the first 3 with an empty string.

Edit: If you want to replace the rows, I would just write it out to a new file:

with open('infilename', 'r') as infile, open('outfilename', 'w') as outfile:
    reader = csv.reader(infile)
    writer = csv.writer(outfile)
    for row in reader:
        row[5] = re.sub(r'^0{3}', '', row[5])
        writer.writerow(row)

Edit2: In light of your newest requests, I would recommend doing the following:

with open('infilename', 'r') as infile, open('outfilename', 'w') as outfile:
    reader = csv.reader(infile)
    writer = csv.writer(outfile)
    for row in reader:
        # strip all 0's from the front
        stripped_value = re.sub(r'^0+', '', row[5])
        # pad zeros on the left to smaller numbers to make them 4 digits
        row[5] = '%04d'%int(stripped_value)
        writer.writerow(row)

Given the following numbers,

['0000430', '0001550', '0013300', '0012900', '0100000', '0001000']

this yields

['0430', '1550', '13300', '12900', '100000', '1000']

edited Jul 31, 2014 at 23:33

answered Jul 25, 2014 at 16:31

wflynny

18.6k6 gold badges50 silver badges69 bronze badges

9 Comments

zooted Over a year ago

Hi Bill, this actually might work. Do I need to write to the file as well to replace the data?

zooted Over a year ago

It doesn't look like it actually working. Its keeping the zeros for some reason. But it looks like its going down the right track.

wflynny Over a year ago

Nick are you still having problems with this?

zooted Over a year ago

Not working just yet. It seems to be adding zeros to numbers that are longer than 4 digits. AFI12001,01,W-,201405,P,0560,2,0.01375000,US,55.0000 AFI12001,01,17,201404,C,0013300,2,0.15625000,US,21.0000 AFI12001,01,17,201404,C,0013400,2,0.06250000,US,30.0000 AFI12001,01,17,201404,C,0013500,2,0.03125000,US,10.0000 AFI12001,01,17,201404,C,0013700,2,0.01563000,US,25.0000 AFI12001,01,17,201404,P,0012800,2,0.04688000,US,38.0000 AFI12001,01,17,201404,P,0012900,2,0.10938000,US,5.0000

wflynny Over a year ago

Are you sure this is for my solution? This re.sub call will not add any zeros. It will only replace 3 leading zeros from strings. Do you want it do more? Specifically, 1.) are all your numbers 7 digits? 2.) do you want to remove all leading zeros from numbers (i.e. 0013300 -> 13300)?

|

shaktimaan · Accepted Answer · 2014-07-25 16:29:29Z

1

You can use lstrip() and zfill() methods. Like this:

with open('input') as in_file:
    csv_reader = csv.reader(in_file)
    for row in csv_reader:
        stripped_data = row[5].lstrip('0')
        new_data = stripped_data.zfill(4)
        print new_data

This prints:

0430
1550

The line:

stripped_data = row[5].lstrip('0')

gets rid of all the zeros on the left. And the line:

new_data = stripped_data.zfill(4)

fills the front with zeros such that the total number of digits are 4.

Hope this helps.

answered Jul 25, 2014 at 16:29

shaktimaan

12.1k2 gold badges32 silver badges33 bronze badges

1 Comment

zooted Over a year ago

How would I add all the data plus this to another csv file? Would I just use the write command? In addition, this is adding zero's to blank fields, is there a way around that?

furas · Accepted Answer · 2014-07-25 18:15:07Z

0

You can keep last 4 chars

columns[5] = columns[5][-4:]

example

data = '''AFI12001,01,C-,201405,P,0000430,2,0.02125000,US,60.0000
AFI12001,01,S-,201404,C,0001550,2,0.03500000,US,30.0000'''

for row in data.splitlines():

    columns = row.split(',')

    columns[5] = columns[5][-4:]

    print ','.join(columns)

result

AFI12001,01,C-,201405,P,0430,2,0.02125000,US,60.0000
AFI12001,01,S-,201404,C,1550,2,0.03500000,US,30.0000

EDIT:

code with csv module - not data to simulate file.

import csv

with open('csvpatpos.csv','r') as f:

    csv_f = csv.reader(f)

    for row in csv_f:

        row[5] = row[5][-4:]

        print row[5] # print one element

        #print ','.join(row) # print full row
        print row # print full row

edited Jul 25, 2014 at 18:15

answered Jul 25, 2014 at 16:50

furas

149k12 gold badges121 silver badges171 bronze badges

3 Comments

zooted Over a year ago

I getting columns are undefined, and there are tons of rows with different data after that.

furas Over a year ago

See new example with module csv and without data simulating file.

zooted Over a year ago

This actually worked. However, its adding zeros to blank columns. Do you know how to get around that?

Collectives™ on Stack Overflow

stripping the zeros in csv with python

4 Answers 4

Comments

9 Comments

1 Comment

3 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

Comments

9 Comments

1 Comment

3 Comments

Your Answer

Sign up or log in

Post as a guest

Related