0

Hello I have a csv file and I need to remove the zero's with python:

Column 6, column 5 in python is defaulted to 7 digits. with this

AFI12001,01,C-,201405,P,0000430,2,0.02125000,US,60.0000
AFI12001,01,S-,201404,C,0001550,2,0.03500000,US,30.0000

I need to remove the zeros in front then I need to add a zero or zeros to make sure it has 4 digits total

so I would need it to look like this:

AFI12001,01,C-,201405,P,0430,2,0.02125000,US,60.0000
AFI12001,01,S-,201404,C,1550,2,0.03500000,US,30.0000

This code adds the zero's

import csv

new_rows = []
with open('csvpatpos.csv','r') as f:
csv_f = csv.reader(f)
for row in csv_f:
new_row = ""
col = 0
print row
for x in row:
col = col + 1
if col == 6:
if len(x) == 3:
x = "0" + x
new_row = new_row + x + ","
print new_row

However, I'm having trouble removing the zeros in front.

2
  • are all the numbers at the same index? Commented Jul 25, 2014 at 16:33
  • BTW: correct indentions to make your code readable. Commented Jul 25, 2014 at 18:11

4 Answers 4

2

Convert the column to an int then back to a string in whatever format you want.

row[5] = "%04d" % int(row[5])
Sign up to request clarification or add additional context in comments.

Comments

1

You could probably do this in several steps with .lstrip(), then finding the resulting string length, then adding on 4-len(s) 0s to the front. However, I think it's easier with regex.

with open('infilename', 'r') as infile:
    reader = csv.reader(infile)
    for row in reader:
        stripped_value = re.sub(r'^0{3}', '', row[5])

Yields

0430
1550

In the regex, we are using the format sub(pattern, substitute, original). The pattern breakdown is:

'^' - match start of string
'0{3}' - match 3 zeros

You said all the strings in the 6th column have 7 digits, and you want 4, so replace the first 3 with an empty string.


Edit: If you want to replace the rows, I would just write it out to a new file:

with open('infilename', 'r') as infile, open('outfilename', 'w') as outfile:
    reader = csv.reader(infile)
    writer = csv.writer(outfile)
    for row in reader:
        row[5] = re.sub(r'^0{3}', '', row[5])
        writer.writerow(row)

Edit2: In light of your newest requests, I would recommend doing the following:

with open('infilename', 'r') as infile, open('outfilename', 'w') as outfile:
    reader = csv.reader(infile)
    writer = csv.writer(outfile)
    for row in reader:
        # strip all 0's from the front
        stripped_value = re.sub(r'^0+', '', row[5])
        # pad zeros on the left to smaller numbers to make them 4 digits
        row[5] = '%04d'%int(stripped_value)
        writer.writerow(row)

Given the following numbers,

['0000430', '0001550', '0013300', '0012900', '0100000', '0001000']

this yields

['0430', '1550', '13300', '12900', '100000', '1000']

9 Comments

Hi Bill, this actually might work. Do I need to write to the file as well to replace the data?
It doesn't look like it actually working. Its keeping the zeros for some reason. But it looks like its going down the right track.
Nick are you still having problems with this?
Not working just yet. It seems to be adding zeros to numbers that are longer than 4 digits. AFI12001,01,W-,201405,P,0560,2,0.01375000,US,55.0000 AFI12001,01,17,201404,C,0013300,2,0.15625000,US,21.0000 AFI12001,01,17,201404,C,0013400,2,0.06250000,US,30.0000 AFI12001,01,17,201404,C,0013500,2,0.03125000,US,10.0000 AFI12001,01,17,201404,C,0013700,2,0.01563000,US,25.0000 AFI12001,01,17,201404,P,0012800,2,0.04688000,US,38.0000 AFI12001,01,17,201404,P,0012900,2,0.10938000,US,5.0000
Are you sure this is for my solution? This re.sub call will not add any zeros. It will only replace 3 leading zeros from strings. Do you want it do more? Specifically, 1.) are all your numbers 7 digits? 2.) do you want to remove all leading zeros from numbers (i.e. 0013300 -> 13300)?
|
1

You can use lstrip() and zfill() methods. Like this:

with open('input') as in_file:
    csv_reader = csv.reader(in_file)
    for row in csv_reader:
        stripped_data = row[5].lstrip('0')
        new_data = stripped_data.zfill(4)
        print new_data

This prints:

0430
1550

The line:

stripped_data = row[5].lstrip('0')

gets rid of all the zeros on the left. And the line:

new_data = stripped_data.zfill(4) 

fills the front with zeros such that the total number of digits are 4.

Hope this helps.

1 Comment

How would I add all the data plus this to another csv file? Would I just use the write command? In addition, this is adding zero's to blank fields, is there a way around that?
0

You can keep last 4 chars

columns[5] = columns[5][-4:]

example

data = '''AFI12001,01,C-,201405,P,0000430,2,0.02125000,US,60.0000
AFI12001,01,S-,201404,C,0001550,2,0.03500000,US,30.0000'''

for row in data.splitlines():

    columns = row.split(',')

    columns[5] = columns[5][-4:]

    print ','.join(columns)

result

AFI12001,01,C-,201405,P,0430,2,0.02125000,US,60.0000
AFI12001,01,S-,201404,C,1550,2,0.03500000,US,30.0000

EDIT:

code with csv module - not data to simulate file.

import csv

with open('csvpatpos.csv','r') as f:

    csv_f = csv.reader(f)

    for row in csv_f:

        row[5] = row[5][-4:]

        print row[5] # print one element

        #print ','.join(row) # print full row
        print row # print full row

3 Comments

I getting columns are undefined, and there are tons of rows with different data after that.
See new example with module csv and without data simulating file.
This actually worked. However, its adding zeros to blank columns. Do you know how to get around that?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.