0

I have multiple .csv files and I combined them in single .csv file with the use of python programming.

Now I need to automate the process of replacing the content of one column in a .csv file with the use of python. I can also open the .csv file using Notepad and replace the content of the column but the file is very huge and it is taking a long time.

Name                          ID                                                class  Num
"kanika",""University ISD_po.log";" University     /projects/asd/new/high/sde"","MBA","12"
"Ambika",""University ISD_po.log";" University     /projects/asd/new/high/sde"","MS","13"

In the above, I need to replace the content of ID column. The new content in the ID column should be "input".

This Id column is enclosed with 2 double quotes and has some extra spaces as well. Whereas other columns have only 1 double quote.

Is there any way to do it in python?

To combine multiple .csv files, the code is:

fout=open("out.csv","a")
for line in open("sh1.csv"):
    fout.write(line)
for num in range(2,21):
    f=open("sh"+str(num)+".csv")
    f.next()
    for line in f:
        fout.write(line)
    f.close()
fout.close()
7
  • Please read: docs.python.org/library/csv.html Commented Jan 16, 2012 at 20:36
  • 4
    That's not a csv file. Where are the commas? Commented Jan 16, 2012 at 20:40
  • @Wooble tab delimited csv perhaps? Commented Jan 16, 2012 at 20:52
  • Perhaps, but there are no tabs in what was posted, just spaces. Could be some kind of fixed-width format, I suppose. Commented Jan 16, 2012 at 20:53
  • no, i separated it with space so that you can understand. It is separated by commmas Commented Jan 16, 2012 at 20:59

4 Answers 4

4

As other people have indicated, normally one does use the csv module to read/write a CSv file from Python.

However, if the file you are mentioning is just like you posted, it is not well formed, and python's CSV won't be able to deal with it properly - (bad usage of double quotes on the column you want to change).

Therefore it is worth treating your file as a text file, and make the changes in there:

with open("myfile.csv") as input_file:
   with open("output.csv", "wt") as output:
      output.write(input_file.readline())
      for line in input_file:
           parts = line.split('""')
           id = parts.split('"')[-1]
           output.write(parts[0] + id + parts[2])
Sign up to request clarification or add additional context in comments.

1 Comment

I tried the code, but got an error. output.write(input_file.readline())---------- IOError: File not open for writing
2

Try Python's csv module to read and write CSV files.

3 Comments

but how do i replace the content?
Simply read the original data from the input file and write the modified data to the output file, one row at a time.
-0. The OP probably would not be able to handle such generic instructions.
0

You could use a regular expression to remove it:

In [3]: re.sub(r'""Uni-\s*"([0-9]+)""', r'\1', '""Uni-  "38447484""', flags=re.I)
Out[3]: '38447484'

Comments

0

You just want to remove ""Uni- and " followed by a space.

change your code to

for line in f:
    line=line.replace('""Uni-','').replace('" ','')
    fout.write(line)

you get for example

kanika "38447484" MBA

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.