Replace column in csv with modified column

Question

I got a csv file with a couple of columns and a header containing 4 rows. The first column contains the timestamp. Unfortunately it also gives milliseconds, but whenever those are at 00, they are not given in the file. It looks like that:

"TOA5","CR1000","CR1000","E9048"
"TIMESTAMP","RECORD","BattV_Avg","PTemp_C_Avg"
"TS","RN","Volts","Deg C"
"","","Avg","Avg"
"2015-08-28 12:40:23.51",1,12.91,32.13
"2015-08-28 12:50:43.23",2,12.9,32.34
"2015-08-28 13:12:22",3,12.91,32.54

As I don't need the milliseconds, I want to get rid of those, as this makes further calculations containing time a bit complicated. My approach so far:

Extract first 20 digits in each row to get a format such as 2015-08-28 12:40:23

timestamp = []
with open(filepath) as f:
    for _ in xrange(4): #skip 4 header rows
        next(f)
    for line in f:
        time = line[1:20] #Get values for the current line
        timestamp.append(time) #Add values to list

From here on I'm struggling on how to procede further. I want to exchange the first column in the csv file with the newly created timestamp list.

I tried creating a dictionary, but I don't know how to use the header caption in row 2 as the key:

d = {}
with open(filepath, 'rb') as csv_file:
    csv_reader = csv.reader(csv_file, delimiter=',')
    for col in csv_reader:
        #use header info from row 2 as key here

This would import the whole csv file into a dict and I'd then change the TIMESTAMP entry in the dict with the timestamp list above. Is this even possible?

Or is there an easier approach on how to just change the first column in the csv with my new list so that my csv file in the end contains the timestamp just without the millisecond information?

So the first column in my csv should look like this:

"TOA5"
"TIMESTAMP"
"TS"
""
2015-08-28 12:40:23
2015-08-28 12:50:43
2015-08-28 13:12:22

can you show what the file looks like? Especially the header info from row 2 you are trying to get. — Anand S Kumar
– Anand S Kumar, Commented Sep 12, 2015 at 16:45
Also would advice to add what you want the csv to look like after your operation. — Anand S Kumar
– Anand S Kumar, Commented Sep 12, 2015 at 16:51
I edited the initial post that now contains this information — GeoEki
– GeoEki, Commented Sep 12, 2015 at 16:51
Is the 1 / 2 also present in the file? And are the headers all quoted like that in the file itself? I would adivce you to directly copy paste the starting few lines of the csv here. — Anand S Kumar
– Anand S Kumar, Commented Sep 12, 2015 at 16:55
updated the initial post. No enumeration in the file, sorry. But yes, the quotes are there. This is how it actually looks like now (there are 30 more columns though). — GeoEki
– GeoEki, Commented Sep 12, 2015 at 17:02

martineau · Accepted Answer · 2017-03-09 16:19:08Z

2

This should do it and preserve the quoting:

with open(filepath1, 'rb') as fin, open(filepath2, 'wb') as fout:
    reader = csv.reader(fin)
    writer = csv.writer(fout, quoting=csv.QUOTE_NONNUMERIC)
    for _ in xrange(4):  # copy first 4 header rows
        writer.writerow(next(reader))
    for row in reader:  # process data lines
        row[0] = row[0][:19] # strip fractional seconds from first column
        writer.writerow([row[0], int(row[1])] + map(float, row[2:]))

Since a csv.reader returns the columns of each row as a list of strings, it's necessary to convert any which contain numeric values into their actual int or float numeric value before they're written out to prevent them from being quoted.

edited Mar 9, 2017 at 16:19

answered Sep 12, 2015 at 17:32

martineau

124k29 gold badges181 silver badges319 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Anand S Kumar · Accepted Answer · 2015-09-12 17:08:04Z

1

I believe you can easily create a new csv from iterating over the original csv and replacing the timestamp as you want.

Example -

with open(filepath, 'rb') as csv_file, open('<new file>','wb') as outfile:
    csv_reader = csv.reader(csv_file, delimiter=',')
    csv_writer = csv.writer(outfile, delimiter=',')
    for i, row in enumerate(csv_reader):    #Enumerating as we only need to change rows after 3rd index.
        if i <= 3:
            csv_writer.writerow(row)
        else:
            csv_writer.writerow([row[0][1:20]] + row[1:])

answered Sep 12, 2015 at 17:08

Anand S Kumar

91.5k18 gold badges196 silver badges179 bronze badges

2 Comments

GeoEki Over a year ago

Wow, brilliant. I would have never thought about that! One more issue though: in my newly created file all quotation marks are erased. Any idea why? This could be a problem when replacing this file with the machine output that's accessing this file.

Anand S Kumar Over a year ago

Do you want all fields to be quoted? If so when openning the csv.writer you can add an extra argument quoting=csv.QUOTE_ALL .

O Green · Accepted Answer · 2015-09-12 16:57:38Z

0

I'm not entirely sure about how to parse your csv but I would do something of the sort:

time = time.split(".")[0]

so if it does have a millisecond it would get removed and if it doesn't nothing will happen.

answered Sep 12, 2015 at 16:57

O Green

273 bronze badges

Collectives™ on Stack Overflow

Replace column in csv with modified column

3 Answers 3

Comments

2 Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

Comments

2 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related