0

I got a csv file with a couple of columns and a header containing 4 rows. The first column contains the timestamp. Unfortunately it also gives milliseconds, but whenever those are at 00, they are not given in the file. It looks like that:

"TOA5","CR1000","CR1000","E9048"
"TIMESTAMP","RECORD","BattV_Avg","PTemp_C_Avg"
"TS","RN","Volts","Deg C"
"","","Avg","Avg"
"2015-08-28 12:40:23.51",1,12.91,32.13
"2015-08-28 12:50:43.23",2,12.9,32.34
"2015-08-28 13:12:22",3,12.91,32.54

As I don't need the milliseconds, I want to get rid of those, as this makes further calculations containing time a bit complicated. My approach so far:

Extract first 20 digits in each row to get a format such as 2015-08-28 12:40:23

timestamp = []
with open(filepath) as f:
    for _ in xrange(4): #skip 4 header rows
        next(f)
    for line in f:
        time = line[1:20] #Get values for the current line
        timestamp.append(time) #Add values to list

From here on I'm struggling on how to procede further. I want to exchange the first column in the csv file with the newly created timestamp list.

I tried creating a dictionary, but I don't know how to use the header caption in row 2 as the key:

d = {}
with open(filepath, 'rb') as csv_file:
    csv_reader = csv.reader(csv_file, delimiter=',')
    for col in csv_reader:
        #use header info from row 2 as key here

This would import the whole csv file into a dict and I'd then change the TIMESTAMP entry in the dict with the timestamp list above. Is this even possible?

Or is there an easier approach on how to just change the first column in the csv with my new list so that my csv file in the end contains the timestamp just without the millisecond information?

So the first column in my csv should look like this:

"TOA5"
"TIMESTAMP"
"TS"
""
2015-08-28 12:40:23
2015-08-28 12:50:43
2015-08-28 13:12:22
5
  • can you show what the file looks like? Especially the header info from row 2 you are trying to get. Commented Sep 12, 2015 at 16:45
  • Also would advice to add what you want the csv to look like after your operation. Commented Sep 12, 2015 at 16:51
  • I edited the initial post that now contains this information Commented Sep 12, 2015 at 16:51
  • Is the 1 / 2 also present in the file? And are the headers all quoted like that in the file itself? I would adivce you to directly copy paste the starting few lines of the csv here. Commented Sep 12, 2015 at 16:55
  • updated the initial post. No enumeration in the file, sorry. But yes, the quotes are there. This is how it actually looks like now (there are 30 more columns though). Commented Sep 12, 2015 at 17:02

3 Answers 3

2

This should do it and preserve the quoting:

with open(filepath1, 'rb') as fin, open(filepath2, 'wb') as fout:
    reader = csv.reader(fin)
    writer = csv.writer(fout, quoting=csv.QUOTE_NONNUMERIC)
    for _ in xrange(4):  # copy first 4 header rows
        writer.writerow(next(reader))
    for row in reader:  # process data lines
        row[0] = row[0][:19] # strip fractional seconds from first column
        writer.writerow([row[0], int(row[1])] + map(float, row[2:]))

Since a csv.reader returns the columns of each row as a list of strings, it's necessary to convert any which contain numeric values into their actual int or float numeric value before they're written out to prevent them from being quoted.

Sign up to request clarification or add additional context in comments.

Comments

1

I believe you can easily create a new csv from iterating over the original csv and replacing the timestamp as you want.

Example -

with open(filepath, 'rb') as csv_file, open('<new file>','wb') as outfile:
    csv_reader = csv.reader(csv_file, delimiter=',')
    csv_writer = csv.writer(outfile, delimiter=',')
    for i, row in enumerate(csv_reader):    #Enumerating as we only need to change rows after 3rd index.
        if i <= 3:
            csv_writer.writerow(row)
        else:
            csv_writer.writerow([row[0][1:20]] + row[1:])

2 Comments

Wow, brilliant. I would have never thought about that! One more issue though: in my newly created file all quotation marks are erased. Any idea why? This could be a problem when replacing this file with the machine output that's accessing this file.
Do you want all fields to be quoted? If so when openning the csv.writer you can add an extra argument quoting=csv.QUOTE_ALL .
0

I'm not entirely sure about how to parse your csv but I would do something of the sort:

time = time.split(".")[0]

so if it does have a millisecond it would get removed and if it doesn't nothing will happen.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.