3

I am trying to add specific values from a CSV file if the user is the same. I can't explain it clearly so I will try to show you.

=====================
|E-mail  | M-count |
|[email protected] | 12      |
|[email protected] | 8       |
|[email protected] | 13      |
|[email protected] | 2       |
=====================

Then it tries to add everything that belongs to a specific user:

=====================
|E-mail  | Total   |
|[email protected] | 25      |
|[email protected] | 8       |
|[email protected] | 2       |
=====================

I split the CSV and added the values that I need in a set, but I can't think of a way to add the values that I need. Any ideas?

Edit:

This is what my CSV looks like:

p_number,duration,clnup#
5436715524,00:02:26,2
6447654246,00:17:18,5
5996312484,00:01:19,1
5436715524,00:10:12,6

I would like to get the total duration and the total clnup# for each unique p_number. I am sorry for the confusion but the table above was just an example.

2
  • better to show us some fragment of the file and some code of what you have done Commented Jun 11, 2015 at 20:46
  • I edited the post. Regarding the code, it's currently just reading the CSV. Commented Jun 11, 2015 at 20:57

2 Answers 2

1

You can use an OrderedDict storing the names as values and updating the count as you go:

import csv
from collections import OrderedDict

od = OrderedDict()

with open("test.txt") as f:
    r = csv.reader(f)
    head = next(r)
    for name,val in r:
        od.setdefault(name, 0)
        od[name]  += int(val)

print(od)
OrderedDict([('[email protected]', 25), ('[email protected]', 8), ('[email protected]', 2)])

To update the original file you can write to a NamedTemporaryFile then use shutil.move to replace the original after you have written the rows with writerows using the od.items:

import csv
from collections import OrderedDict
from shutil import move
from tempfile import NamedTemporaryFile
od = OrderedDict()

with open("test.txt") as f, NamedTemporaryFile(dir=".",delete=False) as out:
    r = csv.reader(f)
    wr = csv.writer(out)
    head = next(r)
    wr.writerow(head)
    for name,val in r:
        od.setdefault(name, 0)
        od[name]  += int(val)
    wr.writerows(od.iteritems())


move(out.name,"test.txt")

Output:

E-mail,M-count
[email protected],25
[email protected],8
[email protected],2

If you don't care about order use a defaultdict instead:

import csv

from collections import defaultdict
from shutil import move
from tempfile import NamedTemporaryFile
od = defaultdict(int)

with open("test.txt") as f, NamedTemporaryFile(dir=".",delete=False) as out:
    r = csv.reader(f)
    wr = csv.writer(out)
    head = next(r)
    wr.writerow(head)
    for name,val in r:
        od[name]  += int(val)
    wr.writerows(od.iteritems())
Sign up to request clarification or add additional context in comments.

2 Comments

Thank you, sir! This is actually really interesting, it never came to mind that I could use this method. I will try it out!
@user5000054, no worries, this presumes the data is as posted and delimited by a comma
0
import csv

ifile = open('sample.csv', 'rb')
csv_reader = csv.reader(ifile)

d = {}
for row in csv_reader:
    d[row[0]] = int(row[1]) if d.get(row[0], None) is None else d[row[0]] + int(row[1])
from pprint import pprint
pprint(d)

1 Comment

@user5000054, You many want to skip the first row of input csv if it contains header.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.