0

i have a csv file similar to the following :

title  title2  h1  h2  h3 ... 
l1.1     l1     1   1   0  
l1.2     l1     0   1   0
l1.3     l1     1   0   1
l2.1     l2     0   0   1
l2.2     l2     1   0   1
l3.1     l3     0   1   1
l3.2     l3     1   1   0
l3.3     l3     1   1   0
l3.4     l3     1   1   0    

i want to be able to add the columns in the following manner:
h1 ( l1.1 + l1.2+ l1.3 ) = 2
h1 ( l2.1 + l2.2 ) = 1
h1 ( l3.1 + l3.2 + l3.3 +l3.4) = 3 and so on for every column And i want the final count for every such value as a summarised table :

title2  h1  h2  h3...
l1     2   2   1
l2     1   0   2
l3     3   4   1

how do i implement this?

5
  • That's not a csv, there are no commas! ;) Commented Jul 12, 2010 at 7:55
  • well it does not appear with commas in excel... and csv does not necessarily mean comma separated.. it can be tab seprarated also... Commented Jul 12, 2010 at 7:56
  • 1
    @newbie: no one is interested in your deadline. Stop fiddling with tags, thanks. Commented Jul 12, 2010 at 8:54
  • Why not simply build a PivotTable in Excel? That's exactly what they're for. Commented Jul 12, 2010 at 9:09
  • but is this technically feasible in python?? Commented Jul 12, 2010 at 9:18

2 Answers 2

2

Something like this should work. It takes an input in the form

title,title2,h1,h2,h3
l1.1,l1,1,1,0
l1.2,l1,0,1,0
l1.3,l1,1,0,1
l2.1,l2,0,0,1
l2.2,l2,1,0,1
l3.1,l3,0,1,1
l3.2,l3,1,1,0
l3.3,l3,1,1,0
l3.4,l3,1,1,0

and outputs

title2,h1,h2,h3
l1,2,2,1
l2,1,0,2
l3,3,4,1

Tested with Python 3.1.2. In Python 2.x you'll need to change the open() calls to use binary mode, and drop the newline="" bit). You can also drop the call to list() since in Python 2.x, map() already returns a list.

import csv
import operator

reader = csv.reader(open("test.csv", newline=""), dialect="excel")
result = {}

for pos, entry in enumerate(reader):
    if pos == 0:
        headers = entry
    else:
        if entry[1] in result:
            result[entry[1]] = list(map(operator.add, result[entry[1]], [int(i) for i in entry[2:]]))
        else:
            result[entry[1]] = [int(i) for i in entry[2:]]

writer = csv.writer(open("output.txt", "w", newline=""), dialect="excel")
writer.writerow(headers[1:])

keys = sorted(result.keys())
for key in keys:
    output = [key]
    output.extend(result[key])
    writer.writerow(output)
Sign up to request clarification or add additional context in comments.

2 Comments

i m using python 2.7... i ve changed the open() part to "open('test.csv','rb'))... but i am getting a syntax error with the "result[entry[1]] = list(map(operator.add, result[entry[1]], [int(i) for i in entry[2:]]))" line12.... any modification i might have to make??
Ah yes, remove the list(), that's only needed in Python 3 since map returns a view instead of a list in Python 3.
0

Have a look at the csv module. What you want to do is open the file with a csv.reader. Then you iterate over the file, one row at the time. You accumulate the results of the additions into a temporary list. When you are done, you write this list to a new csv.writer.

You might need to define a dialect as you are not really using CSV but some tab-delimited format.

5 Comments

even if i read one row at a time, how do i accumulate the results for l1 alone, l2 alone and so on.... and this is necessarily comma separated i ve just given the table like this for better understanding..
I can, but I won't. Accumulation is basic programming practice.
i ve done some editing to the above table... i have now managed to get it in the above format... will this help??
can you atleast give me more instructions on how to do this... i really can't get the hang of it...
i don't mean to read it row wise... i want to read it column wise... can this be done by comparing title2 column's data?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.