0

I have a mysql table that have some integer fields and some text fields. Inside text fields I have multiple numbers separated by comma.

What I need is to return the result by summing all the columns group by two particular keys. When summing, I want to sum up all the integer fields (which is straight forward) as well as all the text fields where each comma separated values should be summed up respectively.

I can't explain it more clearly without example, here is what I want

Table:
Key1    |Key 2  |Col1   |Col2   |Col3   |
A       |X      |2      |12     |2,4,6  |
A       |X      |4      |23     |3,6,9  |
A       |Y      |6      |54     |1,3,5  |
A       |Y      |8      |27     |4,8,12 |
B       |X      |1      |12     |5,10,5 |
B       |X      |3      |31     |6,3,1  |
B       |Y      |5      |23     |1,0,0  |
B       |Y      |7      |91     |2,5,6  |

Output I want:
Key1    |Key 2  |Col1   |Col2   |Col3   |
A       |X      |6      |35     |5,10,15|
A       |Y      |14     |81     |5,11,17|
B       |X      |4      |43     |11,13,6|
B       |Y      |12     |114    |3,5,6  |

I am using mysql and python to store the output to new table. For the integer fields, I easily use mysql SUM() function. For Col3, I use python map(add,a,b) function to individually add the the values.

The problem is, the code I am using looks ugly, and I think it'll be inefficient when I'll work with large amount of data. Any suggestion of doing this efficiently?

My current code stands:

cursor = cnx.cursor()
sqlout = "INSERT INTO tb2 (`key1`,`key2`,`col1`,`col2`) SELECT `key1`,`key2`,SUM(`col1`),SUM(`col2`) FROM tb1 GROUP BY `key1`,`key2`"
cursor.execute(sqlout)  // TESTED
cnx.commit()
sqlint = "SELECT `key1`,`key2`,`col3` FROM tb1"
cursor.execute(sqlint)
results = cursor.fetchall()
myres = {}
for row in results:
    myres[row[0],row[1]]= (map(add,myres[row[0],row[1]],row[2])
//USE MYSQL UPDATE COMMAND TO UPDATE tb2 from myres variable // NOT TESTED
cursor.close()
cnx.close()
2
  • 1
    May we see the code you consider ugly? Commented Jul 7, 2014 at 6:55
  • updated ... the last part I haven't tested as I have to loop through the dict myres and update tb1 (which I called ugly) Commented Jul 7, 2014 at 7:27

1 Answer 1

1

You don't need Python at all, you can do it in plain MySQL. First, define some helpers:

create function column1(x text) returns integer deterministic
    return substring_index(x,',',1);

create function column2(x text) returns integer deterministic
    return substring_index(substring_index(x,',',-2),',',1);

create function column3(x text) returns integer deterministic
    return substring_index(substring_index(x,',',-1),',',1);

Then, here is the query:

select 
    Key1, Key2, 
    sum(Col1) as Col1,
    sum(Col2) as Col2, 
    concat_ws(',', 
        cast(sum(column1(Col3)) as char(50)),
        cast(sum(column2(Col3)) as char(50)),
        cast(sum(column3(Col3)) as char(50))
    ) as Col3
from YourTable
group by Key1, Key2;
Sign up to request clarification or add additional context in comments.

1 Comment

thanks a lot .. i never used function in mysql before .. I'll check that .. one more question, is there any good way I can keep the number of columns as well as number of elements inside CSV string variable?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.