6

I hope someone can help me out. I have a table that logs our import jobs. I need a query that will produce a matrix with the names of tables on the vertical axis, the import dates on the horizontal axis, and the total number of records imported for that table on that date in the matrix cell. I don't care if we have to create a temporary table, but the whole thing must be done in MySQL.

Below is a simplified sample of our event log table. Not only does it have many more fieds, but we import many more tables. Therefore, the solution should account for querying the table names. You will notice that data can be imported into a table more than once per day, as in records 5 and 6.

id  table_name  import_date          num_recs 
----+-----------+--------------------+------- 
0   customer    2010-06-20 00:00:00  10        
1   order       2010-06-20 00:00:00  15        
2   customer    2010-06-21 00:00:00  5         
3   order       2010-06-21 00:00:00  6         
4   customer    2010-06-22 00:00:00  1         
5   order       2010-06-22 00:00:00  6         
6   order       2010-06-22 00:00:00  1         

We are looking for a result something like this. It does not have to be exact

table_name  06-20 06-21 06-22
------------+-----+-----+------
customer    |  10 |   5 |   1
order       |  15 |   6 |   7
1
  • 1
    /me waits to hear about how it has to be dynamic dates... Commented Jun 23, 2010 at 20:20

3 Answers 3

3

What about output of the form:

table_name   date    imports
------------+-------+--------
customer    | 06-20 |   10
customer    | 06-21 |   5
order       | 06-20 |   15
order       | 06-21 |   6

This way you can do it with a simple GROUP BY:

SELECT table_name, DATE(import_date) AS date, SUM(*) AS imports
FROM yourTable
GROUP BY table_name, date;

Otherwise, your query is going go be really nasty.

Sign up to request clarification or add additional context in comments.

14 Comments

Its quite possible to do clever stuff to turn rows into columns (I'll avoid doing a critique of the answers trying to attempt this so far) but it's the WRONG WAY TO SOLVE PROBLEM. Go with Ben S's answer. If your job depends on you providing it in the original layout and your boss kbows nothing about relational databases, then run the query from MSExcel or OpenOffice and generate a pivot table. And start looking for another job, 'cos pretty soon he's going to ask you to organise the first manned mission to Mars using lots of rubber bands.
Other databases (e.g. PostgreSQL, MS-SQL) have built-in functionality to do pivot tables/cross tabs/whatever you want to call them. Asking users to do pivot tables (or any thing that involves more effort on their part) is sometimes not the way to do it, and we as developers should be able to work around mysql's limitations. Using SQL to generate SQL is a standard tool, not "lots of rubber bands".
@symcbean: what's there's to critique about a standard pivot query? Only SQL Server 2005+ and Oracle 11g+ have PIVOT/UNPIVOT syntax. I don't see how using this query as a first stage for alteration in Excel/Access is better than doing it entirely in a SQL query...
@Hamy: What part of the data running by date in columns, not rows, do you not comprehend?
This answer gives the data you need. If you need to display it in another form, just do that in your presentation code. Turning a row into columns is not that big a deal. Or is there something else that absolutly require the resultset from mysql to be in a particular form ?
|
2

MySQL can not do pivot queries, but you can do it in two queries, using the result of the first query as the SQL for the next query:

SELECT 'SELECT table_name'
UNION
SELECT CONCAT(', SUM(IF(import_date = "',import_date,'", num_recs,0)) AS "',DATE_FORMAT(import_date, "%m-%d"),'"')
FROM event_log
GROUP BY import_date
UNION
SELECT 'FROM event_log GROUP BY table_name'

Then execute the output of that query to get your final results, e.g. for your example you would get:

SELECT table_name                                                           
, SUM(IF(import_date = "2010-06-20", num_recs,0)) AS "06-20"
, SUM(IF(import_date = "2010-06-21", num_recs,0)) AS "06-21"
, SUM(IF(import_date = "2010-06-22", num_recs,0)) AS "06-22"
FROM event_log GROUP BY table_name

You can either write a stored procedure to concatenate, prepare, and then execute the results of the first query, OR, if this is all run from a shell script, you can capture the results of the first query, then feed the results back into mysql.

5 Comments

See my query for how to do it in one statement.
And that's not dynamic SQL, only the queries necessary.
I didn't say it was dynamic SQL. Just a dynamic solution :-) Since the first query returns more than one row, you can't use mysql's prepared statements directly, but you could write a stored procedure to concatenate, prepare and then execute the output of the first query. Or you could just capture the output of the first query from a shell script, and then feed it back into mysql.
We're loosing out to an answer that only provides half of the actual answer. But yes, realistically the dynamic SQL would have to be inside a stored procedure.
+1: To keep you above Hamy's fundamentally flawed wiki answer.
0

I think Ben S is on the right track. I wanted to offer what I could here in case it helps anyone, who knows. Original source

Here is a method to take two arbitrary dates and split them apart into blocks of time, and then performs some aggregation function on other data in each block. In your case, that block should probably be a single day, the start date would likely be 30 days prior to the current day, and the end date would likely be the current day. Each block can be returned with some aggregate metric of interest. In your case, this will likely be the SUM('imports')

SELECT t1.table_name AS table_name, t1.imports AS imports FROM (SELECT SUM(`imports`) AS imports, CEIL( (UNIX_TIMESTAMP('<now>') - UNIX_TIMESTAMP(`import_date`))/ (<one day in ?seconds, i think?>) ) AS RANGE FROM `<your table>` WHERE `import_date` BETWEEN '<now minus 30 days>' AND '<now>' GROUP BY RANGE ORDER BY RANGE DESC) AS t1;

This might not help at all, but if it does then goody. It's easily modified to return the starting day for each range as a date column. To be clear, this does the exact same thing that Ben S's solution offers, but it will work if all of your dates are not 00:00:00 whereas that would cause his GROUP BY on the date column to fail

To see what the return would look like, see Ben S's answer and mentally remove the date column. As I said however, that column could easily be added back into this query. FWIW, I have used this method on tables with upwards of 4 million rows and it still runs in < 1 second, which was good enough for my purposes.

Hamy

3 Comments

The point of answers like runrig's and mine is that you don't have to do that work outside of SQL, "mentally removing the date column".
You don't think I don't know you downvoted me? Can you get any more infantile?
The solution, as some of you mentioned, is multiple queries, and some processing in PHP.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.