3

I need to pump up my query a bit for it's taking way too long on a large DB.

I have the following tables

    vb_user

+++++++++++++++++++++++++++++++++

++ userid ++ username ++ posts ++

+++++++++++++++++++++++++++++++++

    vb_post

++++++++++++++++++++++++

++ userid ++ dateline ++

++++++++++++++++++++++++

I use this query

SELECT VBU.userid AS USER_ID
, VBU.username AS USER_NAME
, COUNT(VBP.userid) AS NUMBER_OF_POSTS_FOR_30_DAYS
            , FROM_UNIXTIME(VBU.joindate) as JOIN_DATE
        FROM vb_user AS VBU
        LEFT JOIN vb_post AS VBP
        ON VBP.userid = VBU.userid
            WHERE VBU.joindate BETWEEN '__START_DATE__' AND '__END_DATE__' 
                AND VBP.dateline BETWEEN VBU.joindate AND DATE_ADD(FROM_UNIXTIME(VBU.joindate), INTERVAL 30 DAY)
            GROUP BY VBP.userid
            ORDER BY NUMBER_OF_POSTS_FOR_30_DAYS DESC"

I have to select the users who have posted the most from when they joined till 30 days after..... and I can't figure out how to do it withouth the FROM_UNIXTIME function..

But it takes a lot of time. Any thoughts on how to improve the performance for the query?

Here is the output for explain

id,select_type,table,type,possible_keys,key,key_len,ref,rows,Extra
1,SIMPLE,VBP,index,userid,threadid_visible_dateline,18,NULL,2968000,"Using where; Using index; Using temporary; Using filesort"
1,SIMPLE,VBU,eq_ref,PRIMARY,PRIMARY,4,vb_copilul.VBP.userid,1,"Using where"

And here is the info about the tables

Table,"Create Table"
vb_user,"CREATE TABLE `vb_user` (
  `userid` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `username` varchar(100) NOT NULL DEFAULT '',
  `posts` int(10) unsigned NOT NULL DEFAULT '0',
  PRIMARY KEY (`userid`),
  KEY `usergroupid` (`usergroupid`),
) ENGINE=MyISAM AUTO_INCREMENT=101076 DEFAULT CHARSET=latin1"

Table,"Create Table"
vb_post,"CREATE TABLE `vb_post` (
 `postid` int(10) unsigned NOT NULL AUTO_INCREMENT,
 `threadid` int(10) unsigned NOT NULL DEFAULT '0',
 `parentid` int(10) unsigned NOT NULL DEFAULT '0',
 `username` varchar(100) NOT NULL DEFAULT '',
 `userid` int(10) unsigned NOT NULL DEFAULT '0',
 `title` varchar(250) NOT NULL DEFAULT '',
 `dateline` int(10) unsigned NOT NULL DEFAULT '0',
 `pagetext` mediumtext,
 `allowsmilie` smallint(6) NOT NULL DEFAULT '0',
 `showsignature` smallint(6) NOT NULL DEFAULT '0',
 `ipaddress` char(15) NOT NULL DEFAULT '',
 `iconid` smallint(5) unsigned NOT NULL DEFAULT '0',
 `visible` smallint(6) NOT NULL DEFAULT '0',
 `attach` smallint(5) unsigned NOT NULL DEFAULT '0',
 `infraction` smallint(5) unsigned NOT NULL DEFAULT '0',
 `reportthreadid` int(10) unsigned NOT NULL DEFAULT '0',
 PRIMARY KEY (`postid`),
 KEY `userid` (`userid`),
 KEY `threadid` (`threadid`,`userid`),
 KEY `threadid_visible_dateline` (`threadid`,`visible`,`dateline`,`userid`,`postid`),
 KEY `dateline` (`dateline`),
 KEY `ipaddress` (`ipaddress`)
) ENGINE=MyISAM AUTO_INCREMENT=3009320 DEFAULT CHARSET=latin1"
2
  • 1
    Have you done an EXPLAIN on the query? If so, what does that tell you? Commented May 8, 2013 at 10:31
  • yay on EXPLAIN. And could you please post the complete table definitions? Preferably the results of SHOW CREATE TABLE vb_user and SHOW CREATE TABLE vb_post and maybe even some example data in the form of INSERT INTO .... statements. Commented May 8, 2013 at 10:43

1 Answer 1

3

Two things you can do to improve the query:

  • Do not convert VBP.datetime to unix time. Use the BETWEEN query with the dates directly. In your query the server has to convert all dates in the DB to compare them, instead of use the native types. If you are always using the datetime column as unix timestamp, then declare it as Double (I think?) instead of DATETIME (or TIMESTAMP - whatever you have chosen). This way you will speed up other operations too.
  • Add index to the datetime column to make sure the between query is fast enough.

Everything else looks OK

Sign up to request clarification or add additional context in comments.

9 Comments

I just saw the explain and create table statements you added - yes, the dateline column is already an int type. Looking at the Explain query - It does use one quite complex key but still scans thousands of rows. So try removing the cast and try actually dropping that threadid_visible_dateline key - it could force it to use a better key instead.
Really it needs an index on the dateline AND userid.
Sure, then there should be one, but adding index on way too many fields can be harmful! So my suggestion is to first drop that index and see how it goes. And after that (you are right) try to add an index just on those two fields. I am not sure how smart MySQL is in selecting indexes though - must try it to see if it will prefer that index.
Mysql does not support function based index. So as Dimitar says, do not convert VBP.datetime to unix time. It prevents the DB from using the dateline index.
Ok that worked great, removing the TIMESTAMP conversion. But I have another problem with the following query now.... I updated the post with the new query
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.