SQL Optimization : How to optimize the query in MySQL?

Question

I am writing a SQL query which requires highly optimized solution, so as to not timeout. But I have got no idea of how to continuously optimize the following SQL query:

select distinct j.job,f.path,p.path 
from fixes f, jobs j, paths p where f.job=j.id and p.id =f.path 
and (p.path like '//Tools/Web/%' or p.path = '//Tools/Web');

I have created indexes on the following fields(essentially everything):

jobs.id
jobs.job
paths.path
paths.id
fixes.job
fixes.path

In each of the "fixes", "jobs", "paths" table there are ~50,000 rows, and current timeout is 6 min

The 'explain' command shows the following information, try to deciphering

1   SIMPLE  j   index   PRIMARY         job     62   (null)    73226    Using index; Using temporary
1   SIMPLE  f   ref     path,job        job     8    j.id      825  
1   SIMPLE  p   eq_ref  PRIMARY,path    PRIMARY 8    f.path    1        Using where

The table creation statements for the 'paths' table:

CREATE TABLE `paths` (
   `id` bigint(20) NOT NULL AUTO_INCREMENT,
   `path` varchar(250) NOT NULL,
   PRIMARY KEY (`id`),
   UNIQUE KEY `path` (`path`),
 ) ENGINE=InnoDB  DEFAULT CHARSET=utf8;

Please do not ever use implicit joins , they are a SQL antitppattern as they are more difficult to maintain and far mor subjkject to accidental cross joins. — HLGEM
– HLGEM, Commented Oct 12, 2012 at 16:59
How long does the query take without the string comparisons on the path? — Gordon Linoff
– Gordon Linoff, Commented Oct 12, 2012 at 17:06
Duration: 0.078 sec, and fetch time is 0.015 sec, without string comparison — Chen Xie
– Chen Xie, Commented Oct 12, 2012 at 17:26
Can you edit the question and add the CREATE TABLE statements for the 3 tables? — ypercubeᵀᴹ
– ypercubeᵀᴹ, Commented Oct 12, 2012 at 17:29

HLGEM · Accepted Answer · 2012-10-12 17:39:07Z

2

Wouldn't this get the same results?

select distinct j.job,f.path,p.path  
from fixes f
join  jobs j on  f.job=j.id 
join  paths p  on p.id =f.path  
where p.path like '//Tools/Web%'

OR is almost always a costly feature.

You could also try a Union Query, they are often faster than an OR.

select  j.job,f.path,p.path  
from fixes f
join  jobs j on  f.job=j.id 
join  paths p  on p.id =f.path  
where p.path like '//Tools/Web/%' 
union 
select  j.job,f.path,p.path  
from fixes f
join  jobs j on  f.job=j.id 
join  paths p  on p.id =f.path  
where  p.path = '//Tools/Web');

edited Oct 12, 2012 at 17:39

answered Oct 12, 2012 at 17:03

HLGEM

97k15 gold badges120 silver badges191 bronze badges

Sign up to request clarification or add additional context in comments.

10 Comments

ypercubeᵀᴹ Over a year ago

This would also get the '//Tools/WebDesign/'

Chen Xie Over a year ago

Correct. There are ~50,000 paths and will be more as time goes. There is possibility to have paths that are such "prefixes" of some others

HLGEM Over a year ago

@ypercube, so would his orginal query (at least in SQL server it would)

HLGEM Over a year ago

@ChenXie did you try and see if it returned differnt results?

Chen Xie Over a year ago

@HLGEM Thanks for the discussion and that 'OR' operation is the culprit in this case. After adapting this solution with 'UNION', the problem solved in this case and I now gain a very good query performance, with around 0.1s duration time

|

Stephen O'Flynn · Accepted Answer · 2012-10-12 17:06:17Z

2

Do you need the DISTINCT? Maybe it's possible that in your dataset that you wouldn't require it. You could try rewriting the query without that, and start the WHERE condition with the path.p conditions. You could also try joining the other two tables.

E.g.

    select j.job,f.fix,p.path 
    from paths.p
    join fixes f on (f.path = p.id)
    join jobs j on (f.job = j.id)
    where (p.path like '//Tools/Web/%' or p.path = '//Tools/Web')

     group by job, fix, path

If you need the distinct, the GROUP BY might help. Also, you have two columns called "path" in your original query.

answered Oct 12, 2012 at 17:06

Stephen O'Flynn

2,32923 silver badges34 bronze badges

4 Comments

Chen Xie Over a year ago

It does improved some performance. The query gives a ~5 min duration time, instead of timing out at 6 min.

Stephen O'Flynn Over a year ago

Out of curiosity, how long does the query take if you remove the fix and job tables? i.e. SELECT path FROM paths WHERE (path like '//Tools/Web/%') or (path = '//Tools/Web') It might help you track down the slow table.

Chen Xie Over a year ago

Without joining those tables, the query is fast. So here is the thing, without joining tables, the string comparison is fast; without the string comparison, the joining operation is fast. But when it comes together, things just won't work

Stephen O'Flynn Over a year ago

What happens if you just JOIN fixes and jobs? You could SELECT the path information into a temporary table, then run the join on the results to see where it is slowing down.

simply-put · Accepted Answer · 2012-10-12 16:47:37Z

1

Use Explain your sql query to see whether thses indexes are used by your query or not

I am sure your indexes are wrong because 6 min is lot of time for a query

answered Oct 12, 2012 at 16:47

simply-put

1,0981 gold badge11 silver badges21 bronze badges

Collectives™ on Stack Overflow

SQL Optimization : How to optimize the query in MySQL?

The 'explain' command shows the following information, try to deciphering

The table creation statements for the 'paths' table:

3 Answers 3

10 Comments

4 Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

The 'explain' command shows the following information, try to deciphering

The table creation statements for the 'paths' table:

3 Answers 3

10 Comments

4 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related