0

I need to optimize this query. The professor recommends using indices, but i'm very confused about how. If I could get just one example of what a good index is and why, and the actual code needed, I could definitely do the rest by myself. Any help would be awesome. (PSQL btw)

    SELECT 
      x.enteredBy
      , x.id
      , count(DISTINCT xr.id)
      , count(DISTINCT c.id)
      , 'l'
 FROM 
      ((locationsV x left outer join locationReviews xr on x.id = xr.lid)
       left outer join reviews r on r.id = xr.id) 
       left outer join comments c on xr.id = c.reviewId
 WHERE 
      x.vNo = 0  
      AND (r.enteredBy IS NULL OR 
            (r.enteredBy <> x.enteredBy 
             AND c.enteredBy <> x.enteredBy
             AND r.enteredBY NOT IN 
                       (SELECT requested FROM friends WHERE requester = x.enteredBY)
             AND r.enteredBY NOT IN 
                       (SELECT requester FROM friends WHERE requested = x.enteredBY)))
     AND (c.enteredBy IS NULL OR 
             (c.enteredBY NOT IN 
                       (SELECT requested FROM friends WHERE requester = x.enteredBY)
             AND c.enteredBY NOT IN 
                       (SELECT requester FROM friends WHERE requested = x.enteredBY)))
 GROUP BY 
     x.enteredBy
     , x.id

I tried adding something like this to the beginning, but the overall time it took didn't change.

CREATE INDEX friends1_idx ON friends(requested);
CREATE INDEX friends2_idx ON friends(requester);
1
  • Anyone see any other optimizations to be made? Commented May 5, 2011 at 1:00

1 Answer 1

1

I think the SQL itself could be optimized to improve performance in addition to looking at indexes. Having those IN clauses in the WHERE clause may cause the optimizer do full table scans. So if you could move those to be tables in the FROM section you would have better performance. Also, having the COUNT(DISTINCT ...) clauses in in the SELECT statement seems problematic. You would likely be better off if you could make changes so the DISTINCT clauses were necessary there and simply use the COUNT aggregate function.

Consider using a SQL statement in the FROM clause before you do the left join--a structure something like this:

SELECT ... 
FROM Table1 LEFT JOIN 
     (SELECT ... FROM Table2 INNER JOIN Table3 ON ...) AS Table4 ON
        Table1.somecolumn = Table4.somecolumn
...

I know this isn't giving you the solution, but hopefully it will help you to think about other aspects of the problem and to explore other ways to address performance.

Sign up to request clarification or add additional context in comments.

2 Comments

Thanks! Any additional tips for an indexes?
I think it is best to use the DB profiling tools to suggest the indexes rather than try to guess yourself. Most DBs will have tools that let you run a SQL command that will be analyzed for index suggestions. Beyond that, you want an index that will differentiate things. If you index a column where 90% of the values are the same, it probably won't be too useful unless you are looking for something in the 10%. A column with lots of unique values may be a better one to index on. You also want to index on columns that are used in your query--perhaps "requestor" or "requested" in your case.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.