0

Can someone pls help with this. From the slow log This query below takes 11 seconds to run and its eating up server resources. How do i re-write this query to achieve greater optimization?

P.S: The tables are indexed.

The query :

SELECT SUM(the_val) AS value
FROM
  (SELECT DISTINCT basic_data.id,
                   att2.the_val
   FROM province_create
   INNER JOIN basic_data ON province_create.province = basic_data.province
   INNER JOIN att2 ON att2.church_id = basic_data.id
   WHERE province_create.block = 0
     AND att2.month = 'Feb'
     AND att2.year = '2017'
     AND basic_data.parish = 1
     AND att2.report = 'ATTENDANCE'
     AND province_create.disable = 0 ) t1;

The EXPLAIN report:

[1] => Array ( [0] => 1 [id] => 1 [1] => PRIMARY [select_type] => PRIMARY [2] => [table] => [3] => ALL [type] => ALL [4] => [possible_keys] => [5] => [key] => [6] => [key_len] => [7] => [ref] => [8] => 38339 [rows] => 38339 [9] => [Extra] => )

[2] => Array
    (
        [0] => 2
        [id] => 2
        [1] => DERIVED
        [select_type] => DERIVED
        [2] => province_create
        [table] => province_create
        [3] => ALL
        [type] => ALL
        [4] => kk,province,kkk
        [possible_keys] => kk,province,kkk
        [5] => 
        [key] => 
        [6] => 
        [key_len] => 
        [7] => 
        [ref] => 
        [8] => 261
        [rows] => 261
        [9] => Using where; Using temporary
        [Extra] => Using where; Using temporary
    )

[3] => Array
    (
        [0] => 2
        [id] => 2
        [1] => DERIVED
        [select_type] => DERIVED
        [2] => basic_data
        [table] => basic_data
        [3] => ref
        [type] => ref
        [4] => PRIMARY,kk,kkk,k,parish
        [possible_keys] => PRIMARY,kk,kkk,k,parish
        [5] => kk
        [key] => kk
        [6] => 56
        [key_len] => 56
        [7] => databaseuser.province_create.province
        [ref] => databaseuser.province_create.province
        [8] => 39
        [rows] => 39
        [9] => Using index; Distinct
        [Extra] => Using index; Distinct
    )

[4] => Array
    (
        [0] => 2
        [id] => 2
        [1] => DERIVED
        [select_type] => DERIVED
        [2] => att2
        [table] => att2
        [3] => ref
        [type] => ref
        [4] => indpull,mmm
        [possible_keys] => indpull,mmm
        [5] => mmm
        [key] => mmm
        [6] => 57
        [key_len] => 57
        [7] => databaseuser.basic_data.id
        [ref] => databaseuser.basic_data.id
        [8] => 1
        [rows] => 1
        [9] => Using where; Distinct
        [Extra] => Using where; Distinct
    )

)

6
  • (a) How many records are involved? (b) Would it make a difference if you didn’t include basic_data.id in your subquery? Commented Apr 9, 2017 at 11:34
  • 1
    Please show a sample of the original data. Why is SELECT DISTINCT needed? Commented Apr 9, 2017 at 11:40
  • @Manngo. (a) The province_create table - about 300 records. The b Commented Apr 9, 2017 at 11:42
  • @uzor Is basic_data.id a primary key? There’s usually not much point in including a primary key in a SELECT DISTINCT clause, as it’s already distinct. Also, did you really want att2.the_val to be distinct? I have no idea what the value means, but you’re excluding multiple occurrences of the value. In other words, why are they distinct? Commented Apr 9, 2017 at 11:51
  • @Manngo (a) The province_create table - about 300 records, the basic_data table about 50,000 records, the att2 table almost 2 million records. (B) The select distinct basic_data.id is needed because some ids on the att2 table appeared more than once. when i took it out it gave me a different (false) result Commented Apr 9, 2017 at 11:54

1 Answer 1

1

First, let me assume that SELECT DISTINCT is not needed. Then the query can be written as:

SELECT SUM(a.the_val)
FROM province_create pc INNER JOIN
     basic_data bd
     ON pc.province = bd.province INNER JOIN
     att2 a
     ON a.church_id = bd.id
WHERE pc.block = 0 AND
      a.month = 'Feb' AND
      a.year = '2017' AND
      bd.parish = 1 AND
      a.report = 'ATTENDANCE'
      pc.disable = 0 ;

Second, you should try indexes on the tables. It is hard to tell what the best index would be, so try adding the following:

  • attr2(year, month, report, church_id, the_val)
  • basic_data(id, province, parish)
  • province_create(province, disable)

This index should help even if the SELECT DISTINCT is needed. However, you need to understand why you are getting duplicates and fix the root cause of that problem for best performance.

Sign up to request clarification or add additional context in comments.

3 Comments

Thanks it enhanced performance significantly. However the issue is that on the att2 table the church_id (which is a foreign key of basic_data.id) is not unique. (i.e there are times where a paricular church_id,month,year, report appears more than once). The value of what i am getting from your solution is higher than what it should be. so how can i (a) make sure that is does not double sum or (b) is there a way can i search for values on the att2 table that are duplicates (based on basic_data.id) and delete them? Thanks. your help is appreciated
@uzor . . . Does the index work on your original query?
yes it worked on the original query but worked better with your solution. right now am i researching ways of deleting duplicate entries in a mysql table so that i can adopt your solution.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.