1

I have written sql query:

select `b`.`id` as `id`, `b`.`name`, count(a.institution_id) as total 
from `premises` as `a` 
left join `institutions` as `b` on `b`.`id` = `a`.`institution_id` 
where exists (select id from rental_schedules as c where a.id = c.premises_id and public = 1 and public_enterprise = 0 and rental_schedule_status = 1 and date >= CURDATE()) 
group by `a`.`institution_id`

I have very large data in table (Over 1000000 rows) and this query takes up to 8-10 sec. Is there any possibility to optimize better this query?

Thanks for answers in advance!

2
  • Provide complete CREATE TABLE for all mentioned tables. Specify table aliases for all columns in the query. Commented Nov 19, 2020 at 6:19
  • very large data in table (Over 1000000 rows) What table of 3 mentioned ones? Provide statistic for all 3 tables. Commented Nov 19, 2020 at 6:21

2 Answers 2

1

The join to the institutions table can somewhat benefit from the following index:

CREATE INDEX inst_idx (id, name);

This index will cover the join and the select clause on this table. The biggest improvement would come from the following index on the rental_schedules table:

CREATE INDEX rental_idx (premises_id, public, public_enterprise, rental_schedule_status, date);

This index would allow the exists clause to rapidly evaluate for each joined from the first two tables.

Also, I would rewrite your query to make it ANSI compliant, with the column in the GROUP BY clause matching the SELECT clause:

SELECT
    b.id AS id,
    b.name,    -- allowed, assuming that id be the primary key column of institutions
    COUNT(a.institution_id) AS total 
FROM premises AS a
LEFT JOIN institutions AS b ON b.id = a.institution_id
WEHRE EXISTS (SELECT 1 FROM rental_schedules AS c
              WHERE a.id = c.premises_id AND public = 1 AND
                    public_enterprise = 0 AND rental_schedule_status = 1 AND
                    date >= CURDATE()) 
GROUP BY
    b.id;
Sign up to request clarification or add additional context in comments.

1 Comment

Wow, thanks, really appreaciate your help. Now query runs in miliseconds instead of seconds :)
0

You can try to deal the subquery as in memory as possible as you can. When the memory is lack of space, a temporary is created and a long time will be wasted. As the MySQL documentation is described below:

The optimizer uses materialization to enable more efficient subquery processing. Materialization speeds up query execution by generating a subquery result as a temporary table, normally in memory. The first time MySQL needs the subquery result, it materializes that result into a temporary table. Any subsequent time the result is needed, MySQL refers again to the temporary table. The optimizer may index the table with a hash index to make lookups fast and inexpensive. The index contains unique values to eliminate duplicates and make the table smaller.

Subquery materialization uses an in-memory temporary table when possible, falling back to on-disk storage if the table becomes too large.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.