0

It is first time I came across problem of long time of query execution. Problem is actually pretty big because query is executing in more then 20seconds which highly visible for endpoint user.

I have quite large database of topics (~8k), topic's have it's parameters (which is dictionared - I have 113 different parameters for 8k topics).

I would like to show report about number of repetitions of those topics.

topic table:
----------------+---------+-----------------------------------------------------
 id             | integer | nextval('topic_id_seq'::regclass)
 topicengine_id | integer |
 description    | text    |
 topicparam_id  | integer |
 date           | date    |

topicparam table:
----------------+---------+----------------------------------------------------------
 id             | integer | nextval('topicparam_id_seq'::regclass)
 name           | text    |

and my query:

select distinct tp.id as tpid, tp.name as desc, (select count(*) from topic where topic.topicparam_id = tp.id) as count, t.date
from topicparam tp, topic t where t.topicparam_id =tp.id

Total runtime: 22372.699 ms

fragment of result :

 tpid |                     topicname               | count |    date
------+---------------------------------------------+-------+---------
 3823 | Topic1                                      |     6 | 2014-03-01
 3756 | Topic2                                      |    14 | 2014-03-01
 3803 | Topic3                                      |    28 | 2014-04-01
 3780 | Topic4                                      |  1373 | 2014-02-01

Is there any way to optimize time of execution for this query?

2
  • Please post the output of explain analyze (or upload it to explain.depesz.com). Also which indexes are defined on the table? And which exact Postgres version are you using? Commented Apr 8, 2014 at 6:05
  • Please read stackoverflow.com/tags/postgresql-performance/info then edit your question appropriately. Commented Apr 8, 2014 at 6:09

3 Answers 3

1

A simply group by should do the same thing (if I understood your query correctly.

select tp.id as tpid, 
       max(tp.name) as desc, 
       count(*) as count, 
       max(t.date) as date
from topicparam tp
  join topic t on t.topicparam_id = tp.id
group by tp.id;

Btw: date is a horrible name for a column. For one reason because it's also a reserved word, but more importantly because it does not document what the column contains. A "start date", an "end date", a "due date", a "recording date", a "publish date", ...?

Sign up to request clarification or add additional context in comments.

2 Comments

max() on tp.name doesn't make any sense. max() or min() on date can be interesting to get the first topic date or the last if there are differents date, but according to the original query, seems not.
@Ryx5: the original query uses a distinct which seems to indicate that the OP just wants some unique combination. It did look like an attempt to get what the group by does - but as the original questions lacks a lot of necessary information I had to guess. It could just as well be a group by on all columns as you did in your answer.
0

You can try this query:

SELECT tp.id AS tpid,
       tp.name AS DESC,
       topic.cnt AS count,
       t.date
FROM topicparam tp
JOIN topic t
  ON t.topicparam_id =tp.id
JOIN (SELECT topicparam_id,
             count(*) cnt 
      FROM topic
      GROUP BY topicparam_id) topic
  ON topic.topicparam_id = tp.id
GROUP BY tp.id,
         tp.name,
         t.date,
         topic.cnt

Comments

0

For me DISTINCT + SUBQUERY are killing your performance. You should use GROUP BY in both way to "disinct" you data and "count".

SELECT 
    tp.id as tpid
    , tp.name as description
    , count(*) as numberOfTopics
    , t.date
FROM 
    topicparam tp
    INNER JOIN topic t 
        ON t.topicparam_id = tp.id
GROUP BY
    tp.id 
    , tp.name
    , t.date

Considering the bulk of data, you have to pay attention on indexes :

In this case, use indexes on topicparam.id and topic.id

Remove indexes on columns that is never use in join clauses.

Try to not use sql reserved words like "date, desc, count" for aliases or table fields.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.