2

I am doing a count query on a postgresql table. Table name is simcards containing fields id, card_state and 10 more. Simcards contains around 13 million records

My query is

SELECT CAST(count(*) AS INT) FROM simcards WHERE card_state = 'ACTIVATED';

This is taking more than 6 seconds and I want to optimize it. I tried creating partial index below

CREATE INDEX activated_count on simcards (card_state) where card_state = 'ACTIVATED';

But no improvements. I think it is because I got more than 12 million records with card_state = 'ACTIVATED'. Note that card_state can be 'ACTIVATED', 'PREPROVISIONED', 'TERMINATED'

Anyone got an idea on how the count can be drastically improved?

Running EXPLAIN (ANALYZE, BUFFERS) SELECT CAST(count(*) AS INT) FROM simcards WHERE card_state = 'ACTIVATED'; gives

Finalize Aggregate  (cost=540300.95..540300.96 rows=1 width=4) (actual time=7103.814..7103.814 rows=1 loops=1)
  Buffers: shared hit=2295 read=155298
  ->  Gather  (cost=540300.74..540300.95 rows=2 width=8) (actual time=7103.773..7103.810 rows=3 loops=1)
        Workers Planned: 2
        Workers Launched: 2
        Buffers: shared hit=2295 read=155298
        ->  Partial Aggregate  (cost=539300.74..539300.75 rows=1 width=8) (actual time=7006.368..7006.368 rows=1 loops=3)
              Buffers: shared hit=5983 read=455025
              ->  Parallel Seq Scan on simcards  (cost=0.00..526282.77 rows=5207186 width=0) (actual time=2.677..6483.503 rows=4166620 loops=3)
                    Filter: (card_state = 'ACTIVATED'::text)
                    Rows Removed by Filter: 10965
                    Buffers: shared hit=5983 read=455025
Planning time: 0.333 ms
Execution time: 7123.739 ms
12
  • Please also add the explain analyze of your query Commented Feb 27, 2020 at 12:30
  • 1
    Postgresql will always do seq scan with count. This doesn't scale well with number of rows. One way to get it faster is to have a separate table with count and to put appropriate triggers on the original table. This will dramatically improve read speed at the cost of write speed (which might not be noticable). Commented Feb 27, 2020 at 12:39
  • 1
    @a_horse_with_no_name most likely index scan will still be slower than separate table unless there's only 1 or so element in the index. Commented Feb 27, 2020 at 12:44
  • 2
    @kevin: you can try CREATE INDEX activated_count on simcards (id) where card_state = 'ACTIVATED'; (with id being the PK column of the table) - then you might get an index only scan Commented Feb 27, 2020 at 12:55
  • 1
    @KevinJoymungol: did you run vacuum analyze simcards after creating the index Commented Feb 27, 2020 at 13:19

2 Answers 2

7

Counting is slow. Here are a few ideas how to improve it:

  1. If you don't need exact results, use PostgreSQL's estimates:

    /* this will improve the results */
    ANALYZE simcards;
    
    SELECT t.reltuples * freqs.freq AS count
    FROM pg_class AS t
       JOIN pg_stats AS s
          ON t.relname = s.tablename
             AND t.relnamespace::regnamespace::name = s.schemaname
       CROSS JOIN
          (LATERAL unnest(s.most_common_vals::text::text[]) WITH ORDINALITY AS vals(val,ord)
           JOIN
           LATERAL unnest(s.most_common_freqs::text::float8[]) WITH ORDINALITY AS freqs(freq,ord)
              USING (ord)
          )
    WHERE s.tablename = 'simcards'
      AND s.attname = 'card_state'
      AND vals.val = 'ACTIVATED';
    
  2. If you need exact counts, create an extra “counter table” and triggers on simcards that update the counter whenever rows are added, removed or modified.

For a more detailed discussion, read my blog post.

Sign up to request clarification or add additional context in comments.

1 Comment

Correct. I did a VACUUM ANALYZE together with CREATE INDEX activated_count on simcards (id) where card_state = 'ACTIVATED'; This improved things. Thanks for the counter table solution. Will consider it in case I need better performance
0

Do you test setting the max_parallel_workers_per_gather = 4; parameter?

Is probable that some extra worker help here

Regards

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.