2

I have a simple table in postgresql, say

id fname
abc bert
def jaap
ghi kees
jkl jan
etc piet

...etc...

With a string primary key id.

My table has millions of rows.

I want to get a list of every 10_000th (give or take) row.

Basically:

SELECT id 
FROM (
  SELECT id, ROW_NUMBER() OVER (ORDER BY id) AS rownum 
  FROM mytable
) as t 
WHERE ((t.rownum - 1) % 10000) = 0;

But that seems to be very slow. Is there an efficient alternative?

8
  • TABLESAMPLE comes into mind. But if that's useful to you depends on how accurate the every 10k have to be. Commented Feb 7, 2021 at 18:59
  • What is a "user"? Commented Feb 7, 2021 at 19:04
  • Please use numbers. How many millions? How slow is it? How fast do you need it to be? What is the output of EXPLAIN (ANALYZE, BUFFERS) <query>, preferably after turning track_io_timing on? Commented Feb 7, 2021 at 19:48
  • 1
    @Maarten: It needs you to know how many rows or what percentage of all rows are equivalent to roughly pick every 10k rows, yes. And of course the gaps can vary greatly. In an extreme case it's also possible to pick up two consecutive rows. Commented Feb 7, 2021 at 20:14
  • 1
    @Maarten: I'm sorry but I don't know, if it's gonna be faster. I think there's a good chance it will, but you have to test that for yourself to be sure. Commented Feb 7, 2021 at 20:34

2 Answers 2

3

I am afraid that it might be the best possible solution. I have executed your below query in sql server on a table having almost 65 million rows and getting result with 18 seconds. I think it might be the best possible solution. Since it's primary key column a cluster is already there to speed up the process. If you regularly do the maintenance job it might be the best you can ask for.

SELECT id 
FROM (
  SELECT id, ROW_NUMBER() OVER (ORDER BY id) AS rownum 
  FROM mytable
) as t 
WHERE ((t.rownum - 1) % 10000) = 0;

Please let me know the exact row numbers and your execution time. And run it after reindexing .

Sign up to request clarification or add additional context in comments.

Comments

2

You could try NTILE()-function

WITH CTE(ID,FNAME)AS
 (
    SELECT 'ABC','BERT'
        UNION ALL 
    SELECT 'DEF','JAAP'
        UNION  ALL
    SELECT 'GHI','KEES'
        UNION ALL 
    SELECT 'JKL','JAN'
        UNION ALL
    SELECT 'ETC','PIET'
 )
 SELECT C.ID,C.FNAME,
     NTILE(3)OVER(ORDER BY C.ID ASC)XCOL 
      FROM CTE AS C;

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.