Oracle SQL: Select data between from certain row to another row

Question

I want to select rows from say Nth row to Mth row in a table. I don't want to use any orderby because the table data is huge, it's 38 million. I found a solution for this which says to use the following query

SELECT *
FROM (select suppliers2.*, rownum rnum from
               (select * from suppliers ORDER BY supplier_name) suppliers2
                where rownum <= 5 )
WHERE rnum >= 3;

But since it has two select statement and my table is very big it's 38 million rows, I wanted to know if there is any other way which is not taxing to the DB. I could also see I can use minus but I again see problem with performance. I basically want to select the first one million rows and put it into file, then select the 2nd million rows and put it into file and so on. Please help.

Justin Cave · Accepted Answer · 2014-05-06 22:16:27Z

1

It's not clear to me why you need to page through the results in the first place. You apparently want to grab an arbitrary 1 million rows, put that data in one file, grab another arbitrary 1 million rows (ensuring that you don't grab the same row twice), put that in a second file, and repeat the process until you've generated 38 separate files. What benefit do you derive from issuing 38 separate SELECT statements rather than issuing a single SELECT statement and letting the caller simply write the first million rows that it fetches to one file and then write the second million rows that it fetches to a second file?

Are you trying to generate the files in parallel from 38 separate worker processes? If so, it seems unlikely that you'll get much benefit from parallelizing the writes at the expense of increasing the amount of work that the database has to do substantially. I guess I could envision a system where writes were slow on the client but easy to parallelize while reads on the server were very fast and there was a ton of memory available for sorting on the database server that it might be quicker to write the files in parallel. But there aren't many systems with those characteristics. If you do want to use parallelism, you'd generally be better served letting the client issue a single SELECT to the database and allowing the database to run that SELECT statement in parallel.

If you are determined to select the results in pages, the query you posted should be the most efficient. The fact that there are nested select statements isn't particularly relevant to the analysis of performance. The query will only hit the table once. It still may be very expensive if it needs to fetch and sort all 38 million rows in order to determine which is the 3rd row and which is the 5th row. And it will likely get steadily slower when you look for subsequent pages of data. Fetching rows 37,000,001 - 38,000,000 will require, at a minimum, reading the entire table. That's one reason that it's unlikely to be all that helpful to write the files in parallel-- pulling the first few pages of data is likely to be so much more efficient than pulling the last page that you're going to be limited by that query and the time required to pull 38 million rows over the network.

answered May 6, 2014 at 22:16

Justin Cave

233k25 gold badges378 silver badges395 bronze badges

Sign up to request clarification or add additional context in comments.

6 Comments

Amit Raya Over a year ago

I am not planing on doing a parallel run, but I need a way whereby I can offload one million row each to 38 files, in a way that doesn't tax the system. Can you please suggest some way I can do that and since this is only a one time job I don't want to invest in tools. If not the select statement is there any other way? May be we have some command available which can help me?

Justin Cave Over a year ago

@AmitRaya - Run a single SELECT * FROM your_table from your application. Open file #1. Fetch 1 million rows, write them to file #1. Close file #1. Open file #2. Fetch 1 million rows from the same cursor, write them to file #2. Repeat 38 times.

Amit Raya Over a year ago

I don't want to give one select statement because I feel that will be too taxing for the system, which I understand I land in the same problem paging through the table which I hadn't thought about :).. the only idea is I am looking for a way to offload the table to delimited file without taxing the system. That is the end I am looking at, please help.

Justin Cave Over a year ago

@AmitRaya - A single SELECT will be far less taxing than 38 separate SELECT statements particularly since the single statement doesn't need an ORDER BY. You can't help reading 38 million rows from the database if you want to write 38 million rows of data to files. Reading 38 million rows isn't trivial but it's not a whole lot on modern hardware either.

Amit Raya Over a year ago

If I use Spool as given below is it still equally taxing for the system as a select statement. I am sorry for my ignorance but I am very new to PLSQL. spool c:\myfile.txt select field1||', '||field2||', '||field3 from my_table; spool off -- turn spooling off set head on -- turn the heading parameter back on

|

Collectives™ on Stack Overflow

Oracle SQL: Select data between from certain row to another row

1 Answer 1

6 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

6 Comments

Your Answer

Sign up or log in

Post as a guest

Related