1

I am trying to select all the columns in my table which have about 2.5million records.

But it throws me the above exception after some time of execution.How to solve this problem

adapter.SelectCommand = new SqlCommand("SELECT * from Dwh_staging_table", con1);
adapter.Fill(DataSet2, "Transformed_Table");
8
  • 2
    The entire data is to large for the memory the application is allowed to allocate. Maybe you could solve the issue by reading row by row instead of loading all the rows at once. And do you really need to load all the columns of your table? Commented Nov 24, 2013 at 20:27
  • can you perform the same query in a database management tool? Commented Nov 24, 2013 at 20:27
  • maybe just partially load your data Commented Nov 24, 2013 at 20:27
  • yess i need to read all the rows.. Commented Nov 24, 2013 at 20:32
  • And what do you need to do with the loaded data afterwards? Commented Nov 24, 2013 at 20:33

3 Answers 3

2

I guess you are dealing with some custom build Data Warehouse solution, that means huge amounts of data. Whatever you try to do, you shouldn't be loading all the data from database to application in order to calculate some numbers in staging table.

The best thing you can do is to calculate whatever you need before you put data to Dwh_staging_table, so the problem is solved before it happens. If this it not possible and you already loaded data to database, you should do all the processing in place, in database (e.g. using hated Stored Procedures).

In general, when you are dealing with huge amounts of data, moving the data around is your biggest enemy. Try to solve all your problems at the place where the data are now, without unnecessary transfer.

If you want to anyway load the data back to c# code (what I don't advice), try to do everything without materialising all the data in memory. Create repository function which returns IEnumerable, which will be internally using yield return, so the whole collection of data is never materialised.

And if you still insist on materializing data in some collection (what I don't advice even more), look at some collections which are not using sequential blocks of memory. Using collections like array, List or DataSet will result in higher chance of out of memmory exception. Try to use something like LinkedList or even better some chunked LinkedList of arrays (almost like paging which was suggested in other post).

EDIT: From what you said

i have some missing values in the table i want to fill some columns afterwards using the avg techninque

it sounds to me like something what should be possible just by one UPDATE statement of the staging table in database. Not sure what exactly you want (e.g. I want to set to AvgMetric averaged value of Metric column grouped over Category column). In that case it would look like:

WITH t AS (
SELECT st.[Category]
      ,st.[AvgMetric]
      ,AVG(st.[Metric]) OVER (PARTITION BY [st.Category] AS [CalculatedAvgMetric]
  FROM [Dwh_staging_table] st
)
UPDATE t
   SET [AvgMetric] = [CalculatedAvgMetric]
Sign up to request clarification or add additional context in comments.

2 Comments

paging is it related with sql query or c# i am new to this concept.
What HaurkurHaf suggested is paging on SQL query level using row_number() (beyondrelational.com/modules/2/blogs/28/posts/10434/…). The advantage is you'll load just small chunks of data to memory. Disadvantage is that you'll execute multiple queries against database. What I was suggesting is to open connection and keep it open while processing all the data in stream. The use of LinkedList (or LinkedList of Lists) was if you want to put everything in memory.
0

The obvious answer would be to reduce the data set returned. Do you need all the columns?

How about paging the data? Check out the row_number() function.

2 Comments

How to reduce the data set returned??
You are selecting all the columns by doing "select * ...". If you don't need all the columns, just specify the ones you need in the select statement. Still, we are talking about 2.5 millon records which you are trying to fit in the memory. By using the row_number() function you can split this up into batches, start by just getting the first 1000 rows, then the next 1000 etc.
0

Use Custom Paging.Fetch Only limited record in one time

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.