How to boost Entity Framework Unit Of Work Performance

Question

Please see the following situation:

I do have a CSV files of which I import a couple of fields (not all in SQL server using Entity Framework with the Unit Of Work and Repository Design Pattern).

var newGenericArticle = new GenericArticle
{
    GlnCode = data[2],
    Description = data[5],
    VendorId = data[4],
    ItemNumber = data[1],
    ItemUOM = data[3],
    VendorName = data[12]
};

var unitOfWork = new UnitOfWork(new AppServerContext());
unitOfWork.GenericArticlesRepository.Insert(newGenericArticle);

unitOfWork.Commit();

Now, the only way to uniquely identify a record, is checking on 4 fields: GlnCode, Description, VendorID and Item Number.

So, before I can insert a record, I need to check whether or not is exists:

 var unitOfWork = new UnitOfWork(new AppServerContext());

 // If the article is already existing, update the vendor name.
 if (unitOfWork.GenericArticlesRepository.GetAllByFilter(
         x => x.GlnCode.Equals(newGenericArticle.GlnCode) &&
              x.Description.Equals(newGenericArticle.Description) &&
              x.VendorId.Equals(newGenericArticle.VendorId) &&
              x.ItemNumber.Equals(newGenericArticle.ItemNumber)).Any())
 {
     var foundArticle = unitOfWork.GenericArticlesRepository.GetByFilter(
         x => x.GlnCode.Equals(newGenericArticle.GlnCode) &&
              x.Description.Equals(newGenericArticle.Description) &&
              x.VendorId.Equals(newGenericArticle.VendorId) &&
              x.ItemNumber.Equals(newGenericArticle.ItemNumber));

     foundArticle.VendorName = newGenericArticle.VendorName;

     unitOfWork.GenericArticlesRepository.Update(foundArticle);
 }

If it's existing, I need to update it, which you see in the code above.

Now, you need to know that I'm importing around 1.500.000 records, so quite a lot. And it's the filter which causes the CPU to reach almost 100%.

The `GetAllByFilter' method is quite simple and does the following:

return !Entities.Any() ? null : !Entities.Where(predicate).Any() ? null : Entities.Where(predicate).AsQueryable();

Where predicate equals Expression<Func<TEntity, bool>>

Is there anything that I can do to make sure that the server's CPU doesn't reach 100%?

Note: I'm using SQL Server 2012

Kind regards

I would suggest to use a stored procedure, or try a bulk insert extension — Guillermo Gutiérrez
– Guillermo Gutiérrez, Commented Apr 14, 2015 at 14:04
Why are you so .Any() happy? You are literally querying the database 10 times for every insert. Now, granted, that Any tends to use an EXISTS query, but it's still a query. In particular, you call Entities.Any(), then Any on the predicate, then return an iqueryable and then call Any on that again! Sheesh. — Erik Funkenbusch
– Erik Funkenbusch, Commented Apr 15, 2015 at 17:40
But beyond that, EF is just not designed for this.. It's not a batch or bulk job processor... Use SqlBulkCopy class instead. — Erik Funkenbusch
– Erik Funkenbusch, Commented Apr 15, 2015 at 17:43
@ErikFunkenbusch Any suggestion on how to get rid of the Any() implementation to make it more performant? — Complexity
– Complexity, Commented Apr 16, 2015 at 6:34
Yes, just do a single query with a where clause and your four conditions with a SingleOrDefault (assuming it can only return a single record), and if it's null it means it doesn't exist, so skip the update. — Erik Funkenbusch
– Erik Funkenbusch, Commented Apr 16, 2015 at 6:42

HLGEM · Accepted Answer · 2015-04-15 15:53:11Z

2

Wrong tool for the task. You should never process a million+ records one at at time. Insert the records to a staging table using bulk insert and clean (if need be) and then use a stored proc to do the processing in a set-based way or use the tool designed for this, SSIS.

answered Apr 15, 2015 at 15:53

HLGEM

97k15 gold badges120 silver badges191 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Complexity · Accepted Answer · 2015-04-16 09:29:38Z

1

I've found another solution which wasn't proposed here, so I'll be answering my own question.

I will have a temp table in which I will import all the data, and after the import, I'll execute a stored procedure which will execute a Merge command to populate the destinatio table. I do believe that this is the most performant.

answered Apr 16, 2015 at 9:29

Complexity

5,8426 gold badges47 silver badges91 bronze badges

Comments

Community · Accepted Answer · 2017-05-23 11:46:51Z

0

Have you indexed on those four fields in your database? That is the first thing that I would do.

Ok, I would recommend trying the following: Improving bulk insert performance in Entity framework

To summarize, Do not call SaveChanges() after every insert or update. Instead, call every 1-2k records so that the inserts/updates are made in batches to the database.

Also, optionally change the following parameters on your context:

yourContext.Configuration.AutoDetectChangesEnabled = false;
yourContext.Configuration.ValidateOnSaveEnabled = false;

edited May 23, 2017 at 11:46

CommunityBot

11 silver badge

answered Apr 14, 2015 at 13:56

jle

9,5095 gold badges51 silver badges68 bronze badges

11 Comments

Complexity Over a year ago

Not done it yet. Will do right away. Didn't understand on how I missed that. Any other ideas that you have?

Complexity Over a year ago

The fields are index right now, but the problem remains the same. I do have a table that has 4 keys (the 4 columns that makes a record unique). After that, I've created an index for those 4 keys, but the problem still remains.

jle Over a year ago

In that case you might want to use a stored procedure with a MERGE statement. You can call stored procedures from entity framework

Complexity Over a year ago

I will give that a try, but is that such a performance boost as I still need to execute the SP for every record in the file to import. Meaning, 1.500.000 calls to the Stored Procedure?

jle Over a year ago

No, you can use a table value parameter and send it in batches. mikesdotnet.wordpress.com/2013/03/17/…

|

Collectives™ on Stack Overflow

How to boost Entity Framework Unit Of Work Performance

3 Answers 3

Comments

Comments

11 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

Comments

Comments

11 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related