2

(A) This version is slow... duration is measured in multiple minutes

  • DB is a typical EF Data Context to a SQL Serve database
  • AA_Words_100 is a simple SQL Server table which is added to the EF designer
  • DB.AA_Words_100.Add is called ~3,000 times (confirmed via debugging with counter variables)
  • I have confirmed that >99% of the runtime is inside the inner loop
  • XCampaign is a Dictionary<string, Dictionary<string, _Word>> where _Word is a trivial non-EF object.
foreach (var XCampaign in Words100)
    foreach (var KVP in XCampaign.Value)
        DB.AA_Words_100.Add(KVP.Value.To_AA_Word_100());

DB.SaveChanges();

(B) This version is fast... - .Add() is simply commented out to narrow the scope

var iTemp = 0;

foreach (var XCampaign in Words100)
  foreach (var KVP in XCampaign.Value)
    iTemp++;

DB.SaveChanges();

(C) This version is fast. I simply fill up a List before calling DB.AddRange(...)

var LIST_WordsToAdd = new List<AA_Words_100>();

foreach (var XCampaign in Words100)
{
    foreach (var KVP in XCampaign.Value)
    {
        LIST_WordsToAdd.Add(KVP.Value.To_AA_Word_100());
    }

    DB.AA_Words_100.AddRange(LIST_WordsToAdd);
}

DB.SaveChanges();

(D) Documentation

According to DbContext.Add documentation

Begins tracking the given entity, and any other reachable entities that are not already being tracked, in the Added state such that they will be inserted into the database when SaveChanges() is called.

In particular, when SaveChanges() is called.

I recently migrated to EF from Linq-to-SQL in this application. Linq-to-SQL did not have this problem.

What reason could there be for the DB.AA_Words_100.Add(...) command being so slow?

Thank you!

#Update - To_AA_Word_11() Code

public AA_Words_100 To_AA_Word_100()
{
    var W = new AA_Words_100();
    W.Word = Word;
    W.Word_Clean = Word.Replace("'", "");
    W.PhraseCount = PhraseCount;
    W.Clicks = Clicks;
    W.Impressions = Impressions;
    W.SalesAmt = SalesAmt;
    W.SalesOrders = SalesOrders;
    W.SalesUnits = SalesUnits;
    W.Spend = Spend;
    W.Campaign = Campaign;
    return W;
}

        
12
  • 1
    This should answer your question: entityframework.net/improve-ef-add-performance. Though multiple minutes is a bit too much. Commented Nov 6, 2020 at 22:14
  • 1
    Can you show us the code of KVP.Value.To_AA_Word_100()? Commented Nov 8, 2020 at 9:25
  • 1
    How long has your DbContext been alive prior to this call, and how many AA_Word_100 entities is it tracking? Prior to going into your loop add the following: var count = DB.AA_Word_100.Local.Count() and see how many objects it is tracking. Every time you add an entity to a DbSet there is a cost as EF assesses it against everything it is already tracking. The more entities (and related entities) it tracks, the bigger this cost gets. For bulk-like operations you will want fresh DbContexts or consider non-EF DbContext approaches. Commented Nov 8, 2020 at 22:10
  • 1
    As far as I understand, the difference with LinqToSql is that it calls similar detect changes method only on save, while EF by default calls it on every Add (unless you explicitly disable that). Commented Nov 9, 2020 at 19:46
  • 1
    @Brian, so if your test was done before the loops, the DbContext was already tracking over 3000 entities with each iteration adding one. The answer is to use AddRange when inserting numerous items. Commented Nov 9, 2020 at 20:50

1 Answer 1

4
+50

As stated in comments - Entity Framework (not sure about Entity Framework Core) by default calls DetectChanges on every Add. This function, among other things, scans all entities already tracked by a context to detect changes in and between them. That means time complexity of this function is O(n), where n is number of entities already tracked by a context. When you do a lot of adds in a loop - time complexity becomes O(n^2), where n is total number of items added. So even with tiny numbers like 3000 rows, perfomance drops down very significatly.

To fix this (arguable design) issue there are couple of options:

  1. set AutoDetectChangesEnabled of context to false. Then manually call DetectChanges before SaveChanges.

  2. or use AddRange instead of adding entities one by one, it calls DetectChanges just once.

Another notes:

  1. Try to avoid reusing context between operations. You said there was already 3000 entities tracked by context before you called first Add. It's better to create new context every time you need it, do the stuff, then dispose it. Perfomance impact is negligible (and connections are managed by connection pool and are not necessary open or close every time you create\dispose a context), but you will have much less problems like this one (reusing context can bite not only in the scenario you have now, but in several others).

  2. Use AsNoTracking queries if you do not intend to modify entities returned by specific query (or if you intend to modify some of them later by attaching to context). Then context will not track them which will reduce the possibility of mentioned and other perfomance problems.

As for Linq To Sql - it has a similar concept of "detect changes", but it is automatically called only before commiting changes to database, not on every add, so you do not see the same problem there.

Sign up to request clarification or add additional context in comments.

1 Comment

Just a note, usually it's not even necessary to call DetectChanges yourself. EF always calls DetectChanges internally before saving changes.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.