SQL queries slow when running in sequence, but quick when running separately

Question

I have a table that I will populate with values from an expensive calculation (with xquery from an immutable XML column). To speed up deployment to production I have precalculated values on a test server and saved to a file with BCP.

My script is as follows

-- Lots of other work, including modifying OtherTable

CREATE TABLE FOO (...)
GO

BULK INSERT FOO
FROM 'C:\foo.dat';
GO

-- rerun from here after the break

INSERT INTO FOO 
  (ID, TotalQuantity)
SELECT 
e.ID, 
SUM(e.Quantity) as TotalQuantity
FROM (select 
    o.ID,
    h.n.value('TotalQuantity[1]/.', 'int') as TotalQuantity
FROM dbo.OtherTable o
    CROSS APPLY XmlColumn.nodes('(item/.../salesorder/)') h(n)
WHERE o.ID NOT IN (SELECT DISTINCT ID FROM FOO)
) as E
GROUP BY e.ID

When I run the script in management studio the first two statements completes within seconds, but the last statement takes 4 hours to complete. Since no rows are added to the OtherTable since my foo.dat was computed management studio reports (0 row(s) affected).

If I cancel the query execution after a couple of minutes and selects just the last query and run that separately it completes within 5 seconds.

Notable facts:

The OtherTable contains 200k rows and the data in XmlColumn is pretty large, total table size ~3GB
The FOO table gets 1.3M rows

What could possibly make the difference?
Management studio has implicit transactions turned off. Is far as I can understand each statement will then run in its own transaction.

Update:
If I first select and run the script until -- rerun from here after the break, then select and run just the last query, it is still slow until I cancel execution and try again. This at least rules out any effects of running "together" with the previous code in the script and boils down to the same query being slow on first execution and fast on the second (running with all other conditions the same).

Can you see any differences in the execution plans? With the last statement taking 4 hours, you can look at the estimated plans instead of the actual (at least for a start). — Anders Abel
– Anders Abel, Commented Jan 4, 2012 at 8:12
"If I cancel the query execution after a couple of minutes and selects just the last query and run that separately it completes within 5 seconds." - are you running the select on its own, inserting the results into an empty foo or inserting the results into an aleady-populated foo? Does foo get 1.3M rows mostly from the BCP process or from the insert from OtherTable? — user359040
– user359040, Commented Jan 4, 2012 at 8:49
@MarkBannister, I'm running the select with the tables populated. I'm just continuing the same script from the point where I pressed cancel. All 1.3M rows comes from the bulk insert. (That's what (0 row(s) affected) indicates). — Albin Sunnanbo
– Albin Sunnanbo, Commented Jan 4, 2012 at 9:08

Remus Rusanu · Accepted Answer · 2012-01-04 07:29:01Z

2

Probably different execution plans. See Slow in the Application, Fast in SSMS? Understanding Performance Mysteries.

answered Jan 4, 2012 at 7:29

Remus Rusanu

296k42 gold badges459 silver badges583 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

Albin Sunnanbo Over a year ago

I used fulltablescan.com/index.php?/archives/… to get the execution plan of the slow running query, but as far as I can see it uses the same execution plan the second time I run the query. Unfortunately I can not get the execution count of each part (I'm interested in the number of xpath evaluations) unless I wait 4 hours. I'll try to let the query run overnight to get the full execution plan.

Albin Sunnanbo Over a year ago

Besides, why would the execution plan change between two identical invocations from Management Studio with no other activity in between?

Remus Rusanu Over a year ago

why would the execution plan change between: stats

Anders Abel · Accepted Answer · 2012-01-04 11:31:45Z

1

Could it possibly be related to the statistics being completely wrong on the newly created Foo table? If SQL Server automatically updates the statistics when it first runs the query, the second run would have its execution plan created from up-to-date statistics.

What if you check the statistics right after the bulk insert (with the STATS_DATE function) and then checks it again after having cancelled the long-running query? Did the stats get updated, even though the query was cancelled?

In that case, an UPDATE STATISTICS on Foo right after the bulk insert could help.

answered Jan 4, 2012 at 11:31

Anders Abel

69.6k18 gold badges154 silver badges221 bronze badges

1 Comment

Albin Sunnanbo Over a year ago

The stats-date for the PK for the FOO table is null after the query executes. Sounds like a possible fix. However I managed to get reasonable execution times by rewriting the query to a LEFT OUTER JOIN.

Albin Sunnanbo · Accepted Answer · 2012-01-04 11:52:23Z

0

Not sure exactly why it helped, but i rewrote the last query to an left outer join instead and suddenly the execution dropped to 15 milliseconds.

INSERT INTO FOO 
  (ID, TotalQuantity)
SELECT 
e.ID, 
SUM(e.Quantity) as TotalQuantity
FROM (select 
    o.ID,
    h.n.value('TotalQuantity[1]/.', 'int') as TotalQuantity
FROM dbo.OtherTable o
INNER JOIN FOO f ON o.ID = f.ID
    CROSS APPLY o.XmlColumn.nodes('(item/.../salesorder/)') h(n)
WHERE f.ID = null
) as E
GROUP BY e.ID

answered Jan 4, 2012 at 11:52

Albin Sunnanbo

47.2k8 gold badges72 silver badges110 bronze badges

Collectives™ on Stack Overflow

SQL queries slow when running in sequence, but quick when running separately

3 Answers 3

3 Comments

1 Comment

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

3 Comments

1 Comment

Comments

Your Answer

Sign up or log in

Post as a guest

Related