I have a table that I will populate with values from an expensive calculation (with xquery from an immutable XML column). To speed up deployment to production I have precalculated values on a test server and saved to a file with BCP.
My script is as follows
-- Lots of other work, including modifying OtherTable
CREATE TABLE FOO (...)
GO
BULK INSERT FOO
FROM 'C:\foo.dat';
GO
-- rerun from here after the break
INSERT INTO FOO
(ID, TotalQuantity)
SELECT
e.ID,
SUM(e.Quantity) as TotalQuantity
FROM (select
o.ID,
h.n.value('TotalQuantity[1]/.', 'int') as TotalQuantity
FROM dbo.OtherTable o
CROSS APPLY XmlColumn.nodes('(item/.../salesorder/)') h(n)
WHERE o.ID NOT IN (SELECT DISTINCT ID FROM FOO)
) as E
GROUP BY e.ID
When I run the script in management studio the first two statements completes within seconds, but the last statement takes 4 hours to complete. Since no rows are added to the OtherTable since my foo.dat was computed management studio reports (0 row(s) affected).
If I cancel the query execution after a couple of minutes and selects just the last query and run that separately it completes within 5 seconds.
Notable facts:
- The OtherTable contains 200k rows and the data in XmlColumn is pretty large, total table size ~3GB
- The FOO table gets 1.3M rows
What could possibly make the difference?
Management studio has implicit transactions turned off. Is far as I can understand each statement will then run in its own transaction.
Update:
If I first select and run the script until -- rerun from here after the break, then select and run just the last query, it is still slow until I cancel execution and try again. This at least rules out any effects of running "together" with the previous code in the script and boils down to the same query being slow on first execution and fast on the second (running with all other conditions the same).
(0 row(s) affected)indicates).