I'm working with a database where the details of a phone call a user places with a client are saved into a single VARCHAR(MAX) column. For detailed reporting purposes (SSRS), however, all of the various attributes embedded within the VARCHAR column need to be turned into separate columns. To accomplish this parsing, I'm using a cursor in a stored procedure which calls a UDF that handles the extraction of the desired substring.
When querying a relatively small number of rows (no more than a few hundred), the performance of the query is decent. However, when querying a larger number of rows (several thousand), performance is terrible and the query will often time out. So, I'm looking for an approach that scales far better than what I have in place.
The VARCHAR column contains full HTML markup. I've removed irrelevant stuff to illustrate how the data I'm interested in is stored:
<table>
<tr>
<td class="key">Dial Result:</td><td id="DialResult" class="value">No Answer</td></tr>
<tr><td class="key">Client Name:</td><td id="ClientName" class="value">SMITH, BOB</td></tr>
<tr><td class="key">Number Dialed:</td><td id="NumberDialed" class="value">5555555555 [Day]</td></tr>
<tr><td class="key">Dial Count:</td><td id="DialCount" class="value">1</td></tr><tr><td class="key">Contact Made:</td><td id="ContactMade" class="value">No</td></tr>
<tr><td class="key">Campaign Called On:</td><td id="CampaignCalledOn" class="value">TEST CAMPAIGN</td></tr>
<tr><td class="key">Call Outcome:</td><td id="CallOutcome" class="value">No answer</td></tr>
<tr><td class="key">Email Sent:</td><td id="EmailSent" class="value">No</td></tr>
<tr><td class="key">Do Not Call Requested:</td><td id="DoNotCallRequested" class="value">No</td></tr>
<tr><td class="key">Product Purchased:</td><td id="ProductPurchased" class="value">No</td></tr>
<tr><td class="key">Order Number:</td><td id="OrderNumber" class="value">N/A</td></tr><tr><td class="key">Order Dollar Value:</td><td id="PurchaseAmount" class="value">0.00</td></tr>
<tr><td class="key">Purchased SKUs:</td><td id="PurchasedSKUs" class="value">N/A</td></tr>
</table>
Currently, I have a stored procedure which uses a cursor to iterate over rows in the "live" database table. The procedure cursor looks like this:
OPEN MainNoteCursor;
FETCH NEXT FROM MainNoteCursor INTO @SequenceNumber,@ClientId,@ContactNumber,@UserDisplayName,@CreatorId,@DateCol,@NoteText
WHILE (@@FETCH_STATUS <> -1)
BEGIN
IF (@@FETCH_STATUS <> -2)
BEGIN TRY
DECLARE @NoteDate date = null
DECLARE @DialResult varchar(255) = null
DECLARE @ClientName varchar(255) = null
DECLARE @NumberDialed varchar(255) = null
DECLARE @DialCount int = null
DECLARE @ContactMade varchar(3) = null
DECLARE @CampaignCalledOn varchar(255) = null
DECLARE @CallOutcome varchar(255) = null
DECLARE @EmailSent varchar(3) = null
DECLARE @DoNotCallRequested varchar(3) = null
DECLARE @ProductPurchased varchar(3) = null
DECLARE @OrderNumber varchar(255) = null
DECLARE @PurchaseAmount money = null
DECLARE @PurchasedSKUs varchar(255) = null
SET @NoteDate = CONVERT(date, @DateCol)
SET @DialResult = [dbo].[FN_PARSE_TEXT] (@NoteText,''<TD id="DialResult" class="value">'',''</td>'',1)
SET @ClientName = [dbo].[FN_PARSE_TEXT] (@NoteText,''<TD id="ClientName" class="value">'',''</td>'',1)
SET @NumberDialed = [dbo].[FN_PARSE_TEXT] (@NoteText,''<TD id="NumberDialed" class="value">'',''</td>'',1)
SET @DialCount = [dbo].[FN_PARSE_TEXT] (@NoteText,''<TD id="DialCount" class="value">'',''</td>'',1)
SET @ContactMade = [dbo].[FN_PARSE_TEXT] (@NoteText,''<TD id="ContactMade" class="value">'',''</td>'',1)
SET @CampaignCalledOn = [dbo].[FN_PARSE_TEXT] (@NoteText,''<TD id="CampaignCalledOn" class="value">'',''</td>'',1)
SET @CallOutcome = [dbo].[FN_PARSE_TEXT] (@NoteText,''<TD id="CallOutcome" class="value">'',''</td>'',1)
SET @EmailSent = [dbo].[FN_PARSE_TEXT] (@NoteText,''<TD id="EmailSent" class="value">'',''</td>'',1)
SET @DoNotCallRequested = [dbo].[FN_PARSE_TEXT] (@NoteText,''<TD id="DoNotCallRequested" class="value">'',''</td>'',1)
SET @ProductPurchased = [dbo].[FN_PARSE_TEXT] (@NoteText,''<TD id="ProductPurchased" class="value">'',''</td>'',1)
SET @OrderNumber = [dbo].[FN_PARSE_TEXT] (@NoteText,''<TD id="OrderNumber" class="value">'',''</td>'',1)
SET @PurchaseAmount = [dbo].[FN_PARSE_TEXT] (@NoteText,''<TD id="PurchaseAmount" class="value">'',''</td>'',1)
SET @PurchasedSKUs = [dbo].[FN_PARSE_TEXT] (@NoteText,''<TD id="PurchasedSKUs" class="value">'',''</td>'',1)
INSERT INTO @Return
([SequenceNumber],[ClientId],[ContactNumber],[UserDisplayName],[CreatorId],
[DateCol],[NoteDate],[NoteText],[DialResult],[ClientName],[NumberDialed],[DialCount],
[ContactMade],[CampaignCalledOn],[CallOutcome],[EmailSent],[DoNotCallRequested],
[ProductPurchased],[OrderNumber],[PurchaseAmount],[PurchasedSKUs])
VALUES
(@SequenceNumber,@ClientId,@ContactNumber,@UserDisplayName,@CreatorId,
@DateCol,@NoteDate,@NoteText,@DialResult,@ClientName,@NumberDialed,@DialCount,
@ContactMade,@CampaignCalledOn,@CallOutcome,@EmailSent,@DoNotCallRequested,
@ProductPurchased,@OrderNumber,@PurchaseAmount,@PurchasedSKUs)
END TRY
BEGIN CATCH
--Nothing to do.
END CATCH
FETCH NEXT FROM MainNoteCursor INTO @SequenceNumber,@ClientId,@ContactNumber,@UserDisplayName,@CreatorId,@DateCol,@NoteText
END
CLOSE MainNoteCursor;
DEALLOCATE MainNoteCursor;
SELECT * FROM @Return
So, you can see that the cursor is parsing the full note body each time, extracting the values between two string delimiters, inserting the values into a table, and then returning the table.
When executed in SSMS, I get a result that looks like this:
The code in the FN_PARSE_TEXT uses CHARINDEX(), LEN(), SUBSTRING(), etc. to read the text between the "Start string" and the "end string". I won't post the full function because there's a lot of housekeeping code that's not relevant, but it can be distilled down to this:
--********************************************************************************
-- CHARINDEX returns the position of the fist character in @StartKey, but we need to
-- start reading after the *last* character in @StartKey. Re-adjust the position
-- at which we''ll start reading the string.
--********************************************************************************
SET @StartKeyIndex = @StartKeyIndex + @StartKeyLength
SET @ReadLength = @EndKeyIndex - @StartKeyIndex
--********************************************************************************
-- Start / End positions have been determined, so now read out the calculated number
-- of characters and return to the calling code.
--********************************************************************************
SET @ReturnValue = LTRIM(RTRIM(SUBSTRING(@noteText, @StartKeyIndex, @ReadLength)))
So, after all of that background information (I can provide more detail, if needed), I'm looking for a better approach.
I've thought of creating an overnight batch process that does all of the parsing during off-hours and then dumping the data into a flattened table. I would then report of the flattened data instead of trying to parse out the details in real time. This, of course, opens its own can of worms and I'd ideally like to continue to hit the live data. Depending on the date range the users supply when running the query, upwards of 10,000 rows may be returned.
Another alternative I've thought of would be to use a CLR method to handle the looping and parsing - but I don't have a lot of experience there and I'm not sure that would be any more efficient than what I'm doing now.
I have read articles (https://sqlperformance.com/2012/07/t-sql-queries/split-strings - for example) that talk about things like CTE, but I don't really have much experience there, so I'm not sure about how to make the leap. Most of the "string parsing" articles I've read are geared towards scenarios where the delimiter is the same - things like parsing comma-separated strings. But since I have "rolling" delimiters, I'm not sure how to deal with that.
Anyway, if you're still with me, any input would be greatly appreciated.

as XMLand use xml methods to extract the data you need?NoteTextas xml, then use XML methods to shred the values you need. learn.microsoft.com/en-us/sql/t-sql/xml/xml-data-type-methods