1

I read an answer that said you don't want to use WHILE loops in SQL Server. I don't understand that generalization. I'm fairly new to SQL so I might not understand the explanation yet. I also read that you don't really want to use cursors unless you must. The search results I've found are too specific to the problem presented and I couldn't glean useful technique from them, so I present this to you.

What I'm trying to do is take the values in a client file and shorten them where necessary. There are a couple of things that need to be achieved here. I can't simply hack the field values provided. My company has standard abbreviations that are to be used. I have put these in a table, Abbreviations. the table has the LongName and the ShortName. I don't want to simply abbreviate every LongName in the row. I only want to apply the update as long as the field length is too long. This is why I need the WHILE loop.

My thought process was thus:

CREATE FUNCTION [dbo].[ScrubAbbrev]
(@Field nvarchar(25),@Abbrev nvarchar(255))
RETURNS varchar(255)
AS
BEGIN
    DECLARE @max int = (select MAX(stepid) from Abbreviations)
    DECLARE @StepID int = (select min(stepid) from Abbreviations)
    DECLARE @find varchar(150)=(select Longname from Abbreviations where Stepid=@stepid)
    DECLARE @replace varchar(150)=(select ShortName from Abbreviations where Stepid=@stepid)
    DECLARE @size int = (select max_input_length from FieldDefinitions where FieldName = 'title')
    DECLARE @isDone int = (select COUNT(*) from SizeTest where LEN(Title)>(@size))

    WHILE @StepID<=@max or @isDone = 0 and LEN(@Abbrev)>(@size) and @Abbrev is not null
    BEGIN
        RETURN
        REPLACE(@Abbrev,@find,@replace)
        SET @StepID=@StepID+1
        SET @find =(select Longname from Abbreviations where Stepid=@stepid)
        SET @replace =(select ShortName from Abbreviations where Stepid=@stepid)
        SET @isDone = (select COUNT(*) from SizeTest where LEN(Title)>(@size))
    END
END

Obviously the RETURN should go at the end, but I need to reset the my variables to the next @stepID, @find, and @replace.

Is this one of those times where I'd have to use a cursor (which I've never yet written)?

1 Answer 1

1

Generally, you don't want to use cursors or while loops in SQL because they only process a single row at a time, and thus perform very poorly. SQL is designed and optimized to process (potentially very large) sets of data, not individual values.

You could factor out the while loop by doing something like this:

UPDATE t
SET t.targetColumn = a.ShortName
FROM targetTable t
INNER JOIN Abbreviations a
ON t.targetColumn = a.LongName
WHERE LEN(t.targetColumn) > @maxLength

This is generalized and you will need to tweak it to fit your specific data model, but here's what's going on:

For every row in "targetTable", set the value of "targetColumn" (what you want to abbreviate) to the relevant abbreviation (found in Abbreviations.ShortName) iff: the current value has a standardized abbreviation (the inner join) and the current value is longer than desired (the where condition).

You'll need to add an integer parameter or local variable, @maxLength, to indicate what constitutes "too long". This query processes the target table all at once, updating the value in the target column for every eligible row, while a function will only find the abbreviation for a single item (the intersection of one row and one column) at a time.

Note that this won't do anything if the value is too long but doesn't have a standard abbreviation. Your existing code has this same limitation, so I assume this is desired behavior.

I also recommend making this a stored procedure rather than a function. Functions on SQL Server are treated as black boxes and can seriously harm performance, because the optimizer generally doesn't have a good idea of what they're doing.

Sign up to request clarification or add additional context in comments.

4 Comments

I see what you're saying as anything that has to evaluate each line would be a weighty process. This is contact information, so I would have to apply it to each field based on it's available length. Also, I can't replace the entire field. What I'm look at achieving is transforming "Research Associate Professor of Biostatistics" into "Prof, Dept Pathology & Lab Medicine, UBC" That's why I'm using the REPLACEfunction, not SET.
Sure, makes sense. Looks like you might have more than one abbreviation per entry as well. In this case you'd want to join on targetColumn like '%'+LongName+'%', and use replace in the update statement. Might need some tweaks to ensure avoiding false positives (e.g. partial word matches), but the general technique is the same: use a set based approach to avoid the performance troubles of loops.
Thanks for your help. I'll revisit this project next week. Apparently I can't upvote yet. :-/
Cheers. You'll be able to upvote once you've gathered a small amount of reputation. You can do so by getting upvotes on your questions and answers, accepting an answer, or making edits on others' posts (but only if necessary - your edits will be reviewed by others and must be approved).

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.