2

--- I went with a version of Andrey Gurinov's answer because I wanted to do in the query and he posted it first. ----

I have a database with names, addresses, city, state, zip, ect. for people. I want to read the data into a C# program in order of a group code, name, then a date. I am running into a problem though because a name has been entered in multiple ways by people.

Here is an example of the problem with a subset of the data:

| Dr. Kristen S   | 2011-04-15 00:00:00.000   | 00005573
| Kristen  S      | 2012-04-11 00:00:00.000   | 00005573
| Kristen S       | 2012-08-10 00:00:00.000   | 00005573
| Ms Kristen S    | 2011-08-12 00:00:00.000   | 00005573
| MS Kristen S    | 2012-01-27 00:00:00.000   | 00005573
| Ms. KRISTEN S   | 2012-04-09 00:00:00.000   | 00005573

As you can see, the name is relativly the same but the order of dates is not what I want. I want the dates in order.

If I read this data into my C# program is there a way to make the select statement recognize the variations (Dr. , MS , Ms. , Ms , " " <- double space) and replace them with nothing or a single space? So that I can then sort the name groups by date. Or would I have I have to remove the variations permenately in the database.

----- EDIT (SQL Query) -----

SELECT  [ListMP]
      ,[Name]
      ,[Address1]
      ,[City]
      ,[State]
      ,[ZipCode]
      ,[Date]
      ,[OrderCode]
      ,[SequenceNbr]
  FROM [Customer].[dbo].[Orders]

  ORder by [OrderCode],[Name], [Date]

Sample output:

ORDER |Kristen S| 203 My Street| Bristol| RI| 02809| 2012-04-11 00:00:00.000| 05632| 00005573

The OrderCode is not unique to an individual, it's unique to an address where the address can have multiple people at.

6
  • Why not using a comboBox which contains the variations (Dr. , MS , Ms. , Ms ..) then sort them by date Commented May 10, 2013 at 18:06
  • What is this column? 00005573 If it's person id or something, then it will be easy. Commented May 10, 2013 at 18:08
  • Might you be willing to share your SQL query? Does it have an ORDER BY clause? Commented May 10, 2013 at 18:08
  • @Obama that is not something I want to do. I would rather do it one of the ways mentioned by myself or Nate said. Commented May 10, 2013 at 18:18
  • Some of the differences might be handled by SOUNDEX. Commented May 10, 2013 at 18:24

3 Answers 3

1

You can try something like this:

SELECT REPLACE(REPLACE(REPLACE(name, 'Dr.', ''), 'Ms', ''), '  ', ' ') FROM ...
Sign up to request clarification or add additional context in comments.

Comments

0

You could write a little "name cleaner" procedure in c# that would strip these elements, and then sort the list by this stripped version and then by the date. You could also perform this in the sql query side by a series of nested replaces. Finally, as you mentioned, you could attempt to clean the entries in the database (possibly by creating another field for the cleaned name).

Which method you choose would be determined by how much data you are dealing with, and how often you need to do something like this. If this is a lot of data, and you can see needing this in other future applications, I would recommend handling this at the database level. You could write a function in SQL that formats the names, and then decide if you want to use it at query time or when inserting the data.

The function might look something like this:

drop function [fn_formatName]
go
CREATE FUNCTION [dbo].[fn_formatName] 
(
    @Name nvarchar(4000)
)
RETURNS nvarchar(4000)
AS
BEGIN
    set @Name = replace(@Name, '.', '')
    set @Name = replace(@Name, '  ', ' ')

    if(len(@name) > 4) begin
        set @Name = replace(Left(@Name, 4), 'Mrs ', '') + Right(@Name, Len(@Name) - 4)
    end if(len(@name) > 4) begin
        set @Name = replace(Left(@Name, 4), 'Dr ', '') + Right(@Name, Len(@Name) - 4)
    end if(len(@name) > 4) begin
        set @Name = replace(Left(@Name, 4), 'Mr ', '') + Right(@Name, Len(@Name) - 4)
    end if(len(@name) > 4) begin    
        set @Name = replace(Left(@Name, 4), 'Ms ', '') + Right(@Name, Len(@Name) - 4)
    end

    set @Name = ltrim(@Name)

    RETURN @Name
END

And then your query would look like this

SELECT  [ListMP]
      ,[Name]
      ,dbo.fn_formatName([Name]) as 'CleanName',
      ,[Address1]
      ,[City]
      ,[State]
      ,[ZipCode]
      ,[Date]
      ,[OrderCode]
      ,[SequenceNbr]
  FROM [Customer].[dbo].[Orders]

  ORder by [OrderCode], CleanName, [Date]

3 Comments

So, when you say "use it at query time" what would that whole thing look like when I do my query in my C# program?
I updated the answer the the updated query using the function. I would probably want to fix up this function a little, as it currently has the potential for incorrectly updating names that end in "mr" or "ms".
I updated the function as I wasn't too happy with my first attempt. Feel free to clean it up a bit yourself as I'm not sure the performance is going to be that good as it is, but it should at least point you in the right direction
0

you could use c# to clean up the name like so:

string FixedName(string name)
{
    name = name.Trim();
    var prefixes = new string[] { "Mrs. ", "Mrs ", "Mr. ", "Mr ", "Dr. ", "Dr " };
    foreach (var prefix in prefixes)
    {
        if (name.StartsWith(prefix, true, CultureInfo.InvariantCulture))
        {
            name = name.Substring(prefix.Length).Trim();
            break;
        }
    }
    return name;
}

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.