2

I am trying to validate email in SQL using RegEX to achieve below criteria.

Create a query (use the operator LIKE) that searches for all the Email Addresses that contain:

  1. Only one symbol “@”
  2. At least one symbol “.”
  3. If there is only one “.” symbol it should be after the symbol “@” (and not before it)
  4. At least 6 characters
  5. The symbols “.” and “@” must not be next to each other
  6. The symbol “.” or “@” must not be at the beginning or end of the address In addition, the Email Address must not contain the following symbols: “=”, “_”, “-“, “+”, “&”, “<”, “>” or “,”.

I tried writing a query and I have considered all the above point except the at least 6 character condition. I am getting the results but all of them is getting displayed as invalid, I am not sure why.

Below is the query that I wrote:

SELECT EmailAddress 
CASE WHEN EmailAddress Like '%^[A-Za-z0-9._%\-+!#$&/=?^|~]+@[A-Za-z0-9.-]+[.][A-Za-z]+$%' THEN 'Valid'
ELSE 'INVALID' END
AS valid_email
FROM Database
Where EmailAddress is not null
8
  • 3
    Regexs aren’t good for validating email addresses Commented Jul 10, 2022 at 15:59
  • Hey @DanielA.White Thanks for the comment. Is there any other way to achieve this in SQL? Commented Jul 10, 2022 at 16:01
  • Does this answer your question? SQL Email Verification Function using Regex Commented Jul 10, 2022 at 16:10
  • 1
    Should [email protected] be treated as valid? Note that RFC2606 tells us that nothing with domain ending .invalid should ever work. How about: [email protected] (could exist but domain is not currently registered)? How about: [email protected] (domain is valid but there is no such local part currently) ? Commented Jul 10, 2022 at 16:15
  • At least have the terminology right. SQL Server does not support regex, it supports globbong which have very limited capabilities. Commented Jul 10, 2022 at 16:48

3 Answers 3

1

I did some research and found that the 'like' operator alone doesn't support regular expression; instead, Oracle SQL supports it by using the 'REGEXP_LIKE' keyword.

(^[A-Za-z0-9.%\!#$/?^|~]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,}$) 

I tried your code in this way:

SELECT EmailAddress 
CASE WHEN EmailAddress 
REGEXP_LIKE(^[A-Za-z0-9.%\!#$/?^|~]{1,6}+@[A-Za-z0-9.-]+\.[A-Za-z]{2,}$) THEN'Valid'
ELSE 'INVALID' END
AS valid_email
FROM Database
Where EmailAddress is not null;
Sign up to request clarification or add additional context in comments.

2 Comments

best answer, explaining how to write a query with regex instead of working around LIKE not using regex.
wait a second - you mention Oracle SQL. Is this even applicable to T-SQL?
0

Standard LIKE doesn't handle regex.

I don't recommend using the following code but it satisfies your criteria:

SELECT
    EmailAddress,
    CASE
        WHEN (
            ( NOT EmailAddress LIKE '%@%@%' )    -- 1
            AND ( EmailAddress LIKE '%.%' )      -- 2
            AND ( EmailAddress LIKE '%@%.%' )    -- 3
            AND ( EmailAddress LIKE '______%' )  -- 4
            AND ( NOT (                          -- 5
                ( EmailAddress LIKE '%@.%' )
                OR ( EmailAddress LIKE '%.@%' )
            ) )
            AND ( NOT (                          -- 6(a)
                ( EmailAddress LIKE '@%' )
                OR ( EmailAddress LIKE '.%' )
                OR ( EmailAddress LIKE '%@' )
                OR ( EmailAddress LIKE '%.' )
            ) )
            AND ( NOT (                          -- 6(b)
                ( EmailAddress LIKE '%=%' )
                OR ( EmailAddress LIKE '%\_%' ESCAPE '\' )
                OR ( EmailAddress LIKE '%-%' )
                OR ( EmailAddress LIKE '%+%' )
                OR ( EmailAddress LIKE '%&%' )
                OR ( EmailAddress LIKE '%<%' )
                OR ( EmailAddress LIKE '%>%' )
                OR ( EmailAddress LIKE '%,%' )
            ) )
        ) THEN 'Valid'
        ELSE 'INVALID'
    END
    AS valid_email
FROM Database
Where EmailAddress IS NOT NULL
;

Notes:

  • =, _,-, + and & are perfectly valid in the local part
  • - is valid in the domain part
  • <, > and , are valid in the local part if properly quoted
  • this expression does not correctly classify email addresses in general
  • no expression can determine a priori whether or not an address actually exists, even if it is syntactically valid
  • I used Database because you did, but you probably meant Database.Table

5 Comments

Hi @jhnc, Unfortunately, I am not able to upvote the answer but thank you so much. It worked! I really appreciate your help :)
@ShubhamVajpayee you really shouldn't do email validation this way. it excludes many perfectly-valid email addresses
It would be really helpful if you could tell me the other ways to achieve this as I am not aware of any other ways :(
@ShubhamVajpayee depends what "achieve this" means. my code does what you requested but does not validate email correctly. The syntax of a valid email address is very complicated. The only reliable way to check validity is to do a call-out (eg. send email to the address requesting confirmation and update your table when you receive the reply - consider e-commerce site registration). Even that is time-limited: addresses that exist when you test can be deleted in future.
@ShubhamVajpayee serious tongue-in-cheek
0

Because the (current) other answers says that regular expression cannot be used in Microsoft SQL Server, you can do it without those like this:

WITH sampleaddresses as (
  SELECT '[email protected]' as email UNION ALL
  SELECT '@example.com' UNION ALL
  SELECT '[email protected]' UNION ALL
  SELECT 'invalid@a+b.com' UNION ALL
  SELECT 'invalid2@' UNION ALL
  SELECT 'a.b@c'
),
invalidchars as (
  SELECT value FROM STRING_SPLIT('=|_|-|+|&|<|>|,','|')
)
SELECT DISTINCT email 
FROM sampleaddresses
CROSS APPLY invalidchars
WHERE CHARINDEX(value, email)<>0       -- no invalid chars
   OR LEN(email)-LEN(REPLACE(email,'@','')) <> 1   -- not more than one '@'
   OR LEN(email)-LEN(REPLACE(email,'.','')) < 1   -- al least one '.'
   OR CHARINDEX('.',SUBSTRING(email,charindex('@',email),100))=0  -- '.' after '@'
   OR SUBSTRING(email,1,1)  in('@','.') --  starting with '@' or '.'
   Or SUBSTRING(email,LEN(email),1)  in('@','.') --  ending with '@' or '.'

Example: DBFIDDLE

This SQL has all invalid options in the WHERE clause, to detect an invalid item.

The sampleaddresses` can be expanded to test other situations. (Sorry for my lack of providing testcases for all possible bad email addresses....)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.