0

i'm trying to implement a recursive query in MS SQL server 2008 using CTE. I know there are a lot of posts talking about recursion in SQL SERVER but this is a bit different and i'm getting stuck. I have a table with this structure:

CREATE TABLE [dbo].[Account](
[ID] [nvarchar](20) NULL,
[MAIN_EMAIL] [nvarchar](80) NULL,
[SECONDARY_EMAIL] [nvarchar](80) NULL)

This table represent a list of Account of course, these accounts can be duplicated in the table and i know they are if an account has a MAIN_EMAIL or a SECONDARY_EMAIL that exists in the MAIN_EMAIL or SECONDARY_EMAIL in another record with a different ID.

For example these records are duplicated in my table:

enter image description here

I know those records are duplicated because the ID 21206 has the main email that exists as main email in the record with ID 21246 and as secondary email in the record with ID 21268. Furthermore, the record with ID 21246 has a secondary email that exists as main email in the record with ID 28169. So, i consider those 4 records as a single record (this rule comes from project requirements).

Then, let's suppose i know the ID from which start this recursive query, suppose i know the first, with the ID 21206. I wrote this query but the result is a infinitive loop (and i get the error from MS SQL SERVER since it says i can do at maximum 100 recursions), if i select top 100 the result set contains the correct records, in this example the result is all ids 21206,21246,21268,28169 but these records are repeated to infinity, it seems the recursive part doesn't stop. The query is:

    with cte (ID, MAIN_EMAIL, SECONDARY_EMAIL) as (
        select ad.ID,ad.MAIN_EMAIL,ad.SECONDARY_EMAIL
        from Account ad
        where ad.ID = '21206'
        union all
        select ade.ID,ade.MAIN_EMAIL,ade.SECONDARY_EMAIL
        from Account ade
        inner join cte c 
        on (
            (ade.MAIN_EMAIL = c.MAIN_EMAIL
            or ade.SECONDARY_EMAIL = c.MAIN_EMAIL
            or ade.MAIN_EMAIL = c.SECONDARY_EMAIL
            or ade.SECONDARY_EMAIL = c.SECONDARY_EMAIL)
            and ade.ID <> c.ID
        ) 
    )
    select top 100 * from cte

I extracted those 4 records that are related and changed the emails for privacy. So the result should be the 4 records above. The result I get is a recordset with those 4 records (so it is correct but the recursive query doesn't stop so I get those 4 records to infinity).

Could you help me? Thank you in advance

3
  • Given your example records above, what results are you actually looking for? - it is not clear from your question Commented Apr 3, 2014 at 9:08
  • You are right it's not clear, i would like to have as result those 4 records i posted in the example, in my table i have about 120K records, i extracted those 4 records that are related and changed the emails for privacy. So the result should be the 4 records above. The result i get is a recordset with those 4 records (so it is correct but the recursive query doesn't stop so i get those 4 records to infinity). Commented Apr 3, 2014 at 9:15
  • Recursive CTEs are great, but use a plain old cursor here. Commented Apr 3, 2014 at 9:32

3 Answers 3

1

Like other have already said there is no need for recursion

SELECT DISTINCT account.* 
FROM   account
       INNER Join (SELECT mail
                   FROM (SELECT main_email mail
                         FROM   account
                         UNION ALL
                         SELECT secondary_email mail
                         FROM   account) a
                   GROUP BY mail
                   HAVING count(1) > 1) mails
                  ON main_email = mails.mail or secondary_email = mails.mail

It's probably possible to use UNPIVOT to get the list of all the mail addresses, but I'm not sure what will be the best, performance-wise.

I leave the fiddle link

If you want to check, the UNPIVOT version (with CTE) is:

WITH mails as
(
  SELECT mail
  FROM (SELECT ID, main_email, secondary_email
        FROM account) p
  UNPIVOT (mail FOR col IN (main_email, secondary_email)) as a
  GROUP BY mail
  HAVING count(mail) > 1
)
SELECT DISTINCT account.* 
FROM   account
       INNER JOIN mails on main_email = mails.mail or secondary_email = mails.mail
Sign up to request clarification or add additional context in comments.

Comments

0

If I understood your requirements correctly, perhaps you don't even need recursions to achieve it.

Maybe this can work for you:

SELECT * FROM account ade
WHERE EXISTS (SELECT * FROM account c WHERE ade.ID <> c.ID AND (ade.MAIN_EMAIL = c.MAIN_EMAIL
            or ade.SECONDARY_EMAIL = c.MAIN_EMAIL
            or ade.MAIN_EMAIL = c.SECONDARY_EMAIL
            or ade.SECONDARY_EMAIL = c.SECONDARY_EMAIL))

1 Comment

No it doesn't work for me since i need all records duplicated starting from a specific record, but the real problem with this query is that if you don't use a recursive query the record with ID 21246 would not be bound to record with ID 28169.
0

If you just want to match MAIN_EMAIL to SECONDARY_EMAIL then a UNION should work:

SELECT DISTINCT R.MainId
FROM
(
SELECT  A1.ID MainId 
FROM dbo.Account A1 INNER JOIN dbo.Account A2 ON A2.MAIN_EMAIL = A1.SECONDARY_EMAIL
UNION
SELECT  A2.ID
FROM dbo.Account A2 INNER JOIN dbo.Account A1 ON A2.MAIN_EMAIL = A1.SECONDARY_EMAIL
) R

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.