0

Apologies if this a little confusing to follow but hopefully someone can help me with this. For some reason my mind is drawing a blank on this and also typing the problem out might help.

I am doing some ETL stuff and I have data that can be grouped by a specific Id value. Each record in my dataset has a LookupId value that I use to join to another table and get additional information. Pretty standard INNER JOIN stuff here. However, I need to also include missing LookupId values in my dataset. This turns my INNER JOIN into a LEFT JOIN which is also pretty standard stuff.

You can use the following sample schema and data below to get an idea of what I am looking to do:

-- Create sample schema
CREATE TABLE ETL_DataSet (
    Id INT NOT NULL,
    LookupId INT NULL,
    LookupCode CHAR(1) NULL,
    LookupDesc NVARCHAR(40) NULL
);

CREATE TABLE ETL_LookupTable (
    LookupId INT NOT NULL
    , LookupCode CHAR(1) NOT NULL
    , LookupDesc NVARCHAR(40) NOT NULL
    , CONSTRAINT PK_LookupTable_LookupId PRIMARY KEY CLUSTERED (LookupId)
);

-- Insert sampel data
INSERT INTO ETL_LookupTable (LookupId, LookupCode, LookupDesc) VALUES
    (1, 'Z', 'Z Description'),
    (2, 'A', 'A Description'),
    (3, 'B', 'B Description'),
    (4, 'C', 'C Description'),
    (5, 'D', 'D Description'),
    (6, 'E', 'E Description'),
    (7, 'X', 'X Description');

INSERT INTO ETL_DataSet (Id, LookupId, LookupCode, LookupDesc) VALUES 
    (1, 3, 'B', 'B Description'),
    (1, 5, 'D', 'D Description'),
    (1, 3, 'B', 'B Description'),
    (1, 2, 'A', 'A Description'),
    (2, 4, 'C', 'C Description'),
    (2, 6, 'E', 'E Description'),
    (2, 3, 'B', 'B Description'),
    (2, 2, 'A', 'A Description');

And here is the sample script to run:

DECLARE @id INT = 1; //change this to 1 or 2
;WITH LookupCTE AS
(
    SELECT d.*
    FROM ETL_LookupTable l INNER JOIN ETL_DataSet d ON l.LookupId = d.LookupId
    WHERE d.Id = @id --comment this out to get all data in ETL_DataSet
)
SELECT *
FROM ETL_LookupTable l LEFT JOIN LookupCTE cte ON l.LookupId = cte.LookupId

The end goal is to include all LookupTable values for each Id even if the set of Ids do not have the LookupTable values. Changing @Id between 1 and 2 you can see that this works but if you comment out the WHERE clause in the CTE, you get incorrect data. I am trying to do this without using cursors or any other row-by-agonizing-row technique so hopefully someone can assist or point me in the right direction.

Desired results should look like the following:

Id          LookupId    LookupCode LookupDesc
----------- ----------- ---------- ----------------------------------------
1           2           A          A Description
1           3           B          B Description
1           3           B          B Description
1           NULL        C          C Description
1           5           D          D Description
1           NULL        E          E Description
1           NULL        X          X Description
1           NULL        Z          Z Description
2           2           A          A Description
2           3           B          B Description
2           4           C          C Description
2           NULL        D          D Description
2           6           E          E Description
2           NULL        X          X Description
2           NULL        Z          Z Description
4
  • can you provide the "expected result" please Commented Dec 28, 2017 at 0:59
  • @Used_By_Already See Nolan Shang's answer for my desired output. The NULL rows will have other data still attached to it but the main thing is that the first column includes the missing data as well. Commented Dec 28, 2017 at 2:19
  • For MY expected result, look over there a result by someone else, it is almost close enough. Seriously? You can't be bothered showing is the real expected result? Commented Dec 28, 2017 at 2:23
  • You make a valid point and apologies if it seemed like I was being lazy. I looked at the answer and made comments and then started looking closer at the answer since I've been stuck off and on. Question is updated accordingly. Commented Dec 28, 2017 at 2:37

4 Answers 4

2

You seem to want a Cartesian product of the etl_dataset.id with the etl_lookupTable information, so a cross join forming that product should work.

select
ed.*, el.*, d.*
from (select distinct id from etl_dataset) ed
cross join (select distinct LookupId, LookupCode, LookupDesc from ETL_LookupTable) el
left join etl_dataset d on ed.id = d.id and el.LookupId = d.LookupId
order by 
ed.id, el.LookupCode
GO
id | LookupId | LookupCode | LookupDesc    |   Id | LookupId | LookupCode | LookupDesc   
-: | -------: | :--------- | :------------ | ---: | -------: | :--------- | :------------
 1 |        2 | A          | A Description |    1 |        2 | A          | A Description
 1 |        3 | B          | B Description |    1 |        3 | B          | B Description
 1 |        3 | B          | B Description |    1 |        3 | B          | B Description
 1 |        4 | C          | C Description | null |     null | null       | null         
 1 |        5 | D          | D Description |    1 |        5 | D          | D Description
 1 |        6 | E          | E Description | null |     null | null       | null         
 1 |        7 | X          | X Description | null |     null | null       | null         
 1 |        1 | Z          | Z Description | null |     null | null       | null         
 2 |        2 | A          | A Description |    2 |        2 | A          | A Description
 2 |        3 | B          | B Description |    2 |        3 | B          | B Description
 2 |        4 | C          | C Description |    2 |        4 | C          | C Description
 2 |        5 | D          | D Description | null |     null | null       | null         
 2 |        6 | E          | E Description |    2 |        6 | E          | E Description
 2 |        7 | X          | X Description | null |     null | null       | null         
 2 |        1 | Z          | Z Description | null |     null | null       | null         

dbfiddle here

Sign up to request clarification or add additional context in comments.

2 Comments

Thanks for this. This is a good solution that produces exactly what I was looking for. As a side note I should embrace Cartesian products more. I have always shied away from them (was taught early on to avoid them) and gone the temp table/CTE route but this works very well.
A cross join is a vital tool when needing rows that are not stored. Accidental Cartesian products are something to avoid but if you use explicit join syntax they are not possible (unlike older join syntax that relied on commas between tables often did create unexpected Cartesian products).
2
CREATE TABLE #ETL_DataSet (
    Id INT NOT NULL,
    LookupId INT NULL,
    LookupCode CHAR(1) NULL,
    LookupDesc NVARCHAR(40) NULL
);

CREATE TABLE #ETL_LookupTable (
    LookupId INT NOT NULL
    , LookupCode CHAR(1) NOT NULL
    , LookupDesc NVARCHAR(40) NOT NULL
    , CONSTRAINT PK_LookupTable_LookupId PRIMARY KEY CLUSTERED (LookupId)
);

-- Insert sampel data
INSERT INTO #ETL_LookupTable (LookupId, LookupCode, LookupDesc) VALUES
    (1, 'Z', 'Z Description'),
    (2, 'A', 'A Description'),
    (3, 'B', 'B Description'),
    (4, 'C', 'C Description'),
    (5, 'D', 'D Description'),
    (6, 'E', 'E Description'),
    (7, 'X', 'X Description');

INSERT INTO #ETL_DataSet (Id, LookupId, LookupCode, LookupDesc) VALUES 
    (1, 3, 'B', 'B Description'),
    (1, 5, 'D', 'D Description'),
    (1, 3, 'B', 'B Description'),
    (1, 2, 'A', 'A Description'),
    (2, 4, 'C', 'C Description'),
    (2, 6, 'E', 'E Description'),
    (2, 3, 'B', 'B Description'),
    (2, 2, 'A', 'A Description');


select * from #ETL_DataSet
select * from #ETL_LookupTable

SELECT E.ID,D.LOOKUPID,E1.LOOKUPCODE,E1.LOOKUPDESC 
FROM(
SELECT DISTINCT ID 
FROM #ETL_DATASET) AS E 
CROSS JOIN (SELECT * FROM #ETL_LOOKUPTABLE) AS E1 
LEFT JOIN #ETL_DATASET D ON D.LOOKUPID = E1.LOOKUPID AND D.ID = E.ID
ORDER BY E.ID,E1.LOOKUPCODE,E1.LOOKUPDESC 

OUTPUT

+----+----------+------------+---------------+
| ID | LOOKUPID | LOOKUPCODE |  LOOKUPDESC   |
+----+----------+------------+---------------+
|  1 | 2        | A          | A Description |
|  1 | 3        | B          | B Description |
|  1 | 3        | B          | B Description |
|  1 | NULL     | C          | C Description |
|  1 | 5        | D          | D Description |
|  1 | NULL     | E          | E Description |
|  1 | NULL     | X          | X Description |
|  1 | NULL     | Z          | Z Description |
|  2 | 2        | A          | A Description |
|  2 | 3        | B          | B Description |
|  2 | 4        | C          | C Description |
|  2 | NULL     | D          | D Description |
|  2 | 6        | E          | E Description |
|  2 | NULL     | X          | X Description |
|  2 | NULL     | Z          | Z Description |
+----+----------+------------+---------------+

1 Comment

Thanks for the provided answer but Used_By_Already beat you to it. Upvoted because you used the specific column names to provide the desired output.
1

try this:

DECLARE @id INT = 2; --//change this to 1 or 2
;WITH IdList AS(
   SELECT DISTINCT id FROM ETL_DataSet
), LookupCTE AS
(
    SELECT IdList.id,d.LookupId, d.LookupCode, d.LookupDesc
    FROM ETL_LookupTable l INNER JOIN IdList ON 1=1
    LEFT  JOIN ETL_DataSet d ON l.LookupId = d.LookupId AND d.id=IdList.id
   -- WHERE d.Id = @id --comment this out to get all data in ETL_DataSet
)

SELECT * FROM LookupCTE --WHERE id=@id
1   NULL    NULL    NULL
1   2   A   A Description
1   3   B   B Description
1   3   B   B Description
1   NULL    NULL    NULL
1   5   D   D Description
1   NULL    NULL    NULL
1   NULL    NULL    NULL
2   NULL    NULL    NULL
2   2   A   A Description
2   3   B   B Description
2   4   C   C Description
2   NULL    NULL    NULL
2   6   E   E Description
2   NULL    NULL    NULL

Comments

0

I think for the results you are looking for you should set ETL_DataSet as the left table. something like:

SELECT * FROM ETL_DataSet LEFT JOIN ETL_LookupTable ON ETL_DataSet.LookupId = ETL_LookupTable.LookupId 

Ref: https://www.w3schools.com/sql/sql_join_left.asp

Note: The LEFT JOIN keyword returns all records from the left table ... , even if there are no matches in the right table.

4 Comments

Hi Valeria I could make the LEFT JOIN a RIGHT JOIN instead and I would have the same thing.
True. Then, are you not getting what you want?
I am for single datasets but not for the entire dataset. Nolan Shang's answer is looking promising at the moment for the results I am looking for. I am still verifying it at the moment.
I am of the opinion that there are very few places that a RIGHT JOIN should be used instead of a properly done LEFT JOIN. I'm not a fan of RIGHT JOINs, but that probably has a lot to do with my preference for left-to-right reading in general.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.