106

Suppose I had the following 2 tables:

      Table1:                                Table2:
Col1:      Col2:     Col3:             Col1:       Col2:       Col4:
a          b         c                 a           b           d
e          <null>    f                 e           <null>      g
h          i         j                 h           i           k
l          <null>    m                 l           <null>      n
o          <null>    p                 o           <null>      q

Now, I want to join these tables on Col1 and Col2 and bring back the entire set to look like:

     Result:
Col1:      Col2:     Col3:     Col4:
a          b         c         d
e          <null>    f         g
h          i         j         k
l          <null>    m         n
o          <null>    p         q

So, I tried a SQL like:

SELECT Table1.Col1, Table1.Col2, Table1.Col3, Table2.Col4
FROM Table1
INNER JOIN Table2
    ON Table1.Col1 = Table2.Col1 
    AND Table1.Col2 = Table2.Col2

But it isn't matching the NULL values in Col2, so I end up with:

     Result:
Col1:      Col2:     Col3:     Col4:
a          b         c         d
h          i         j         k

How can I get the result I am looking for??

Thanks!

1
  • is this specific to sql-server? i can't seem to do this in postgres 11.5 Commented Jan 17, 2020 at 3:37

11 Answers 11

152

You can be explicit about the joins:

SELECT Table1.Col1, Table1.Col2, Table1.Col3, Table2.Col4
FROM Table1 INNER JOIN
     Table2
      ON (Table1.Col1 = Table2.Col1 or Table1.Col1 is NULL and Table2.Col1 is NULL) AND
         (Table1.Col2 = Table2.Col2 or Table1.Col2 is NULL and Table2.Col2 is NULL)

In practice, I would be more likely to use coalesce() in the join condition:

SELECT Table1.Col1, Table1.Col2, Table1.Col3, Table2.Col4
FROM Table1 INNER JOIN
     Table2
     ON (coalesce(Table1.Col1, '') = coalesce(Table2.Col1, '')) AND
        (coalesce(Table1.Col2, '') = coalesce(Table2.Col2, ''))

Where '' would be a value not in either of the tables.

Just a word of caution. In most databases, using any of these constructs prevents the use of indexes.

Sign up to request clarification or add additional context in comments.

5 Comments

another option for coalesce would be ON (T1.C1=T2.C1 or (coalesce(T1.C1,T2.C1) is null)) and (T1.C2=T2.C2 or (coalesce(T1.C2,T2.C2) is null))
This doesn't work in the case where you have Table1=(a,b,c)(a,null,d) Table2=(a,b,e). In this case, the join matches the case Table1(a,null,d) with the nonexistent case Table2(a,null,null). This should not result in a match.
In other words, this does not distinguish between cases where there is a NULL in Table2.Col2 which matches, versus no record at all in Table2.
Shouldn't join conditions be like below with null checks as a single unit after or operator. (Table1.Col1 = Table2.Col1 or (Table1.Col1 is NULL and Table2.Col1 is NULL)) AND (Table1.Col2 = Table2.Col2 or (Table1.Col2 is NULL and Table2.Col2 is NULL))
In T-SQL, using COALESCE() in a JOIN is way slower than explicitly checking NULL
53

Use Left Outer Join instead of Inner Join to include rows with NULLS.

SELECT Table1.Col1, Table1.Col2, Table1.Col3, Table2.Col4
FROM Table1
LEFT OUTER JOIN Table2 
    ON Table1.Col1 = Table2.Col1 
    AND Table1.Col2 = Table2.Col2

For more information, see here: http://technet.microsoft.com/en-us/library/ms190409(v=sql.105).aspx

3 Comments

This does not answer the original question. your solution will include null rows for table2 if no match found. look at second row in both tables in OP's question
This doesn't really solve the question as you'd now have to filter to select the values you want. To do that you'll have contend with the nulls again.
Thanks for good advice, thats what i have needed. My case was Table.Col1 was null and there was no null record in Table2 at all, so it wasn't select that row with null col at all.
18

Dirty and quick hack:

SELECT Table1.Col1, Table1.Col2, Table1.Col3, Table2.Col4
FROM Table1 INNER JOIN Table2 ON Table1.Col1 = Table2.Col1
 AND ((Table1.Col2 = Table2.Col2) OR (Table1.Col2 IS NULL AND Table2.Col2 IS NULL))

1 Comment

This solution is neither dirty nor a hack. It is simply the correct way to do it in T-SQL.
18

Try using ISNULL function:

SELECT Table1.Col1, Table1.Col2, Table1.Col3, Table2.Col4
FROM Table1 
INNER JOIN Table2
   ON Table1.Col1 = Table2.Col1 
   AND ISNULL(Table1.Col2, 'ZZZZ') = ISNULL(Table2.Col2,'ZZZZ')

Where 'ZZZZ' is some arbitrary value never in the table.

5 Comments

This introduces magic values which I tend to avoid unless absolutely necessary
This would work nicely, but assumes that '0' isn't used elsewhere in another row.
Yeah, I prefer @Gordon's reply, just offering an alternative. And I'd probably use a different value that I knew couldn't be represented in the return values like 'ZZZZ' or something :-)
This doesn't distinguish between cases where Table2.Col2 IS NULL versus cases where there is no matching record in Table2.
Frankly you can probably get away with '' instead of 'zzzz' for 99% of the time you find yourself needing to do something like this. Otherwise what you're saying is that an empty string is NOT equivalent to a NULL (for the purpose of doing the join), which seems odd. For my use-case I am comparing addresses and sometimes AddressLine2 is null, but that's logically equivalent to an empty string for an address. Of course, only you know your data - and it's important to be careful about making quick decisions and assumptions with things like this that might bite you later.
1

you can just map like that

select * from tableA a
join tableB b on isnull(a.colID,'') = isnull(b.colId,'')

1 Comment

This is problematic, because it doesn't differentiate between an empty string an an actual NULL value. In case a.colD is an empty string and b.colID is NULL the comparison would still evaluate to true, even though the two columns contain different values. In some cases you don't want to treat NULL and an empty string as being equal.
1

for some reason I couldn't get it to work with the outer join.

So I used:

SELECT * from t1 where not Id in (SELECT DISTINCT t2.id from t2)

Comments

1
declare @Table1 as Table(Col1 varchar(1),Col2 varchar(1), Col3 varchar(1))
declare @Table2 as Table(Col1 varchar(1),Col2 varchar(1), Col4 varchar(1))
insert into @Table1
values('a',          'b',     'c'),
('e',          null,    'f'), 
('h',          'i'  ,   'j'), 
('l',          null  ,  'm'), 
('o',          null  ,  'p') 
insert into @Table2
values('a',          'b',     'd'),
('e',          null,    'g'), 
('h',          'i'  ,   'k'), 
('l',          null  ,  'n'), 
('o',          null  ,  'q') 


select * from @Table1 tbl1
join @Table2 tbl2
on tbl1.Col1 =tbl2.Col1
and isnull(tbl1.Col2,0) =isnull(tbl2.Col2,0)

output:

 Col1   Col2    Col3    Col1    Col2    Col4
 a      b       c       a       b       d
 e      NULL    f       e       NULL    g
 h      i       j       h       i       k
 l      NULL    m       l       NULL    n
 o      NULL    p       o       NULL    q

Comments

1

Some SQL implementations have a special Null-safe equal operator.

For instance Snowflake has EQUAL_NULL so ou can do

SELECT 
  Table1.Col1, Table1.Col2, Table1.Col3, Table2.Col4 
FROM 
  Table1 
  INNER JOIN Table2 ON EQUAL_NULL(Table1.Col1, Table2.Col1) 
  AND EQUAL_NULL(Table1.Col2, Table2.Col2)

Comments

0

Try using additional condition in join:

SELECT Table1.Col1, Table1.Col2, Table1.Col3, Table2.Col4
FROM Table1 
INNER JOIN Table2
ON (Table1.Col1 = Table2.Col1 
    OR (Table1.Col1 IS NULL AND Table2.Col1 IS NULL)
   )

Comments

0

The only correct answer is not to join columns with null values. This can lead to unwanted behaviour very quickly.

e.g. isnull(b.colId,''): What happens if you have empty strings in your data? The join maybe duplicate rows which I guess is not intended in this case.

3 Comments

I disagree. The correct answer is not to avoid joining nullable columns. The correct answer is to do the comparison in the JOIN correctly (as demonstrated in the other answers to this question).
You are right David, but I think that we can agree on the fact that the ISNULL solution is really bad practice.
Yes, the ISNULL solution is not a good idea. Using magic values is generally problematic, especially in this situation. Using ISNULL in this situation is neither necessary nor advisable. NULL does have a special meaning in T-SQL. It means that the value is unknown. That is very different from any other possible value. Therefore SQL Server treats NULL very differently than any other value and one should generally not attempt to treat NULL the same as any other value.
-1

I think that if you have the possibility to change the schema table then you could modify the tables columns used in the join condition appending a default value like ''. so you can use the nomal join condition and index will be used.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.