1

I'm using SQL Server. I find myself doing complex queries in the WHERE clause with the following syntax:

SELECT ..
WHERE StudentID IS NULL OR StudentID NOT IN (SELECT StudentID from Students)

was wondering if there's a better approach/more cleaner way to replace it with because this is a small example of the bigger query I'm doing which includes multiple conditions like that.

As you can see I'm trying to filter for a specific column the rows which its column value is null or not valid id.

EDIT

Courses:

|CourseID   | StudentID | StudentID2|
|-----------------------------------|
| 1         | 100       | NULL      |
| 2         | NULL      | 200       |
| 3         | 1         | 1         |

Students

|StudentID  | Name  |
|--------------------
| 1         | A     |
| 2         | B     |
| 3         | C     |

Query:

SELECT CourseID 
FROM Courses 
WHERE
   StudentID IS NULL OR StudentID NOT IN (SELECT * FROM Students)
   OR StudentID2 IS NULL OR StudentID2 NOT IN (SELECT * FROM Students)

Result:

| CourseID  |
|-----------|
| 1         |
| 2         |

As you can see, course 1 and 2 has invalid students.

4
  • 1
    Can a StudentId be non-null but it's not a student, so why do you need the sub-query at all? Commented Nov 30, 2013 at 11:48
  • 2
    This is too little information to really help you. Consider adding a functional example of that select, with table structures, sample data and expected output. Commented Nov 30, 2013 at 11:49
  • Also tell us which DBMS you are using. Postgres? Oracle? Commented Nov 30, 2013 at 12:00
  • I have edited my question with a tables example hope u could help Commented Nov 30, 2013 at 12:26

3 Answers 3

2

Alain was close, except the studentID2 column is associated with the courses table. Additionally, this is joining each studentID column to an instance of the students table and the final WHERE is testing if EITHER of the student ID's fail, so even if Student1 is valid and still fails on Student2, it will capture the course as you are intending.

SELECT 
      C.CourseID 
   FROM 
      Courses C
         LEFT JOIN Students S
            ON C.StudentId = S.StudentId
         LEFT JOIN Students S2
            OR C.StudentId2 = S2.StudentId
   WHERE
         S.StudentId IS NULL 
      OR S2.StudentID IS NULL
Sign up to request clarification or add additional context in comments.

6 Comments

reposted at the same time as your ;), but I think Popokoko wants to have all unmatching students, so I think it should be an AND in the WHERE part not OR ?
@Alain_Deloin, although his data did not show, I think it is an OR... what if student ID #1 is 1 (valid), and #2 is NULL or 100 (both invalid). They would want to review that record.
it should be OR but u guys still rock, i ticked Alain's answer :) Thanks a lot for both solutions !
@Popokoko, if it should by OR, why did you check the AND answer... Alain's version. On his version, if you have a course with StudentID = 1 and StudentID2 = NULL it will NOT return that class in your result set.
hmm, im guessing because i respect you both's effort, but u do right. don't wanna confuse others so i re-edited. thanks.
|
1

this is not a sure shot but i have had experience that this is better performer than the question one:

SELECT CourseID from Courses WHERE
Courses.StudentID NOT exists (SELECT 1 FROM Students where Students.StudentID=nvl(Courses.StudentID,-1));

Also Create an index on StudentId in the Students Table.

And if your data model supports create a primary key foreign key relationship between the 2 tables. That way u definitely avoid invalid values in the courses table.

After your update:

SELECT CourseID from Courses WHERE
Courses.StudentID NOT exists (SELECT 1 FROM Students where Students.StudentID=nvl(Courses.StudentID1,-1) or Students.StudentID=nvl(Courses.StudentID2,-1));

7 Comments

1. what's nvl? 2. and what if there's no pk on students table?
You can create a pk on the table if ur data model is well defined it should have the pk. and NVL is a function that say that if first argument is giving a value of null then use the second argument as the output so it caters to the Student id is null part
Thanks for ur reply but don't u think it is still messy a bit? or there's no other way doing that?
if u have 2 columns to check, it becomes messy. that is why i suggest that you use PK-FK relation. that way u avoid the invalid students in the first place
You can also make a specific change if you like. Build a data correctional function. on execution, it will update the student id with -1 where ever it finds invalid value. then u just query -1 values. this way much more faster and more efficient. u can schedule the function on regular intervals or inside a trigger.
|
1

The NOT EXISTS pattern is fine, however, you have several ways to do that.

You should check here and here

For example with LEFT JOIN (two left joins since two variables are checked)

SELECT *
from Courses 
LEFT JOIN Students Student1
    on Courses.StudentId = Student1.StudentId
LEFT JOIN Students Student2
    on Courses.StudentId2 = Student2.StudentId
WHERE
    -- No matching Student
    student1.StudentId IS NULL 
    and student2.StudentId IS NULL 

1 Comment

The problem with this query is that it doesn't solve the issue of having invalid IDs in my courses table. this is why i added the "OR StudentID NOT IN (..)" part. the table is not perfect and i can't edit it's relationships, do u suggest my to keep my original query?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.