0

I have a table called Staff and a table called Supervisors.

Staff has StaffID, FirstName, LastName, etc...

Supervisors contains RelationshipID, StaffID, SupervisorID, SetBy, SetOn, Status.

Basically, the Supervisors tables gives us an audit trail for the Staff self-relationship. We have a table of Staff, and we have a table of Staff:Staff relationships (supervisor:staff) with some extra information (obsolete, current, incorrect) and a StaffID who set it and when they set it.

Now, I'm writing a query to find all orphaned staff members. I have:

SELECT *
  FROM Staff
 WHERE StaffID NOT IN (SELECT StaffID
                         FROM Supervisors
                        WHERE Status = 0 
                           OR Status = 2);

(status 0 is initial load from corporate DB and 2 is modified record which has been verified. All others are 'obsolete', 'incorrect', etc...)

The issue is I have over 6000 staff and over 5000 staff:supervisor relationships and this is basically an NxM query meaning MySQL has to sift through 3 million permutations.

I'm not an SQL ninja, is there a better way to do this?

(Note, I do not expect to be running this particular query very often at all)

2
  • 2
    I can't image that would be very slow if you have the proper indexes in place. Set an index on StaffId and Status if you haven't already and watch your query speed drop. Commented Jul 26, 2011 at 2:58
  • @AlienWebuy it was actually timing out the 30 sec php limit but I modified the DB to have more indexes and now it's taking less than a second with OMG Ponies' query. Commented Jul 26, 2011 at 4:57

2 Answers 2

3

Assuming SUPERVISOR.staffid and SUPERVISOR.status columns are not nullable, use:

   SELECT st.*
     FROM STAFF st
LEFT JOIN SUPERVISOR s ON s.staffid = st.staffid
                      AND s.status NOT IN (0,2)
    WHERE s.staffid IS NULL

Otherwise, NOT IN/NOT EXISTS are equivalent & perform better if the columns are nullable.

For more info:

Sign up to request clarification or add additional context in comments.

Comments

3

This would be better performed as a join rather than a NOT IN:

SELECT st.* 
FROM Staff st
LEFT JOIN Supervisors su ON st.StaffID = su.StaffID 
          AND (su.Status <> 0 AND su.Status <> 2)
WHERE su.StaffId IS NULL

Here's how I transformed it:

NOT IN (SELECT StaffID FROM Supervisors WHERE Status = 0 OR Status = 2)

by applying Boole's law is equivalent to

IN (SELECT StaffID FROM Supervisors WHERE Status <> 0 AND Status <> 2);

(assuming Status can never be NULL) and from there is just a join.

2 Comments

Sorry, I may have been unspecific: Some staff have no supervisor relationships, i.e. there is no record in the Supervisors table with the StaffID. When I run your query with my initial data (all statuses are == 0), I just get the same result as my query with IN instead of NOT IN, which isn't what I need. I need to find orphaned staff.
Ozzah : I've updated. If you accept any answer I suggest accepting OMG Ponies's answer.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.