1

Suppose we have a table employees which contains name, employee_id and citizen_id. We want to filter the table using also 3 python variables name, employee_id and citizen_id for the WHERE clause. Example:

name = 'John'
citizen_id = 'ID001'
employee_id = 'EMP023'

The condition is those 3 variables are NULLABLE, one or two of them can be NULL but at least one variable are not. How we create a sql query that allow one/two of that 3 variables in the WHERE clause?

('''SELECT 
    *
FROM
    people
WHERE
    name = '{0}'
OR citizen_id = '{1}'
OR employee_id = '{2}'''').format(name, citizen_id, employee_id)

1 Answer 1

1

You can distinguish the cases by adding

.. AND <theRespectiveParameter> IS NOT NULL) OR <theRespectiveParameter> IS NULL for each,

and change the currently stated operators between them from OR to AND considering parametrized query techniques for Spark SQL such as

Q1 = spark.sql(('''
                  SELECT *
                    FROM people
                   WHERE ((name = {0} AND {0} IS NOT NULL) OR {0} IS NULL)
                     AND ((citizen_id = {1} AND {1} IS NOT NULL) OR {1} IS NULL)
                     AND ((employee_id = {2} AND {2} IS NOT NULL) OR {2} IS NULL)
                ''').format(name, citizen_id, employee_id)

or another option would be

params  = {"name":name,               # based on current parameter names
           "citizen_id":citizen_id,
           "employee_id":employee_id}
           
Q1 = spark.sql(('''
                  SELECT *
                    FROM people
                   WHERE ((name = {name} AND {name} IS NOT NULL) OR {name} IS NULL)
                     AND ((citizen_id = {citizen_id} AND {citizen_id} IS NOT NULL) OR {citizen_id} IS NULL)
                     AND ((employee_id = {employee_id} AND {employee_id} IS NOT NULL) OR {employee_id} IS NULL)
                ''').format(**params)

getting rid of quotes for both cases

Sign up to request clarification or add additional context in comments.

4 Comments

the name, citizen_id, employee_id after OR are the table column name?
Good catch @AMR , needed to replace them with parameter names. Btw, what's your DBMS?
not all DBMS able to do parameter on the left side of equation. I am using SparkSQL so it cannot be done through your solution.
@AMR Check out as a future reference please : Why should I "tag my RDBMS"?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.