postgresql where condition returns at least one result

Question

postgreSQL question...I have an update query below which updates a column with the results from a subquery, however in some cases the subquery will return null which throws the 'not null' constraint on the column, how can I get it to NOT update if the subquery returns null?

I have tried EXISTS but this only seems to work on a WHERE clause?

UPDATE user_stats as stats
SET ave_price = (
    SELECT AVG(l.price)
    FROM lengths as l, user_sessions as us
    WHERE l.product_type = 'car'
    AND l.session_id = us.session_id
    AND stats.user_id = us.user_id
)

Please post your table definitions (at least the relevant columns and PK/FK constraints) and if possible provide a little script to populate the tables with test data. — tscho
– tscho, Commented Dec 17, 2011 at 13:33
@ErwinBrandstetter: I am curious to know which feature you are looking for. Is it CTEs, hence minimum PostgreSQL 8.4? — tscho
– tscho, Commented Dec 17, 2011 at 13:42
@tscho: Exactly. However, CTE for data modification commands only were introduced with PostgreSQL 9.1. Allow data-modification commands (INSERT/UPDATE/DELETE) in WITH clauses — Erwin Brandstetter
– Erwin Brandstetter, Commented Dec 17, 2011 at 14:02

xQbert · Accepted Answer · 2011-12-17 13:20:47Z

4

coalesce, nvl, ifnull in most db engines will do a conditional statement that says take the first non-null value in the string in this case when the subselect returns null it will set the ave_price = to itself.

UPDATE user_stats as stats
SET ave_price = coalesce((
    SELECT AVG(l.price)
    FROM lengths as l, user_sessions as us
    WHERE l.product_type = 'car'
    AND l.session_id = us.session_id
    AND stats.user_id = us.user_id
),ave_price)

This doesn't prevent the udpate as requested, but it has a similar effect on the data.

For more info on coalesce see: PostgreSQL

To actually prevent the update you would needto add a where clause on the update and re-execute the sub query such as:

    UPDATE user_stats as stats
    SET ave_price = (
        SELECT AVG(l.price)
        FROM lengths as l, user_sessions as us
        WHERE l.product_type = 'car'
        AND l.session_id = us.session_id
        AND stats.user_id = us.user_id)
WHERE (SELECT AVG(l.price)
        FROM lengths as l, user_sessions as us
        WHERE l.product_type = 'car'
        AND l.session_id = us.session_id
        AND stats.user_id = us.user_id) is not null

Logically executing the subquery twice would impact performance twice; whereas the coalesce only requires execution once. There's always multiple ways to do things and depending on requirements, one must choose which option serves them best.

edited Dec 17, 2011 at 13:20

answered Dec 17, 2011 at 13:08

xQbert

35.5k2 gold badges46 silver badges67 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

DaveB Over a year ago

thanks, the coalesce solution will work fine (takes nearly 1 min to execute though!)

Erwin Brandstetter Over a year ago

@DaveB: The first version with coalesce is not a good solution. Results in a lot of pointless updates, takes a lot longer, leaves a lot of additional dead tuples in your table and may fire triggers ON UPDATE that should not be fired. The second version is better (though unnecessarily slow). There are better ways to do this (which made me post another answer).

xQbert Over a year ago

Yep there are lots of ways to accomplish the same task I pointeds out two, the option presented by DaveB is a 3rd and better solution

Erwin Brandstetter · Accepted Answer · 2011-12-28 22:25:35Z

1

Use a an actual subquery to select from instead of a subquery expression:

UPDATE user_stats s
SET    ave_price = x.ave_price
FROM  (
    SELECT user_id
          ,avg(l.price) AS ave_price
    FROM   lengths l
    JOIN   user_sessions us ON us.session_id = l.session_id
    WHERE  l.product_type = 'car'
    GROUP  BY us.user_id
    HAVING avg(l.price) IS NOT NULL
    ) x
WHERE x.user_id = s.user_id;

This will be faster, too. If you have a relevant proportion of user_id that exists in the table user_sessions, but not in user_stats, then the following query might be faster (while both yield the same result in every case):

UPDATE user_stats s
SET    ave_price = x.ave_price
FROM  (
    SELECT user_id
          ,avg(l.price) AS ave_price
    FROM   lengths l
    JOIN   user_stats usr USING (user_id)
    JOIN   user_sessions us ON us.session_id = l.session_id
    WHERE  l.product_type = 'car'
    GROUP  BY us.user_id
    HAVING avg(l.price) IS NOT NULL
    ) x
WHERE x.user_id = s.user_id;

The point of the second version is to exclude irrelevant rows early. The same query written with a CTE (somewhat more elegant and readable):

WITH x AS (
    SELECT user_id
          ,avg(l.price) AS ave_price
    FROM   lengths l
    JOIN   user_stats usr USING (user_id)
    JOIN   user_sessions us ON us.session_id = l.session_id
    WHERE  l.product_type = 'car'
    GROUP  BY us.user_id
    HAVING avg(l.price) IS NOT NULL
    )
UPDATE user_stats s
SET    ave_price = x.ave_price
FROM   x
WHERE  x.user_id = s.user_id;

Be advised that while CTE for SELECT queries were introduced with PostgreSQL 8.4, CTE for data modification commands were only introduced with PostgreSQL 9.1:

Allow data-modification commands (INSERT/UPDATE/DELETE) in WITH clauses

edited Dec 28, 2011 at 22:25

answered Dec 17, 2011 at 13:41

Erwin Brandstetter

669k160 gold badges1.2k silver badges1.3k bronze badges

2 Comments

Erwin Brandstetter Over a year ago

@tscho: Thank you! I amended my answer to clarify this.

tscho Over a year ago

well, thank you! I deleted my original comment suggesting 8.4 as minimum version since I could not edit it to 9.1.

Collectives™ on Stack Overflow

postgresql where condition returns at least one result

2 Answers 2

3 Comments

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

3 Comments

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related