2

postgreSQL question...I have an update query below which updates a column with the results from a subquery, however in some cases the subquery will return null which throws the 'not null' constraint on the column, how can I get it to NOT update if the subquery returns null?

I have tried EXISTS but this only seems to work on a WHERE clause?

UPDATE user_stats as stats
SET ave_price = (
    SELECT AVG(l.price)
    FROM lengths as l, user_sessions as us
    WHERE l.product_type = 'car'
    AND l.session_id = us.session_id
    AND stats.user_id = us.user_id
)
4
  • Important for this question: version number of PostgreSQL? Commented Dec 17, 2011 at 13:30
  • Please post your table definitions (at least the relevant columns and PK/FK constraints) and if possible provide a little script to populate the tables with test data. Commented Dec 17, 2011 at 13:33
  • @ErwinBrandstetter: I am curious to know which feature you are looking for. Is it CTEs, hence minimum PostgreSQL 8.4? Commented Dec 17, 2011 at 13:42
  • 1
    @tscho: Exactly. However, CTE for data modification commands only were introduced with PostgreSQL 9.1. Allow data-modification commands (INSERT/UPDATE/DELETE) in WITH clauses Commented Dec 17, 2011 at 14:02

2 Answers 2

4

coalesce, nvl, ifnull in most db engines will do a conditional statement that says take the first non-null value in the string in this case when the subselect returns null it will set the ave_price = to itself.

UPDATE user_stats as stats
SET ave_price = coalesce((
    SELECT AVG(l.price)
    FROM lengths as l, user_sessions as us
    WHERE l.product_type = 'car'
    AND l.session_id = us.session_id
    AND stats.user_id = us.user_id
),ave_price)

This doesn't prevent the udpate as requested, but it has a similar effect on the data.

For more info on coalesce see: PostgreSQL

To actually prevent the update you would needto add a where clause on the update and re-execute the sub query such as:

    UPDATE user_stats as stats
    SET ave_price = (
        SELECT AVG(l.price)
        FROM lengths as l, user_sessions as us
        WHERE l.product_type = 'car'
        AND l.session_id = us.session_id
        AND stats.user_id = us.user_id)
WHERE (SELECT AVG(l.price)
        FROM lengths as l, user_sessions as us
        WHERE l.product_type = 'car'
        AND l.session_id = us.session_id
        AND stats.user_id = us.user_id) is not null

Logically executing the subquery twice would impact performance twice; whereas the coalesce only requires execution once. There's always multiple ways to do things and depending on requirements, one must choose which option serves them best.

Sign up to request clarification or add additional context in comments.

3 Comments

thanks, the coalesce solution will work fine (takes nearly 1 min to execute though!)
@DaveB: The first version with coalesce is not a good solution. Results in a lot of pointless updates, takes a lot longer, leaves a lot of additional dead tuples in your table and may fire triggers ON UPDATE that should not be fired. The second version is better (though unnecessarily slow). There are better ways to do this (which made me post another answer).
Yep there are lots of ways to accomplish the same task I pointeds out two, the option presented by DaveB is a 3rd and better solution
1

Use a an actual subquery to select from instead of a subquery expression:

UPDATE user_stats s
SET    ave_price = x.ave_price
FROM  (
    SELECT user_id
          ,avg(l.price) AS ave_price
    FROM   lengths l
    JOIN   user_sessions us ON us.session_id = l.session_id
    WHERE  l.product_type = 'car'
    GROUP  BY us.user_id
    HAVING avg(l.price) IS NOT NULL
    ) x
WHERE x.user_id = s.user_id;

This will be faster, too. If you have a relevant proportion of user_id that exists in the table user_sessions, but not in user_stats, then the following query might be faster (while both yield the same result in every case):

UPDATE user_stats s
SET    ave_price = x.ave_price
FROM  (
    SELECT user_id
          ,avg(l.price) AS ave_price
    FROM   lengths l
    JOIN   user_stats usr USING (user_id)
    JOIN   user_sessions us ON us.session_id = l.session_id
    WHERE  l.product_type = 'car'
    GROUP  BY us.user_id
    HAVING avg(l.price) IS NOT NULL
    ) x
WHERE x.user_id = s.user_id;

The point of the second version is to exclude irrelevant rows early. The same query written with a CTE (somewhat more elegant and readable):

WITH x AS (
    SELECT user_id
          ,avg(l.price) AS ave_price
    FROM   lengths l
    JOIN   user_stats usr USING (user_id)
    JOIN   user_sessions us ON us.session_id = l.session_id
    WHERE  l.product_type = 'car'
    GROUP  BY us.user_id
    HAVING avg(l.price) IS NOT NULL
    )
UPDATE user_stats s
SET    ave_price = x.ave_price
FROM   x
WHERE  x.user_id = s.user_id;

Be advised that while CTE for SELECT queries were introduced with PostgreSQL 8.4, CTE for data modification commands were only introduced with PostgreSQL 9.1:

Allow data-modification commands (INSERT/UPDATE/DELETE) in WITH clauses

2 Comments

@tscho: Thank you! I amended my answer to clarify this.
well, thank you! I deleted my original comment suggesting 8.4 as minimum version since I could not edit it to 9.1.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.