4

I have a very simple query that is not much more complicated than:

select *
from table_name
where id = 1234

...it takes less than 50 milliseconds to run.

Took that query and put it into a function:

CREATE OR REPLACE FUNCTION pie(id_param integer)
RETURNS SETOF record AS
$BODY$
BEGIN
    RETURN QUERY SELECT *
         FROM table_name
         where id = id_param;
END
$BODY$
LANGUAGE plpgsql STABLE;

This function when executed select * from pie(123); takes 22 seconds.

If I hard code an integer in place of id_param, the function executes in under 50 milliseconds.

Why does the fact that I am using a parameter in the where statement cause my function to run slow?


Edit to add concrete example:

CREATE TYPE test_type AS (gid integer, geocode character varying(9))

CREATE OR REPLACE FUNCTION geocode_route_by_geocode(geocode_param character)
  RETURNS SETOF test_type AS
$BODY$
BEGIN
RETURN QUERY EXECUTE
    'SELECT     gs.geo_shape_id AS gid,     
        gs.geocode
    FROM geo_shapes gs
    WHERE geocode = $1
    AND geo_type = 1 
    GROUP BY geography, gid, geocode' USING geocode_param;
END;

$BODY$
  LANGUAGE plpgsql STABLE;
ALTER FUNCTION geocode_carrier_route_by_geocode(character)
  OWNER TO root;

--Runs in 20 seconds
select * from geocode_route_by_geocode('999xyz');

--Runs in 10 milliseconds
SELECT  gs.geo_shape_id AS gid,     
        gs.geocode
    FROM geo_shapes gs
    WHERE geocode = '9999xyz'
    AND geo_type = 1 
    GROUP BY geography, gid, geocode
3
  • Three orders of magnitude difference? Whoah Commented Feb 16, 2012 at 3:31
  • 1
    What happens when you use LANGUAGE SQL instead? Commented Feb 16, 2012 at 3:38
  • @DanielLyons using LANGUAGE SQL turned out to be the trick. Thanks! Commented Feb 17, 2012 at 3:32

1 Answer 1

9

Update in PostgreSQL 9.2

There was a major improvement, I quote the release notes here:

Allow the planner to generate custom plans for specific parameter values even when using prepared statements (Tom Lane)

In the past, a prepared statement always had a single "generic" plan that was used for all parameter values, which was frequently much inferior to the plans used for non-prepared statements containing explicit constant values. Now, the planner attempts to generate custom plans for specific parameter values. A generic plan will only be used after custom plans have repeatedly proven to provide no benefit. This change should eliminate the performance penalties formerly seen from use of prepared statements (including non-dynamic statements in PL/pgSQL).


Original answer for PostgreSQL 9.1 or older

A plpgsql functions has a similar effect as the PREPARE statement: queries are parsed and the query plan is cached.

The advantage is that some overhead is saved for every call.
The disadvantage is that the query plan is not optimized for the particular parameter values it is called with.

For queries on tables with even data distribution, this will generally be no problem and PL/pgSQL functions will perform somewhat faster than raw SQL queries or SQL functions. But if your query can use certain indexes depending on the actual values in the WHERE clause or, more generally, chose a better query plan for the particular values, you may end up with a sub-optimal query plan. Try an SQL function or use dynamic SQL with EXECUTE to force a the query to be re-planned for every call. Could look like this:

CREATE OR REPLACE FUNCTION pie(id_param integer)
RETURNS SETOF record AS
$BODY$
BEGIN        
    RETURN QUERY EXECUTE
        'SELECT *
         FROM   table_name
         where  id = $1'
    USING id_param;
END
$BODY$
LANGUAGE plpgsql STABLE;

Edit after comment:

If this variant does not change the execution time, there must be other factors at play that you may have missed or did not mention. Different database? Different parameter values? You would have to post more details.

I add a quote from the manual to back up my above statements:

An EXECUTE with a simple constant command string and some USING parameters, as in the first example above, is functionally equivalent to just writing the command directly in PL/pgSQL and allowing replacement of PL/pgSQL variables to happen automatically. The important difference is that EXECUTE will re-plan the command on each execution, generating a plan that is specific to the current parameter values; whereas PL/pgSQL normally creates a generic plan and caches it for re-use. In situations where the best plan depends strongly on the parameter values, EXECUTE can be significantly faster; while when the plan is not sensitive to parameter values, re-planning will be a waste.

Sign up to request clarification or add additional context in comments.

5 Comments

...and I was just wondering the difference between RETURN QUERY and RETURN QUERY EXECUTE. Good answer, thanks :)
This did not seem to help the execution time.
Added concrete example in question.
After doing some more experimenting, I changed my function to use LANGUAGE 'sql' instead of plpgsql. The function now runs in the 10ms range.
After consulting the pgsql-performance list I got another correct answer to my problem. The column being matched is defined as character varying(9) in my database table, but the input parameter to the function is defined as character. Changing the function parameter type to match the column definition made my function perform just as well as via command line. Erwin Brandstetter's answer is also technically correct so I'm marking it as answer.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.