2

I am trying to build a function that gives back a percentage of rows that have a particular condition.

Here's my attempt:

CREATE OR REPLACE FUNCTION osm_prozent ()
RETURNS integer AS $$
DECLARE 
      a integer;
      b integer;
      ergebnis integer;
BEGIN
  a = select into a count(*)
    from osm_street;
  b = select into b count(*)
    from osm_street
    where hausnummer like '%-%'
    or hausnummer like '% %' 
    or hausnummer like '%/%' 
    or hausnummer like '%;%'
    or hausnummer like '%,%'
    or hausnummer ~ '([a-z,A-Z])';  

  ergebnis = (b/a)*100;
  return ergebnis;
END;
$$ LANGUAGE plpgsql;

Variable a is 207000 and variable b is 11000. At the end I should have something like 5.31%.

Can someone help to write this function correctly?

1
  • if you use (b/a) * 100 u will get 5.31. In your calculation b is higher than a how you will be get 6 % Commented Mar 31, 2016 at 9:10

2 Answers 2

3

You can make two clear improvements, in order of importance:

  1. Make 1 pass over the table instead of 2 passes
  2. Simplify where condition of your second query

Dealing with the 2nd item first:

The where condition can be changed to have just one comparison using just one regex test. This expression:

where hausnummer like '%-%'
or hausnummer like '% %' 
or hausnummer like '%/%'
or hausnummer like '%;%'
or hausnummer like '%,%'
or hausnummer ~ '([a-z,A-Z])'

can be expressed as simply:

where hausnummer ~ '[a-z,A-Z /;-]'

Next, to make just one pass over the table, use sum() on the condition to capture the number of hits at the same time as you capture count(*). Using the above improvement as the condition, your query can become:

select 100.0 * sum(case when hausnummer ~ '[a-z,A-Z /;-]' then 1 end) / count(*)
from osm_street

Unless you need a store procedure, you can now dispense with it entirely, as you have the result in one simple query.

If you desperately require a stored procedure (not recommended), it would be just a thin wrapper (adding no value) around the above query:

CREATE OR REPLACE FUNCTION osm_prozent () RETURNS double precision AS $$
BEGIN
  RETURN select 100.0 * sum(case when hausnummer ~ '[a-z,A-Z /;-]' then 1 end) / count(*) from osm_street;
END;
$$ LANGUAGE plpgsql;

Note that I suspect that your last condition hausnummer ~ '([a-z,A-Z])' is not what you intend. Firstly, the brackets are redundant and may be removed without changing the meaning, ie it's the same as hausnummer ~ '[a-z,A-Z]' and is true if hausnummber contains a letter or a comma.

If that is not what you intend, leave a comment to this answer explaining what you actually want.

Sign up to request clarification or add additional context in comments.

3 Comments

What the OP actually wants should never just go into a comment. It should be in the question. And I assume by "store procedure" you mean stored procedure? Postgres doesn't have stored procedures, just functions which are almost, but not quite the same.
@erwin I respect your database knowledge very much, but I think you're splitting hairs saying the PG "doesn't have stored procedures"; the doc says PL/pgSQL ... can be used to create functions and trigger procedures, and they are stored, so they are stored procedures. Also, "stored procedures" is the industry's generic term for such things. Further, there is no chance IMHO that calling these things stored procedures in the context of Postgres would be misleading or lead to confusion.
The important distinction is that Postgres functions (as opposed to real stored procedures) are always run inside a single transaction so cannot begin or commit transactions and cannot run commands like VACUUM or CREATE INDEX CONCURRENTLY (which cannot run within a transaction block). More: stackoverflow.com/a/28251922/939860
0

Your filter by regular expression mixed with several likes can be simplified to a single regular expression (also note that your regular expression adds unnecessary load by adding brackets, and the comma between a-z and A-Z is not a separator but matches a comma as well, which you also included TWICE! in your list of like comparisons). A much simpler function code would be:

CREATE OR REPLACE FUNCTION osm_prozent () RETURNS double precision AS $$
  SELECT 100. * (SELECT COUNT(*) FROM osm_street WHERE hausnummer ~* '[a-z /;,-]') / (SELECT COUNT(*) FROM osm_street);
$$ LANGUAGE sql;

2 Comments

Hi, i am getting this error:ERROR: operator does not exist: text ~ unknown LINE 2: ...(SELECT COUNT() FROM osm_street WHERE hausnummer *~ '[a-z /... ^ HINT: No operator matches the given name and argument type(s). You might need to add explicit type casts.
Sorry, typo (I had put *~ instead of ~*). I've corrected it now in the code. ~ uses a case-sensitive regular expression match, whilst ~* uses a case-insensitive regular expression match :)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.