Counting the number of occurrences of a substring within a string in PostgreSQL

Question

How can I count the number of occurrences of a substring within a string in PostgreSQL?

Example:

I have a table

CREATE TABLE test."user"
(
  uid integer NOT NULL,
  name text,
  result integer,
  CONSTRAINT pkey PRIMARY KEY (uid)
)

I want to write a query so that the result contains column how many occurrences of the substring o the column name contains. For instance, if in one row, name is hello world, the column result should contain 2, since there are two o in the string hello world.

In other words, I'm trying to write a query that would take as input:

and update the result column:

I am aware of the function regexp_matches and its g option, which indicates that the full (g = global) string needs to be scanned for the presence of all occurrences of the substring).

Example:

SELECT * FROM regexp_matches('hello world', 'o', 'g');

returns

{o}
{o}

and

SELECT COUNT(*)  FROM regexp_matches('hello world', 'o', 'g');

returns

But I don't see how to write an UPDATE query that would update the result column in such a way that it would contain how many occurrences of the substring o the column name contains.

Possible duplicate of PostgreSQL count number of times substring occurs in text — Evan Carroll
– Evan Carroll, Commented Mar 10, 2017 at 1:15

Franck Dernoncourt · Accepted Answer · 2016-04-02 17:35:55Z

96

A common solution is based on this logic: replace the search string with an empty string and divide the difference between old and new length by the length of the search string

(CHAR_LENGTH(name) - CHAR_LENGTH(REPLACE(name, 'substring', ''))) 
/ CHAR_LENGTH('substring')

Hence:

UPDATE test."user"
SET result = 
    (CHAR_LENGTH(name) - CHAR_LENGTH(REPLACE(name, 'o', ''))) 
    / CHAR_LENGTH('o');

edited Apr 2, 2016 at 17:35

Franck Dernoncourt

84.7k81 gold badges374 silver badges556 bronze badges

answered Apr 2, 2016 at 17:28

dnoeth

60.5k4 gold badges45 silver badges63 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

Evan Carroll Over a year ago

This is a solid answer, and it's right. You may be interested on my write up all the methods of doing this

Aleksandr Levchuk Over a year ago

Thanks! Does anyone know, why there is no simpler way? I mean, REPLACE already goes through the trouble of scanning the whole string for all the occurrences, why not have something that does half of the work of REPLACE - just count the occurrences

dnoeth Over a year ago

@AleksandrLevchuk: Well, you can write your own User Defined Function doing this calculation, e.g. there's Oracle's REGEXP_COUNT in enterprisedb.com/docs/en/9.5/eeguide/….

Gordon Linoff · Accepted Answer · 2016-04-02 17:31:28Z

78

A Postgres'y way of doing this converts the string to an array and counts the length of the array (and then subtracts 1):

select array_length(string_to_array(name, 'o'), 1) - 1

Note that this works with longer substrings as well.

Hence:

update test."user"
    set result = array_length(string_to_array(name, 'o'), 1) - 1;

answered Apr 2, 2016 at 17:31

Gordon Linoff

1.3m62 gold badges706 silver badges857 bronze badges

5 Comments

Le Droid Over a year ago

If someone needs regexp, this solution with "regexp_split_to_array" instead of "string_to_array" works too.

Evan Carroll Over a year ago

This solution is substantially slower than @dnoeth's suggestion. I don't think it's more-Postgres-y. When things are faster and more portable in a different method, I think we call that better. =)

WebWanderer Over a year ago

@EvanCarroll Unfortunately, dnoeth's answer won't work for regex matches, since you may not know the length of the match. This answer will work for both regex matches and raw string matches. I think what we call better is the solution that works for everything you are trying to do :)

user2297550 Over a year ago

upvoted but i don't see what's "postgres'y" about it; let's not borrow pythonic characterizations :)

Gordon Linoff Over a year ago

@user2297550 . . . Most databases don't support arrays natively.

bnson · Accepted Answer · 2016-08-16 09:18:20Z

12

Other way:

UPDATE test."user" SET result = length(regexp_replace(name, '[^o]', '', 'g'));

answered Aug 16, 2016 at 9:18

bnson

2223 silver badges7 bronze badges

1 Comment

Sébastien Clément Over a year ago

This is the simplest way.

rcanpahali · Accepted Answer · 2019-04-12 13:20:20Z

9

Return count of character,

 SELECT (LENGTH('1.1.1.1') - LENGTH(REPLACE('1.1.1.1','.',''))) AS count
--RETURN COUNT OF CHARACTER '.'

edited Apr 12, 2019 at 13:20

rcanpahali

2,6432 gold badges25 silver badges40 bronze badges

answered Apr 12, 2019 at 12:43

Guilherme Passos

911 silver badge3 bronze badges

3 Comments

Steven Over a year ago

This was the most readable solution for me.

StonedTensor Over a year ago

This works fine for just one character, but the division is needed in the accepted answer for substrings longer than one character

Vérace Over a year ago

@StonedTensor - you could try with the TRANSLATE() function!

jrtc27 · Accepted Answer · 2019-04-12 13:40:18Z

7

Occcurence_Count = LENGTH(REPLACE(string_to_search,string_to_find,'~'))-LENGTH(REPLACE(string_to_search,string_to_find,''))

This solution is a bit cleaner than many that I have seen, especially with no divisor. You can turn this into a function or use within a Select.
No variables required. I use tilde as a replacement character, but any character that is not in the dataset will work.

edited Apr 12, 2019 at 13:40

jrtc27

8,5363 gold badges38 silver badges69 bronze badges

answered Apr 3, 2018 at 13:37

Robert Bondy

711 silver badge1 bronze badge

Comments

sparkle · Accepted Answer · 2021-12-09 11:52:17Z

1

SELECT array_length(string_to_array('a long name here', 'o'),1)

1 is for 1-dimension array
'o' is the occurrence you want to count

answered Dec 9, 2021 at 11:52

sparkle

7,41824 gold badges78 silver badges148 bronze badges

1 Comment

j3App Over a year ago

Clever :) To be exact, you need to subtract one at the end. E.g. on one occurance the array will have the length == 2

Collectives™ on Stack Overflow

Counting the number of occurrences of a substring within a string in PostgreSQL

6 Answers 6

3 Comments

5 Comments

1 Comment

3 Comments

Comments

1 Comment

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

6 Answers 6

3 Comments

5 Comments

1 Comment

3 Comments

Comments

1 Comment

Your Answer

Sign up or log in

Post as a guest

Linked

Related