6

I have foo table and would like to set bar column to a random string. I've got the following query:

update foo
set bar = select string_agg(substring('0123456789bcdfghjkmnpqrstvwxyz', round(random() * 30)::integer, 1), '')
          from generate_series(1, 9);

But it generates the random string once and reuse it for all rows. How can I make it to generate one random string for each row?

I know I can make it a function like this:

create function generate_bar() returns text language sql as $$
  select string_agg(substring('0123456789bcdfghjkmnpqrstvwxyz', round(random() * 30)::integer, 1), '')
  from generate_series(1, 9)
$$;

and then call the function in the update query. But I'd prefer to do it without a function.

4
  • 1
    Could you please refer this link Commented Jan 4, 2021 at 10:20
  • 2
    I'm not following. Why should I refer to the link? Commented Jan 4, 2021 at 10:24
  • 1
    Because using a custom function is the cleanest way to do this. Commented Jan 4, 2021 at 14:45
  • Right. Created as a function. Commented Jan 4, 2021 at 15:06

4 Answers 4

3

For a random mixed-case numeric-inclusive string containing up to 32 characters use:

UPDATE "foo" SET "bar"= substr(md5(random()::text), 0, XXX);

and replace XXX with the length of desired string plus one. To replace all with length 32 strings, Example:

UPDATE "foo" SET "bar"= substr(md5(random()::text), 0, 33);

14235ccd21a408149cfbab0a8db19fb2 might be a value that fills one of the rows. Each row will have a random string but not guaranteed to be unique.

For generating strings with more than 32 characters

Just combine the above with a CONCAT

Sign up to request clarification or add additional context in comments.

Comments

2

The problem is that the Postgres optimizer is just too smart and deciding that it can execute the subquery only once for all rows. Well -- it is really missing something obvious -- the random() function makes the subquery volatile so this is not appropriate behavior.

One way to get around this is to use a correlated subquery. Here is an example:

update foo
    set bar = array_to_string(array(select string_agg(substring('0123456789bcdfghjkmnpqrstvwxyz', round(random() * 30)::integer, 1), '')
                                    from generate_series(1, 9)
                                    where foo.bar is distinct from 'something'
                                   ), '');

Here is a db<>fiddle.

Comments

0

Not as good as the answer, but if you want to generate a random string with few letters, you could also use:

UPDATE foo
    SET bar = CONCAT(
        SUBSTRING('abcdefghijklmnopqrstuvwxyz', round(random() * 26)::integer + 1, 1),
        SUBSTRING('abcdefghijklmnopqrstuvwxyz', round(random() * 26)::integer + 1, 1),
        SUBSTRING('abcdefghijklmnopqrstuvwxyz', round(random() * 26)::integer + 1, 1))
        );

Comments

0

Here's a sane function that picks from allowed characters:

CREATE OR REPLACE FUNCTION random_string(int) RETURNS TEXT as $$
select
  string_agg(substr(characters, (random() * length(characters) + 1)::integer, 1), '') as random_word
from (values('ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789')) as symbols(characters)
  join generate_series(1, $1) on 1 = 1
$$ language sql;  

Then use it as:

UPDATE mytable SET col1 = random_string(10), col2 = random_string(20);

Minimal test:

CREATE OR REPLACE FUNCTION random_string(int) RETURNS TEXT as $$
select
  string_agg(substr(characters, (random() * length(characters) + 1)::integer, 1), '') as random_word
from (values('ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789    --')) as symbols(characters)
  join generate_series(1, $1) on 1 = 1
$$ language sql;  
DROP TABLE IF EXISTS tmp;
CREATE TABLE "tmp" ("i" INTEGER, "j" INTEGER, "s" TEXT, "t" TEXT);
INSERT INTO "tmp" (i, j, s, t) SELECT i, i*2, 'a', 'b' FROM generate_series(1, 10) as s(i);
SELECT * FROM "tmp";
UPDATE "tmp" SET s = random_string(10), t = random_string(20);
SELECT * FROM "tmp";

which outputs:

CREATE FUNCTION
DROP TABLE
CREATE TABLE
INSERT 0 10
 i  | j  | s | t 
----+----+---+---
  1 |  2 | a | b
  2 |  4 | a | b
  3 |  6 | a | b
  4 |  8 | a | b
  5 | 10 | a | b
  6 | 12 | a | b
  7 | 14 | a | b
  8 | 16 | a | b
  9 | 18 | a | b
 10 | 20 | a | b
(10 rows)

UPDATE 10
 i  | j  |     s      |          t           
----+----+------------+----------------------
  1 |  2 | pqb0jVp i  | PImey082XovRskbK5mxY
  2 |  4 | DqOtVlf5r4 | 13MPe1WAiTi4Pr pEGHK
  3 |  6 | AITONX Xzg | VTU4gKsN4fuoRR8dVb7o
  4 |  8 | PcmsD5t1g- | JV4ohJ DtKGKwc kRGJ
  5 | 10 | oJ-RtapI-q | G XBIP2UqGpxOSroY3s7
  6 | 12 | ScecWoJ6jy | JDWdjTFBm0rseuVwqdJa
  7 | 14 | 3bigPU7GHG | 1u VEgNIhXYf ZZa7z2W
  8 | 16 |   4vLHduh- | Zk20QXq1t  Jb2fevaQ 
  9 | 18 | sW t7Jzr3v | Cvr3aD wd H8jdgHvSq
 10 | 20 | F3ylfYcqe4 | 0ccHaM9XW-Qzg2tV-gI0
(10 rows)

so we see that both columns were updated with different values each time.

As per my quick benchmark at: SQL Populate table with random data it is about 10x slower than md5(random()::text.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.