0

I want to generate random numbers in PostgreSQL just like I have done in MySQL like below. I want to do so in a Postgres function.

MySQL:

DROP PROCEDURE IF EXISTS Generate_random;
DELIMITER $$
CREATE PROCEDURE Generate_random()
BEGIN
    Drop table if exists aa_dev.`Agents`;
    CREATE TABLE aa_dev.`Agents`(AgentID int PRIMARY KEY);

    SET @first = 1;
    SET @last = 1000;

    WHILE(@first <= @last) Do
        INSERT INTO aa_dev.`Agents` VALUES(FLOOR(RAND()*(2900000-2800000+1)+2800000))
                                          ON DUPLICATE KEY UPDATE AgentID = FLOOR(RAND()*(2900000-2800000+1)+2800000);
        IF ROW_COUNT() = 1 THEN
            SET @first = @first + 1;
        END IF;
    END WHILE;
END$$


DELIMITER ;

CALL Generate_random();

I have so far generated random numbers in Postgres but they are getting repeated in the column. Please tell me how can I achieve the above MySQL code in PostgreSQL.

drop function if exists aa_dev.rand_cust(low INT, high INT, total INT);
CREATE OR REPLACE FUNCTION aa_dev.rand_cust(low INT ,high INT, total INT)
  RETURNS TABLE (Cust_id  int) AS
$$
declare

counter int := 0;
rand int := 0;


begin
------------------- Creating a customer table with Cust_id----------------------------
    DROP TABLE IF EXISTS aa_dev.Customer;

    CREATE TABLE IF NOT EXISTS aa_dev.Customer (
    Cust_id INT
    );
 --------------------- Loop to insert random -----------------------
    while counter < total loop
        rand = floor(random()* (high-low + 1) + low);
        Insert into aa_dev.Customer (Cust_id) values(rand);
        counter := counter + 1;
    end loop;

    return query
    select *
    from aa_dev.customer;
end
$$
LANGUAGE plpgsql;

select * from aa_dev.rand_cust(1, 50, 100);
3
  • mysql code is totally different from the postgresql code. at least, try to do the same code. furthermore, the random code in postgresql will never be unique since you generate 100 numbers, randomizing only 50. Commented Jan 7, 2021 at 15:15
  • @PauloPereira They both are different because I could not achieve the Mysql exact code in PostgreSQL and that's the point of posting the question. I tried On Conflict upsert but that did not work as it was giving some error. Commented Jan 7, 2021 at 16:52
  • I suggest you take a look at Migrate your mindset too. At a minimum your parameters should be (1, 100001, 100). Then you need to handle duplicates as a Postgres exception - not complain it not the same as mysql. Hint: put your insert in a nested block. Commented Jan 8, 2021 at 3:42

2 Answers 2

1

For Postgres you've asked for 100 numbers between 1 and 50 - there will naturally be duplicates!

The MySQL code has a much wider range of possible values (100000) and only 1000 of them are sampled. Also the MySQL code generates random numbers until there is no key error, i.e. there are no duplicates in the column.

So, for Postgres, you could try checking for duplicates and retrying if found. Making the column unique will prevent duplicate insertion, but you have to handle it.

Also, a sample size that is larger than the number of values is required. Be careful with the retries, don't replicate the MySQL example. If the sample size is smaller than the required count, the loop will never terminate.


Update

Here is a function that will generate unique random numbers within a range and populate a table with them:

DROP FUNCTION IF EXISTS rand_cust (low INT, high INT, total INT);
CREATE OR REPLACE FUNCTION rand_cust (low INT, high INT, total INT) 
RETURNS TABLE (Cust_id INT) 
AS 
$$ 
BEGIN
------------------- Creating a customer table with Cust_id----------------------------
    DROP TABLE IF EXISTS Customer;
    CREATE TABLE IF NOT EXISTS Customer(Cust_id INT);

    RETURN query
    INSERT INTO Customer(Cust_id)
    SELECT *
    FROM generate_series(low, high)
    ORDER BY random() LIMIT total
    RETURNING -- returns the id's you generated
        Customer.Cust_id;

END $$ 
LANGUAGE plpgsql;

SELECT *
FROM rand_cust(1000, 2000, 100);  -- 100 unique numbers between 1000 and 2000 inclusive

Note that this will not be able to generate more numbers than the sample size, e.g. you can't generate 100 numbers between 1 and 50, only a maximum of 50. That's a consequence of the uniqueness requirement. The LIMIT clause will not cause errors, but you could add code to check that (hi - low) >= total before attempting the query.

If you'd prefer a simple function to generate n random unique numbers:

DROP FUNCTION IF EXISTS sample(low INT, high INT, total INT);
CREATE OR REPLACE FUNCTION sample(low INT, high INT, total INT) 
RETURNS TABLE (Cust_id INT) 
AS 
$$ 
BEGIN
    RETURN query
    SELECT *
    FROM generate_series(low, high)
    ORDER BY random() LIMIT total;  
END $$ 
LANGUAGE plpgsql;

-- create a table of unique random values
SELECT INTO Customer FROM sample(100, 200, 10);
Sign up to request clarification or add additional context in comments.

3 Comments

Yes the range will produce duplicates, I will take care of the range now. Thanks. I changed the range to 1-150 and 100 rows, still, it gives duplicate numbers, can you tell me how can I use On conflict with it?
@Chloe: neither of the functions should produce duplicates - all my testing indicates otherwise.
@Chloe: demo on sqlfiddle: sqlfiddle.com/#!17/b36f5a/8
0

As said before, you have a range between 1 and 50 and you want to create 100 records. That will never be unique. And your query doesn't ask for unique values anyway, so even with a million records you can have duplicates.

But, your code can be much simpler as well, without a loop and just a single query:

DROP FUNCTION IF EXISTS aa_dev.rand_cust ( low INT, high INT, total INT );
CREATE OR REPLACE FUNCTION aa_dev.rand_cust ( low INT, high INT, total INT ) 
RETURNS TABLE ( Cust_id INT ) 
AS 
$$ 
BEGIN
------------------- Creating a customer table with Cust_id----------------------------
    DROP TABLE IF EXISTS aa_dev.Customer;
    CREATE TABLE IF NOT EXISTS aa_dev.Customer ( Cust_id INT );
--------------------- No Loop to insert random -----------------------

    RETURN query
    INSERT INTO aa_dev.Customer ( Cust_id )
    SELECT FLOOR ( random( ) * ( high - low + 1 ) + low ) -- no uniqueness!
    FROM    generate_series(1, total) -- no loop needed
    RETURNING -- returns the id's you generated
        Customer.Cust_id;
    

END $$ 
LANGUAGE plpgsql;

SELECT
    * 
FROM
    aa_dev.rand_cust ( 1, 50, 100 );

3 Comments

How would you handle duplicate prevention?
At least a constraint on the table. But why do you create id's like this? It just looks like some sequence, and that's standard within PostgreSQL.
This is helpful as it has eliminated the loop, but the problem for non-unique values is still there even if I choose a wider range.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.