1

So I am a student currently learning about PostgreSQL. I am trying to figure out the way, how to randomly seed data. I have 10M users and 100 stocks.

Currently my tables will look like:

CREATE TABLE user (
  user_id INTEGER NOT NULL,   
  amount_of_stocks [][] array, -- this is just assumption
  PRIMARY KEY (user_id)
);

CREATE TABLE stock (
  stock_id INTEGER NOT NULL,   
  amount_per_stock INT,
  quantity INT
  PRIMARY KEY (stock_id)
);

How would I store 100 different stocks for each user?

4
  • Use 4 SPACE characters indent to auto-format your code snippets. I made that edit for you. And welcome to Stack Overflow. Commented Dec 18, 2019 at 1:27
  • en.wikipedia.org/wiki/First_normal_form Commented Dec 18, 2019 at 1:28
  • 2
    Your title does not seem to fit the body of your Question. Commented Dec 18, 2019 at 1:28
  • Please fix your title to match the question. Commented Dec 18, 2019 at 2:20

2 Answers 2

3

Sounds like a classical many-to-many relationship. Should not involve arrays at all. Assuming Postgres 10 or later, use something along these lines:

CREATE TABLE users (  -- "user" is a reserved word!
  user_id  int GENERATED ALWAYS AS IDENTITY PRIMARY KEY
, username text UNIQUE NOT NULL  -- or similar
);

CREATE TABLE stock (
  stock_id int GENERATED ALWAYS AS IDENTITY PRIMARY KEY
, stock    text UNIQUE NOT NULL  -- or similar
);

CREATE TABLE user_stock (
  user_id  int REFERENCES users
, stock_id int REFERENCES stock
, amount   int NOT NULL
, PRIMARY KEY (user_id, stock_id)
);

Detailed explanation:

Seed

Postgres provides generate_series() to conveniently generate rows. random() is instrumental to generate random numbers:

INSERT INTO users(username)
SELECT 'user_' || g
FROM   generate_series(1, 10000000) g; -- 10M (!) - try with just 10 first

INSERT INTO stock(stock)
SELECT 'stock_' || g
FROM   generate_series(1, 100) g;

Experiment with a small number of users first. 10M users * 100 stocks generates a billion rows. Takes some time and occupies some space.

How would I store 100 different stocks for each user?

INSERT INTO user_stock
        (user_id,   stock_id, amount)
SELECT u.user_id, s.stock_id, ceil(random() * 1000)::int
FROM   users u, stock s;  -- cross join

Every user gets 100 different stocks - though everyone gets the same set in this basic example, you did not define more closely. I added a random amount per stock between 1 and 1000.

About the cross join to produce the Cartesian product:

Sign up to request clarification or add additional context in comments.

3 Comments

Thank you for the explanation, really helpful
Hi Erwin what do you use to format your SQL statements? You do this by hand or using a tool?
@ErwinRooijakkers: I do it by hand, mostly. Seems tedious, but it helps me process, structure and understand complex queries a lot. I prefer this compact form over the often noisy format of various clients and frameworks. Actually, I like this special format for INSERT (see update) to see which expression goes into which column at a glance.
0
CREATE TABLE user (
  user_id INTEGER NOT NULL,   
  stocks text[], 
  PRIMARY KEY (user_id)
);

Store a list of the primary keys in your Stock table, so you can easily look up their values with a select statement.

If you wanted to, you could make the array two-dimensional, and store the value.. but that violates some principle, I'm sure, as you already have a table for that purpose.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.