I am following the example of how to use 'create statistics' in postgresql from this page.
CREATE TABLE tbl (
col1 int,
col2 int
);
INSERT INTO tbl SELECT i/10000, i/100000
FROM generate_series (1,10000000) s(i);
CREATE STATISTICS s1 (dependencies) on col1, col2 from tbl;
ANALYZE tbl;
SELECT stxname, stxkeys, stxdependencies
FROM pg_statistic_ext
WHERE stxname = 's1';
returns:
stxname,stxkeys,stxdependencies
s1,1 2,{"1 => 2": 1.000000}
However, I'm confused by this line
Looking at this, we can see that Postgres realizes that col1 fully determines col2 and therefore has a coefficient of 1 to capture that information.
In the example there are many records in col2 that are repeated. So how does col1 'fully determine' col2? What does this mean?