3

I have a simple Postgres dataset that looks like this:

INSERT INTO mytable (day, person)
values
('Monday', 'A'),
('Monday', 'B'),
('Tuesday', 'A'),
('Thursday', 'B');

I then run a query that yields two array_aggs as follows:

SELECT *
FROM (select day as d1,
             array_agg(distinct person) as agg1
      from mytable
      group by day) AS AA
   cross join
     (select day as d2,
             array_agg(distinct person) as agg2
      from mytable
      group by day) AS BB

which yields this dataset:

Monday, {A,B}, Monday, {A,B}
Monday, {A,B}, Thursday, {B}
Monday, {A,B}, Tuesday, {A}
Thursday, {B}, Monday, {A,B}
Thursday, {B}, Thursday, {B}
Thursday, {B}, Tuesday, {A}
Tuesday, {A}, Monday, {A,B}
Tuesday, {A}, Thursday, {B}
Tuesday, {A}, Tuesday, {A}

I would like to add a fifth column to this query that identifies that number of repeat entries in agg1 and agg2 across each row.

So for example, the first row would be 2 and the second row would be 1. I was hoping to do it as follows, but this gives me a ambiguous syntax error:

SELECT *, count(select unnest(agg1) intersect select unnest(agg2))
FROM (select day as d1,
             array_agg(distinct person) as agg1
      from mytable
      group by day) AS AA
   cross join
     (select day as d2,
             array_agg(distinct person) as agg2
      from mytable
      group by day) AS BB
0

2 Answers 2

1

Postgresql has LATERAL.

Which can be used to do something with the content of fields on record level.

create table mytable (day varchar(30), person varchar(1));
INSERT INTO mytable (day, person)
values
('Monday', 'A'),
('Monday', 'B'),
('Tuesday', 'A'),
('Thursday', 'B');
SELECT *
FROM (
  select day as d1,
             array_agg(distinct person) as agg1
      from mytable
      group by day) AS AA
   cross join
     (select day as d2,
             array_agg(distinct person) as agg2
      from mytable
      group by day
) AS BB
CROSS JOIN LATERAL 
(
   SELECT COUNT(*) AS MatchingPersons
   FROM
   (
     SELECT unnest(agg1) person
     INTERSECT
     SELECT unnest(agg2)
   ) q
) lat 
d1       | agg1  | d2       | agg2  | matchingpersons
:------- | :---- | :------- | :---- | --------------:
Monday   | {A,B} | Monday   | {A,B} |               2
Thursday | {B}   | Monday   | {A,B} |               1
Tuesday  | {A}   | Monday   | {A,B} |               1
Monday   | {A,B} | Thursday | {B}   |               1
Thursday | {B}   | Thursday | {B}   |               1
Tuesday  | {A}   | Thursday | {B}   |               0
Monday   | {A,B} | Tuesday  | {A}   |               1
Thursday | {B}   | Tuesday  | {A}   |               0
Tuesday  | {A}   | Tuesday  | {A}   |               1

db<>fiddle here

Sign up to request clarification or add additional context in comments.

1 Comment

@user2242044 Thx. Btw, you can get the same result if you use that query with the count as a correlated sub-query. select *, (select count(*) ...) as matches from .... But a correlated sub-query can only return 1 column. While with LATERAL the SQL is more readable, and you could include more calculated columns.
0

Using the function from this answer, you could write:

SELECT *, array_length(array_intersect(arr1, arr2), 1) AS repeat_count
FROM /* your query */

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.