0

I want to sort a table with array. The record with the most overlaps should be on top. I already have a where statement to filter the records with arrays. With the same array I want to determine the number of overlaps for sorting. Do you have an idea how the order by statement might look like ?

My Table

SELECT * FROM "nodes"
+-----------+---------------------------+
| name      |            tags           |
+-----------+---------------------------+
| Max       | ["foo", "orange", "app"]  |
| Peter     | ["foo", "bar", "baz"]     |
| Maria     | ["foo", "bar"]            |
| John      | ["apple"]                 |
+-----------+---------------------------+

Result with where

SELECT * FROM "nodes" WHERE (tags && '{"foo", "bar", "baz"}')
+-----------+---------------------------+
| name      |            tags           |
+-----------+---------------------------+
| Max       | ["foo", "orange", "app"]  |
| Peter     | ["foo", "bar", "baz"]     |
| Maria     | ["foo", "bar"]            |
+-----------+---------------------------+

Result with Order

SELECT * FROM "nodes" WHERE (tags && '{"foo", "bar", "baz"}') ORDER BY ????
+-----------+---------------------------+
| name      |            tags           |
+-----------+---------------------------+
| Peter     | ["foo", "bar", "baz"]     |
| Maria     | ["foo", "bar"]            |
| Max       | ["foo", "orange", "app"]  |
+-----------+---------------------------+
0

2 Answers 2

1

The only thing I can think of, is to create a function that computes the number of common elements:

create or replace function num_overlaps(p_one text[], p_other text[])
  returns bigint
as
$$
  select count(*)
  from (
    select *
    from unnest(p_one)
    intersect   
    select *
    from unnest(p_other)
  ) x
$$
language sql
immutable;

Then use it in the order by clause:

SELECT *
FROM nodes 
WHERE tags && '{"foo", "bar", "baz"}'
order by num_overlaps(tags, '{"foo", "bar", "baz"}') desc;

The drawback is, that you need to repeat the list of tags you are testing for.


It's unclear to me if those values are JSON arrays (because that's the syntax in the sample data) or native Postgres arrays (because of the && operator which doesn't work for JSON arrays) - if you are using jsonb you can replace unnest() with jsonb_array_elements_text()

Sign up to request clarification or add additional context in comments.

Comments

0

First of all, the arrays are needed for the identifiers of both sides of && operator, such as STRING_TO_ARRAY(translate(tags::text, '[] "', ''), ',')::text[] instead of tags and STRING_TO_ARRAY('foo,bar,baz',',')) instead of '{"foo", "bar", "baz"}' pattern, respectively.

Then, you can unnest array elements for tags column by using JSON_ARRAY_ELEMENTS() function in order to count the occurence of each elements of returned value columns within '{"foo", "bar", "baz"}' pattern through use of STRPOS() and SIGN() functions together with SUM() aggregation :

SELECT name, tags::text
  FROM "nodes" 
 CROSS JOIN JSON_ARRAY_ELEMENTS(tags) AS js
 WHERE ( STRING_TO_ARRAY(translate(tags::text, '[] "', ''), ',')::text[] 
      && STRING_TO_ARRAY('foo,bar,baz',','))
 GROUP BY name, tags::text    
 ORDER BY SUM( SIGN( STRPOS('{"foo", "bar", "baz"}'::text,value::text) ) ) DESC

But, you may have repeated elements within tags column. In this case the above query fails. So, I suggest to use this one below containing rows eliminated by DISTINCT keyword :

SELECT name, tags 
  FROM
  (
   SELECT DISTINCT name, tags::text, STRPOS('{"foo", "bar", "baz"}'::text,value::text)
     FROM "nodes" 
    CROSS JOIN JSON_ARRAY_ELEMENTS(tags) AS js
    WHERE ( STRING_TO_ARRAY(translate(tags::text, '[] "', ''), ',')::text[] 
         && STRING_TO_ARRAY('foo,bar,baz',','))
  ) n    
  GROUP BY name, tags::text    
  ORDER BY SUM( SIGN( strpos ) ) DESC

Demo

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.