1

I have a Postgres(Version 11) table where i have two comma separated values as shown below. Can this be compared to identify which particular value is missing.

Category                                       Comparison category
2219,2220,2991,2992,3577,3617,3624,3884        2992,3617,3884
2145,2150,3594,3597,3600,3626                  2150,3594,3600,3626
2237                                           2237
2991,2992,3884                                 2991,2992,3884
2991,3884                                      2991,3884
2145,2993,3597,3631                            2993,3631
1113,2882,3490,4,4034,922,985                  2882,3490,4,4034,922,985

Expected output:

 Category                             Comparison category   Comments
    2219,2220,2991,2992..             2992,3617,3884       2219,2220,..not in category,But all Categories in comparison column is present in Category column. column     
    2145,2150,3594,3597,3600,3626     2150,3594,3600,3626. 2145,3597 not in category column,But all Categories in comparison column is present in Category column.
    2237                              2237                 Matching
    2991,2992,3884                    2991,2992,3884       Matching
    2991,3884                         2991,3884            Matching
    2145,2993,3597,3631               2993,3631            2145,3597 not in Category column,But all Categories in comparison column is present in Category column.
2
  • 1
    This would be so easy with a properly normalized data model. Do you have a chance to fix the broken database design? Commented Aug 12, 2020 at 16:09
  • @a_horse_with_no_name The data i'm getting is like this, So what you are suggesting is to split the comma separated values into rows using regex_split and compare them ? Commented Aug 12, 2020 at 16:17

1 Answer 1

3

If you can install the intarray extension, you can do this:

select category, comparison_category, 
       string_to_array(category, ',')::int[] - string_to_array(comparison_category, ',')::int[]
from the_table;

If you can't install the extension, it's bit more complicated:

select category, comparison_category, 
       (select string_agg(c,',')
        from unnest(string_to_array(category, ',')) as x(c)
        where x.c not in (select cc
                          from unnest(string_to_array(comparison_category, ',')) as t(cc))
        )
from the_table;

But the proper solution would be to fix the data model.

Sign up to request clarification or add additional context in comments.

3 Comments

I'm not seeing string_to_array in extension docs. Could you not use regexp_split_to_array(category, ',')?
@AdrianKlaver: string_to_array() is a built-in function and is typically faster than regexp_split_to_array(). The ability to use int[] - int[] is provided by the mentioned extension
Yeah, I found that out when I tried the example. Thanks.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.