0

A colleague of mines encountered this problem while working on a COBOL program and eventually solved it at the application level. Still I am curious if it is possible to solve it on the data access level, with SQL. This is somehow related to this other question, but I'd like to use only ANSI SQL.

I'm looking for a single SQL select query that acts on a VARCHAR field that contains variable length CSV rows. The purpose of the query is to split every CSV field in its own result set row.

Here is an example with schema and data (and here is the fiddle):

CREATE TABLE table1 (`field` varchar(100));

 INSERT INTO table1 (`field`)
      VALUES
             ('Hello,world,!')    ,
             ('Haloa,!')          ,
             ('Have,a,nice,day,!');

Here is the output I'd like to have from the query:

Hello
world
!
Haloa
!
Have
a
nice
day
!

The CSV separator used is the comma, and for now I wouldn't worry about escaping.

10
  • Depends on your DBMS. There are some implementation of a split function (much like in many languages) and you would need to invoke if for each table record (field) with , as separator. If not available in your DBMS, you can write a simple function that would return a simple array/cursor/result set. Commented Sep 7, 2016 at 12:18
  • Tag the dbms you're using. Commented Sep 7, 2016 at 13:04
  • 1
    Not storing comma-separated values into an SQL table in the first place prevents many headaches. You seem to have control over the database - design it properly instead of wasting your time with the creation of workarounds for completely avoidable problems. Commented Sep 7, 2016 at 13:17
  • 1
    @Tomalak you couldn't be more right, but no, I don't have any control over the database, that is a 30 years old bank database. In these kind of systems it happens that poor design choices have layered over the years. Replacing it with a more proper table design would cost too much and nobody wants to be responsible for that much of an effort, therefor you can just go along with it. Commented Sep 7, 2016 at 13:26
  • 1
    Yes, unnest() is ANSI SQL and will turn an array in a list of rows. The question is: how to turn a CSV string into an array. Commented Sep 7, 2016 at 13:45

3 Answers 3

4

As far as I can tell, this is ANSI SQL:

with recursive word_list (field, word, rest, field_id, level) as (             
  select field, 
         substring(field from 1 for position(',' in field) - 1) as word,
         substring(field from position(',' in field) + 1) as rest,
         row_number() over () as field_id,
         1
  from table1
  union all 
  select c.field, 
         case 
            when position(',' in p.rest) = 0 then p.rest
            else substring(p.rest from 1 for position(',' in p.rest) - 1) 
         end as word,
         case 
            when position(',' in p.rest) = 0 then null
            else substring(p.rest from position(',' in p.rest) + 1) 
         end as rest, 
         p.field_id,
         p.level + 1
  from table1 as c
    join word_list p on c.field = p.field and position(',' in p.rest) >= 0
) 
select word
from word_list
order by field_id, level;
     

This assumes that the values in field are unique.

Here is a running example

Sign up to request clarification or add additional context in comments.

1 Comment

this is amazing, here is an overview of the databases in which this could work: en.wikipedia.org/wiki/…
0

In Oracle you could use something like that (maybe it's not the most elegant but it gives the result you want) - simply replace tab with your_table_name:

WITH 
tab2 AS (
SELECT t.field,
       CASE WHEN INSTR(t.field, ',', 1, 1) > 0 AND regexp_count(t.field,',') >= 1 THEN INSTR(t.field, ',', 1, 1) ELSE NULL END AS pos1,
       CASE WHEN INSTR(t.field, ',', 1, 2) > 0 AND regexp_count(t.field,',') >= 2 THEN INSTR(t.field, ',', 1, 2) ELSE NULL END AS pos2,
       CASE WHEN INSTR(t.field, ',', 1, 3) > 0 AND regexp_count(t.field,',') >= 3 THEN INSTR(t.field, ',', 1, 3) ELSE NULL END AS pos3,
       CASE WHEN INSTR(t.field, ',', 1, 4) > 0 AND regexp_count(t.field,',') >= 4 THEN INSTR(t.field, ',', 1, 4) ELSE NULL END AS pos4,
       CASE WHEN INSTR(t.field, ',', 1, 5) > 0 AND regexp_count(t.field,',') >= 5 THEN INSTR(t.field, ',', 1, 5) ELSE NULL END AS pos5,
       CASE WHEN INSTR(t.field, ',', 1, 6) > 0 AND regexp_count(t.field,',') >= 6 THEN INSTR(t.field, ',', 1, 6) ELSE NULL END AS pos6
FROM tab t
),
tab3 AS (
SELECT SUBSTR(tt.field,1,tt.pos1-1) AS col1,
       SUBSTR(tt.field,tt.pos1+1, CASE WHEN tt.pos2 IS NULL THEN LENGTH(tt.field) - tt.pos1 ELSE tt.pos2 - tt.pos1 - 1 END) AS col2,
       SUBSTR(tt.field,tt.pos2+1, CASE WHEN tt.pos3 IS NULL THEN LENGTH(tt.field) - tt.pos2 ELSE tt.pos3 - tt.pos2 - 1 END) AS col3,
       SUBSTR(tt.field,tt.pos3+1, CASE WHEN tt.pos4 IS NULL THEN LENGTH(tt.field) - tt.pos3 ELSE tt.pos4 - tt.pos3 - 1 END) AS col4,
       SUBSTR(tt.field,tt.pos4+1, CASE WHEN tt.pos5 IS NULL THEN LENGTH(tt.field) - tt.pos4 ELSE tt.pos5 - tt.pos4 - 1 END) AS col5,
       SUBSTR(tt.field,tt.pos5+1, CASE WHEN tt.pos6 IS NULL THEN LENGTH(tt.field) - tt.pos5 ELSE tt.pos6 - tt.pos5 - 1 END) AS col6
       ,ROWNUM AS r
FROM tab2 tt
),
tab4 AS (
SELECT ttt.col1 AS col FROM tab3 ttt WHERE r  = 1
UNION ALL SELECT ttt.col2 FROM tab3 ttt WHERE r  = 1
UNION ALL SELECT ttt.col3 FROM tab3 ttt WHERE r  = 1
UNION ALL SELECT ttt.col4 FROM tab3 ttt WHERE r  = 1
UNION ALL SELECT ttt.col5 FROM tab3 ttt WHERE r  = 1
UNION ALL SELECT ttt.col6 FROM tab3 ttt WHERE r  = 1
UNION ALL
SELECT ttt.col1 FROM tab3 ttt WHERE r  = 2
UNION ALL SELECT ttt.col2 FROM tab3 ttt WHERE r  = 2
UNION ALL SELECT ttt.col3 FROM tab3 ttt WHERE r  = 2
UNION ALL SELECT ttt.col4 FROM tab3 ttt WHERE r  = 2
UNION ALL SELECT ttt.col5 FROM tab3 ttt WHERE r  = 2
UNION ALL SELECT ttt.col6 FROM tab3 ttt WHERE r  = 2
UNION ALL
SELECT ttt.col1 FROM tab3 ttt WHERE r  = 3
UNION ALL SELECT ttt.col2 FROM tab3 ttt WHERE r  = 3
UNION ALL SELECT ttt.col3 FROM tab3 ttt WHERE r  = 3
UNION ALL SELECT ttt.col4 FROM tab3 ttt WHERE r  = 3
UNION ALL SELECT ttt.col5 FROM tab3 ttt WHERE r  = 3
UNION ALL SELECT ttt.col6 FROM tab3 ttt WHERE r  = 3
UNION ALL
SELECT ttt.col1 FROM tab3 ttt WHERE r  = 4
UNION ALL SELECT ttt.col2 FROM tab3 ttt WHERE r  = 4
UNION ALL SELECT ttt.col3 FROM tab3 ttt WHERE r  = 4
UNION ALL SELECT ttt.col4 FROM tab3 ttt WHERE r  = 4
UNION ALL SELECT ttt.col5 FROM tab3 ttt WHERE r  = 4
UNION ALL SELECT ttt.col6 FROM tab3 ttt WHERE r  = 4
UNION ALL
SELECT ttt.col1 FROM tab3 ttt WHERE r  = 5
UNION ALL SELECT ttt.col2 FROM tab3 ttt WHERE r  = 5
UNION ALL SELECT ttt.col3 FROM tab3 ttt WHERE r  = 5
UNION ALL SELECT ttt.col4 FROM tab3 ttt WHERE r  = 5
UNION ALL SELECT ttt.col5 FROM tab3 ttt WHERE r  = 5
UNION ALL SELECT ttt.col6 FROM tab3 ttt WHERE r  = 5
)
SELECT col
FROM tab4
WHERE col IS NOT NULL

it gives me the result:

1   Hello
2   world
3   !
4   Haloa
5   !
6   Have
7   a
8   nice
9   day
10  !

Comments

0

FWIW, here's another Oracle specific method. Maybe it will give an idea or help a future searcher at least.

SQL> with tbl(rownbr, col1) as (
           select 1, 'Hello,world,!'     from dual union
           select 2, 'Haloa,!'           from dual union
           select 3, 'Have,a,nice,day,!' from dual
      )
   SELECT rownbr, column_value substring_nbr,
        regexp_substr(col1, '(.*?)(,|$)', 1, column_value, null, 1)
    FROM tbl,
                TABLE(
                  CAST(
                    MULTISET(SELECT LEVEL
                                FROM dual
                                CONNECT BY LEVEL <= REGEXP_COUNT(col1, ',')+1
                            ) AS sys.OdciNumberList
                  )
                )
      order by rownbr, substring_nbr;

    ROWNBR SUBSTRING_NBR REGEXP_SUBSTR(COL
---------- ------------- -----------------
         1             1 Hello
         1             2 world
         1             3 !
         2             1 Haloa
         2             2 !
         3             1 Have
         3             2 a
         3             3 nice
         3             4 day
         3             5 !

10 rows selected.

SQL>

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.