1

I am loading data from a csv file into a table using sqlldr. There is one column which is not present in every row of the csv file. The data needed to populate this column is present in one of the other columns of the row. I need to split (split(.) )that column's data and populate into that column.

Like:-

 column1:- abc.xyz.n  

So the unknown column(column2) should be

 column2:- xyz

Also, there is another column which is present in the row but it's not what I want to input into the table. It is also needed to be populated from column1. But there are around 50 if-else cases in that. Is decode preferable to do this?

column1:- abc.xyz.n

Then,

column2:- hi if(column1 has 'abc')
             if(column1 has 'abd' then 'hello')

like this there are around 50 if-else cases.

Thanks for help.

1 Answer 1

2

For the first part of your question, define the column1 data in the control file as BOUNDFILLER with a name that does not match a table column name which tells sqlldr to remember it but don't use it. If you need to load it into a column, use the column name plus the remembered name. For column2, use the remembered BOUNDFILLER name in an expression where it returns the part you need (in this case the 2nd field, allowing for NULLs):

  x        boundfiller,
  column1  EXPRESSION  ":x",
  column2  EXPRESSION  "REGEXP_SUBSTR(:x, '(.*?)(\\.|$)', 1, 2, NULL, 1)"

Note the double backslash is needed else it gets removed as it gets passed to the regex engine from sqlldr and the regex pattern is altered incorrectly. A quirk I guess.

Anyway after this column1 ends up with "abc.xyz.n" and column2 gets "xyz".

For the second part of your question, you could use an expression as already shown but call a custom function you create where you pass the extracted value and it would return the searched value from a lookup table. You certainly don't want to hardcode your 50 lookup values. You could do the same thing basically in a table level trigger too. Note I show a select statement for an example only but this should be encapsulated in a function for reusability and maintainability:

Just to show you can do it:

 col2  EXPRESSION  "(select 'hello' from dual where REGEXP_SUBSTR(:x, '(.*?)(\\.|$)', 1, 2, NULL, 1) = 'xyz')"

The right way:

 col2  EXPRESSION  "(myschema.mylookupfunc(REGEXP_SUBSTR(:x, '(.*?)(\\.|$)', 1, 2, NULL, 1)))"

mylookupfunc returns the result of looking up 'xyz' in the lookup table, i.e. 'hello' as per your example.

Sign up to request clarification or add additional context in comments.

9 Comments

I will try to implement and get back. I was thinking about using case statements for second part and using SUBSTR(:column1,INSTR(:column1,'.'),INSTR(:column1,'.',-1)) for first part. Will these work too ?
I have tried your method, works great and using a function for second part has made the job faster. Thanks
@jaydeep glad it worked for you. Regarding your first comment, both could work but would be confusing and more maintenance in the long run should changes be needed. This way a non-coder could update lookup values in the table should they change for instance.
if one of my column is a#b#c#d can you help with expression to split it into four columns as a, b, c and d as done above
@Jaydeep It would be basically the same as above, just change the delimiter from a period to the pound sign, and the 4th argument to REGEXP_SUBSTR() would change for each field you want. Call it for each field.
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.