2

Requirement :

Generic query/function to check if the value provided in a varchar column in a table is actually a number & the precision does not exceed the allowed precision.

Available values:

Table_Name, Column_Name, Allowed Precision, Allowed Scale

General advise would be to create a function & use to_number() to validate the value however it won't validate the allowed length (precision-scale).

My solution:

Validate Number using Regexp NOT REGEXP_LIKE(COLUMN_NAME, '^-?[0-9.]+$')

Validate Length of left component (before decimal) (I have no idea what's its actually called) because for scale, oracle automatically rounds off if required. As the actual column is varchar i will use substr, instr to find the component on the left of decimal point.

As above Regexp allows number like 123...123124..55 I will also validate the number of decimal points. [If > 1 then error]

Query to find invalid number's:

Select * From Table_Name 
Where
(NOT REGEXP_LIKE(COLUMN_NAME, '^-?[0-9.]+$')
OR
Function_To_Fetch_Left_Component(COLUMN_NAME) > (Precision-Scale)
/* Can use regexp_substr now but i already had a function for that */
OR
LENGTH(Column_Name) - LENGTH(REPLACE(Column_Name,'.','')) > 1
/* Can use regexp_count aswell*/)

I was happy & satisfied with my solution until a column with only '.' value escaped my check and I saw the limitation of my checks. Although adding another check to validate this as well will solve my problem the solution as a whole looks very inefficient to me.

I will really appreciate a better solution [in any way].

Thanks in advance.

2 Answers 2

2

Look for:

  • One-or-more digits optionally followed by a decimal point and zero-or-more digits; or
  • A leading decimal point (no preceding unit digit) and then one or more (decimal) digits.

Like this:

Select *
From   Table_Name 
Where  NOT REGEXP_LIKE(COLUMN_NAME, '^[+-]?(\d+(\.\d*)?|\.\d+)$')

If you do not want zero-padded values in the number string then:

Select *
From   Table_Name 
Where  NOT REGEXP_LIKE(COLUMN_NAME, '^[+-]?(([1-9]\d*|0)(\.\d*)?|\.\d+)$')

With precision and scale (assuming it works as per a NUMBER( precision, scale ) data type and scale < precision):

Select *
From   Table_Name 
Where  NOT REGEXP_LIKE(COLUMN_NAME, '^[+-]?(\d{1,'||(precision-scale)||'}(\.\d{0,'||scale||'})?|\.\d{1,'||scale||'})$')

or, for non-zero-padded numbers with precision and scale:

Select *
From   Table_Name 
Where  NOT REGEXP_LIKE(COLUMN_NAME, '^[+-]?(([1-9]\d{0,'||(precision-scale-1)||'}|0)(\.\d{0,'||scale||'})?|\.\d{1,'||scale||'})$')

or, for any precision and scale:

Select *
From   Table_Name 
Where  NOT REGEXP_LIKE(
             COLUMN_NAME,
             CASE
               WHEN scale <= 0
               THEN '^[+-]?(\d{1,'||precision||'}0{'||(-scale)||'})$'
               WHEN scale < precision
               THEN '^[+-]?(\d{1,'||(precision-scale)||'}(\.\d{0,'||scale||'})?|\.\d{1,'||scale||'})$'
               WHEN scale >= precision
               THEN '^[+-]?(0(\.0{0,'||scale||'})?|0?\.0{'||(scale-precision)||'}\d{1,'||precision||'})$'
             END
           )
Sign up to request clarification or add additional context in comments.

12 Comments

Just for my understanding ^ - Begining... [+-]? - 0 or 1 occurrence of either + or -... ([1-9]\d*|0) - starting with 1-9 followed by 0 or 1 occurrence of digits or a zero so that 1 zero is allowed however multiple are not..... (\.\d*)? - 0 or 1 occurrence of 1 decimal + digits.... OR.... decimal followed by 1 or more digits... $ - End.... Thank you for your time.
@pOrinG [1-9]\d* is starting with the digit 1-9 then followed by zero-or-more of any digit (0-9) so ([1-9]\d*|0) is either the [1-9]\d* pattern or a single 0 so that the integer part can be any number of digits but must start with a non-zero digit unless it is the single digit 0. The bit about decimals - yes.
Thank you so much.
Can you please clarify how this works with the variable precision and scale that the OP wanted?
@MatthewMcPeak Updated
|
0

The precision means that you want at most allowed_precision digits in the number (strictly speaking, not counting leading zeros, but I'll ignore that). The scale means that at most allowed_scale can be after the decimal point.

This suggests a regular expression such as:

[-]?[0-9]{1,<before>}[.]?[0-9]{0,<after>}

You can construct the regular expression:

NOT REGEXP_LIKE(COLUMN_NAME,
                REPLACE(REPLACE('[-]?[0-9]{1,<before>}[.]?[0-9]{0,<after>}', '<before>', allowed_precision - allowed_scale
                               ), '<after>', allowed_scale)

Now, variable regular expressions are highly inefficient. You can do the logic using like and other functions as well. I think the conditions are:

(column_name not like '%.%.%' and
 column_name not like '_%-%' and
 translate(column_name, '0123456789-.x', 'x') is null and
 length(translate(column_name, '-.x', 'x') <= allowed_precision and
 length(translate(column_name, '-.x', 'x') >= 1 and
 instr(translate(column_name, '-.x', 'x'), '.') <= allowed_precision - allowed_scale
)

4 Comments

Hi, The suggestion REGEXP [-]?[0-9]{1,<before>}[.]?[0-9]{0,<after>} does not work as intended because if the decimal is missing it accepts a total of before + after digits which exceeds the allowed value on the left side of the decimal. Eg for number(4,2) the query accepts 1111 which will be rejected by oracle later. I would really appreciate if you explain translate a little bit more because I am not able to understand the like queryyy with translate.
If the use of this condition translate(column_name, '0123456789-.x', 'x') is null is to find if there is any character apart from the one mentioned then the syntax should be translate(column_name, 'x0123456789-.', 'x') is null else it will replace 0 with x and show instead of replacing all digits with ''/blank/empty. Please advise.
Your final hint to drop the regexp made me create the following query which decreases the execution time by more than half in my tests!!! with x as (Select '012341' temp_test from dual) Select * From x WHERE (Translate(temp_test,'x0123456789-.','x') is not null OR temp_test = '.' OR temp_test = '-' OR temp_test like '%-%-%' OR temp_test like '%.%.%' OR Instr(temp_test||'.','.')-1 > 5 ) ;I hope i covered all the checks required as I am particularly not worried about accepting scientific notations or exceeding precision becoz of decimal component as oracle will tc.
Updated Query: with x as (Select '012341' temp_test from dual) Select * From x WHERE (Translate(temp_test,'x0123456789-.','x') is not null OR temp_test = '.' OR temp_test = '-' OR temp_test = '-.' OR Instr(temp_test,'-') > 1 OR temp_test like '%-%-%' OR temp_test like '%.%.%' OR Instr(temp_test||'.','.')-1 > 5 ) ;

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.