29

I am trying to locate some problematic records in a very large Oracle table. The column should contain all numeric data even though it is a varchar2 column. I need to find the records which don't contain numeric data (The to_number(col_name) function throws an error when I try to call it on this column).

1
  • Starting with 12.2 the conversion error can be catched using the ON CONVERSION ERROR clause of to_number Commented Jan 18, 2022 at 18:39

11 Answers 11

30

I was thinking you could use a regexp_like condition and use the regular expression to find any non-numerics. I hope this might help?!

SELECT * FROM table_with_column_to_search WHERE REGEXP_LIKE(varchar_col_with_non_numerics, '[^0-9]+');
Sign up to request clarification or add additional context in comments.

1 Comment

Actually, the answer should be SELECT * FROM table_with_column_to_search WHERE NOT REGEXP_LIKE(varchar_col_with_non_numerics, '\d'); because the original answer accepts things like "123TEST".
16

To get an indicator:

DECODE( TRANSLATE(your_number,' 0123456789',' ')

e.g.

SQL> select DECODE( TRANSLATE('12345zzz_not_numberee',' 0123456789',' '), NULL, 'number','contains char')
 2 from dual
 3 /

"contains char"

and

SQL> select DECODE( TRANSLATE('12345',' 0123456789',' '), NULL, 'number','contains char')
 2 from dual
 3 /

"number"

and

SQL> select DECODE( TRANSLATE('123405',' 0123456789',' '), NULL, 'number','contains char')
 2 from dual
 3 /

"number"

Oracle 11g has regular expressions so you could use this to get the actual number:

SQL> SELECT colA
  2  FROM t1
  3  WHERE REGEXP_LIKE(colA, '[[:digit:]]');

COL1
----------
47845
48543
12
...

If there is a non-numeric value like '23g' it will just be ignored.

4 Comments

Michael, there is a slight problem with your translate if the string your checking contains a zero. TRANSLATE will turn any zeros into spaces. For example: select DECODE( TRANSLATE('123405','0123456789',' '), NULL, 'number','contains char') from dual returns "contains char"
@aiGuru I fixed that problem by adding a leading space to the second parameter of TRANSLATE. The problem is that TRANSLATE considered the zero a match because it was the first character.
This was very helpful!
14

In contrast to SGB's answer, I prefer doing the REGEXP_LIKE() defining the actual format of my data and negating that. This allows me to define values like '$DDD,DDD,DDD.DD'.

In the OPs simple scenario, it would look like...

SELECT * 
FROM table_with_column_to_search 
WHERE NOT REGEXP_LIKE(varchar_col_with_non_numerics, '^[0-9]+$');

...which finds all non-positive integers. If you want to accept negative integers also, it's an easy change, just add an optional leading minus...

SELECT * 
FROM table_with_column_to_search 
WHERE NOT REGEXP_LIKE(varchar_col_with_non_numerics, '^-?[0-9]+$');

To accept floating points...

SELECT * 
FROM table_with_column_to_search 
WHERE NOT REGEXP_LIKE(varchar_col_with_non_numerics, '^-?[0-9]+(\.[0-9]+)?$');

Same goes further with any format. Basically, you will generally already have the formats to validate input data, so when you desire to find data that does not match that format ... it's simpler to negate that format than come up with another one; which in case of SGB's approach would be a bit tricky to do if you want more than just positive integers.

1 Comment

This worked perfectly for me, I wish I could tag this as perfect / best answer, Thanks :) I used it as <script> select /*+parallel(8)+*/ colName from TblName WHERE NOT REGEXP_LIKE(colName, '^[0-9]+$') and rownum < 10; </script>
5

Starting with Oracle 12.2 the function to_number has an option ON CONVERSION ERROR clause, that can catch the exception and provide default value.

This can be used for the test of number values. Simple set NULL when the conversion fails and filer all not NULL values.

Example

with num as (
select '123' vc_col from dual union all
select '1,23'  from dual union all
select 'RV12P2000'  from dual union all
select null  from dual)
select
  vc_col 
from num
where /* filter numbers */
vc_col is not null and
to_number(vc_col DEFAULT NULL ON CONVERSION ERROR) is not null
;

VC_COL   
---------
123
1,23

Comments

4

Use this

SELECT * 
FROM TableToSearch 
WHERE NOT REGEXP_LIKE(ColumnToSearch, '^-?[0-9]+(\.[0-9]+)?$');

Comments

4

After doing some testing, i came up with this solution, let me know in case it helps.

Add this below 2 conditions in your query and it will find the records which don't contain numeric data

 and REGEXP_LIKE(<column_name>, '\D') -- this selects non numeric data
 and not REGEXP_LIKE(column_name,'^[-]{1}\d{1}') -- this filters out negative(-) values

Comments

1

From http://www.dba-oracle.com/t_isnumeric.htm

LENGTH(TRIM(TRANSLATE(, ' +-.0123456789', ' '))) is null

If there is anything left in the string after the TRIM it must be non-numeric characters.

2 Comments

This can exclude non-numeric-values like dates in format "YYYY-MM-DD" for example.
Numbers are not just sequences of the above list. They have a format. 61..01 will test as numeric using this method even though it is not a number.
0

I've found this useful:

 select translate('your string','_0123456789','_') from dual

If the result is NULL, it's numeric (ignoring floating point numbers.)

However, I'm a bit baffled why the underscore is needed. Without it the following also returns null:

 select translate('s123','0123456789', '') from dual

There is also one of my favorite tricks - not perfect if the string contains stuff like "*" or "#":

 SELECT 'is a number' FROM dual WHERE UPPER('123') = LOWER('123')

2 Comments

The "underscore trick" works only if your data does -never- contain an underscore. Because translate maps the underscore to an underscore, and maps all other numbers to NULL. This would actually work quite reliably if you use a character that is most unlikely to appear in your data.
As mentioned in my own answer, this solves it completely: TRANSLATE(replace(<char_column>,'0',''),'0123456789',' ') and does not have a speed impact
0

After doing some testing, building upon the suggestions in the previous answers, there seem to be two usable solutions.

Method 1 is fastest, but less powerful in terms of matching more complex patterns.
Method 2 is more flexible, but slower.

Method 1 - fastest
I've tested this method on a table with 1 million rows.
It seems to be 3.8 times faster than the regex solutions.
The 0-replacement solves the issue that 0 is mapped to a space, and does not seem to slow down the query.

SELECT *
FROM <table>
WHERE TRANSLATE(replace(<char_column>,'0',''),'0123456789',' ') IS NOT NULL;

Method 2 - slower, but more flexible
I've compared the speed of putting the negation inside or outside the regex statement. Both are equally slower than the translate-solution. As a result, @ciuly's approach seems most sensible when using regex.

SELECT *
FROM <table>
WHERE NOT REGEXP_LIKE(<char_column>, '^[0-9]+$');

Comments

0

You can use this one check:

create or replace function to_n(c varchar2) return number is
begin return to_number(c);
exception when others then return -123456;
end;

select id, n from t where to_n(n) = -123456;

Comments

0

enter image description hereI tray order by with problematic column and i find rows with column.

SELECT 
 D.UNIT_CODE,
         D.CUATM,
         D.CAPITOL,
          D.RIND,
          D.COL1  AS COL1


FROM
  VW_DATA_ALL_GC  D
  
  WHERE
  
   (D.PERIOADA IN (:pPERIOADA))  AND   
   (D.FORM = 62) 
   AND D.COL1 IS NOT NULL
 --  AND REGEXP_LIKE (D.COL1, '\[\[:alpha:\]\]')
 
-- AND REGEXP_LIKE(D.COL1, '\[\[:digit:\]\]')
 
 --AND REGEXP_LIKE(TO_CHAR(D.COL1), '\[^0-9\]+')
 
 
   GROUP BY 
    D.UNIT_CODE,
         D.CUATM,
         D.CAPITOL,
          D.RIND ,
          D.COL1  
         
         
        ORDER BY 
        D.COL1

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.