1

Please consider the below table. I am trying to retrieve only the EUR amount within the Tax strings. Some records vary more than the other in size, but the float numbers are always there.

OrderID    SKU      Price    Tax
****       ****     ****     [<TV<standard#21.0#false#21.36#EUR>VT>]
****       ****     ****     [<TV<standard#21.0#false#7.21#EUR>VT>]
****       ****     ****     [<TV<standard#17.0#false#5.17#EUR>VT>]

I wrote a regular expression that matches what I need: \d+\W\d+ returns me both float values within the string. In Oracle SQL I can simply get the second occurrence with a query like:

SELECT REGEXP_SUBSTR(column, '\d+\W\d+',1,2) FROM table

Using the above approach I retrieve 21.36, 7.21 and 5.17 for those three records.

How can I achieve this with SQL Server?

2
  • SQL Server has no native Regex support. If you need/want to use Regex in SQL Server you will need to implement CLR functions. A search for somethnig like "SQL Server Regex CLR Functions" will most certainly give you plenty of choices Commented Oct 11, 2021 at 14:53
  • While asking a question, you need to provide a minimal reproducible example: (1) DDL and sample data population, i.e. CREATE table(s) plus INSERT T-SQL statements. (2) What you need to do, i.e. logic and your code attempt implementation of it in T-SQL. (3) Desired output, based on the sample data in the #1 above. (4) Your SQL Server version (SELECT @@version;). Commented Oct 11, 2021 at 15:04

2 Answers 2

3

Obviously regex would be the likely tool of choice here. But SQL Server does not have much native regex support. Here is a pure SQL Server solution making use of PATINDEX and CHARINDEX. It is a bit verbose, but gets the job done:

SELECT
    SUBSTRING(Tax,
              CHARINDEX('#', Tax, PATINDEX('%[0-9]#%', Tax) + 3) + 1,
              CHARINDEX('#', Tax, CHARINDEX('#', Tax, PATINDEX('%[0-9]#%', Tax) + 3) + 1) -
              CHARINDEX('#', Tax, PATINDEX('%[0-9]#%', Tax) + 3) - 1)
FROM yourTable;

screen capture from demo link below

Demo

Sign up to request clarification or add additional context in comments.

Comments

1

Please try the following solution.

The approach is using XML for tokenization of the tax column. It is producing an XML like below for each row:

<root>
  <r>[&lt;TV&lt;standard</r>
  <r>21.0</r>
  <r>false</r>
  <r>21.36</r>
  <r>EUR&gt;VT&gt;]</r>
</root>

4th r element is a monetary value in question.

SQL

-- DDL and sample data population, start
DECLARE @tbl TABLE (ID INT IDENTITY PRIMARY KEY, Tax VARCHAR(MAX));
INSERT INTO @tbl (Tax) VALUES
('[<TV<standard#21.0#false#21.36#EUR>VT>]'),
('[<TV<standard#21.0#false#7.21#EUR>VT>]'),
('[<TV<standard#17.0#false#5.17#EUR>VT>]');
-- DDL and sample data population, end

DECLARE @separator CHAR(1) = '#';

SELECT t.*
    , c.value('(/root/r[4]/text())[1]', 'DECIMAL(10,2)') AS result
FROM @tbl AS t
CROSS APPLY (SELECT TRY_CAST('<root><r><![CDATA[' + 
        REPLACE(tax, @separator, ']]></r><r><![CDATA[') + 
        ']]></r></root>' AS XML)) AS t1(c);

Output

+----+-----------------------------------------+--------+
| ID |                   Tax                   | result |
+----+-----------------------------------------+--------+
|  1 | [<TV<standard#21.0#false#21.36#EUR>VT>] |  21.36 |
|  2 | [<TV<standard#21.0#false#7.21#EUR>VT>]  |   7.21 |
|  3 | [<TV<standard#17.0#false#5.17#EUR>VT>]  |   5.17 |
+----+-----------------------------------------+--------+

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.