0

I have a column of strings, called "MyStrings" like the following:

...
Foo bar Jul15 blah blah.xlsx
Choo bar Jul-15 blah far.xlsx
Star bar 10-Jul-15 blah far.xlsx
Car Star bar 10.Jul.2015 blah far.xlsx
...
...

I'd like to do string manipulation so all dates, whatever format, are not included in the results.

So the following query:

SELECT  results = <manipulated "MyStrings">
FROM    aTable

Should have these results:

...
Foo bar  blah blah.xlsx
Choo bar  blah far.xlsx
Star bar  blah far.xlsx
Car Star bar  blah far.xlsx
...
...

Is there a quick way of doing this or do I need to consider each format individually?

10
  • In your example the month is always 3 characters, so maybe you could use that. Search for 'words' (units delimited by spaces), containing 3 letters and (some?) digits. Commented Nov 25, 2015 at 16:25
  • 2
    don't think you're gonna find a simple solution to this with multiple formats. Commented Nov 25, 2015 at 16:28
  • @Tanner ye of little faith...I'm still hopeful Commented Nov 25, 2015 at 16:29
  • 1
    LOL the difference between a pessimist and an optimist? Experience. :) Just from the very controlled values you posted this is challenging. From what is likely all kinds of formats this is just not feasible to do accurately. July, Jul, Jun, June, Jan, January etc...and that doesn't even take into account fat fingering. Commented Nov 25, 2015 at 16:32
  • 1
    @SeanLange fat fingering... i'm gonna steal that one! Commented Nov 25, 2015 at 16:36

1 Answer 1

1

You need a Split function

If you split first by <space> is easy create regular expresion for

monDD
mon-DD
DD-mon-YY
DD-mon-YYYY

SQL Fiddle Demo

WITH splitCTE AS (
  SELECT s.[id], f.Number, f.Item
  FROM dbo.SourceData AS s
  CROSS APPLY dbo.SplitStrings(s.[test], ' ') as f
)
  SELECT *,
         CASE
             WHEN item Like 'Jul[0-9][0-9]' THEN 'mmmdd'
             WHEN item Like 'Jul-[0-9][0-9]' THEN 'mmm-dd'
             WHEN item Like '[0-9][0-9]-Jul-[0-9][0-9]' THEN 'dd-mmm-yy'
             WHEN item Like '[0-9][0-9].Jul.[0-9][0-9][0-9][0-9]' THEN 'dd.mmm.yyyy'
             ELSE ''
         END matchType         
  FROM splitCTE

OUTPUT

  • Need a join with list of 3 char months to replace the wired Jul.
  • Easy expand to also include a version with full month name.
  • Will match Jul77 as mmmdd but is a start.
  • You can calculate a IsValidDate column in another step
    • For some of the format you can use CONVERT to check for a valid date
    • For other like Jul77 you can separate first 3 char with last 2 and try to get a date.

.

| id | Number |        Item |   matchType |
|----|--------|-------------|-------------|
|  1 |      1 |         Foo |             |
|  1 |      2 |         bar |             |
|  1 |      3 |       Jul15 |       mmmdd |
|  1 |      4 |        blah |             |
|  1 |      5 |   blah.xlsx |             |
|  2 |      1 |        Choo |             |
|  2 |      2 |         bar |             |
|  2 |      3 |      Jul-15 |      mmm-dd |
|  2 |      4 |        blah |             |
|  2 |      5 |    far.xlsx |             |
|  3 |      1 |        Star |             |
|  3 |      2 |         bar |             |
|  3 |      3 |   10-Jul-15 |   dd-mmm-yy |
|  3 |      4 |        blah |             |
|  3 |      5 |    far.xlsx |             |
|  4 |      1 |         Car |             |
|  4 |      2 |        Star |             |
|  4 |      3 |         bar |             |
|  4 |      4 | 10.Jul.2015 | dd.mmm.yyyy |
|  4 |      5 |        blah |             |
|  4 |      6 |    far.xlsx |             |

Then use your favorite XML PATH to join back without the matching elements

Sign up to request clarification or add additional context in comments.

1 Comment

there are quite a few - this is a great start

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.