0

Can anybody find a mistake in the following regex:

regex = ([.0-9]+[/–_\":・’ー‘‐`─'.,-\0-9]*)

My intention is to match "numerical" strings of any kind but if the number is followed by e.g. a letter I just want to get the number.

When I use it with the following sentences:
s1 = Bla bla 805P bla 1080P; bla bla
s2 = Bla bla 5600p bla 5400p
It finds 805P and 1080P in s1 and 5600 and 5400 in s2.
You can check it using http://regexpal.com

I also used this regex in Regex Buddy and it gives me a description that says nothing about letters.
Does anybody have any idea why am I catching P and P; in s1 if there is no letters included in the second group of characters?

2
  • What do you mean with \0 in the second regex group? Commented Aug 6, 2014 at 8:06
  • It should be a double backslash, i.e. escaped backslash. I think it displays it as a single backslash for whatever reason. When I edit my post there are two backslashes before 0. Commented Aug 6, 2014 at 8:41

2 Answers 2

2

A part of your regexp says [... ,-\\ ...], which will include any characters between comma and backslash, which includes the following characters:

,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\

This, incidentally, includes P.

To match a literal minus sign in a character class, it needs to be first, last or escaped. E.g.

[- ... ,\\ ... ]
[ ... ,\\ ... -]
[ ... ,\-\\ ... ]

would all be valid ways to write what you intended.

Sign up to request clarification or add additional context in comments.

Comments

2

You're escaping the 0, not the -, with -\0. Also, that is very complex; all you need is

pattern = "([.0-9]+[^A-Za-z]*)"

I.e. "One or more numbers or periods followed by as many non-letter characters as possible". You can add more characters to ignore to that second block if required. See demo

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.