0

I'm trying to solve a problem and i figure out that the best solution is to use regex.

I dont have advance knowledge on regex, so I really need your help.

The problem is that I need to identify if a string is phone number or a cell phone the way we use here in Brazil, like this:

The phone number could be in this format:

0 + XX + XXXXXXXX = 01121234567

Or

XX + XXXXXXXX = 1121234567

Or

XXXXXXXX = 21234567

In this case its a phone number, so it need to start with 2,3,4 or 5

And if its a cell phone it have 9 digits and must start with 9, exclusive, and second digit must be 6,7 or 8:

0 + XX + 9XXXXXXXX = 011961234567

Or

XX + 9XXXXXXXX = 11971234567

Or

9XXXXXXXX = 981234567

So, if i get the strings "011984160986", "11984160986" or "984160986", for example, I need to be able to identify that its a cell phone, and the same for phone numbers.

At the same time I must be able to check that string "84160289" its not a valid phone number.

Can anyone ligth me on this problem with a regex?

I not comfortable with use a lot of "ifs" in my code to validate this.

Thanks.

3
  • You say "second digit must be 6,7 or 8" but in the cell phone examples they don't do it. Commented Apr 26, 2017 at 12:35
  • My bad, it was wrong Commented Apr 26, 2017 at 13:31
  • "I'm trying to solve a problem and i figure out that the best solution is to use regex." Now you have two problems Commented Apr 26, 2017 at 14:10

2 Answers 2

1

The regex for the phone number is

\b(0?\d\d)?[2345]\d{7}\b

The regex for the cell phone number is

\b(0?\d\d)?9[678]\d{7}\b

The regex for validating both of them at the same time is

\b(0?\d\d)?([2345]|9[678])\d{7}\b

See demos: 1, 2, 3.

The \d escaper works depending on your language. If it does not work, try replacing it with [0-9].

Sign up to request clarification or add additional context in comments.

2 Comments

Yes, the regex was exactly what I need, except one thing: I did some tests, and if I add a big number the regex did 2 matches, like regex101.com/r/0BI89u/1 . The \d{7} in the end doesnt mean that its allowed only 7 chars after this?
@RaphaelTelatim Oh, right. I've updated the answer using word boundaries (see https://regex101.com/r/0BI89u/2).
0

When I start working on a regex, I like to start with the simplest possible approach first.

NOTE 1: I like to use free spacing regexes. It helps with readability.

NOTE 2: I'm not using \d to avoid possible problems with unicode digits. If unicode digits are allowed, then use \d.

0 + XX + XXXXXXXX = 01121234567

0 [0-9]{2} [0-9]{8}

XX + XXXXXXXX = 1121234567

[0-9]{2} [0-9]{8}

XXXXXXXX = 21234567

[0-9]{8}

In this case its a phone number, so it need to start with 2,3,4 or 5

OK, now we have more restrictions. Let's start to narrow in here?

(0 [0-9]{2} [2-5] [0-9]{7} |
   [0-9]{2} [2-5] [0-9]{7} |
            [2-5] [0-9]{7})

With so much in common, we can compress that more.

(0 [0-9]{2} | [0-9]{2})? [2-5] [0-9]{7}

And finally compress again

(0? [0-9]{2})? [2-5] [0-9]{7}

And if its a cell phone it have 9 digits and must start with 9, exclusive, and second digit must be 6,7 or 8:

0 + XX + 9XXXXXXXX = 011961234567

0 [0-9]{2} 9 [6-8] [0-9]{7}

XX + 9XXXXXXXX = 11971234567

[0-9]{2} 9 [6-8] [0-9]{7}

9XXXXXXXX = 981234567

9 [6-8] [0-9]{7}

Combining them works the same

(0? [0-9]{2})? 9 [6-8] [0-9]{7}

Putting both phone and land lines together

((0? [0-9]{2})? [2-5]   [0-9]{7} |
 (0? [0-9]{2})? 9 [6-8] [0-9]{7})

This can be compressed to

(0? [0-9]{2})? ([2-5] | 9 [6-8]) [0-9]{7}

To embed this use word boundaries if that fits your needs.

\b (0? [0-9]{2})? ([2-5] | 9 [6-8]) [0-9]{7} \b

The final form is similar to the other answer, but I felt it's worthwhile to work the whole process.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.