0

I explain my problem : I'm working on different kind of address

" 25 Down Street 15000 London "

" 25 B Down Street 15000 London "

" Building A 25 Down Street 15000 London "

I found a way to determine which is the number of the street on all case with this regex :

 `^([1-9][0-9]{0,2}(?:\s*[A-Z])?)\b`

But now i got a problem that i can't solve, i need when the case is real to determine characters which are before the street's number .

Example : " Building 2 25 Down Street 15000 London " i need here to find only "Building 2"

I understand that i have to find characters before the first number of this string.

Keep searching on my own but will be great if someone got a solution for me .

Thank you .

Edit my code now is :

preg_match('/^(.*?)\d+\s+\D+/', $cleanAdressNode, $result, PREG_OFFSET_CAPTURE,0);
        print $result[0][0];

        return $result[0][0];

and the result now is : Résidence Les Thermes 1 15 boulevard Jean Jaurès instead of only : Résidence Les Thermes 1

1
  • Matching addresses using regular expressions is near impossible... it would take a purpose-built system to figure out what each token in the string means. Commented Oct 5, 2015 at 14:30

2 Answers 2

1

How about:

preg_match('/^(\D*)/', $str, $match);

You will find in $match[1] everything that is not a digit at the begining of the string.

According to your example:

preg_match('/^(.*?)\d+\s+\D+/', $str, $match);
Sign up to request clarification or add additional context in comments.

3 Comments

I found a way to do it look at this : regex101.com/r/nU5tA7/1 But my new problem is that i can have special characters like "é" or "ç" and my regex don't work anymore with this look at this example : regex101.com/r/yB1hI0/1 maybe you can help me with this ?
@VERYNET: I can't help you without seeing your code. Edit your question add the relevant part of the code, some sample input strings and expected result.
@VERYNET: the result is in the first group ie. $result[1][0]
0

If you only want to match the first non-numeric characters, ^([^0-9]*) should do the trick. It uses class negation to grab every non-numeric characters at the start of the string.

9 Comments

I found a way to do it look at this : regex101.com/r/nU5tA7/1 But my new problem is that i can have special characters like "é" or "ç" and my regex don't work anymore with this look at this example : regex101.com/r/yB1hI0/1 maybe you can help me with this ?
Here you're looking for the two groups "Résidence les Thermes 1" and "25 B Boulevard Emile Zola 13100 Aix en Provence" right?
Only for : "Résidence les Thermes 1" , it's the only match
The 1 really is problematic, as there is no formal way to distinguish it from the street number... Let me think about it
It's seems to be resolved look at @Toto edit it works pretty well .
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.