I have written this function,
def my_func(s):
wordlist = ('unit','room','lot')
if if any(re.match(r'^'+ word + r'\b' + r'.*$', s.lower()) for word in wordlist) and any(i.isdigit() for i in s.lower())::
if ',' in s:
out = re.findall(r"(.*),", s) #Getting everything before comma
return out[0]
else:
out = re.findall(r"([^\s]*\s[^\s]*)", s) #Getting everything before second space.
return out[0]
My test data and the expected output
Unity 11 Lane. --> None
Unit 11 queen street --> Unit 11
Unit 7, king street --> Unit 7
Lot 12 --> Lot 12
Unit street --> None
My logic here is
- Take up to first comma, if there is ',' in the string.
- Take up to second space if there is no comma
- Dont bring out anything if the string is not starting with anything in the wordlist.
- Bring all if no second space or comma in it.
Everything else is working fine, how to capture Lot 12 here, say if the string matches wordlist and there is no ',' and no second space, then bring it all
Lot 12 --> Lot 12andUnit street --> Noneare mutually exclusive if you want your rule to beTake up to first comma, if there is ',' in the string.andTake up to second space if there is no comma.streetmatches those conditions. Should those matches be only digits?