How to return multiple regex values as a tuple

Question

I am working on a Python program that searches through received emails and returns coordinates. I am trying to create a regular expression to select the Lat/long values from a string. (I am new to regex)

Here is a small example of one of the strings I have been using for testing:

     content = """

WorkLocationBoundingBox
Latitude:30.556555Longitude:-97.659824
SecondLatitude:30.569138SecondLongitude:-97.650855

     """

I came up with Latitude:(\d+).(\d+)Longitude:(.*), which I believe is close to what I need, but it sperates 30 and 556555 into seperate groups. But, -97.659824 is correctly placed into a group.

My ideal expected result would look something this:

[(30.556555, -97.659824, 30.569138, -97.650855)]

Try it like this Latitude:(\d+(?:\.\d+)?)Longitude:(.*) or more precise (?:Second)?Latitude:(-?\d+(?:\.\d+)?)(?:Second)?Longitude:(-?\d+(?:\.\d+)?) See regex101.com/r/OZgPXb/1 — The fourth bird
– The fourth bird, Commented Jun 4, 2021 at 14:12
Worked great, now to spend the time to figure out why! Thanks for your help! — Ryan Bobo
– Ryan Bobo, Commented Jun 4, 2021 at 14:29

The fourth bird · Accepted Answer · 2021-06-04 14:41:26Z

You can use 3 capture groups, where the first group is used to match up the word before Long or Latitude.

((?:Second)?)Latitude:(-?\d+(?:\.\d+)?)\1Longitude:(-?\d+(?:\.\d+)?)

((?:Second)?) Capture group 1, optionally match Second
Latitude: Match literally
(-?\d+(?:\.\d+)?) Capture group 2, match an optional - then 1+ digits with an optional decimal part
\1Longitude: A Backreference to what is matched in group 1 and match Longitude:
(-?\d+(?:\.\d+)?) Capture group 3, match an optional - then 1+ digits with an optional decimal part

Regex demo or a Python demo

import re
regex = r"((?:Second)?)Latitude:(-?\d+(?:\.\d+)?)\1Longitude:(-?\d+(?:\.\d+)?)"
s = ("WorkLocationBoundingBox\n"
            "Latitude:30.556555Longitude:-97.659824\n"
            "SecondLatitude:30.569138SecondLongitude:-97.650855")

matches = re.finditer(regex, s)
lst = []

for matchNum, match in enumerate(matches, start=1):
     lst.append(match.group(2))
     lst.append(match.group(3))

print(lst)

Output

['30.556555', '-97.659824', '30.569138', '-97.650855']

A bit less strict pattern could be matching optional word character before either Longitude or Latitude:

\w*Latitude:(-?\d+(?:\.\d+)?)\w*Longitude:(-?\d+(?:\.\d+)?)

Regex demo

In that case, you might also use re.findall to return the group values in a list of tuples if you want:

import re

pattern = r"\w*Latitude:(-?\d+(?:\.\d+)?)\w*Longitude:(-?\d+(?:\.\d+)?)"

s = ("WorkLocationBoundingBox\n"
            "Latitude:30.556555Longitude:-97.659824\n"
            "SecondLatitude:30.569138SecondLongitude:-97.650855")
print(re.findall(pattern, s))

Output

[('30.556555', '-97.659824'), ('30.569138', '-97.650855')]

A very thorough and helpful answer. Thanks again for your help!

Collectives™ on Stack Overflow

How to return multiple regex values as a tuple

1 Answer 1

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related