Using regex to find email address and other values in Python String

Question

So I have a string that possesses data I need to extract for my main program.
It looks something like this:

string = "[email:[email protected]][days:90]"

From this string I want to extract the data within the brackets and be able to split email and the email address by the colon so that I can store the word email and the email address separately to get something like this:

string = "[email:[email protected]]"
... some regex here ...
param_type = "email"
param_value = "[email protected]"

if param_type == 'email':
   ... my code to send an email to param_value ...

The string could ultimately have at most 2 pairs of brackets for different parameter types so that I can specify what functions to handle:

string = "[email:[email protected]] [days:90]"
...regex to split by bracket group ....
param_type1 = "email"
param1 = "[email protected]"

param_type2 = "days"
param2 = "90"

if param_type1 != "":
   ... email code ...
if param_type2 != "":
   ... run other code for the specified number of days ...

The main program already has default values for these 2 param_types, but I want there to be the option to specify the email address, days, both, or neither. If anything, I mainly need to know how to retrieve the email address as the online examples don't work for my situation.

juanpa.arrivillaga · Accepted Answer · 2022-10-21 20:18:05Z

1

So, in this case, you can just use a regex to extract what is between the brackets, then split on a colon character to get the param type and param, something like:

[s.split(":") for s in re.findall(r"\[(.+?)\]", string)]

So your code would be something like:

import re

string = "[email:[email protected]][days:90]"
type_and_param_pairs = [s.split(":") for s in re.findall(r"\[(.+?)\]", string)]
for param_type, param in type_and_param_pairs:
    if param_type == "email":
        # do something
    elif param_type == "days":
        # do something else
    ...

answered Oct 21, 2022 at 20:18

juanpa.arrivillaga

97.6k14 gold badges141 silver badges190 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

chrslg · Accepted Answer · 2022-10-21 20:24:38Z

0

You could use

\[([^:]*):([^\]]*)]*\]

That regular expression matches any [attribute:value] substring, with subexpression for the attribute and the value part.

It searches for a [, then for a few chars that are not :, then for a :, then for a few chars that are not ], then for a ]. And it encloses the part between [ and : and the one between : and ] into parenthesis.

So that if you use findall on this regex, it returns a list of all pairs [attribute:value] found in the string.

Example:

import re

string = "[email:[email protected]] [days:90]"
pairs=re.findall(r'\[([^:]*):([^\]]*)]*\]', s)
# pairs = [('email', '[email protected]'), ('days', '90')]
for attr,val in pairs:
    if attr=='email':
        doSomethingWithEmail(val)
    elif attr=='days':
        doSomethingWidhDays(val)

edited Oct 21, 2022 at 20:24

answered Oct 21, 2022 at 20:18

chrslg

15.2k11 gold badges26 silver badges42 bronze badges

1 Comment

finman69 Over a year ago

Oh my god that is so elegant. All the answers online had regex strings that were way too long to understand and that wasn't reasonable for maintaining the program. You are a life saver!

Collectives™ on Stack Overflow

Using regex to find email address and other values in Python String

2 Answers 2

Comments

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related