1

So I have a string that possesses data I need to extract for my main program.
It looks something like this:

string = "[email:[email protected]][days:90]"

From this string I want to extract the data within the brackets and be able to split email and the email address by the colon so that I can store the word email and the email address separately to get something like this:

string = "[email:[email protected]]"
... some regex here ...
param_type = "email"
param_value = "[email protected]"

if param_type == 'email':
   ... my code to send an email to param_value ...

The string could ultimately have at most 2 pairs of brackets for different parameter types so that I can specify what functions to handle:

string = "[email:[email protected]] [days:90]"
...regex to split by bracket group ....
param_type1 = "email"
param1 = "[email protected]"

param_type2 = "days"
param2 = "90"

if param_type1 != "":
   ... email code ...
if param_type2 != "":
   ... run other code for the specified number of days ...

The main program already has default values for these 2 param_types, but I want there to be the option to specify the email address, days, both, or neither. If anything, I mainly need to know how to retrieve the email address as the online examples don't work for my situation.

0

2 Answers 2

1

So, in this case, you can just use a regex to extract what is between the brackets, then split on a colon character to get the param type and param, something like:

[s.split(":") for s in re.findall(r"\[(.+?)\]", string)]

So your code would be something like:

import re

string = "[email:[email protected]][days:90]"
type_and_param_pairs = [s.split(":") for s in re.findall(r"\[(.+?)\]", string)]
for param_type, param in type_and_param_pairs:
    if param_type == "email":
        # do something
    elif param_type == "days":
        # do something else
    ...
Sign up to request clarification or add additional context in comments.

Comments

0

You could use

\[([^:]*):([^\]]*)]*\]

That regular expression matches any [attribute:value] substring, with subexpression for the attribute and the value part.

It searches for a [, then for a few chars that are not :, then for a :, then for a few chars that are not ], then for a ]. And it encloses the part between [ and : and the one between : and ] into parenthesis.

So that if you use findall on this regex, it returns a list of all pairs [attribute:value] found in the string.

Example:

import re

string = "[email:[email protected]] [days:90]"
pairs=re.findall(r'\[([^:]*):([^\]]*)]*\]', s)
# pairs = [('email', '[email protected]'), ('days', '90')]
for attr,val in pairs:
    if attr=='email':
        doSomethingWithEmail(val)
    elif attr=='days':
        doSomethingWidhDays(val)

1 Comment

Oh my god that is so elegant. All the answers online had regex strings that were way too long to understand and that wasn't reasonable for maintaining the program. You are a life saver!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.