1

I have text of the type:

  • "Choice values selected: Option 1, or Option 2, or Option 3"
  • "Choice value selected: Option 1, or Option 2, or Option 3"
  • "Choice value selected = Option 1 , or Option 2, or Option 3"

I need to extract everything that is after the : or =

I have tried to go about it this way:

import regex as re
r = re.compile(r'Choice(.+?)selected')
r.split(str)

I don't know how to capture the : or =

3
  • Just re.split should be enough @alannaC Commented May 2, 2019 at 10:35
  • ok darn yes that didn't occur to me. :facepalm: Commented May 2, 2019 at 10:37
  • No issues, I have done the same in my answer below! Please check it and accept it if it helped you :) @alannaC Commented May 2, 2019 at 10:38

2 Answers 2

4

You don't need to use regex, just use re.split to split both on : and =

li = ["Choice values selected: Option 1, or Option 2, or Option 3", "Choice value selected: Option 1, or Option 2, or Option 3",
      "Choice value selected = Option 1 , or Option 2, or Option 3"]

import re
for item in li:
    #Split on : and =, get the last element from list and strip it
    print(re.split(':|=',item)[1].strip())

The output will be

Option 1, or Option 2, or Option 3
Option 1, or Option 2, or Option 3
Option 1 , or Option 2, or Option 3
Sign up to request clarification or add additional context in comments.

Comments

1

You can use this regex,

[:=]\s*(.*)

And get your value from group1

This regex starts by capturing either : or = and then optionally \s* matches optional space and then (.*) captures the remaining text in the line and captures in group1

Regex Demo

Python code,

import regex as re

arr = ['Choice values selected: Option 1, or Option 2, or Option 3','Choice value selected: Option 1, or Option 2, or Option 3','Choice value selected = Option 1 , or Option 2, or Option 3']

for s in arr:
 m = re.search(r'[:=]\s*(.*)', s)
 if m:
  print(s, '-->', m.group(1))

Output,

Choice values selected: Option 1, or Option 2, or Option 3 --> Option 1, or Option 2, or Option 3
Choice value selected: Option 1, or Option 2, or Option 3 --> Option 1, or Option 2, or Option 3
Choice value selected = Option 1 , or Option 2, or Option 3 --> Option 1 , or Option 2, or Option 3

Also, in case you want to use re.split then you can split it using [=:] regex which represents either = or :

import regex as re
arr = ['Choice values selected: Option 1, or Option 2, or Option 3','Choice value selected: Option 1, or Option 2, or Option 3','Choice value selected = Option 1 , or Option 2, or Option 3']

for s in arr:
 r = re.compile(r'[:=]')
 print(r.split(s)[1])

Output,

 Option 1, or Option 2, or Option 3
 Option 1, or Option 2, or Option 3
 Option 1 , or Option 2, or Option 3

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.