Is there a way to combine multiple strings using Regex?

Question

Having an issue with Regex and not really understanding its usefulness right now. Trying to extrapolate data from a file. file consists of first name, last name, grade

File:

Peter Jenkins: A  
Robert Right: B  
Kim Long: C  
Jim Jim: B

Opening file code:

##Regex Code  r'([A-Za-z]+)(: B)
regcode = r'([A-Za-z]+)(: B)'

answer=re.findall(regcode,file)
return answer

The expected result is first name last name. The given result is last name and letter grade. How do I just get the first name and last name for all B grades?

Do you need regex? I think simple split and some filter / complrehension should fo the job — Jan Stránský
– Jan Stránský, Commented Sep 7, 2020 at 19:35
Ok, why do you need regex? There are easier (IMO) solutions.. But the regex way: r'([A-Za-z]+)(: B)' matches a word (in 1st group) followed by : B (matched to the second group). Just match one more word and it should work — Jan Stránský
– Jan Stránský, Commented Sep 7, 2020 at 19:38

HamiltonPharmD · Accepted Answer · 2020-09-19 17:19:49Z

2

Since you must use regex for this task, here's a simple regex solution that returns the full name:

'(.*): B'

Which works in this case because:

(.*) returns all text up to a match of : B

Click here to see my test and matching output. I recommend this site for your regex testing needs.

edited Sep 19, 2020 at 17:19

answered Sep 7, 2020 at 19:58

HamiltonPharmD

6737 silver badges22 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Roman Zhak · Accepted Answer · 2020-09-07 20:02:33Z

1

You can do it without regex:

students = '''Peter Jenkins: A
Robert Right: B
Kim Long: C
Jim Jim: B'''
    
for x in students.split('\n'):
    string = x.split(': ')
    if string[1] == 'B':
        print(string[0])

# Robert Right
# Jim Jim

or

[x[0:-3] for x in students.split('\n') if x[-1] == 'B']

edited Sep 7, 2020 at 20:02

answered Sep 7, 2020 at 19:55

Roman Zhak

262 bronze badges

Comments

Jan Stránský · Accepted Answer · 2020-09-07 20:02:35Z

0

If a regex solution is required (I perosnally like the solution of Roman Zhak more), put inside a group what you are interested in, i.e. the first name and the second name. Follows colon and B:

import re

file = """
Peter Jenkins: A  
Robert Right: B  
Kim Long: C  
Jim Jim: B
"""

regcode = r'([A-Za-z]+) ([A-Za-z]+): B'
answer=re.findall(regcode,file,re.)
print(answer) # [('Robert', 'Right'), ('Jim', 'Jim')]

answered Sep 7, 2020 at 20:02

Jan Stránský

1,6911 gold badge11 silver badges16 bronze badges

Comments

DYZ · Accepted Answer · 2020-09-07 20:10:42Z

0

Add a capturing group ('()') to your expression. Everything outside the group will be ignored, even if it matches the expression.

re.findall('(\w+\s+\w+):\s+B', file)
#['Robert Right', 'Jim Jim']

'\w' is any alphanumeric character, '\s' is any space-like character.

You can add two groups, one for the first name and one for the last name:

re.findall('(\w+)\s+(\w+):\s+B', data)
#[('Robert', 'Right'), ('Jim', 'Jim')]

The latter will not work if there are more than two names on one line.

answered Sep 7, 2020 at 20:10

DYZ

57.3k10 gold badges73 silver badges101 bronze badges

Collectives™ on Stack Overflow

Is there a way to combine multiple strings using Regex?

4 Answers 4

Comments

Comments

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

Comments

Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related