how do you do regex in python

Question

I have a string like this:

data='WebSpherePMI_jvmRuntimeModule_ProcessCpuUsage'

I need to get rid of everything until the first instance of the underline (inclusive) in regex.

I've tried this:

re.sub("(^.*\_),"", data)

but this get rids of everything before all underlines

ProcessCpuUsage

I need it to be:

jvmRuntimeModule_ProcessCpuUsag

You really don't even need to use regex for this.

l'L'l
– l'L'l

2015-01-06 20:29:05 +00:00
Commented Jan 6, 2015 at 20:29 — l'L'l
– l'L'l, Commented Jan 6, 2015 at 20:29
Definitely don't use regex. It's much slower.

mbomb007
– mbomb007

2015-01-06 20:40:11 +00:00
Commented Jan 6, 2015 at 20:40 — mbomb007
– mbomb007, Commented Jan 6, 2015 at 20:40

mbomb007 · Accepted Answer · 2015-01-06 20:39:18Z

2

Use this instead:

from string import find

data='WebSpherePMI_jvmRuntimeModule_ProcessCpuUsage'
result = data[find(data, "_")+1:]
print result

answered Jan 6, 2015 at 20:39

mbomb007

4,4013 gold badges52 silver badges79 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Chad Miller · Accepted Answer · 2015-01-06 20:33:59Z

1

re.sub("(^.*\_),"", data)

This makes . match every character in the line. Once it gets to the end, and can't match any more ".", it goes to the next token. Oops, that's a underscore! So, it backtracks back before the _ProcessCpuUsage, where it can match a underscore at the start, and then complete the match.

You should ask the . multiplier to be less greedy. You also do not need to capture the contents. Drop the parens. The backslash does nothing. Drop it. The leading line-start anchor also does nothing. Drop it.

re.sub(".*?_,", data)

answered Jan 6, 2015 at 20:33

Chad Miller

1,4759 silver badges11 bronze badges

Comments

Mark Ransom · Accepted Answer · 2015-01-06 20:34:54Z

1

You have become a victim of greedy matching. The expression matches the longest sequence that it possibly can.

I know there's a way to turn off greedy matching, but I never remember it. Instead there's a trick I use when there's a character I want to stop at. Instead of matching on every character with . I match on every character except the one I want to stop at.

re.sub("(^[^_]*\_", "", data)

answered Jan 6, 2015 at 20:34

Mark Ransom

310k44 gold badges423 silver badges660 bronze badges

Comments

Finwood · Accepted Answer · 2015-01-06 20:35:29Z

1

This should do:

import re
def get_last_part(d):
    m = re.match('[^_]*_(.*)', d)
    if m:
        return m.group(1)
    else:
        return None

print get_last_part('WebSpherePMI_jvmRuntimeModule_ProcessCpuUsage')

answered Jan 6, 2015 at 20:35

Finwood

3,9911 gold badge23 silver badges39 bronze badges

Comments

Hackaholic · Accepted Answer · 2015-01-06 20:41:05Z

1

you can use str.index:

>>> data = 'WebSpherePMI_jvmRuntimeModule_ProcessCpuUsage'
>>> data[data.index('_')+1:]
'jvmRuntimeModule_ProcessCpuUsage'

Using str.split

>>> data.split('_',1)[1]
'jvmRuntimeModule_ProcessCpuUsage'

Using str.find:

>>> data[data.find('_')+1:]
'jvmRuntimeModule_ProcessCpuUsage'

Take a look at string methods Here

edited Jan 6, 2015 at 20:41

answered Jan 6, 2015 at 20:34

Hackaholic

19.8k6 gold badges59 silver badges77 bronze badges

Comments

Andie2302 · Accepted Answer · 2015-01-06 20:45:17Z

1

Try this regex:

result = re.sub("^.*?_", "", text)

What the regex ^.*?_ does:

^ .. Assert that the position is at the beginning of the string.
.*? .. Match every character that is not a linebreak character between zero and unlimitted times as few times as possible.
- .. Match the character _

edited Jan 6, 2015 at 20:45

answered Jan 6, 2015 at 20:30

Andie2302

4,9254 gold badges26 silver badges46 bronze badges

Comments

l'L'l · Accepted Answer · 2015-01-06 20:46:14Z

1

Try using split():

s = 'WebSpherePMI_jvmRuntimeModule_ProcessCpuUsage'
print(s.split('_',1)[1])

Result:

jvmRuntimeModule_ProcessCpuUsage

edited Jan 6, 2015 at 20:46

answered Jan 6, 2015 at 20:35

l'L'l

47.5k12 gold badges102 silver badges154 bronze badges

Collectives™ on Stack Overflow

how do you do regex in python

7 Answers 7

Comments

Comments

Comments

Comments

Comments

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

7 Answers 7

Comments

Comments

Comments

Comments

Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related