1

I have a string that I need to extract values out of. The problem is the string is inconsistent. Here's an example of the script that has the string within it.

import re

RAW_Data = "Name Multiple Words Zero Row* (78.59/0) Name Multiple Words2* (96/24.56) Name Multiple Words3* (0/32.45) Name Multiple Words4* (96/12.58) Name Multiple Words5* (96/0) Name Multiple Words Zero Row6* (0) Name Multiple Words7* (96/95.57) Name Multiple Words Zero Row8* (0) Name Multiple Words9*"

First_Num = re.findall(r'\((.*?)\/*', RAW_Data)
Seg_Length = re.findall(r'\/(.*?)\)', RAW_Data)
#WithinParenthesis = re.findall(r'\((.*?)\)', RAW_Data) #This works correctly

print First_Num
print Seg_Length

del RAW_Data

What I need to get out of the string are all values within the parenthesis. However, I need some logic that will handle the absence of the "/" between the numbers. Basically if the "/" doesn't exist make both values for First_Num and Seg_Length equal to "0". I hope this makes sense.

0

2 Answers 2

1

Use a simple regex and add some programming logic:

import re
rx = r'\(([^)]+)\)'
string = """Name Multiple Words Zero Row* (78.59/0) Name Multiple Words2* (96/24.56) Name Multiple Words3* (0/32.45) Name Multiple Words4* (96/12.58) Name Multiple Words5* (96/0) Name Multiple Words Zero Row6* (0) Name Multiple Words7* (96/95.57) Name Multiple Words Zero Row8* (0) Name Multiple Words9*"""

for match in re.finditer(rx, string):
    parts = match.group(1).split('/')
    First_Num = parts[0]
    try:
        Seg_Length = parts[1]
    except IndexError:
        Seg_Length = None

    print "First_Num, Seg_Length: ", First_Num, Seg_Length

You might get along with a regex alone solution (e.g. with conditional regex), but this approach is likely to be still understood in three months. See a demo on ideone.com.

Sign up to request clarification or add additional context in comments.

4 Comments

That looks like it'll work Jan. I ran it on some sample data, and it's doing exactly what I want it to. Awesome! Thanks
@user1457123: Glad to help :)
Jan, do you have any suggestions on how I can run this within a nested loop? When I try, I get the last set of values for First_Num and Seg_Length only. Maybe I should post a new question for this?
Yes, post a new question :-)
0

You are attempting to find values on each side of '/' that you know may not exist. Pull back to the always known condition for your initial search. Use a Regular Expression to findall of data within parenthesis. Then process these based on if '/' is in the value.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.