1

I have some strings that look like this

S25m\S25m_16Q_-2dB.png
S25m\S25m_1_16Q_0dB.png
S25m\S25m_2_16Q_2dB.png

I want to get the string between slash and the last underscore, and also the string between last underscore and extension, so

Desired:

[S25m_16Q, S25m_1_16Q, S25m_2_16Q]
[-2dB, 0dB, 2dB]

I was able to get the whole thing between slash and extension by doing

foo = "S25m\S25m_16Q_-2dB.png"
match = re.search(r'([a-zA-Z0-9_-]*)\.(\w+)', foo)
match.group(1)

But I don't know how to make a pattern so I could split it by the last underscore.

3 Answers 3

5

Capture the groups you want to get.

>>> re.search(r'([-\w]*)_([-\w]+)\.\w+', "S25m\S25m_16Q_-2dB.png").groups()
('S25m_16Q', '-2dB')
>>> re.search(r'([-\w]*)_([-\w]+)\.\w+', "S25m\S25m_1_16Q_0dB.png").groups()
('S25m_1_16Q', '0dB')
>>> re.search(r'([-\w]*)_([-\w]+)\.\w+', "S25m\S25m_2_16Q_2dB.png").groups()
('S25m_2_16Q', '2dB')

* matches the previous character set greedily (consumes as many as possible); it continues to the last _ since \w includes letters, numbers, and underscore.


>>> zip(*[m.groups() for m in re.finditer(r'([-\w]*)_([-\w]+)\.\w+', r'''
... S25m\S25m_16Q_-2dB.png 
... S25m\S25m_1_16Q_0dB.png
... S25m\S25m_2_16Q_2dB.png
... ''')])
[('S25m_16Q', 'S25m_1_16Q', 'S25m_2_16Q'), ('-2dB', '0dB', '2dB')]
Sign up to request clarification or add additional context in comments.

3 Comments

Thanks! How did the pattern differentiate different underscores? Where is the "last underscore" part?
I see. Now I can understand this one. I guess I don't know re very well in general.
Also helps to know that the underscore is included by the \w wildcard.
0

A non-regex solution (albeit rather messy):

>>> import os
>>> s = "S25m\S25m_16Q_-2dB.png"
>>> first, _, last = s.partition("\\")[2].rpartition('_')
>>> print (first, os.path.splitext(last)[0])
('S25m_16Q', '-2dB')

Comments

-3

I know it says using re, but why not just use split?

strings = """S25m\S25m_16Q_-2dB.png
S25m\S25m_1_16Q_0dB.png
S25m\S25m_2_16Q_2dB.png"""

strings = strings.split("\n")

parts = []
for string in strings:
    string = string.split(".png")[0] #Get rid of file extension
    string = string.split("\\")
    splitString = string[1].split("_")
    firstPart = "_".join(splitString[:-1]) # string between slash and last underscore
    parts.append([firstPart, splitString[-1]])


for line in parts:
    print line
['S25m_16Q', '-2dB']
['S25m_1_16Q', '0dB']
['S25m_2_16Q', '2dB']

Then just transpose the array,

for line in zip(*parts):
    print line
('S25m_16Q', 'S25m_1_16Q', 'S25m_2_16Q')
('-2dB', '0dB', '2dB')

1 Comment

Thanks for the downvotes while editing a question i accidentally submitted before it was finished guys!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.