0

from string:

l="\tsome string in line 1\n\tcmd: DIR @1332243996 (2012.03.20 12:46:36) state op:29 cfg:0\n\tline 3 some other string"

i want to extract "DIR", therefore i created that regex:

j = re.search(r'cmd: \w+', l)

but when i do:

print j.group()

i got:

cmd: DIR

What should I do, to get only "DIR", not with "cmd: " eg:

print j.group()
DIR

thx for all answers

0

4 Answers 4

5

You need to capture the DIR group in your regex:

j = re.search(r'cmd: (\w+)', l)

Then reference it when retrieving:

print j.group(1)
Sign up to request clarification or add additional context in comments.

2 Comments

Well, but if DIR is a directory (normally, something like that "some/directory"; and not just a single "directory"), it won't match.
@Dr.Kameleon It will match up to the slash, which appears to be what is requested.
4

Make it a positive look behind assertion

j = re.search(r'(?<=cmd: )\w+', l)

See it here on Regexr

A group starting with ?<= is a positive look behind assertion that means, it does not match, but it ensures that the content is before the pattern you want to match.

5 Comments

Wouldn't that get cmd rather than what follows it?
@Marcin no. The characters ?<= at the start of the group tell the regex engine that it is a positive lookbehind; that is, that the match should be preceded by that group.
@katrielalex I see. Rather convoluted alternative to just capturing what follows, no?
@Marcin I actually think it's simpler! Using an extra group first matches the wrong thing, then restricts to only part of that match. A lookbehind precisely captures the meaning of "find a word preceded by cmd: .
@Marcin a lookaround is an advanced regex feature, I wouldn't say convoluted. But that are your two choices, either match only what you want with my solution, or match more and capture the part you want, like in the other solutions.
4

You need to place a group (that is, brackets) around the part that you want to capture:

j = re.search(r'cmd: (\w+)', l)
k = re.search(r'cmd:\s*(\w+)', l)
print j.group(1)

You might prefer to use the k version, which handles a variable amount of whitespace between "cmd:" and what follows.

4 Comments

Well, but if DIR is a directory (normally, something like that "some/directory"; and not just a single "directory"), it won't match.
@Dr.Kameleon What are you talking about?
I mean it won't match something like cmd: another/dir. Isn't it possible that the DIR the OP refers to is a "directory path"? In that case, I suppose we should also take into account the / and `` characters when matching...
@Dr.Kameleon Perhaps s/he could face that issue, but there is nothing in the question to suggest that OP has that issue, and s/he certainly does not request help with that issue.
-1

RE-RE-FIXED

Here's your Regex : cmd:\s([\w//\\]+)\s@[0-9]+\s


Hint : it matches cmd: somedir @12312312 as well as cmd: another/dir @123123

2 Comments

Did you read the question? This doesn't do what he asks, nor does it respect the requirement implicit in the question.
@sr2222 Well... my mistake... I just corrected it... (hopefully)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.