0

I am trying to parse the output of a Windows command prompt command that gets the Caption and ProcessId of child processes for a process. The command returns output in the following format:

Caption   ProcessId\r\r\nnotepad++.exe 40000 \r\r\nnfilezilla.exe 90000 \r\r\n\r\r\n

The regex I am trying to use is:

Caption\s*ProcessId((?:\r\r\n)([a-zA-z\W]+.exe)\s*(\d+)\s*)*

Here is what I am trying to do:

  1. Match the start of the output Caption ProcessId
  2. Capture the caption and process ID of each process in the output
    1. Using the non-capture group, match the two carriage returns \r and single newline character \n that precedes the process information.
    2. Within the first capture group, capture the caption of the process
    3. Match any whitespace between the caption and process ID
    4. Within the second capture group, capture the process ID
    5. Continue matching within the non-capture group zero or more times

I have been using https://regex101.com/r/Zqo6FW/47 with the regex and example string I used above. Doing so, I match only Caption ProcessId and I can't seem to match the carriage returns and newline characters.

How can I modify my regex to successfully match the example output?

3
  • Are you sure about that you provide valid regex(regex101 link)? It does not seem they are similar. Commented Oct 19, 2018 at 13:02
  • @Ekrem good catch, I just fixed it Commented Oct 19, 2018 at 13:06
  • The cmd tag is related to Microsoft Windows cmd.exe. If this is not about Windows, please remove the cmd tag. Commented Oct 19, 2018 at 19:40

3 Answers 3

2
((?:\\r\\r\\n)([a-zA-z\W]+.exe)\s*(\d+)\s*)

Could you try this? You did not escape \r and \n characters with one more backslash. You can use this regex with iteration for getting all processes.

Regex101

Sign up to request clarification or add additional context in comments.

1 Comment

I tried this with a few tweaks, like removing the first capture group that wraps the rest of the expression, and it was just what I was looking for
1

If you want to match that string literally, you have to escape the backslash like \\r to match \r

To match Caption ProcessId and capture the caption of the process in the first capturing group and to match the process in the second capturing group you could use an alternation:

^Caption\s*ProcessId|\\r\\r\\n(\S+)\s+(\d+)

Regex demo

That will match:

  • ^ Assert the start of the string
  • Caption\s*ProcessId Match Caption ProcessId
  • | Or
  • \\r\\r\\n(\S+)\s+(\d+) Match \r\r\n. Then capture in a group 1+ times not a whitespace character (\S+) followed by 1+ times a whitespace character \s+ and then capture in a group 1+ digits (\d+)

If the caption of the process should end on .exe, you could change (\S+) to (\S+\.exe)

1 Comment

Thanks for the great explanation
1

If you have PCRE compatible regex's you can use \G to reset the start of your match. You can also just use \s to match the \r and \n characters. Try this:

(?:^Caption\s*ProcessId|\G)((?:\s*)([a-zA-z\W]+.exe)\s*(\d+))

Caption is in Group 2 and ProcessID in Group 3.

Demo on Regex101

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.