1

I to parse multiple lines of text that, for example, look like this:

{"Name":"pathology[876]", "cpu":"0.58","mem":"18.39", "vm":"1542.14"}
{"Name":"/opt/pathology/bin/pathology[876]", "cpu":"0.58","mem":"18.39", "vm":"1542.14"}
{"Name":"/usr/sbin/ofonod[760]", "cpu":"0.00","mem":"0.00", "vm":"0.00"}
{"Name":"/opt/networking/bin/network_manager[370]", "cpu":"0.20","mem":"53.43", "vm":"4225.69"}
{"Name":"/usr/bin/dmrouterd[913]", "cpu":"0.00","mem":"0.00", "vm":"0.00"}

I have to extract every process name, but some come alone and as well with their related path which I have to ignore, for example: pathology[876] is that same thing as /opt/pathology/bin/pathology[876]. I have to generalize this process to take the process name indifferently of the path. How could I take the desired string between the last / and the end of the string?

So far I have computed the following regex that treats paths like: /opt/<anything>/bin/<anything> extracting part after bin/, but there is a problem where the path is longer, for example /opt/<anything>/bin/pat/pathology[876] I get pat/pathology[876] while I would want only pathology[876].

"(Name)":("\/opt\/(.*?)\/bin\/(.*?)"|"(.*?)")
4
  • This looks like JSON. do you want to iterate over each JSON record? Or do you want to treat them as a single entry like a string? Commented Jun 15, 2020 at 14:23
  • I want to treat them like a string, the idea is that this JSON structure was embedded in a longer message, but I only selected it to be more concise with my question. Commented Jun 15, 2020 at 14:36
  • @LiviuIosim Anything wrong with the answer I gave 10 minutes ago? Commented Jun 15, 2020 at 14:42
  • Can we assume that every entry will look like your excerpt? Something like alphanumerical string followed by a square bracket enclosed process number? Commented Jun 15, 2020 at 14:49

2 Answers 2

2

my steps to create such regex are:

  1. Thinking about which characters are (not) included in my target string? In this case all chars are allowed, but " and / are not allowed: ([^/\"]+)
  2. What is written before my target string? In this case an optional string like /.../.../ which always starts and ends with /. To catch all ../../../ we can write ([^"\/]+\/)* and to catch the first / and make it optional we just extend it to (\/([^"\/]+\/)*)?
  3. What is written after my target string? -> "

The final regex could be:

"Name":"(?:\/(?:[^"\/]+\/)*)?([^/\"]+)"

(Note the syntax (?:X) will group the expression X but will not be captured as a "result group")

I've tested and saved this regex here: https://regex101.com/r/WnSNNk/2

Sign up to request clarification or add additional context in comments.

2 Comments

Albeit a little hard to decypher, this is much more performant than my answer so +1 :-)
@MonkeyZeus thanks for your +1. But I like your "simple" regex more. It is easier to understand. I just didn't saw it.
2

This would do it for you:

[^\/"]+(?=", "cpu")

In English:

Per line, find everything that's not a forward slash nor double quote leading up to ", "cpu"

https://regex101.com/r/u3rhUf/1/

1 Comment

I am sorry for being inactive yesterday, this works really well, thank you for your input.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.