7

I am trying to extract a substring after matching a pattern in a string.

Now I can't share my hole file but let's take this example.

From this string:

{"code":"S02A5","name":"18\" Leichtmetallräder Doppelspeiche 397","price":"0","standard":"false"}

I want to extract this substring

18\" Leichtmetallräder Doppelspeiche 397

So far I tried the following :

This matches to many results

grep -oP '(?<="code":".....","name":")[^"]+'

I know that the first char after "name":" is always 1, so I tried to use this in the following command, and the return is 8\ which is not that bad because I can add the 1 afterwards.

grep -oP '(?<="code":".....","name":"1)[^"]+'

The problem is that I can't find a way to retrieve the rest of the substring needed, because there's an extra quotation mark after that backslash.

Any ideas how can I solve this?

1
  • Please use a tool like jq for handling structured, formatted data like JSON. Using grep to do it is like stirring paint with a screwdriver. Commented Dec 10, 2018 at 15:23

2 Answers 2

2

That looks like JSON, use for example jq:

$ jq '.name' file
"18\" Leichtmetallräder Doppelspeiche 397"

or

$ jq -r '.name' file
18" Leichtmetallräder Doppelspeiche 397

Update:

If you need to use grep

$ grep -oP '(?<="name":")(\\"|[^"])+' file
18\" Leichtmetallräder Doppelspeiche 397

Explained:

  • (?<="name":") positive lookbehind preceeded by "name":"
  • followed by \"s or non-quotes

OR:

Maybe it should be:

$ grep -oP '(?<="name":")((?<![^\\]\\)\\"|[^"])+' file

since that would match \" and \\\" but not \\"

Sign up to request clarification or add additional context in comments.

4 Comments

This look fine, but the problem is that I have just a text file (a payload) as an input file, nothing more.
Updated with a grep solution.
This works just fine for this example, but for my hole file, this retrieves more substrings. But if I adjust it it should work fine. thanks
That's why it's important to make as detailed sample as possible. We can't guess what is in the real data and leave it the question poster to finalize the implementation.
0

If you are considering Perl, this should work

/tmp> export data='{"code":"S02A5","name":"18\" Leichtmetallräder Doppelspeiche 397","price":"0","standard":"false"}'
/tmp> echo $data | perl -ne  ' /\"name\":(.+?),/ and print "$1\n" '
"18\" Leichtmetallräder Doppelspeiche 397"
/tmp>

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.