2

How do I match the URL address in this string, I have other code that matches text and it seems to work, but when I try to use it here, it doesn't, it keeps saying there is "No such file or directory. I didn't know grep -o only worked on files?

matchString='url={"urlPath":"http://www.google.com/","thisIsOtherText"'
array=($(grep -o 'url={"urlPath":"([^"]+)"' "$matchString"))
grep: url={"urlPath":"http://www.google.com/","thisIsOtherStuff": No such file or directory

Anyway, could you please help me with matching the URL from in the "matchString" variable (it doesn't have to use grep).

Preferred output: http://www.google.com/

2 Answers 2

5

You need to echo the string through a pipe to grep:

array=($(echo "$matchString" | grep -o 'url={"urlPath":"([^"]+)"'))

Grep reads from a file or standard input. It doesn't accept a string argument to search within.

Also, grep is going to output the entire match, not the part in parentheses. You probably need to use sed.

array=($(echo "$matchString" | sed 's/url={"urlPath":"\([^"]\+\).*"/\1/'))

The sed command works like this:

  • s/// is the substitute command and its delimiters. You can use another delimiter for convenience if it makes the expression more readable or helps eliminate having to do some escapes. Between the first two delimiters is what we want to change. Between the middle one and the last one is what we want to change it to.

  • url={"urlPath":" is just the literal text we are using to help make the match

  • \( \) encloses a capture group. What falls bewteen here is what we want to snag.

  • [^"] matches any character that's not a double-quote

  • \+ match one or more of the preceding pattern. So, in this case, that's one or more characters that are not quotation marks.

  • .* match zero or more of any character. In this case, it starts at the quote after google.com/ and goes to the end of the string.

  • \1 outputs what was captured by the first (and only in this case) capture group.

Visually:

url={"urlPath":"       http://www.google.com/       ","thisIsOtherText"
-----literal----       -------non-quote------       ---any character---
url={"urlPath":"   \(  [^"]                    \)   .*
Sign up to request clarification or add additional context in comments.

3 Comments

Cheers, the sed one works. Not sure how my other code works with the grep, though I think it might be file.
Also could you please explain how the regex in there all works and the \1?
Thanks! Very detailed. I would give you two ticks if I could :)
0

I am not familiar with grep, but have knowledge of regex.

You may have to add escapes with for the "

 array=($(grep -o 'url\=\{\"urlPath\"\:\"([^\"]*)\"' "$matchString"))

1 Comment

user:~# array=($(grep -o 'url\=\{\"urlPath\"\:\"([^\"]*)\"' "$matchString")); echo "$array" grep: Unmatched \{ user:~# array=($(grep -o 'url\={\"urlPath\"\:\"([^\"]*)\"' "$matchString")); echo "$array" grep: : No such file or directory

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.