1

I am looking to extract a string from a URL. Here is an example to illustrate what I am looking for.

Input URL: http://www.nba.com/bulls/stats/ Output : bulls

In other words, I want to be able to extract the string between the second last and last "/" in the url. I know that I can split by "/" and extract the second last term, but am looking for a cleaner regex solution.

Any ideas?

3
  • 1
    I don't grok regex and your string splitting option seemed pretty straight forward to me: head(tail(unlist(strsplit(URL, "/")), 2), 1) Commented Mar 28, 2011 at 2:04
  • The string between the second last and the last / is stats, not bulls. Did you mean you want the string between the third last and second last /? Commented Mar 28, 2011 at 3:35
  • yes. i meant the one between the third last and second last Commented Mar 31, 2011 at 14:05

3 Answers 3

3

Try this:

http://[^/]+/([^/]+)/[^/]+/?
Sign up to request clarification or add additional context in comments.

Comments

1

If you must do it by regex, you could simply do this (assuming JavaScript-style regex syntax):

/\/([^\/]*)\/[^\/]*\/$/

For the sake of making it easier to understand, the .NET version would be this:

@"/([^/]*)/[^/]*/$"

However, I think the idea of splitting on / is really the right way to do this.

Comments

1

The following regex can do the job

http[s]?://[\w\.]+/(\w+)/.*

3 Comments

I probably meant http[s]? instead of http[s]*
This won't match if there are any special characters in the string, such as - or %. I'm not sure if that's relevant or not.
https? is simple as Czechnology mentioned above.. you don't need to put [] around s

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.