1

I'm sure this has been quite numerous times but though i've checked all similar questions, i couldn't come up with a solution.

The problem is that i've an input urls similar to;

  1. http://www.justin.tv/peacefuljay
  2. http://www.justin.tv/peacefuljay#/w/778713616/3
  3. http://de.justin.tv/peacefuljay#/w/778713616/3

I want to match the slug part of it (in above examples, it's peacefuljay).

Regex i've tried so far are;

 http://.*\.justin\.tv/(?<Slug>.*)(?:#.)?
 http://.*\.justin\.tv/(?<Slug>.*)(?:#.)

But i can't come with a solution. Either it fails in the first url or in others.

Help appreciated.

3 Answers 3

3

The easiest way of parsing a Uri is by using the Uri class:

string justin = "http://www.justin.tv/peacefuljay#/w/778713616/3";
Uri uri = new Uri(justin);
string s1 = uri.LocalPath; // "/peacefuljay"
string s2 = uri.Segments[1]; // "peacefuljay"

If you insisnt on a regex, you can try someting a bit more specific:

Match mate = Regex.Match(str, @"http://(\w+\.)*justin\.tv(?:/(?<Slug>[^#]*))?");
  • (\w+\.)* - Ensures you match the domain, not anywhere else in the string (eg, hash or query string).
  • (?:/(?<Slug>[^#]*))? - Optional group with the string you need. [^#] limits the characters you expect to see in your slug, so it should eliminate the need of the extra group after it.
Sign up to request clarification or add additional context in comments.

3 Comments

Thanks for this which is actually a way to solve but in my situation i've to implement this with regexes -- cause i've far more urls to parse which i can't parse them all with uri segments.
Actually, the more you have, the more complex the regex will be. Unless you're doing URL rewriting, which is sometimes confined to regex, this should be the better option. This will also handle tricky urls, like http://www.justin.tv /warandhate?source=justin.tv/peacefuljay , which currently fail on your regex. Either way, I've added a regex alternative.
Thanks for the regex method. Actually the urls i've are one fore livestream, one for ustream and so on. So each will have specific regex to process.
2

As I see it there's no reason to treat to the parts after the "slug".

Therefore you only need to match all characters after the host that aren't "/" or "#".

http://.*\.justin\.tv/(?<Slug>[^/#]+)

Comments

0
http://.*\.justin\.tv/(?<Slug>.*)#*?

or

http://.*\.justin\.tv/(?<Slug>.*)(#|$)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.