1

I'm struggling with getting my regex pattern to match. Here are my requirements...

Match the following domain ex, (google.com) with both http and https.

I have an array list of various URL's....

http://stackoverflow.com/questions/ask
https://ask.com/search
http://google.com/images
https://google.com/images

This is my Pattern:

final Pattern p = Pattern.compile( "(http:(?!.*google.com).*)" );

However, it's currently returning true for all my url's.

Again, I only want it to return true if http://www.google.com or https://www.google.com matches my current url.

3
  • You need to post the code where it's returning true for all of the urls. In particular your pattern should not match any url with https either so something is wrong when you test to see if it matches Commented Mar 9, 2012 at 20:54
  • 2
    Not quite the answer you wanted, but why not use a proper URL parser instead of a homebrew regex? docs.oracle.com/javase/1.4.2/docs/api/java/net/URL.html Commented Mar 9, 2012 at 20:57
  • I'll look into the URL parser as well, thank you! Commented Mar 9, 2012 at 21:14

6 Answers 6

2

Use this:

Pattern.compile("^https?://(?!.*\\.google\\.com/)[^/]*");

RegEx Demo

RegEx Details:

  • ^: Start
  • https?: Match http or https
  • ://: Match ://
  • (?!.*google\\.com/): Negative lookahead to fail to match if there is a google.com/ is matched anywhere
  • [^/]*: Match 0 or more of any non-/ characters
Sign up to request clarification or add additional context in comments.

2 Comments

Thank you! That worked. I did have to change it slightly to:final Pattern p = Pattern.compile("^(https?://(?!.*.google.com)[/]*)");
You can make it more useful by describing your regex.
1

How about just .contains("//google.com")? Or if "google.com" is at position seven or eight?

Comments

1

How about java.net.URI or URL classes...

try {
    URI url = new URI("https://www.google.com/foo?test=horse");
    System.out.println(url.getScheme()); // https
    System.out.println(url.getHost()); // www.google.com
    System.out.println(url.getPath()); // /foo
    System.out.println(url.getQuery()); // test=horse
} catch (URISyntaxException e) {
    e.printStackTrace();
}

Edit: I used URI because I remember hearing somewhere URL had side effects. Just checked it does, the hashCode() method does DNS lookups. Therefore stick to URI if you just want to re-use the URL parsing functionality... See this question

1 Comment

Thanks, I'm taking a look at this right now. Think this will work!
0

final Pattern p = Pattern.compile( "(https?:(?!.*google.com).*)" );

1 Comment

This one is working also: Had to make a slight change, notice the end i added the slash, their were some other url's that were matching true, because it had the domain name that I'm searching for in the tracking query tag. final Pattern p = Pattern.compile( "(https?:(?!.*google.com/).*)" );
0

I only want it to return true if http://www.google.com or https://www.google.com matches my current url.

Pattern.compile("(?i)^https?://www\\.google\\.com\\z");

Comments

0
    String[] urls = new String[] {
        "http://stackoverflow.com/questions/ask",
        "https://ask.com/search",
        "http://google.com/images",
        "https://google.com/images",
        "http://www.google.com"
    };

    final Pattern p = Pattern.compile( "https?://.*?google\\.com.*?" );

    for (String url : urls) {
        Matcher m = p.matcher(url);
        System.out.println(m.matches());
    }

Output is:

   false
   false
   true
   true
   true

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.