0

I need to find regex to find website names which does not follow http:// or https:// eg

http://www.google.co.in  ---dont match
http://www.google.co.in  ---dont match
www.google.co.in         ---match

the URL can also be part of a larger string like

<p><a href="https://www.w3schools.com/html/">www.w3schools.com</a></p>

or

The URL To be Matched is www.w3schools.com and www.abc.com , URL Not to be matched is https://www.w3schools.com/html/

in which www.w3schools.com and www.abc.com (In the second example) shoud get a match, and there can be multiple urls in the string

thanks in advance

2
  • Don't parse HTML with regex stackoverflow.com/questions/1732348/… Commented Apr 12, 2018 at 6:59
  • String Is Not html, just gave it as an example, Updated the question accordingly, thanks Commented Apr 12, 2018 at 7:01

3 Answers 3

1

Do you need that?

/(?<!https:\/\/)(?<!http:\/\/)(www\.[\w-.]*?[\w-]+?(\/[\w-]*?)*?)((?=[^\w.\/-]+?)|$)+/ig

You can have a look here:

https://regex101.com/r/XvmR4V/4

If you have a large String that contains website names, this regex matches all names, that do not start with "http://" or "https://". Your websites names always have to start with "www"!!!

Without lookaheads and lookbehinds you can try this. You are going to find the results in the 2. Group ($2).

/([^\/]{2,2})(www\.[\w-.]*?[\w-]+?(\/[\w-]*?)*?)(([^\w.\/-]+?)|$)+/ig

https://regex101.com/r/XvmR4V/5

Now even for www.google.de:

([^\/]{2,2}|^)(www\.[\w-.]*?[\w-]+?(\/[\w-]*?)*?)(([^\w.\/-]+?)|$)+

https://regex101.com/r/XvmR4V/6

You can replace like that.

I replaced the 'www...' with 'Test'.

/([^\/]{2,2}|^)(www\.[\w-.]*?[\w-]+?(\/[\w-]*?)*?)(([^\w.\/-]+?)|$)+/$1Test$4/gi

I testet it with the regex-Tool from IntelliJ.

My input was:

<p><a href="https://www.w3schools.com/html/"><a href="http://www.w3schools.com/html/">www.w3schools.com</a></p>
<p><a href="https://www.google.com/html/"><a href="http://www.google.com/html/">www.google.com</a>

The output was:

<p><a href="https://www.w3schools.com/html/"><a href="http://www.w3schools.com/html/">Test</a></p>
<p><a href="https://www.google.com/html/"><a href="http://www.google.com/html/">Test</a>

If it helps, it would be greate if you vote it up :-)

Sign up to request clarification or add additional context in comments.

7 Comments

in the above link the regex does not work in javascript, while running im getting an error ERROR SyntaxError: Invalid regular expression: /(?^!https://)(?<!http://)(www.[w-.]*?[w-]+?(/[w-]*?)*?)((?=[^w./-]+?)|$)+/gi/: Invalid group at new RegExp (<anonymous>)
@biff: added an other possibility to check it.
hi it was helpful, but if the string is just "www.google.com" rejex will not catch it
and it also catches strings like "www..google.com" which is wrong, can u help
@biff: adapted it :-)
|
0

If you just want to exclude strings beginning with http:// or https://, this is easy enough to do with a negative lookahead:

var match = "www.google.co.in";
var nomatch = "http://www.google.co.in";

var re = new RegExp("^(?!https?:\/\/).*$");
if (re.test(match)) {
    console.log(match + " is valid");
}
if (re.test(nomatch)) {
    console.log(nomatch + " is valid");
}

One advantage of this type of pattern is that it would allow to also filter the positive match URLs on other conditions.

Comments

0

You can use the regular expression ^(http|https):// to get the match for the string that has http:// or https://. Then when you apply the match use the not (!) operator to reverse the match to not to include http:// or https://:

var regEx = new RegExp("^(http|https)://", "i");
var str = "http://www.google.co.in";
var match = !regEx.test(str);
console.log(match + ' for ' + str);

str = 'http://www.google.co.in';
match = !regEx.test(str);
console.log(match + ' for ' + str);

str = 'www.google.co.in';
match = !regEx.test(str);
console.log(match + ' for ' + str);

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.