3

I need to validate a url like those of web servers. Something like http://localhost:8080/xyz

How do we do that using regex. Sorry, new to regex.

3
  • What do you have so far? Commented Jun 12, 2011 at 10:29
  • 3
    What do you want to verify URLs using a regular expression with? Commented Jun 12, 2011 at 10:38
  • 1
    How do you expect a regular expression to validate a URL? Wouldn't you be better trying to access it to see if you get a 2xx ? Commented Jun 12, 2011 at 11:00

1 Answer 1

7

the relevant specs can be found in rfc 3986 and include regular syntax definitions for all possible url components. however, for your purposes these will probably be too general. a somewhat condensed expression matching only urls under the http(s) protocol would be

http[s]?://(([[:alpha:][:digit:]-._~!$&'\(\)*+,;=]|%([0-9A-F]{2}))+|([0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5]))(:[0-9]+)?(/([[:alpha:][:digit:]-._~!$&'\(\)*+,;=]|%([0-9A-F]{2}))*)+(\?([[:alpha:][:digit:]-._~!$&'\(\)*+,;=/?]|%([0-9A-F]{2}))+)?(#([[:alpha:][:digit:]-._~!$&'\(\)*+,;=/?]|%([0-9A-F]{2}))+)?

which can be simplified to

http[s]?://(([^/:\.[:space:]]+(\.[^/:\.[:space:]]+)*)|([0-9](\.[0-9]{3})))(:[0-9]+)?((/[^?#[:space:]]+)(\?[^#[:space:]]+)?(\#.+)?)?

in case you can be confident about the proper syntax of the url components.

note that you might wish more restrictive patterns e.g. for full text search and to only allow for iana-registered top-level-domains.

hope it helps,

best regards, carsten

Sign up to request clarification or add additional context in comments.

1 Comment

This regex doesn't match if there is a trailing slash. How would you extend it to match even with trailing slashes?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.