2

I am trying to create a javascript function that can extract only the 'domain' and 'top-level domain' from a url string.

The current questions on StackOverFlow do not resolve the answer for non-urls too.

Examples:

  1. https://www.google.com/imgres?imgurl -> google.com
  2. yahoo.com/mail -> yahoo.com
  3. http://helloworld.net/index/test/help -> helloworld.net
  4. www.stackoverflow.com/ -> stackoverflow.com
  5. invalid.url -> "return false or an empty string"

Any/All help is welcome and appreciated. Thank you.

3
  • 3
    Unless wanting this for learning purposes, there is already something that will do that: URL, though it does need to have the protocol part for it to be valid input, and will have to strip the subdomain parts (www) to get the output you seek Commented Jan 15, 2018 at 20:10
  • using the window.location as the parameter for the URL method Commented Jan 15, 2018 at 20:13
  • Possible duplicate of How to get top level domain (base domain) from the url in javascript Commented Jan 15, 2018 at 20:15

3 Answers 3

1

A regex can suit your needs, for example:

^(?:https?://)?(?:[^/]+\.)?([^./]+\.[^./]+).*$

See on Debuggex

In JS:

function extractDomain(url) {
  return url.replace(/^(?:https?:\/\/)?(?:[^\/]+\.)?([^.\/]+\.[^.\/]+).*$/, "$1");
}

If you want to handle TLDs that contain dots (like .co.uk), then I'm afraid the only solution is to hardcode them, for example:

^(?:https?://)?(?:[^/]+\.)?([^./]+\.(?:co\.uk|com|de|es|fr)).*$

See on Debuggex

Sign up to request clarification or add additional context in comments.

Comments

0

This is quite simple.

window.location.href variable stores the current url of the website.

With this variable, you can get the top domain.

var url = window.location.href
var split1 = url.split('//')[1];   // Get the string except the protocol.
var split2 = split1.split('/')[0];    // Get the domain url.
// Get top domain url.
var domain = split2;
if (split2.substring(0, 4) == 'www.')
    domain = split2.slice(4)

variable domain is the value you want.

3 Comments

This will fail for https://accounts.google.com or even https://foo.bar.foo.bar.
In that case, the top domain will be "accounts.google.com". I think. And it will work like that.
Sorry, I misspelled the code. When getting split2 variable, variable split1 must be used but I used url (my mistake.)
-2
function getBaseHostname(urlStr: string): false | string {
  let url;
  try {
    url = new URL(urlStr);
  } catch (e) {
    return false;
  }
  return url.hostname.split('.').splice(-2).join('.');
}

the other answers overcomplicate it

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.