top-level domains and second-level domains may be 2 characters long but a registered subdomain must be at least 3 characters long.
EDIT: because of pjv's comment, i learned Australian domain names are an exception because they allow 5 TLDs as SLDs (com,net,org,asn,id) example: somedomain.com.au. i'm guessing com.au is nationally controlled domain name which "shares". so, technically, "com.au" would still be the "base domain", but that's not useful.
EDIT: there are 47,952 possible three-letter domain names (pattern: [a-zA-Z0-9][a-zA-Z0-9-][a-zA-Z0-9] or 36 * 37 * 36) combined with just 8 of the most common TLDS (com,org,etc) we have 383,616 possibilities -- without even adding in the entire scope of TLDs. 1-letter and 2-letter domain names still exist, but are not valid going forward.
in google.com -- "google" is a subdomain of "com"
in google.co.uk -- "google" is a subdomain of "co", which in turn is a subdomain of "uk", or a second-level domain really, since "co" is also a valid top-level domain
in www.google.com -- "www" is a subdomain of "google" which is a subdomain of "com"
"co.uk" is NOT a valid host because there is no valid domain name
going with that assumption this function will return the proper "basedomain" in almost all cases, without requiring a "url map".
if you happen to be one of the rare cases, perhaps you can modify this to fulfill particular needs...
EDIT: you must pass the domain string as a URL with it's protocol (http://, ftp://, etc) or parse_url() will not consider it a valid URL (unless you want to modify the code to behave differently)
function basedomain( $str = '' )
{
// $str must be passed WITH protocol. ex: http://domain.com
$url = @parse_url( $str );
if ( empty( $url['host'] ) ) return;
$parts = explode( '.', $url['host'] );
$slice = ( strlen( reset( array_slice( $parts, -2, 1 ) ) ) == 2 ) && ( count( $parts ) > 2 ) ? 3 : 2;
return implode( '.', array_slice( $parts, ( 0 - $slice ), $slice ) );
}
if you need to be accurate use fopen or curl to open this URL:
http://data.iana.org/TLD/tlds-alpha-by-domain.txt
then read the lines into an array and use that to compare the domain parts
EDIT: to allow for Australian domains:
function au_basedomain( $str = '' )
{
// $str must be passed WITH protocol. ex: http://domain.com
$url = @parse_url( $str );
if ( empty( $url['host'] ) ) return;
$parts = explode( '.', $url['host'] );
$slice = ( strlen( reset( array_slice( $parts, -2, 1 ) ) ) == 2 ) && ( count( $parts ) > 2 ) ? 3 : 2;
if ( preg_match( '/\.(com|net|asn|org|id)\.au$/i', $url['host'] ) ) $slice = 3;
return implode( '.', array_slice( $parts, ( 0 - $slice ), $slice ) );
}
IMPORTANT ADDITIONAL NOTES: I don't use this function to validate domains. It is generic code I only use to extract the base domain for the server it is running on from the global $_SERVER['SERVER_NAME'] for use within various internal scripts. Considering I have only ever worked on sites within the US, I have never encountered the Australian variants that pjv asked about. It is handy for internal use, but it is a long way from a complete domain validation process. If you are trying to use it in such a way, I recommend not to because of too many possibilities to match invalid domains.
google.com, you're interested in the TLD and second-level domain name. Forgoogle.co.uk, you want the TLD and second and third level domain names. There's no defined "base name", what you mean by "base name" is different for different registrars/TLDs.