0

I want to extract the website name, from a link, so I write the following function:

protected function getWebsiteName()
{
    $prefixs = ['https://', 'http://', 'www.'];

    foreach($prefixs as $prefix)
    {
        if(strpos($this->website_link, $prefix) !== false)
        {
            $len = strlen($prefix);
            $this->website_name = substr($this->website_link, $len);
            $this->website_name = substr($this->website_name, 0, strpos($this->website_name, '.'));
        }
    }
}

The problem is that when I use I website link that look like https://www.github.com, the result is: s://www, and the function only works when I remove that 'www.' from the array list.

Any ideas why this is happening, or how I can improve this function?

3 Answers 3

2

You could use parse_url();, Try:

print_r(parse_url('https//www.name/'));
Sign up to request clarification or add additional context in comments.

3 Comments

ok i will try it, but what about the above behavior, why i get that result.
@sjagr what the you mean? it's exactly doing what i need it to do !!
from the documentation of parse_url(), i still need to parse the hostname (www.example.com) to extract (example) from it.
1

Let's look at your code. Each time you go through the foreach, you are applying your logic from the original website_link every time. This means when you run strlen in the situation of www. after the first two iterations, this happens:

  1. $prefix is www.
  2. Therefore, $len = 4 (the length of $prefix)
  3. $this->website_link is still https://www.github.com
  4. You apply substr($this->website_link, 4)
  5. Result is $this->website_name = 's://www.github.com'
  6. You apply substr($this->website_name, 0, 7) (7 being the result of strpos($this->website_name, '.')
  7. The result is $this->website_name = 's://www'

To fix this, you should save $this->website_link to $temp and then use the following code:

$temp = $this->website_link;
foreach($prefixs as $prefix)
{
    if(strpos($temp, $prefix) !== false)
    {
        $len = strlen($prefix);
        $temp = substr($temp, $len);
    }
}
$this->website_name = substr($temp, 0, strpos($temp, '.'));

I'd suggest @dynamic's answer, but if you want to continue the strategy of string replacement, use str_replace. It accepts arrays for the needle!

$prefixes = ['https://', 'http://', 'www.'];
$this->website_name = str_replace($prefixes, '', $this->website_link);
$this->website_name = substr($this->website_name, 0, strpos($this->website_name, '.'));

8 Comments

in 2, how do you get 4?, i have $len = strlen('https://') wich is the 1st element of $prefixs, so $this->website_name should be the substring from that position to the end, am i getting this wrong ?
@OussamaELGOUMRI Because we're dealing with the last iteration. See the edit, I've broken it down further for you.
@OussamaELGOUMRI You misunderstand the fundamental problem. You are applying the logic to the original $this->website_link every time, where you should instead be compounding the substr operations to a temporary variable through each iteration. Therefore the first two iterations of http:// and https:// do not matter whatsoever because the last iteration is the only thing that remains! If you try the modified code it will work. Even better if you use the last code block that I suggested with str_replace, you will be golden.
yep, exactly, i finaly get it, i forget to update the website link to the new value, thank you sjagr.
the test now is green, thank you, str_replace() clean's up my code a little :)
|
0

Yes, use parse_url along with preg_match should do the job

function getWebsiteName($url)
{
  $pieces = parse_url($url);
  $domain = isset($pieces['host']) ? $pieces['host'] : '';
  if (preg_match('/(?P<domain>[a-z0-9][a-z0-9\-]{1,63}\.[a-z\.]{2,6})$/i', $domain, $regs)) {
    return $regs['domain'];
  }
  return false;
}

This is fixing your code.

function getWebsiteName()
{
    $this->website_name = $this->website_link;
    $prefixs = array('https://', 'http://', 'www.');

    foreach($prefixs as $prefix)
    {
      if (substr($this->website_name, 0, strlen($prefix)) == $prefix) {
        $this->website_name = substr($this->website_name, strlen($prefix));
      }              
    }
}

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.