1

I want to manipulate a URL to get just the base site name.

For example I have the URL:

http://stackoverflow.com/questions/ask,

which should just return stackoverflow.

Also if I have the URL:

http://stackoverflow.com/questions/4988199/rails-3-mechanize-socketerror-getaddrinfo-host-or-name-not-known

it should also be only stackoverflow.

Final example, if I have the URL:

http://www.google.dk/search?q=ruby+string+manipulation

it should be google.

How do I strip away everything but the domain name of a URL?

3 Answers 3

9

You could use something already available in the std Ruby distribution:

irb(main):001:0> require "uri"
=> true
irb(main):002:0> a = URI.parse("http://www.google.com")
=> #<URI::HTTP:0x3b3eb78 URL:http://www.google.com>
irb(main):003:0> a.host
=> "www.google.com"
irb(main):004:0>
Sign up to request clarification or add additional context in comments.

Comments

2

Probably the easiest solution would be to use the Domainatrix gem to take care of this for you. Once you have it installed, it's just a matter of doing this:

url = Domainatrix.parse("http://www.google.dk/search?q=ruby+string+manipulation")
url.domain # returns 'google'

There are a few more code examples and options on their github page.

Comments

1

If you don't want to use a separate gem, you can try some simple regex

(https?:\/\/)?(www\.)?([^\.]+).* 

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.