how to extract domain name without sub domain from url

Chem Leakhina · Jun 23, 2009

Hi everyone,

Does anyone know how to extract domain name without sub domain from url?

Example: http://test.domain.com => http://domain.com

Please give me an example code in ruby.

Thanks,
Leakhina

Justin Collins · Jun 23, 2009

Chem said:
Hi everyone,

Does anyone know how to extract domain name without sub domain from url?

Example: http://test.domain.com => http://domain.com

Please give me an example code in ruby.

Thanks,
Leakhina

This is actually quite difficult, because there is a multitude of
possible second-level domains which can be used (such as .co.uk), and
they are not really standardized. Just picking one at random, the
country of Jordan has .com.jo, .net.jo, .gov.jo, .edu.jo, .org.jo,
mil.jo, .name.jo, and .sch.jo.

If one were to ignore such things, then it becomes easier:

$ irb
irb(main):001:0> require 'uri'
=> true
irb(main):002:0> u = URI.parse "http://test.domain.com/"
=> #<URI::HTTP:0xb7bbf848 URL:http://test.domain.com/>
irb(main):003:0> u.host
=> "test.domain.com"
irb(main):004:0> u.host.split(".")[-2,2]
=> ["domain", "com"]
irb(main):005:0> u.host.split(".")[-2,2].join(".")
=> "domain.com"

However, as mentioned above, there are a lot of domains this will not
work for.

-Justin

Robert Klemme · Jun 23, 2009

2009/6/23 Justin Collins said:
Chem said:

Hi everyone,

Does anyone know how to extract domain name without sub domain from url?

Example: http://test.domain.com => http://domain.com

Please give me an example code in ruby.

Thanks,
Leakhina

Click to expand...

This is actually quite difficult, because there is a multitude of possible
second-level domains which can be used (such as .co.uk), and they are not
really standardized. Just picking one at random, the country of Jordan has
.com.jo, .net.jo, .gov.jo, .edu.jo, .org.jo, .mil.jo, .name.jo, and .sch.jo.

If one were to ignore such things, then it becomes easier:

$ irb
irb(main):001:0> require 'uri'
=> true
irb(main):002:0> u = URI.parse "http://test.domain.com/"
=> #<URI::HTTP:0xb7bbf848 URL:http://test.domain.com/>
irb(main):003:0> u.host
=> "test.domain.com"
irb(main):004:0> u.host.split(".")[-2,2]
=> ["domain", "com"]
irb(main):005:0> u.host.split(".")[-2,2].join(".")
=> "domain.com"

However, as mentioned above, there are a lot of domains this will not work
for.

We can get better results by ignoring particular known domain prefixes
such as "ftp" and "www":

# this works with 1.8 and 1.9
%w{
www.google.com
google.co.uk
www.google.co.uk
foo.bar
}.each do |domain|
dom = domain.sub(/^(?:www|ftp)\./, '')[/^[^.]+/]
printf "%p -> %p\n", domain, dom
# alternative
dom = domain[/^(?

?:ftp|www)\.)?([^.]+)/, 1]
printf "%p -> %p\n", domain, dom
end

Kind regards

robert

Iframe cross-domain access with JavaScript	1	Oct 14, 2022
CORS/Express: Getting data from server from domain html	2	Sep 3, 2022
How to get client domain name with asp	0	May 7, 2018
How to extract image from PDF in Python	0	May 24, 2022
Extract domain name	5	Aug 20, 2010
Changing .html in URL	3	Jul 11, 2022
Get Parameter from URL in JavaScript that starts with # and not ?	1	Apr 29, 2021
SQL Problem Using Extract Command	0	Apr 8, 2022

how to extract domain name without sub domain from url

Chem Leakhina

Justin Collins

Robert Klemme

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads