























python的urlparse模块可以解析域名,但是有个缺陷无法提取url的顶级域名。如果要做一些复杂的提取可以用tld模块。安装方式就是pip命令安装。
pip install tld
以下代码我们用该模块获取一个url的顶级域名(不含后缀部分)、域名后缀、顶级域名(带后缀部分)、子域名部分(不含后缀)
提醒:必须含有协议,比如http或者https,否则会报错。如下:
Is not a valid URL www.python66.com!
# -*- coding: utf-8 -*- import tld url = 'www.python66.com' obj = tld.get_tld(url,as_object=True)
Traceback (most recent call last):
File "D:/pyscript/py3script/python66/test/a.py", line 6, in
obj = tld.get_tld(url,as_object=True)
File "D:python3installlibsite-packages ldutils.py", line 490, in get_tld
parser_class=parser_class
File "D:python3installlibsite-packages ldutils.py", line 328, in process_url
raise TldBadUrl(url=url)
tld.exceptions.TldBadUrl: Is not a valid URL www.python66.com!
1、一个普通的域名
# -*- coding: utf-8 -*- import tld url = 'http://www.python66.com' obj = tld.get_tld(url,as_object=True) print(obj.domain) print(obj.extension) print(obj.fld) print(obj.subdomain) print(obj.suffix)
python66 com python66.com www com
2、一个层级较多的子域名
# -*- coding: utf-8 -*- import tld url = 'http://www.python66.com.cn.uk' obj = tld.get_tld(url,as_object=True) print(obj.domain) print(obj.extension) print(obj.fld) print(obj.subdomain) print(obj.suffix)
cn uk cn.uk www.python66.com uk
3、一个特殊后缀的域名(如果你写的后悔比较冷门,tld库本身没有记录就会报错)
didn't match any existing TLD name!
# -*- coding: utf-8 -*- import tld url = 'http://www.anjuke.co.ui' obj = tld.get_tld(url,as_object=True) print(obj.domain) print(obj.extension) print(obj.fld) print(obj.subdomain) print(obj.suffix)
Traceback (most recent call last):
File "D:/pyscript/py3script/python66/test/a.py", line 6, in
obj = tld.get_tld(url,as_object=True)
File "D:python3installlibsite-packages ldutils.py", line 490, in get_tld
parser_class=parser_class
File "D:python3installlibsite-packages ldutils.py", line 378, in process_url
raise TldDomainNotFound(domain_name=domain_name)
tld.exceptions.TldDomainNotFound: Domain www.anjuke.co.ui didn't match any existing TLD name!
很赞哦!
python编程网提示:转载请注明来源www.python66.com。
有宝贵意见可添加站长微信(底部),获取技术资料请到公众号(底部)。同行交流请加群
此内容由惯性聚合(RSS阅读器)自动聚合整理,仅供阅读参考。 原文来自 — 版权归原作者所有。