Python Library: urllib

built-in library for URL handling

Library resources
PyPI ---
Github https://github.com/python/cpython/tree/3.10/Lib/urllib/
Documentation https://docs.python/3/library/urllib.html

29 Sep 2022

urllib is a package that collects several modules for working with URLs:

  • urllib.request for opening and reading URLs
  • urllib.error containing the exceptions raised by urllib.request
  • urllib.parse for parsing URLs
  • urllib.robotparser for parsing robots.txt files

urllib.parse — Parse URLs into components

from urllib.parse import urlparse
from urllib.parse import urlparse

urlparse("scheme://netloc/path;parameters?query#fragment")

# ParseResult(scheme='scheme', netloc='netloc', path='/path;parameters', params='',query='query', fragment='fragment')
o = urlparse("http://docs.python:80/3/library/urllib.parse.html?highlight=params#url-parsing")

ParseResult(scheme='http', netloc='docs.python:80',
            path='/3/library/urllib.parse.html', params='',
            query='highlight=params', fragment='url-parsing')

o.scheme
'http'
o.netloc
'docs.python:80'
o.hostname
'docs.python'

links

social