[utils] Encode hostnames before passing to urllib

With IDN (Internationalized Domain Name) and a proxy, non-ascii URLs
are passed down to urllib/urllib2, causing UnicodeEncodeError

Fixes #8890
This commit is contained in:
Yen Chi Hsuan 2016-03-23 22:24:52 +08:00
parent 7da2c87119
commit efbed08dc2
2 changed files with 11 additions and 0 deletions

View file

@ -1746,6 +1746,7 @@ def escape_url(url):
"""Escape URL as suggested by RFC 3986"""
url_parsed = compat_urllib_parse_urlparse(url)
return url_parsed._replace(
netloc=url_parsed.netloc.encode('idna').decode('ascii'),
path=escape_rfc3986(url_parsed.path),
params=escape_rfc3986(url_parsed.params),
query=escape_rfc3986(url_parsed.query),