It's a little strange that it gives all that html back.
Not gone try to fix it,as you should not use urllib at all.
It work fine with Requests and also need to do a little parsing with BS.
Example:
Not gone try to fix it,as you should not use urllib at all.
It work fine with Requests and also need to do a little parsing with BS.
Example:
from bs4 import BeautifulSoup import requests url = 'https://www.ietf.org/rfc/rfc2324.txt' response = requests.get(url) soup = BeautifulSoup(response.content, 'lxml') pre = soup.find('p') print(pre.text.strip())