I Want To Download Many Files Of Same File Extension With Either Wget Or Python, - Printable Version +- Python Forum (https://python-forum.io) +-- Forum: Python Coding (https://python-forum.io/forum-7.html) +--- Forum: Web Scraping & Web Development (https://python-forum.io/forum-13.html) +--- Thread: I Want To Download Many Files Of Same File Extension With Either Wget Or Python, (/thread-10404.html) Pages:
1
2
|
I Want To Download Many Files Of Same File Extension With Either Wget Or Python, - eddywinch82 - May-19-2018 I would like to download Files of the same File types .utu and .zip from the Following Microsoft Flight Simulator AI Traffic Websites :- http://web.archive.org/web/20050315112710/http://www.projectai.com:80/libraries/acfiles.php?cat=6 *(Current Repaints) http://web.archive.org/web/20050315112940/http://www.projectai.com:80/libraries/acfiles.php?cat=1 (Vintage Repaints) On each of those pages there are Subcatagories for Airbus Boeing etc for the AI Aircraft types, and the repaints .zip Files choices are shown when you click on the Aircraft image. The Folder name then becomes http://web.archive.org/web/20041114195147/http://www.projectai.com:80/libraries/repaints.php?ac=number&cat=(number) Then when you click the downloads repaints.php? becomes download.php?fileid=(4 digit number) What do I need to type to download all the .zip Files at once ? As clicking on them individually to download would take ages. Also I would like to download all .utu File extension File, For Flight 1 ultimate Traffic AI Aircraft repaints. from the Following Webpage :- http://web.archive.org/web/20060512161232/http://ultimatetraffic.flight1.net:80/utfiles.asp?mode=1&index=0 Then When you click to download the Ultimate Traffic Aircraft Texture :- The last Folder Path becomes /utfiles.asp?mode=download&id=F1AIRepaintNumbers-Numbers-Numbers.utu And I would like to do the same as for the other Websites. I used the following written code in Python 2.79, found on a video on Youtube, inserting my info to achieve my aim, but it unsurprisingly didn't work when I ran it timeouts and errors etc, probably due to it's simplicity :- import requests from bs4 import BeautifulSoup import wget def download_links(url): source_code = requests.get(url) plain_text = source_code.text soup = BeautifulSoup(plain_text, "html.parser") for link in soup.findAll('a'): href = link.get('href') print(href) wget.download(href) download_links('http://web.archive.org/web/20041225023002/http://www.projectai.com:80/libraries/acfiles.php?cat=6')Traceback Error Readout from Run Code :- Any help would be much appreciatedEddie RE: I Want To Download Many Files Of Same File Extension With Either Wget Or Python, - j.crater - May-19-2018 Put your code in Python code tags and full error traceback message in error tags. You can find help here. RE: I Want To Download Many Files Of Same File Extension With Either Wget Or Python, - wavic - May-19-2018 wget -r -np -A zip,utu -c -U Mozilla http://example.comThis should work. Move to the desired destination directory before that. RE: I Want To Download Many Files Of Same File Extension With Either Wget Or Python, - eddywinch82 - May-19-2018 Hi there wavic, many thanks for your help, I tried out your suggestion, in the wget program :- wget -e robots=off -r -np -A utu -c -U Mozilla http://web.archive.org/web/20070611232047/http://ultimatetraffic.flight1.net:80/utfiles.asp?mode=1&index=0 But got the following error messages :- Some of the .utu File downloads might have Broken links, could that be what is causing this problem ? How do I get round that ?
RE: I Want To Download Many Files Of Same File Extension With Either Wget Or Python, - wavic - May-19-2018 Try enclosing the address in quotes. RE: I Want To Download Many Files Of Same File Extension With Either Wget Or Python, - eddywinch82 - May-19-2018 How do I do that ? what shall i type ? RE: I Want To Download Many Files Of Same File Extension With Either Wget Or Python, - wavic - May-19-2018 Double quotes around the address. "http://web.archive.org/web/20070611232047/http://ultimatetraffic.flight1.net:80/utfiles.asp?mode=1&index=0" RE: I Want To Download Many Files Of Same File Extension With Either Wget Or Python, - eddywinch82 - May-19-2018 I see, many thanks I will try that and get back to you. Hi wavic, I did what you suggested, and no .utu files were downloaded still, only a .tmp file RE: I Want To Download Many Files Of Same File Extension With Either Wget Or Python, - wavic - May-19-2018 I tested it. And got nothing. I see that the link to the file starts with tp://. I think that means transport protocol and I am not sure if wget can handle it. Basically, this is how I download part of a website or bunch of other files but here doesn't work that way. Perhaps some web scrapping has to be involved. RE: I Want To Download Many Files Of Same File Extension With Either Wget Or Python, - snippsat - May-19-2018 (May-19-2018, 10:55 AM)eddywinch82 Wrote: I used the following written code in Python 2.79, found on a video on Youtube, inserting my info to achieve my aim, but it unsurprisingly didn't work when I ran it timeouts and errors etc, probably due to it's simplicity :-Yes,i guess there are some lacking in understating this of topic,and maybe Python in general The wget get all files method,may work or not.Not work then back to look at site source for an other method. As i took a look,it's not so difficult to get all .UTU files. from bs4 import BeautifulSoup import requests url = 'http://web.archive.org/web/20070611232047/http://ultimatetraffic.flight1.net:80/utfiles.asp?mode=1&index=0' url_get = requests.get(url) soup = BeautifulSoup(url_get.content, 'lxml') b_tag = soup.find_all('b') for a in b_tag: link = a.find('a')['href'] #print(link) f_name = link.split('id=')[-1] with open(f_name, 'wb')as f: f.write(requests.get(link).content) |