Python Forum
Scraping all website text using Python
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Scraping all website text using Python
#1
I am very very new to Python at all (so sorry in advance for asking stupid questions). I have an excel sheet with a unique company identifier and the respective URLs next to it for a couple of companies.

What I would like to do is to open the URL and save all the website text (the complete text from the first page of the website) for each of the companies to a separate .txt-file. The name of the file should be the unique identifier from the excel sheet.

Did someone of you something similar in the past or could help me with the code on that task?

That would be great!!
Reply
#2
I suggest that you go through snippsat's web scraping tutorials here:
web scraping part 1
web scraping part 2
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Extracting content from a website using Python? SandraYokum 2 182 Today, 04:17 AM
Last Post: Fernanda
  Retrieve website content using Python? Vadanane 1 1,330 Jan-16-2023, 09:55 AM
Last Post: Axel_Erfurt
  web scraping for new additions/modifed website? kingoman123 4 2,311 Apr-14-2022, 04:46 PM
Last Post: snippsat
  I want to create an automated website in python mkdhrub1 2 2,510 Dec-27-2021, 11:27 PM
Last Post: Larz60+
  Scraping lender data from Ren Ren Dai website using Python. I will pay for that 200$ Hafedh_2021 1 2,797 May-18-2021, 08:41 PM
Last Post: snippsat
  Python to build website Methew324 1 2,284 Dec-15-2020, 05:57 AM
Last Post: buran
  Scraping text from application? kamix 1 1,668 Sep-25-2020, 10:53 PM
Last Post: Larz60+
  Python Webscraping with a Login Website warriordazza 0 2,656 Jun-07-2020, 07:04 AM
Last Post: warriordazza
  Scraping a Website (HELP) LearnPython2 1 1,800 May-08-2020, 03:20 PM
Last Post: Larz60+
  scraping from a website that hides source code PIWI_Protein 1 2,012 Mar-27-2020, 05:08 PM
Last Post: Larz60+

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020