Python Forum
Scraping all website text using Python
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Scraping all website text using Python
#1
I am very very new to Python at all (so sorry in advance for asking stupid questions). I have an excel sheet with a unique company identifier and the respective URLs next to it for a couple of companies.

What I would like to do is to open the URL and save all the website text (the complete text from the first page of the website) for each of the companies to a separate .txt-file. The name of the file should be the unique identifier from the excel sheet.

Did someone of you something similar in the past or could help me with the code on that task?

That would be great!!
Reply
#2
I suggest that you go through snippsat's web scraping tutorials here:
web scraping part 1
web scraping part 2
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Extracting content from a website using Python? SandraYokum 2 184 Today, 04:17 AM
Last Post: Fernanda
  Retrieve website content using Python? Vadanane 1 1,331 Jan-16-2023, 09:55 AM
Last Post: Axel_Erfurt
  web scraping for new additions/modifed website? kingoman123 4 2,311 Apr-14-2022, 04:46 PM
Last Post: snippsat
  I want to create an automated website in python mkdhrub1 2 2,516 Dec-27-2021, 11:27 PM
Last Post: Larz60+
  Scraping lender data from Ren Ren Dai website using Python. I will pay for that 200$ Hafedh_2021 1 2,797 May-18-2021, 08:41 PM
Last Post: snippsat
  Python to build website Methew324 1 2,284 Dec-15-2020, 05:57 AM
Last Post: buran
  Scraping text from application? kamix 1 1,669 Sep-25-2020, 10:53 PM
Last Post: Larz60+
  Python Webscraping with a Login Website warriordazza 0 2,658 Jun-07-2020, 07:04 AM
Last Post: warriordazza
  Scraping a Website (HELP) LearnPython2 1 1,803 May-08-2020, 03:20 PM
Last Post: Larz60+
  scraping from a website that hides source code PIWI_Protein 1 2,016 Mar-27-2020, 05:08 PM
Last Post: Larz60+

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020