Python Forum
Thread Rating:
  • 1 Vote(s) - 2 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Email extraction from websites
#15
(Aug-18-2017, 08:55 PM)snippsat Wrote:
(Aug-18-2017, 07:46 PM)stefanoste78 Wrote: I do not want to extract emails from a file. The excel file contains urls in a column and I was hoping that is possible to exctract all the emails in the site beside each cell.
You read in Excel with eg Openpyxl, Pandas.
Cell with url address you give to Requests.
Then is web scraping i have a tutorial here.

Email links in html has a unique look that it begins with mailto:.
Then can use a CSS selector like a[href^="mailto"]
Example:
from bs4 import BeautifulSoup

# Simulate a web page
html = '''\
<html>
  <body>
    <p>Email me at <a class="emaillink" href="mailto:[email protected]">[email protected]</a></p>
    <p>Email<a id='foo' href="mailto:[email protected]">[email protected]</a></p>
  </body>
</html>'''

soup = BeautifulSoup(html, 'lxml')
email = soup.select('a[href^="mailto"]')
for link in email:
    print(link.text)
Output:
[email protected] [email protected]

Hello. I noticed that you've already dealt with the case of extracting emails from the links.
The problem is I'm not a programmer and I could not put into practice what you said in the affiliates you posted.
Have you ever done something like where the code fetches excel data? For me it would be easier. I just need to put the links in a column and then activate the extraction.
Reply


Messages In This Thread
Email extraction from websites - by stefanoste78 - Aug-13-2017, 12:54 PM
RE: Email extraction from websites - by nilamo - Aug-13-2017, 05:07 PM
RE: Email extraction from websites - by nilamo - Aug-17-2017, 01:41 PM
RE: Email extraction from websites - by nilamo - Aug-17-2017, 06:32 PM
RE: Email extraction from websites - by nilamo - Aug-17-2017, 07:00 PM
RE: Email extraction from websites - by wavic - Aug-18-2017, 09:32 AM
RE: Email extraction from websites - by DeaD_EyE - Aug-18-2017, 11:34 AM
RE: Email extraction from websites - by snippsat - Aug-18-2017, 08:55 PM
RE: Email extraction from websites - by stefanoste78 - Aug-18-2017, 09:44 PM

Possibly Related Threads…
Thread Author Replies Views Last Post
  Webscrapping sport betting websites KoinKoin 3 5,551 Nov-08-2023, 03:00 PM
Last Post: LoriBrown
  Web Scraping Sportsbook Websites Khuber79 17 319,811 Mar-17-2021, 12:06 AM
Last Post: Whitesox1
Thumbs Up Issue facing while scraping the data from different websites in single script. Balamani 1 2,151 Oct-20-2020, 09:56 AM
Last Post: Larz60+
  Django send email - email form Remek953 2 2,334 Sep-18-2020, 07:07 AM
Last Post: Remek953
  Python Scrapy Date Extraction Issue tr8585 1 3,360 Aug-05-2020, 04:32 AM
Last Post: tr8585
  Can urlopen be blocked by websites? peterjv26 2 3,435 Jul-26-2020, 06:45 PM
Last Post: peterjv26
  Article Extraction - Wordpress svzekio 7 5,365 Jul-10-2020, 10:18 PM
Last Post: steve_shambles
  Python program to write into websites for you pythonDEV333 3 2,572 Jun-08-2020, 12:06 PM
Last Post: pythonDEV333
  Follow Up: Web Calendar based Extraction AgileAVS 0 1,529 Feb-23-2020, 05:39 AM
Last Post: AgileAVS
  Scraping Websites to post on Telegram kobryan 1 2,692 Oct-19-2019, 07:03 AM
Last Post: metulburr

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020