Aug-18-2017, 09:44 PM
(Aug-18-2017, 08:55 PM)snippsat Wrote:(Aug-18-2017, 07:46 PM)stefanoste78 Wrote: I do not want to extract emails from a file. The excel file contains urls in a column and I was hoping that is possible to exctract all the emails in the site beside each cell.You read in Excel with eg Openpyxl, Pandas.
Cell with url address you give to Requests.
Then is web scraping i have a tutorial here.
Email links in html has a unique look that it begins withmailto:
.
Then can use a CSS selector likea[href^="mailto"]
Example:
from bs4 import BeautifulSoup # Simulate a web page html = '''\ <html> <body> <p>Email me at <a class="emaillink" href="mailto:[email protected]">[email protected]</a></p> <p>Email<a id='foo' href="mailto:[email protected]">[email protected]</a></p> </body> </html>''' soup = BeautifulSoup(html, 'lxml') email = soup.select('a[href^="mailto"]') for link in email: print(link.text)
Output:[email protected] [email protected]
Hello. I noticed that you've already dealt with the case of extracting emails from the links.
The problem is I'm not a programmer and I could not put into practice what you said in the affiliates you posted.
Have you ever done something like where the code fetches excel data? For me it would be easier. I just need to put the links in a column and then activate the extraction.