Aug-18-2017, 07:46 PM
(Aug-18-2017, 11:34 AM)DeaD_EyE Wrote: You can use regex: http://emailregex.com/
But if you want to open a regular Excel file, you've formatting and maybe binary data inside. The newer format is based on xml, the older Excel format is something else. You can use the hammer method and parse the whole file for e-mail addresses.
import re email_regex = r"([a-zA-Z0-9_.+-]+@[a-zA-Z0-9-]+\.[a-zA-Z0-9-.]+)" with open('youfile.xls') as fd: emails = set(re.findall(email_regex, fd.read())) print(emails)Maybe you have luck and the text is encoded as UTF-8.
But it's better to use a Library to open Excel files or you export the Excel sheet as CSV and use the stdlib of Python for this task.
Thank you for the reply
I do not want to extract emails from a file. The excel file contains urls in a column and I was hoping that is possible to exctract all the emails in the site beside each cell.
I wrote excel because I use it frequently .... what I'm interested of all emails extracted from the links.