BeautifulSoup help ! - Printable Version +- Python Forum (https://python-forum.io) +-- Forum: Python Coding (https://python-forum.io/forum-7.html) +--- Forum: Web Scraping & Web Development (https://python-forum.io/forum-13.html) +--- Thread: BeautifulSoup help ! (/thread-346.html) |
BeautifulSoup help ! - navsid - Oct-06-2016 So I've just started up with python and an assignment was given to me by a company as an recruitment task. I need to web scrap the coupons of all the websites available on www.couponraja.in and export it to csv format. The details which I need to be present in the csv are the coupon title , vendor , validity , description/detail , url to the vendor , image url of the coupon. I have gone through many tutorials on beautifulsoup and have a beginners understanding of using it. Wrote a code as well , but the problem Im facing here is when i collect info from the divs which contains all those info , Im getting it in with all the html tags and the info is clustered. Code m using : import requests from bs4 import BeautifulSoup url = "https://www.couponraja.in/amazon" r = requests.get(url) soup = BeautifulSoup(r.content) g_data = soup.find_all("div", {"class": "nw-offrtxt"}) for item in g_data: print item.contentsalso will need help on how to export the info to csv format , I just know I need to import csv then write the information to a csv file. But not getting through on how to achieve that. Any help will be appreciated. RE: BeautifulSoup help ! - wavic - Oct-06-2016 Hello! What is in g_data? Give us portion of the output RE: BeautifulSoup help ! - navsid - Oct-06-2016 (Oct-06-2016, 09:48 AM)wavic Wrote: Hello! What is in g_data? Give us portion of the output g_data has the information in the nw-offrtxt div of the web page. the palce i have circles includes all the information Im seeking .. on the other side is the div "nw-offrtxt" which contains the information . g_data has that information. Im a noob here so I might me a bit off track here. RE: BeautifulSoup help ! - wavic - Oct-06-2016 I was meaning the the output from the print(g_data[0]). |