python 2.7 - BeautifulSoup, ignore <a> </a> tags and get al the text inside <p> </p> -
i want text inside every <p>
tag belongs news1
import requests bs4 import beautifulsoup r1 = requests.get("http://www.metalinjection.net/shocking-revelations/machine-heads-robb-flynn-addresses-controversial-photo-from-his-past-in-the-wake-of-charlottesville") data1 = r1.text soup1 = beautifulsoup(data1, "lxml") news1 = soup1.find_all("div", {"class": "article-detail"}) x in news1: print x.find("p").text
this first <p>
text , that..when called find_all gives following error
attributeerror: resultset object has no attribute 'find_all'. you're treating list of items single item. did call find_all() when meant call find()?
so made list.but still getting same error??
text1 = [] x in news1: text1.append(x.find_all("p").text) print text1
the error when running code is: attributeerror: 'resultset' object has no attribute 'text'
, reasonable bs4 resultset
list of tag
elements. can text of every 'p' tag if loop on iterable.
text1 = [] x in news1: in x.find_all("p"): text1.append(i.text)
or one-liner, using list comprehensions:
text1 = [i.text x in news1 in x.find_all("p")]
Comments
Post a Comment