python 2.7 - BeautifulSoup, ignore <a> </a> tags and get al the text inside <p> </p> -

May 15, 2014

i want text inside every <p> tag belongs news1

import requests bs4 import beautifulsoup r1  = requests.get("http://www.metalinjection.net/shocking-revelations/machine-heads-robb-flynn-addresses-controversial-photo-from-his-past-in-the-wake-of-charlottesville") data1 = r1.text soup1 = beautifulsoup(data1, "lxml") news1 = soup1.find_all("div", {"class": "article-detail"})  x in news1:     print x.find("p").text

this first <p> text , that..when called find_all gives following error

attributeerror: resultset object has no attribute 'find_all'. you're treating list of items single item. did call find_all() when meant call find()?

so made list.but still getting same error??

text1 = [] x in news1:     text1.append(x.find_all("p").text)  print text1

the error when running code is: attributeerror: 'resultset' object has no attribute 'text', reasonable bs4 resultset list of tag elements. can text of every 'p' tag if loop on iterable.

text1 = [] x in news1:     in x.find_all("p"):         text1.append(i.text)

or one-liner, using list comprehensions:

text1 = [i.text x in news1 in x.find_all("p")]

Search This Blog

How Y

python 2.7 - BeautifulSoup, ignore <a> </a> tags and get al the text inside <p> </p> -

Comments

Post a Comment

Popular posts from this blog

Is there a better way to structure post methods in Class Based Views -

reflection - How to access the object-members of an object declaration in kotlin -

php - Doctrine Query Builder Error on Join: [Syntax Error] line 0, col 87: Error: Expected Literal, got 'JOIN' -