python 2.7 - BeautifulSoup, ignore <a> </a> tags and get al the text inside <p> </p> -


i want text inside every <p> tag belongs news1

import requests bs4 import beautifulsoup r1  = requests.get("http://www.metalinjection.net/shocking-revelations/machine-heads-robb-flynn-addresses-controversial-photo-from-his-past-in-the-wake-of-charlottesville") data1 = r1.text soup1 = beautifulsoup(data1, "lxml") news1 = soup1.find_all("div", {"class": "article-detail"})  x in news1:     print x.find("p").text 

this first <p> text , that..when called find_all gives following error

attributeerror: resultset object has no attribute 'find_all'. you're treating list of items single item. did call find_all() when meant call find()? 

so made list.but still getting same error??

text1 = [] x in news1:     text1.append(x.find_all("p").text)  print text1 

the error when running code is: attributeerror: 'resultset' object has no attribute 'text', reasonable bs4 resultset list of tag elements. can text of every 'p' tag if loop on iterable.

text1 = [] x in news1:     in x.find_all("p"):         text1.append(i.text) 

or one-liner, using list comprehensions:

text1 = [i.text x in news1 in x.find_all("p")] 

Comments

Popular posts from this blog

Is there a better way to structure post methods in Class Based Views -

performance - Why is XCHG reg, reg a 3 micro-op instruction on modern Intel architectures? -

jquery - Responsive Navbar with Sub Navbar -