Parsing XML in Python -
i have large xml file , need format needed data particular elements in , print out data needed file. in xml file have number of text tags belonging different conversations id's , authors have id's after author tag. not need texts authors specific ones whom have id's. how write function specifies select , write out conversations author = id1 or id2 or id3.......etc? document looks like...
<conversations> <conversation id="e621da5de598c9321a1d505ea95e6a2d"> <message line="1"> <author>97964e7a9e8eb9cf78f2e4d7b2ff34c7</author> <time>03:20</time> <text>hola.</text> </message> <message line="2"> <author>0158d0d6781fc4d493f243d4caa49747</author> <time>03:20</time> <text>hi.</text> </message> </conversation> <conversation id="3c517e43554b6431f932acc138eed57e"> <message line="1"> <author>505166bca797ceaa203e245667d56b34</author> <time>18:11</time> <text>hi</text> </message> <message line="2"> </conversation> <conversation id="3c517e43554b6431f932acc138eed57e"> <author>505166bca797ceaa203e245667d56b34</author> <time>18:11</time> <text>aujourd.</text> </message> <message line="3"> <author>4b66cb4831680c47cc6b66060baff894</author> <time>18:11</time> <text>hey</text> </message> </conversation> </conversations>
import xml.etree.elementtree et tree = et.parse('conversations.xml') node in tree.iter(): if node.tag == "conversations": continue if node.tag == "conversation": print("\n") # visual break, new conversation print("{} {}".format(node.tag, node.attrib)) continue if node.tag == "message": print("{} {}".format(node.tag, node.attrib)) continue print("{} {}".format(node.tag, node.text)) so using above should able check id, using similar logic if searching 97964e7a9e8eb9cf78f2e4d7b2ff34c7, etc, make list or dict.
authors = ['97964e7a9e8eb9cf78f2e4d7b2ff34c7'] node in tree.iter(): if node.tag == "author" , node.text in authors: print('found')
Comments
Post a Comment