r - Merging a weird html-like txt file with an Excel file -


i got 2 files i'm supposed merge (most using statistical software such r or spss), 1 of them being normal excel table 3 variables (names @ top of columns). second one, however, sent me in format haven't seen before, large txt file input per case (identified id variable, use merge excel file) looks this:

<organizations> <organization id="b0101"> <type1>e</type1> <type2>v</type2> <name>international association official statistics</name> <acronym>iaos</acronym> <country_first_address>not known</country_first_address> <city_first_address>not known</city_first_address> <countries_in_which_members_located>not known</countries_in_which_members_located> <subject_headings>government; statistics</subject_headings> <foundation_year>1985</foundation_year> <history>[[history]] founded 1985, amsterdam (netherlands), @ 45th session of #a2590, specialized section of isi. absorbed, 1989, #d1316, had been set 22 oct 1958, geneva (switzerland), following recommendations of isi, [international association of municipal statisticians -- association internationale de statisticiens municipaux]. </history> <history_relations>#a2590; #d1316</history_relations> <consultative_status>none known</consultative_status> <igo_relations>none known</igo_relations> <ngo_relations>#e1209; #m4975; #d1976; #e2125; #e3673; #d2578; #m0084</ngo_relations> <member_organizations>none known</member_organizations> </organization>  <organization id="b8500"> <type1>b</type1> <type2>y</type2> <name>world blind union</name> <acronym>wbu</acronym> <country_first_address>canada</country_first_address> <city_first_address>toronto</city_first_address> <countries_in_which_members_located>algeria; angola; benin; burkina faso; burundi; cameroon; cape verde; central african rep; chad; congo brazzaville; congo dr; côte d'ivoire; djibouti; egypt; equatorial guinea; eritrea; ethiopia; gabon; gambia; ghana; guinea; guinea-bissau; kenya; lesotho; liberia; libyan aj; madagascar; malawi; mali; mauritania; mauritius; morocco; mozambique; namibia; niger; nigeria; rwanda; sao tomé-principe; senegal; seychelles; sierra leone; somalia; south africa; south sudan; sudan; swaziland; tanzania ur; togo; tunisia; uganda; zambia; zimbabwe; anguilla; antigua-barbuda; argentina; bahamas; barbados; belize; bolivia; brazil; canada; chile; colombia; costa rica; cuba; dominica; dominican rep; ecuador; el salvador; grenada; guatemala; guyana; haiti; honduras; jamaica; martinique; mexico; montserrat; nicaragua; panama; paraguay; peru; st kitts-nevis; st lucia; st vincent-grenadines; trinidad-tobago; turks-caicos; uruguay; usa; venezuela; virgin uk; afghanistan; bahrain; bangladesh; brunei darussalam; cambodia; china; hong kong; india; indonesia; iraq; israel; japan; jordan; kazakhstan; korea rep; kuwait; kyrgyzstan; laos; lebanon; macau; malaysia; mongolia; myanmar; nepal; pakistan; philippines; qatar; singapore; sri lanka; syrian ar; taiwan; tajikistan; thailand; timor-leste; turkmenistan; united arab emirates; uzbekistan; vietnam; yemen; australia; fiji; new zealand; tonga; albania; armenia; austria; azerbaijan; belarus; belgium; bosnia-herzegovina; bulgaria; croatia; cyprus; czech rep; denmark; estonia; finland; france; georgia; germany; greece; hungary; iceland; ireland; italy; latvia; lithuania; luxembourg; macedonia; malta; moldova; montenegro; netherlands; norway; poland; portugal; romania; russia; serbia; slovakia; slovenia; spain; sweden; switzerland; turkey; uk; ukraine;</countries_in_which_members_located> <subject_headings>blind, visually impaired</subject_headings> <foundation_year>1984</foundation_year> <history>[[history]] founded 26 oct 1984, riyadh (saudi arabia), 1 united world body composed of representatives of national associations of blind , agencies serving blind, successor body both #b3499, set 20 july 1951, paris (france), , #b2024, formed in aug 1964, new york ny (usa). constitution adopted 26 oct 1984; amended at: 3rd general assembly, 2-6 nov 1992, cairo (egypt); 26-30 aug 1996, toronto (canada); 20-24 nov 2000, melbourne (australia); 22-26 nov 2004, cape town (south africa); 18-22 aug 2008, geneva (switzerland); 12-16 nov 2012, bangkok (thailand). registered in accordance french law, 20 dec 1984, paris , again 20 dec 2004, paris. incorporated in canada not-share-capital not-for-profit corporation, 16 mar 2007. </history> <history_relations>#b3499; #b2024</history_relations> <consultative_status>#e3377; #b2183; #b3548; #b0971; #f3380; #b3635</consultative_status> <igo_relations>#e7552; #f1393; #a3375; #b3408</igo_relations> <ngo_relations>#e0409; #e6422; #j5215; #f5821; #c1224; #d5392; #f6792; #a1945; #b2314; #d1758; #f5810; #d1612; #j0357; #d1038; #g6537; #b2221; #b0094; #b3536; #d7556</ngo_relations> <member_organizations>#f6063; #f4959; #j1979; #c1224; #b0094; #d5392; #a1945; #d2362; #f2936; #j4730; #f3167; #d8743; #f1898; #d0043; #g0853</member_organizations> </organization> 

any appreciated - type of file , how transform manageable table?

i think data xml. copied sample data, pasted blank file, , saved sample.xml. made sure add in line </organizations> @ end (line 37 in sample), close off tag.

then followed instructions here read in:

library(xml) xmlfile <- xmltreeparse(file = "sample.xml") xmltop = xmlroot(xmlfile) orgs <- xmlsapply(xmltop, function(x) xmlsapply(x, xmlvalue)) orgs_df <- data.frame(t(orgs),row.names=null) 

this returns dataframe orgs_df 2 obs. of 15 variables. presume can go ahead , merge excel file please.


Comments

Popular posts from this blog

Is there a better way to structure post methods in Class Based Views -

performance - Why is XCHG reg, reg a 3 micro-op instruction on modern Intel architectures? -

jquery - Responsive Navbar with Sub Navbar -