regex - How to remove middle appearing twice in the name list in r -


my name list has following error middle name appears twice example s.no-1,2. have data table format has 100k observation , 15 variables including name column. how achieve expected output removing middle name appearing twice?

 name column               expected   1.a michael michael aura  1.a michael aura  2.a thomas thomas parsa   2.a thomas parsa  3.a gul                   3.a gul  4.clark                   4.clark 

we can use sub

sub("\\s+(\\w+\\s*)\\1+", " \\1", df1[,1]) #[1] "1.a michael aura" "2.a thomas parsa" "3.a gul"          "4.clark"      

Comments

Popular posts from this blog

Is there a better way to structure post methods in Class Based Views -

performance - Why is XCHG reg, reg a 3 micro-op instruction on modern Intel architectures? -

jquery - Responsive Navbar with Sub Navbar -