pandas - How to count unique rows in a column based on multiple conditions in python -
i have data frame looks : (the treatment has multiple possibily of character variable, simplified question)
id position treatment --20axecvv- 0 --20axecvv- -1 --20axecvv- -2 --h9inkewqf- 0 --h9inkewqf- -1 b zzu7a@8jn 0 b quesnexmdb 0 c quesnexmdb -1 c qu72ql@h79 0 c i want keep id exclusif treatment, in other word keep id treated 1 treatment if several times. after, want sum number of id each treatment. result :
id position treatment --20axecvv- 0 --20axecvv- -1 --20axecvv- -2 zzu7a@8jn 0 b quesnexmdb 0 c quesnexmdb -1 c qu72ql@h79 0 c and sum :
a : 1 b : 1 c : 2 i have ida how resolve this, maybe loop within loop beginner python/panda thanks
you can groupby id , filter rows based on condition number of unique rows == 1
df1 = df.loc[df.groupby('id').treatment.filter(lambda x: x.nunique()==1).index] or @igor raush suggested,
df1 = df.groupby('id').filter(lambda g: g.treatment.nunique() == 1) id position treatment 0 --20axecvv- 0 1 --20axecvv- -1 2 --20axecvv- -2 5 zzu7a@8jn 0 b 6 quesnexmdb 0 c 7 quesnexmdb -1 c 8 qu72ql@h79 0 c and unique count
df1.groupby('treatment').id.nunique() treatment 1 b 1 c 2
Comments
Post a Comment