pandas - How to count unique rows in a column based on multiple conditions in python -

January 15, 2013

i have data frame looks : (the treatment has multiple possibily of character variable, simplified question)

id              position            treatment --20axecvv-         0           --20axecvv-         -1          --20axecvv-         -2          --h9inkewqf-        0           --h9inkewqf-        -1          b zzu7a@8jn           0           b quesnexmdb          0           c quesnexmdb          -1          c qu72ql@h79          0           c

i want keep id exclusif treatment, in other word keep id treated 1 treatment if several times. after, want sum number of id each treatment. result :

id              position            treatment --20axecvv-         0           --20axecvv-         -1          --20axecvv-         -2          zzu7a@8jn           0           b quesnexmdb          0           c quesnexmdb          -1          c    qu72ql@h79          0           c

and sum :

a : 1  b : 1 c : 2

i have ida how resolve this, maybe loop within loop beginner python/panda thanks

you can groupby id , filter rows based on condition number of unique rows == 1

df1 = df.loc[df.groupby('id').treatment.filter(lambda x: x.nunique()==1).index]

or @igor raush suggested,

df1 = df.groupby('id').filter(lambda g: g.treatment.nunique() == 1)          id          position    treatment 0   --20axecvv-     0           1   --20axecvv-     -1          2   --20axecvv-     -2          5   zzu7a@8jn       0           b 6   quesnexmdb      0           c 7   quesnexmdb      -1          c 8   qu72ql@h79      0           c

and unique count

df1.groupby('treatment').id.nunique()  treatment        1 b        1 c        2

Search This Blog

How Y

pandas - How to count unique rows in a column based on multiple conditions in python -

Comments

Post a Comment

Popular posts from this blog

html - unterminated string literal “onclick” event in anchor -

angular - DownloadURL return null in below code -

python 2.7 - Given three nested dictionaries, sort the top two nested dictionaries from a value in the innermost dictionary? -