python - Counting unique values in a column in pandas dataframe like in Qlik? -


if have table this:

df = pd.dataframe({          'hid': [101, 102, 103, 101, 102, 104, 105, 101],          'did': [10, 11, 12, 10, 11, 10, 12, 10],          'uid': ['james', 'henry', 'abe', 'james', 'henry', 'brian', 'claude', 'james'],          'mid': ['a', 'b', 'a', 'b', 'a', 'a', 'a', 'c'] }) 

i can count(distinct hid) in qlik come count of 5 unique hid. how do in python using pandas dataframe? or maybe numpy array? similarly, if count(hid) 8 in qlik. equivalent way in pandas?

count distict values, use nunique:

df['hid'].nunique() 5 

count non-null values, use count:

df['hid'].count() 8 

count total values including null values, use size attribute:

df['hid'].size 8 

edit add condition

use boolean indexing:

df.loc[df['mid']=='a','hid'].agg(['nunique','count','size']) 

or using query:

df.query('mid == "a"')['hid'].agg(['nunique','count','size']) 

output:

nunique    5 count      5 size       5 name: hid, dtype: int64 

Comments

Popular posts from this blog

What is happening when Matlab is starting a "parallel pool"? -

angular - DownloadURL return null in below code -

php - Cannot override Laravel Spark authentication with own implementation -