python 3.x - Add columns with normalised rankings to a pandas dataframe -


i add column normalized rankings pandas dataframe. process follows:

import pandas package first.

#import packages import pandas pd 

define pandas dataframe.

# create dataframe data = {'name': ['jason', 'jason', 'tina', 'tina', 'tina'],         'reports': [4, 24, 31, 2, 3],         'coverage': [25, 94, 57, 62, 70]} df = pd.dataframe(data) 

after dataframe created, want add column dataframe. column contains rank based on values in coverage column every name seperately.

df['coveragerank'] = df.groupby('name')['coverage'].rank() print (df)    coverage   name  reports  coveragerank 0        25  jason        4           1.0 1        94  jason       24           2.0 2        57   tina       31           1.0 3        62   tina        2           2.0 4        70   tina        3           3.0 

i want normalize values in ranking column.

the desired output

   coverage   name  reports  coveragerank 0        25  jason        4      0.500000 1        94  jason       24      1.000000 2        57   tina       31      0.333333 3        62   tina        2      0.666667 4        70   tina        3      1.000000 

does know way without using explicit for-loop?

you can use transform series same size original df , divide div:

a = df.groupby('name')['coverage'].transform('size') print (a) 0    2 1    2 2    3 3    3 4    3 name: coverage, dtype: int64  df['coveragerank'] = df.groupby('name')['coverage'].rank().div(a) print (df)    coverage   name  reports  coveragerank 0        25  jason        4      0.500000 1        94  jason       24      1.000000 2        57   tina       31      0.333333 3        62   tina        2      0.666667 4        70   tina        3      1.000000 

another solution apply:

df['coveragerank'] = df.groupby('name')['coverage'].apply(lambda x: x.rank() / len(x)) print (df)    coverage   name  reports  coveragerank 0        25  jason        4      0.500000 1        94  jason       24      1.000000 2        57   tina       31      0.333333 3        62   tina        2      0.666667 4        70   tina        3      1.000000 

Comments

Popular posts from this blog

Is there a better way to structure post methods in Class Based Views -

performance - Why is XCHG reg, reg a 3 micro-op instruction on modern Intel architectures? -

jquery - Responsive Navbar with Sub Navbar -