python - Pandas how to use pd.cut() -


here snippet:

test = pd.dataframe({'days': [0,31,45]}) test['range'] = pd.cut(test.days, [0,30,60]) 

output:

    days    range 0   0       nan 1   31      (30, 60] 2   45      (30, 60] 

i surprised 0 not in (0, 30], should categorize 0 (0, 30]?

test['range'] = pd.cut(test.days, [0,30,60], include_lowest=true) print (test)    days           range 0     0  (-0.001, 30.0] 1    31    (30.0, 60.0] 2    45    (30.0, 60.0] 

see difference:

test = pd.dataframe({'days': [0,20,30,31,45,60]})  test['range1'] = pd.cut(test.days, [0,30,60], include_lowest=true) #30 value in [30, 60) group test['range2'] = pd.cut(test.days, [0,30,60], right=false) #30 value in (0, 30] group test['range3'] = pd.cut(test.days, [0,30,60]) print (test)    days          range1    range2    range3 0     0  (-0.001, 30.0]   [0, 30)       nan 1    20  (-0.001, 30.0]   [0, 30)   (0, 30] 2    30  (-0.001, 30.0]  [30, 60)   (0, 30] 3    31    (30.0, 60.0]  [30, 60)  (30, 60] 4    45    (30.0, 60.0]  [30, 60)  (30, 60] 5    60    (30.0, 60.0]       nan  (30, 60] 

or use numpy.searchsorted, values of days hast sorted:

arr = np.array([0,30,60]) test['range1'] = arr.searchsorted(test.days) test['range2'] = arr.searchsorted(test.days, side='right') - 1 print (test)    days  range1  range2 0     0       0       0 1    20       1       0 2    30       1       1 3    31       2       1 4    45       2       1 5    60       2       2 

Comments

Popular posts from this blog

Is there a better way to structure post methods in Class Based Views -

performance - Why is XCHG reg, reg a 3 micro-op instruction on modern Intel architectures? -

jquery - Responsive Navbar with Sub Navbar -