group by - pandas groupby unexpectedly returns negative sum -
i have data on port freight volumes year wish sum turn percentages later on, find negative sums unexpectedly:
data = pd.series( {('2006', 'oakland, ca (port)'): 7460155164, ('2006', 'rest of california'): 32868692124, ('2006', 'san francisco, ca (port)'): 2262901767, ('2007', 'oakland, ca (port)'): 7881218797, ('2007', 'rest of california'): 38595482723, ('2007', 'san francisco, ca (port)'): 1897361592, ('2008', 'oakland, ca (port)'): 8325019179, ('2008', 'rest of california'): 46200094019, ('2008', 'san francisco, ca (port)'): 2732413994, ('2009', 'oakland, ca (port)'): 9077952296, ('2009', 'rest of california'): 42642020668, ('2009', 'san francisco, ca (port)'): 2998130982, ('2010', 'oakland, ca (port)'): 9596205900, ('2010', 'rest of california'): 48091887406, ('2010', 'san francisco, ca (port)'): 2623519555, ('2011', 'oakland, ca (port)'): 10316313358, ('2011', 'rest of california'): 54869935898, ('2011', 'san francisco, ca (port)'): 2591413704}) data this series behaves expected:
data.sum(level=0) out[27]: 2006 42591749055 2007 48374063112 2008 57257527192 2009 54718103946 2010 60311612861 2011 67777662960 dtype: int64 or, using `groupby:
data.groupby(level=0).sum() out[26]: 2006 42591749055 2007 48374063112 2008 57257527192 2009 54718103946 2010 60311612861 2011 67777662960 dtype: int64 i want apply this: lambda x: x/x.sum() within-group percentages, x.sum() gives unexpected results: when sum in lambda function, negative values:
data.groupby(level=0).apply(lambda x: x.sum()) out[28]: 2006 -357923905 2007 1129422856 2008 1422952344 2009 -1116470902 2010 182070717 2011 -941813776 dtype: int64 for record, things healthy grouping itself, returning expected sets of values:
data.groupby(level=0).apply(lambda x: [x]) out[21]: 2006 [[7460155164, 32868692124, 2262901767]] 2007 [[7881218797, 38595482723, 1897361592]] 2008 [[8325019179, 46200094019, 2732413994]] 2009 [[9077952296, 42642020668, 2998130982]] 2010 [[9596205900, 48091887406, 2623519555]] 2011 [[10316313358, 54869935898, 2591413704]] dtype: object but why negative sums?
Comments
Post a Comment