group by - pandas groupby unexpectedly returns negative sum -


i have data on port freight volumes year wish sum turn percentages later on, find negative sums unexpectedly:

data = pd.series( {('2006', 'oakland, ca (port)'): 7460155164,  ('2006', 'rest of california'): 32868692124,  ('2006', 'san francisco, ca (port)'): 2262901767,  ('2007', 'oakland, ca (port)'): 7881218797,  ('2007', 'rest of california'): 38595482723,  ('2007', 'san francisco, ca (port)'): 1897361592,  ('2008', 'oakland, ca (port)'): 8325019179,  ('2008', 'rest of california'): 46200094019,  ('2008', 'san francisco, ca (port)'): 2732413994,  ('2009', 'oakland, ca (port)'): 9077952296,  ('2009', 'rest of california'): 42642020668,  ('2009', 'san francisco, ca (port)'): 2998130982,  ('2010', 'oakland, ca (port)'): 9596205900,  ('2010', 'rest of california'): 48091887406,  ('2010', 'san francisco, ca (port)'): 2623519555,  ('2011', 'oakland, ca (port)'): 10316313358,  ('2011', 'rest of california'): 54869935898,  ('2011', 'san francisco, ca (port)'): 2591413704}) data 

this series behaves expected:

data.sum(level=0) out[27]: 2006    42591749055 2007    48374063112 2008    57257527192 2009    54718103946 2010    60311612861 2011    67777662960 dtype: int64 

or, using `groupby:

data.groupby(level=0).sum()   out[26]: 2006    42591749055 2007    48374063112 2008    57257527192 2009    54718103946 2010    60311612861 2011    67777662960 dtype: int64 

i want apply this: lambda x: x/x.sum() within-group percentages, x.sum() gives unexpected results: when sum in lambda function, negative values:

data.groupby(level=0).apply(lambda x: x.sum())  out[28]: 2006    -357923905 2007    1129422856 2008    1422952344 2009   -1116470902 2010     182070717 2011    -941813776 dtype: int64 

for record, things healthy grouping itself, returning expected sets of values:

data.groupby(level=0).apply(lambda x: [x]) out[21]: 2006     [[7460155164, 32868692124, 2262901767]] 2007     [[7881218797, 38595482723, 1897361592]] 2008     [[8325019179, 46200094019, 2732413994]] 2009     [[9077952296, 42642020668, 2998130982]] 2010     [[9596205900, 48091887406, 2623519555]] 2011    [[10316313358, 54869935898, 2591413704]] dtype: object 

but why negative sums?


Comments

Popular posts from this blog

What is happening when Matlab is starting a "parallel pool"? -

angular - DownloadURL return null in below code -

php - Cannot override Laravel Spark authentication with own implementation -