python 3.x - pandas dataframe concatenate strings from a subset of columns and put them into a list -
i tried retrieve strings subset of columns dataframe
, concatenate strings 1 string, , put these list,
# row_subset sub-dataframe of dataframe sub_columns = ['a', 'b', 'c'] string_list = [""] * row_subset.shape[0] x in range(0, row_subset.shape[0]): y in range(0, len(sub_columns)): string_list[x] += str(row_subset[sub_columns[y]].iloc[x])
so result like,
['row 0 string concatenation','row 1 concatenation','row 2 concatenation','row3 concatenation']
i wondering best way this, more efficiently?
i think need select columns subset []
first , sum
or if need separator use join
:
df = pd.dataframe({'a':list('abcdef'), 'b':list('qwerty'), 'c':list('fertuj'), 'd':[1,3,5,7,1,0], 'e':[5,3,6,9,2,4], 'f':list('aaabbb')}) print (df) b c d e f 0 q f 1 5 1 b w e 3 3 2 c e r 5 6 3 d r t 7 9 b 4 e t u 1 2 b 5 f y j 0 4 b
sub_columns = ['a', 'b', 'c'] print (df[sub_columns].sum(axis=1).tolist()) ['aqf', 'bwe', 'cer', 'drt', 'etu', 'fyj'] print (df[sub_columns].apply(' '.join, axis=1).tolist()) ['a q f', 'b w e', 'c e r', 'd r t', 'e t u', 'f y j']
very similar numpy solution:
print (df[sub_columns].values.sum(axis=1).tolist()) ['aqf', 'bwe', 'cer', 'drt', 'etu', 'fyj']
Comments
Post a Comment