shell - Bash group by on the basis of n number of columns -

April 15, 2013

this related previous question [asked] (bash command group count)

what if want generalize this? instance input file

 abc|1|2  abc|3|4  bcd|7|2  abc|5|6  bcd|3|5

the output should be

 abc|9|12  bcd|10|7

the result calculated group first column , adding values of 2nd column, , 3rd column, similar group command in sql.

i tried modifying command provided in link failed. don't know whether i'm making conceptual error or silly mistake know none of mentioned commands aren't working.

command used

awk -f "|" '{arr[$1]+=$2} end arr2[$1]+=$5 end  {for (i in arr) {print i"|"arr[i]"|"arr2[i]}}' sample awk -f "|" '{arr[$1]+=$2} end {arr2[$1]+=$5} end  {for (i in arr) {print i"|"arr[i]"|"arr2[i]}}' sample  awk -f "|" '{arr[$1]+=$2 arr2[$1]+=$5} end  {for (i in arr2) {print i"|"arr[i]"|"arr2[i]}}' sample

additionally, if i'm trying here limit use summing columns upto 2 only. if there n columns , want perform operations such addition in 1 column , subtraction in other? how can further modified?

example

abc|1|2|4|......... upto n columns abc|4|5|6|......... upto n columns def|1|4|6|......... upto n columns

lets if sum needed first column, average may second column, other operation third column, etc. how can tackled?

for 3 fields (key , 2 data fields):

$ awk ' begin { fs=ofs="|" }      # set separators {      a[$1]+=$2             # sum second field hash     b[$1]+=$3             # ... b hash } end {                     # in end     for(i in a)           # loop         print i,a[i],b[i] # , output }' file bcd|10|7 abc|9|12

more generic solution n columns using gnu awk:

$ awk ' begin { fs=ofs="|" } {     for(i=2;i<=nf;i++)                    # loop data fields         a[$1][i]+=$i                      # sum them related cells     a[$1][1]=i                            # set field count first cell } end {     for(i in a) {         for((j=2)&&b="";j<a[i][1];j++)    # buffer output             b=b (b==""?"":ofs)a[i][j]         print i,b                         # output     } }' file bcd|10|7 abc|9|12

latter tested 2 fields (busy @ meeting :).

Search This Blog

How Y

shell - Bash group by on the basis of n number of columns -

Comments

Post a Comment

Popular posts from this blog

Is there a better way to structure post methods in Class Based Views -

reflection - How to access the object-members of an object declaration in kotlin -

php - Doctrine Query Builder Error on Join: [Syntax Error] line 0, col 87: Error: Expected Literal, got 'JOIN' -