SQL like group by and sum for text files in command line?
Posted
by dnkb
on Super User
See other posts from Super User
or by dnkb
Published on 2010-05-02T04:25:57Z
Indexed on
2010/05/02
4:28 UTC
Read the original article
Hit count: 404
I have huge text files with two fields, the first is a string the second is an integer. The files are sorted by the first field. What I'd like to get in the output is one line per unique string and the sum of the numbers for the identical strings. Some strings appear only once while other appear multiple times. E.g. Given the sample data below, for the string glehnia I'd like to get 10+22=32 in the result.
Any suggestions how to do this either with gnuwin32 command line tools or in linux shell?
Thanks!
glehnia 10
glehnia 22
glehniae 343
glehnii 923
glei 1171
glei 2283
glei 3466
gleib 914
gleiber 652
gleiberg 495
gleiberg 709
© Super User or respective owner