How to transform a csv to combine matching rows?

Posted by Christian Wolf on Super User See other posts from Super User or by Christian Wolf
Published on 2013-10-19T10:24:49Z Indexed on 2013/11/02 9:58 UTC
Read the original article Hit count: 347

Filed under:

I have a CSV file with some transaction data. Let's say date, volume, price and direction (sell/buy). Additionally there is a ID for each transaction and on each closing transaction (the newer one) there is a reference to the corresponding transaction. Classical database referencing.

Now I want to do some statistics and draw some plots. This could be done via Octave, LaTeX/TikZ, Gnuplot or whatever. To do this I need both buy and sell price in one row. My thought was to preprocess the CSV to get another CSV containing the needed information and then to do the statistics. In the end I'd like to have a solution based on scripts and not on a spreadsheet as data might change often (exported from online DB).

My actual solution (see http://paste.ubuntu.com/6262822/ ) is a bash script that parses the CSV line by line and checks if there exists a corresponding transaction. If found, a new row is written to the destination CSV. If not a warning is printed.

The bad news: For each row in the source file I have to read the whole file a few times. This causes long running times of 10sec for 300 lines. As the line number might rise soon (>10k lines), this is not perfect. I am aware, that there are many shells to be opened in the script which might cause the performance problems.

Now my questions:

Is bash/awk/sed/.... a good way to do things?
Should I first import all data into a "real" local database to use SQL?
Is there an easy way to achieve the desired results?

Developer IT

How to transform a csv to combine matching rows? - Developer IT

How to transform a csv to combine matching rows?

bash

script

csv

Related posts about bash

launching a program from bash causes bash to go to new prompt

How to debug a .bash_profile

Every command fails with "command not found" after changing .bash_profile?

Is there any fundamental difference between piping in mac and linux?

why is $0 set to -bash?

Related posts about script

Asset Pipeline acting up

Rendering ASP.NET Script References into the Html Header

How do I configure Tomcat services in Ubuntu?

Avoid richfaces to send back javascript libraries in the ajax responses

Whats the difference between running a shell script as ./script.sh and sh script.sh

Categories cloud