Running Awk command on a cluster
- by alex
How do you execute a Unix shell command (awk script, a pipe etc) on a cluster in parallel (step 1) and collect the results back to a central node (step 2)
Hadoop seems to be a huge overkill with its 600k LOC and its performance is terrible
(takes minutes just to initialize the job)
i don't need shared memory, or - something like MPI/openMP as i…