How can I make a unix command (say a one-liner using cut and awk on a HUGE file) use all 16 cores instead of just 1? This isn't a program where I can use -j and specify the number of CPUs to use...
3 Answers
Have you tried GNU parallel to parallelize jobs? See http://www.gnu.org/software/parallel/
1 Comment
Ole Tange
For your task you should look into the --pipe option demonstrated here: youtube.com/watch?v=1ntxT-47VPA
One possible way is to split your input file into a number of pieces and then launch separate shell pipelines for each piece. Multiple processes will take multiple cores.
2 Comments
Nick
but all of the CPUs won't be utilized at the same time right? Will there be moments where any certain CPU is not used?
Noufal Ibrahim
Multiple processes should use separate CPUs but I'm not totally sure.
If you use both cut and awk in the same pipeline, they'll probably run on different CPUs, so you will use two of them. There isn't a simple way to set up a short pipeline like that to use more CPUs than the number of steps in the pipeline.
1 Comment
Nick
I have a long command, say something like (cut -f 5 HUGEFILE.txt | awk '{print $1+$2}'). The computer I'm working on says its using 100% of 1 cpu, and that the rest are inactive.