3

I have 10GB file with a couple of billions entries. It has many columns. I want to plot each column into different subplot. I used the following MWE:

set datafile separator ","
set terminal png
set output "a.png"
set multiplot layout 2,1 title ""
plot "camkii.dat" using 1:2 with lines
plot "camkii.dat" using 1:23 with lines

This script takes few tens of seconds. As you can see, I call plot "camkii.dat" ... two times. I suspect that the file is read each time. This is not very efficient and I might run out of memory.

If I could read the file into some variable (say foo) and then plot each subplot using the variable foo. Something similar to plot foo[1] ... and plot foo[2] ... etc. That way I read the file only once.

Am I right in suspecting the gnuplot might be loading the file two times. If yes, will saving the file into a variable and plotting it will help? Changes suggested to MWE would be great.

1
  • No, you cannot cache data to reuse it in a second plot. If you are having trouble with the data amount you could try using a more efficient way to save your data, like hdf5 file format. Then you can use e.g. h5totxt to extract only the required data parts without reading the whole file. Just a guess, haven't benchmarked this Commented Feb 20, 2016 at 21:12

1 Answer 1

1

I guess the entire file is read tice, but i'm not sure. If you are on a Linux system, you could invoke awk to extract the needed columns (but the first column is again read twice)

plot "<awk '{print $1 $2}' camkii.dat" with lines     
plot "<awk '{print $1 $23}' camkii.dat" with lines
Sign up to request clarification or add additional context in comments.

3 Comments

Now awk must read the file twice. Not sure if that's better.
maybe its better because the full file is not loaded twice at the same time... but to have more control you should have a look at other tools (e.g. gnuplot.py or python's matplotlib), there you have more control over how the memory is allocated
@RaphaelRoth I was already using matplotlib. Unfortunately it takes much longer with matplotlib to plot the data (unless I do sampling of data).

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.