6

I have a big amount of data from which I would like to create a scatter plot and include it in my LaTeX document. I use gnuplot to generate the scatter plot with the epslatex output format to be able to import the data to my LaTeX document easily.

My problem is that the EPS files are way too big (approximately 14MB per figure) which will result in a very big output document. Clearly the reason is that all of the data is included in the EPS file which is not needed.

However, I couldn't find a way to compress the EPS file. The only way is to reduce the number of sample points I have but for technical reasons I would prefer not to do that.

Can anyone suggest me a way to reduce the size of EPS plots?

I tried to use ImageMagick and reduce the resolution of the EPS files (like convert -units PixelsPerInch plot.eps -density 300 plot2.eps) but it shrinks the dimensions which is not what I want.

Thanks in advance,

3 Answers 3

4

My solution for this problem is the "every" command in gnuplot, i.e.

plot "datafile" u 1:2 every 10

Like this you can already reduce the size of the eps graphics by ~ a factor of 10. of course you need to find out yourself how much data you can omit without loosing too much information, i.e. the figure should still contain all the features you want to visualize.

If this is not wanted, I normally convert the eps to a raster image of appropriate size and convert it back to eps. Also here you have to play around with the resolution etc

Sign up to request clarification or add additional context in comments.

1 Comment

Thank you very much. I tried this, it helps to some extents, but the problem is that the size of eps output is still big (if I include 5 plots of such type in my LaTeX document the size of resulting PDF file would be as big as 6 MB). Perhaps I should use bigger downsampling factor but small enough to keep the features I want to exhibit.
2

The problem with .eps files is not necessarily resolution (they are vector graphics), but the amount of information gnuplot includes when creating the file. Gnuplot has a tendency to draw .eps files with lots of extra information, especially for 2D plots and plots with lots of points. For instance, for a grid of red squares joined at the edges to make a big red square, gnuplot would draw tons of little red squares instead of the big square. This issue is mentioned at the end of this blog post, where they say that plot ... with image creates a much smaller output than splot for making heat maps.

It sounds like you are not using splot, though, so you could try making a .pdf instead of .eps, and if you need .eps convert it using pdf2ps or another program. That might help...

Out of curiosity, how many points are you plotting? If you could give an idea of the amount of data you use, along with some example code you are using right now, we might be able to give better ideas.

3 Comments

I am using plot: set terminal epslatex size 8cm,8cm color colortext; set output "plot.tex"; plot "temp.dat" using 2:4 w p title ""; where "temp.dat" is a file containing around 1'000'000 lines each of which having the desired data in second and fourth columns.
I see. I agree with Raphael's answer--if you want to show a million data points, that will result in a huge file no matter what if you use vector graphics. I would suggest making a high-resolution .png image instead, or filtering the data in some way, like with the every command.
The current gnuplot development version has a level3 option for the postscript and epslatex terminal which uses PNG encoding for the embedded bitmaps from plot ... with image. This can reduce the image size considerably.
2

I encountered a similar problem where a scatter plot of more than 10^6 points resulted in PDF files of >100 MB. The points were drawn with a very low opacity (1%) so only many layered points would be visible at all, resulting in something more of a smooth density distribution rather than a scatter plot. Thus, I was very reluctant to follow Raphael Roth's advise and thin out the data.

Instead, I found it useful to create a separate Gnuplot script to plot the data using the pngcairo terminal to PNG bitmap images of sufficient resolution. This plot has no axes, not tics, no border and no margins -- just the data drawn in the appropriate coordinates:

set terminal pngcairo transparent size 400,400
set output 'foo.png'
set margins 0,0,0,0
set border 0
unset xtics
unset ytics
# set xrange, yrange appropriately
plot ... with points notitle

Then, in the actual plot (for which I used the cairolatex terminal), I plot this PNG image:

set terminal cairolatex pdf
# regular setup, using the same xrange and yrange
plot 'foo.png' binary filetype=png with rgbalpha axes x2y2

Note that I plot using the other (ticless) set of axes to ensure that the image is filling the graph area without any border, so the tics on the x1y1 axes match the actual position of the points the scatter plot.

The PNG ended up being only a few dozen kilobytes, the PDF was a couple of MB. I think the rgbalpha plot style (similar to with image) is not the most efficient but this was good enough for me.

Hope someone will find this useful.

6 Comments

See also tex.stackexchange.com/a/131106/33933 for related tests from my side.
@Christoph You don't happen to know if there's an option like level3 for the cairolatex terminal? Or is that the default?
On another note, something to be aware of is the (unfortunate) behavior of Apple's PDF engine (and also the pdf.js viewer in Firefox and (possibly?) Chrome) to apply a smoothed upscaling (interpolation) to the embedded PNGs drawn with image. To be clear, I don't mean the (desired) interpolation of pm3d but one which shouldn't really be there. This can be avoided by using with image pixels which falls back to drawning rectangles (resulting in large files). See gnuplot> help image pixels
cairolatex doesn't have such an option AFAIK, such details are deferred to cairo, whereas epslatex is written byte-by-byte by gnuplot. You may also want to try set terminal tikz externalimages, haven't tried this by myself, yet. Does the interpolation problem also happen with pdf? I was aware only of eps making problems, see my comments to stackoverflow.com/q/25736904/2604213
Yes, it also happens with pdf output. And its not even limited to Preview.
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.