I am trying to evaluate several methods to compress some 2D data points. The algorithm itself is not relevant, but from the output, I can compute the MSE and the number of points (which can be used to calculate the compression ratio). Is there any metric that combines compression quality (MSE) with the resulting number of points (compression ratio)?
Given some feedback in the comments, I will explain in more detail the problem and one possible metric (hoping to get some comments on the metric).
As stated, we have a set of x and y points that is large (millions of points) that needed to be simplified in order to be analysed. The points are not produced by any smooth function, they can be noisy. The analysis includes finding relevant peaks and valleys. I am evaluating several different methods that reduce/compress this set of points into a smaller one while minimizing a cost function (based on MSE). The reducing algorithm itself is based on RDP.
There are several different configurations that need to be evaluated. So I require a single metric that gives me information regarding the quality of each point. The error is minimal if I use all the points in the set, but that does not reduce the number of points. The idea is to get the minimum number of points while keeping the ones that reduce the greatest amount the error.
Based on the comments, one idea would be to compute a rough estimation of the improvement of each point. Let us define improvement as the difference between a reference MSE and the final MSE: I = MSE_r - MSE_f One brought estimative for the reference MSE (MSE_r) would be to compute the MSE using only the first and the last point in the set. These points would be connected by a straight line.
The final metric (AIP) could them be the improvement over the number of points (N): AIP = I/N = (MSE_r - MSE_f)/N
AIP stand by Average Improvement per Point.
Any better idea?