I have a simple static class that it's purpose is given an RDD of Point to find the median of each dimension and return that as a new Point using Spark's reduce functions.
This is the class:
public class MedianPointFinder {
public static Point findMedianPoint(JavaRDD<Point> points) {
Point biggestPointByXDimension = points.reduce((a, b) -> getBiggestPointByXDimension(a, b));
Point biggestPointByYDimension = points.reduce((a, b) -> getBiggestPointByYDimension(a, b));
double xDimensionMedian = biggestPointByXDimension.getX() / 2.0;
double yDimensionMedian = biggestPointByYDimension.getY() / 2.0;
return new Point(xDimensionMedian, yDimensionMedian);
}
private static Point getBiggestPointByXDimension(Point first, Point second) {
return first.getX() > second.getX() ? first : second;
}
private static Point getBiggestPointByYDimension(Point first, Point second) {
return first.getY() > second.getY() ? first : second;
}
}
Point class is a simple class for storing an (x, y) point.
1 2 9I consider the median to be2not4.5. \$\endgroup\$