2

I am writing a simple script in numpy which takes a 640 x 480 depth image (a 2D numpy array of bytes), and converts it into a num_points x 3 numpy array of points, given a pinhole camera model. The math is fairly simple, and I've gotten the script to work -- but its extremely slow. Apparently, it takes 2 seconds for each frame. I've written a similar script in C++ before and gotten ~100ms per frame. I'm wondering what optimizations I can make to my python script. Can any of this be vectorized? Could I benefit from parallelization?

def create_point_cloud(self, depth_image):
    shape = depth_image.shape;
    rows = shape[0];
    cols = shape[1];

    points = np.zeros((rows * cols, 3), np.float32);

    bytes_to_units = (1.0 / 256.0);

    # Linear iterator for convenience
    i = 0
    # For each pixel in the image...
    for r in xrange(0, rows):
        for c in xrange(0, cols):
            # Get the depth in bytes
            depth = depth_image[r, c, 0];

            # If the depth is 0x0 or 0xFF, its invalid.
            # By convention it should be replaced by a NaN depth.
            if(depth > 0 and depth < 255):
                # The true depth of the pixel in units
                z = depth * bytes_to_units;

                # Get the x, y, z coordinates in units of the pixel
                # using the intrinsics of the camera (cx, cy, fx, fy)
                points[i, 0] = (c - self.cx) / self.fx * z;
                points[i, 1] = (r - self.cy) / self.fy * z;
                points[i, 2] = z
            else:
                # Invalid points have a NaN depth
                points[i, 2] = np.nan;
            i = i + 1
    return points

1 Answer 1

1

I can't check it because I don't have your data but the following code should do the job

def create_point_cloud_vectorized(self,depth_image):
    im_shape = depth_image.shape

    # get the depth
    d = depth_image[:,:,0]

    # replace the invalid data with np.nan
    depth = np.where( (d > 0) & (d < 255), d /256., np.nan)

    # get x and y data in a vectorized way
    row = (np.arange(im_shape[0])[:,None] - self.cx) / self.fx * depth
    col = (np.arange(im_shape[1])[None,:] - self.cy) / self.fy * depth

    # combine x,y,depth and bring it into the correct shape
    return array((row,col,depth)).reshape(3,-1).swapaxes(0,1)
Sign up to request clarification or add additional context in comments.

1 Comment

Thank you. I will try this out. For the record I decided to just do this on the GPU, and it has become a moot point.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.