How to compute descriptors for predefined keypoints correctly?

Ask Question

Asked 4 years, 4 months ago

Modified 4 years, 4 months ago

Viewed 453 times

I want to develop a face alignment program. There is a video, from which the face is extracted and aligned. It is happening in the following way: there is a result frame, constructed from the first frame of the video, and then the face from every next frame is aligned to it and rerecorded as a result frame. Alignment is performed via homography. So for every frame, I need to do the operation of finding keypoints, matching them for current face and result face, and computing homography.

Here is the problem. In my pipeline keypoints for the current frame must not be computed repeatedly. Instead, the following algorithm is proposed:

There are some predefined points in the format of 2d numpy array. (in general, they could be any points on the image, but for example, let's imagine these points are some face landmarks)
For the first frame using akaze feature detector I search for keypoints in the area close to the initial points from item 1.
I use cv2.calcOpticalFlowPyrLK to track those keypoints, so in the next frame I do not detect them again, but use tracked keypoints from the previous frame.

So here is the code of this:

# Parameters for lucas kanade optical flow
    lk_params = dict( winSize  = (15,15),
                  maxLevel = 2,
                  criteria = (cv2.TERM_CRITERIA_EPS | cv2.TERM_CRITERIA_COUNT, 10, 0.03))
    

    # previous keypoints are the keypoints from the previous frame. It is the list of cv2.Keypoint
    # here I cast them to the input format for optical flow
    coord_keypoints = np.array(list(map(lambda point: [point.pt[0], point.pt[1]], previous_keypoints)), dtype = np.float32)
    p0 = coord_keypoints.copy().reshape((-1, 1, 2))


    # oldFace_gray and faceImg1 are the faces from previous and current frame respectively
    p1, st, err = cv2.calcOpticalFlowPyrLK(oldFace_gray, faceImg1, p0, None, **lk_params)
    indices = np.where(st==1)[0]
    good_new = p1[st==1]
    good_old = p0[st==1]
    
    
    # Here I cast tracked points back to the type of cv2.Keypoint for description and matching
    keypoints1 = []
    for idx, point in zip(indices, good_new):
        keypoint = cv2.KeyPoint(x=point[0], y=point[1],
                                _size=previous_keypoints[idx].size,
                                _class_id=previous_keypoints[idx].class_id,
                                _response=previous_keypoints[idx].response)
        keypoints1.append(keypoint)

     
    # here I create descriptors for keypoints defined above for current face and find and describe keypoints for result face
    akaze = cv2.AKAZE_create(threshold = threshold)
    keypoints1, descriptors1 = akaze.compute(faceImg1, keypoints1)
    keypoints2, descriptors2 = akaze.detectAndCompute(faceImg2, mask=None)


    # Then I want to filter keypoints for result face by their distance to points on current face and previous result face
    # For that firstly define a function
    def landmarkCondition(point, landmarks, eps):
        for borderPoint in landmarks:
            if np.linalg.norm(np.array(point.pt) - np.array(borderPoint)) < eps:
                return True
        return False


    # Then use filters. landmarks_result is 2d numpy array of coordinates of keypoints founded on the previous result face.
    keypoints_descriptors2 = (filter(lambda x : landmarkCondition(x[0], landmarks_result, eps_result), zip(keypoints2, descriptors2)))
        
    keypoints_descriptors2 = list(filter(lambda x : landmarkCondition(x[0], good_new, eps_initial), keypoints_descriptors2))
        
    keypoints2, descriptors2 = [], []
    for keypoint, descriptor in keypoints_descriptors2:
        keypoints2.append(keypoint)
        descriptors2.append(descriptor)
    descriptors2 = np.array(descriptors2)
 
    # Match founded keypoints
    height, width, channels = coloredFace2.shape

    matcher = cv2.DescriptorMatcher_create(cv2.DESCRIPTOR_MATCHER_BRUTEFORCE_SL2)
    matches = matcher.match(descriptors1, descriptors2, None)

    # # Sort matches by score
    matches.sort(key=lambda x: x.distance, reverse=False)
    numGoodMatches = int(len(matches) * GOOD_MATCH_PERCENT)

    matches = matches[:numGoodMatches]
    

    # I want to eliminate obviously bad matches. Since two images are meant to be similar, lines connecting two correspoindg points on images should be almost horizontal with length approximately equal width of the image
    def correct(point1, point2 , width, eps=NOT_ZERO_DIVIDER):
        x1, y1 = point1
        x2, y2 = point2
        angle = abs((y2-y1) / (x2 - x1 + width + eps))
        length = x2 - x1 + width
        return True if angle < CRITICAL_ANGLE and (length > (1-RELATIVE_DEVIATION) * width and length < (1 + RELATIVE_DEVIATION) * width) else False


    goodMatches = []
    for i, match in enumerate(matches):
        if correct(keypoints1[match.queryIdx].pt, keypoints2[match.trainIdx].pt, width):
            goodMatches.append(match)


    # Find homography
    points1 = np.zeros((len(goodMatches), 2), dtype=np.float32)
    points2 = np.zeros((len(goodMatches), 2), dtype=np.float32)

    
    for i, match in enumerate(goodMatches):
        points1[i, :] = keypoints1[match.queryIdx].pt
        points2[i, :] = keypoints2[match.trainIdx].pt
    
    h, mask = cv2.findHomography(points1, points2, method)
    height, width, channels = coloredFace2.shape
    result = cv2.warpPerspective(coloredFace1, h, (width, height))
    resultGray = cv2.cvtColor(result, cv2.COLOR_BGR2GRAY)

The result of such matching and aligning very poor. If I compute keypoints for both images on every steps without tracking, the result is quite good. Do I make a mistake somewhere?

P.S. I am not sure about posting minimum reproducing example because there is a lot of preprocessing of frames from video.

asked Jul 9, 2021 at 9:13

Nourless

9561 gold badge12 silver badges25 bronze badges

perhaps reduce your problem or reformulate the question... lots of moving parts that are probably not necessary to know. show with pictures what's going on.

Christoph Rackwitz
– Christoph Rackwitz

2021-07-10 10:54:58 +00:00
Commented Jul 10, 2021 at 10:54

Add a comment |

0 Your Answer

Sign up or log in

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.

Collectives™ on Stack Overflow

How to compute descriptors for predefined keypoints correctly?

0

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

0

Know someone who can answer? Share a link to this question via email, Twitter, or Facebook.

Your Answer

Sign up or log in

Post as a guest