I am trying to get a geometrical representation of an image (ultimately into a WKT string, but any geometrical encoding is fine). I'm using OpenCV's Canny edge detection method to detect edges, and then detect lines using Hough Line Transform. Despite trying different parameters for Canny and Hough Transforms, I am unable to get satisfactory results.
import numpy as np
import cv2 as cv
import requests
from matplotlib import pyplot as plt
img_link = 'https://i.ibb.co/pwTwWfg/13581.jpg'
resp = requests.get(img_link)
arr = np.asarray(bytearray(resp.content), dtype=np.uint8)
img = cv.imdecode(arr, -1)
img = cv.GaussianBlur(img, (15, 15), 0)
assert img is not None, "file could not be read, check with os.path.exists()"
edges = cv.Canny(img,100,120, apertureSize=3)
plt.subplot(121),plt.imshow(img,cmap = 'gray')
plt.title('Original Image'), plt.xticks([]), plt.yticks([])
plt.subplot(122),plt.imshow(edges,cmap = 'gray')
plt.title('After Canny'), plt.xticks([]), plt.yticks([])
plt.show()
Plot of the original image alonside the result after running Canny Edge Detection
Then I run (Probabilistic) Hough Line transform on the resulting Edgemap to get lines
linesP = cv.HoughLinesP(edges, 1, np.pi / 360, 3, None, 50, 10)
cdstP = cv.cvtColor(edges, cv.COLOR_GRAY2BGR)
if linesP is not None:
for i in range(0, len(linesP)):
l = linesP[i][0]
cv.line(cdstP, (l[0], l[1]), (l[2], l[3]), (0,0,255), 3, cv.LINE_AA)
plt.imshow(cdstP)
plt.show()
Hough Line Transform transposed on the edgemap generated earlier
There are certain problems with this approach:
- Not all edges are detected by the Hough Line Transform.
- Sometimes, the interior of the shape is detailed and is also detected as edges.
I have tried different aperture sizes (for Canny), Gaussian kernel sizes, and Hough Transform voting thresholds as well as resolutions. None work perfectly, and the ideal parameters seem to vary image by image. Is there something I am missing?
If there is a better way to do so, please let me know.