1

I'm trying to understand how I can show image segmentation results to users.

I mean that if I have this image:
enter image description here

I want to show to the user this result:
enter image description here

These images are from this Github. I have checked their code, but I haven't found where they show their results to the users.

How can I show semantic segmentation to the user?

3
  • Take a look at the implementation of labelVisualize(), tat's the function they use to visualize their results (You do need to adapt the code, though). In principle, normally the first operation you do on the output mask is a threshold at 0.5 to binarize the output image. Then, you could just print that or postprocess it as needed Commented Jan 27, 2020 at 11:05
  • @GPhilo Thanks. Now, I have to understand what it's the network output. Is it like an image's region and a label for that region or area? Maybe it must be some coordinates to indicate where is each class. Thanks again. Commented Jan 27, 2020 at 11:09
  • 1
    Ok, let me expand in a proper answer then Commented Jan 27, 2020 at 11:10

1 Answer 1

2

What is the output of a semantic segmentation network?

UNet (the one in the example) and essentially every other network that deals with Semantic Segmentation produce as output an image whose size is proportional to the input image and in which each pixel is classified as one of the possible classes specified.

For binary classification, typically the raw output is a single-channel float image with values in [0,1] that must be thresholded at 0.5 in order to obtain the "foreground" binary mask. It's also possible that the network is trained implicitly with two classes (foreground/background), in which case, read on for how to deal with for multi-class classification output.

For multi-class classification, the raw output image has N channels, one per class, with value at index [x, y, c] being the score for the pixel (think of it as the probability of pixel x,y to belong to class c, although in principle scores don't have to be probabilities). For each pixel, the selected class is the one of the channel with the highest score.

Images can then be postprocessed (e.g., flattening them and assigning to each pixel class label of the "winning" class), as it seems to be the case for the example you link (if you take a look at the implementation of labelVisualize(), they use a dict mapping class codes to colors).

Sign up to request clarification or add additional context in comments.

1 Comment

Thanks a lot! Maybe, you could be interested in this question also: ai.stackexchange.com/questions/17695/…

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.