0

I am following the object_detection_tutorial.ipynb tutorial.

Here is the code ( I only put parts which are needed, the rest of the code is the same as the notebook):

my_results = []   # I added this, a list to hold the detected classes

PATH_TO_LABELS = 'D:\\TensorFlow\\models\\research\\object_detection\\data\\oid_v4_label_map.pbtxt'
category_index = label_map_util.create_category_index_from_labelmap(PATH_TO_LABELS, use_display_name=True)

PATH_TO_TEST_IMAGES_DIR = pathlib.Path('C:\\Users\\Bhavin\\Desktop\\objects')
TEST_IMAGE_PATHS = sorted(list(PATH_TO_TEST_IMAGES_DIR.glob("*.jpg")))
TEST_IMAGE_PATHS

model = load_model()

def run_inference_for_single_image(model, image):
  image = np.asarray(image)
  # The input needs to be a tensor, convert it using `tf.convert_to_tensor`.
  input_tensor = tf.convert_to_tensor(image)
  # The model expects a batch of images, so add an axis with `tf.newaxis`.
  input_tensor = input_tensor[tf.newaxis,...]

  # Run inference
  output_dict = model(input_tensor)

  # All outputs are batches tensors.
  # Convert to numpy arrays, and take index [0] to remove the batch dimension.
  # We're only interested in the first num_detections.
  num_detections = int(output_dict.pop('num_detections'))
  output_dict = {key:value[0, :num_detections].numpy() 
                 for key,value in output_dict.items()}
  output_dict['num_detections'] = num_detections

  # detection_classes should be ints.
  output_dict['detection_classes'] = output_dict['detection_classes'].astype(np.int64)

  # Handle models with masks:
  if 'detection_masks' in output_dict:
    # Reframe the the bbox mask to the image size.
    detection_masks_reframed = utils_ops.reframe_box_masks_to_image_masks(
              output_dict['detection_masks'], output_dict['detection_boxes'],
               image.shape[0], image.shape[1])      
    detection_masks_reframed = tf.cast(detection_masks_reframed > 0.5,
                                       tf.uint8)
    output_dict['detection_masks_reframed'] = detection_masks_reframed.numpy()

  return output_dict

def show_inference(model, image_path):
  # the array based representation of the image will be used later in order to prepare the
  # result image with boxes and labels on it.
  image_np = np.array(Image.open(image_path))
  # Actual detection.
  output_dict = run_inference_for_single_image(model, image_np)
  # Visualization of the results of a detection.
  vis_util.visualize_boxes_and_labels_on_image_array(
      image_np,
      output_dict['detection_boxes'],
      output_dict['detection_classes'],
      output_dict['detection_scores'],
      category_index,
      instance_masks=output_dict.get('detection_masks_reframed', None),
      use_normalized_coordinates=True,
      line_thickness=8)

  name = "Image" + str(i) + ".jpg"
  img = Image.fromarray(image_np)
  plt.imsave(name,image_np)
  my_results.append(output_dict['detection_classes']) # I added this
  print(my_results) # I added this
  #img.show()

i = 1
for image_path in TEST_IMAGE_PATHS:
 show_inference(model, image_path)
 i += 1

I checked some related stack overflow questions and the answer had something to do with category index. But the code and examples used are very different from the tutorial I am following.

The line : my_results.append(output_dict['detection_classes'])

Gives me output: [array([55], dtype=int64)]

How do I extract the classes of the detected objects?

1 Answer 1

2

First import six

Add get_classes_name_and_scores method, before def show_inference(model, image_path):

get_classes_name_and_scores method returns {'name': 'person', 'score': '91%'}

def get_classes_name_and_scores(
        boxes,
        classes,
        scores,
        category_index,
        max_boxes_to_draw=20,
        min_score_thresh=.9): # returns bigger than 90% precision
    display_str = {}
    if not max_boxes_to_draw:
        max_boxes_to_draw = boxes.shape[0]
    for i in range(min(max_boxes_to_draw, boxes.shape[0])):
        if scores is None or scores[i] > min_score_thresh:
            if classes[i] in six.viewkeys(category_index):
                display_str['name'] = category_index[classes[i]]['name']
                display_str['score'] = '{}%'.format(int(100 * scores[i]))

    return display_str

Then add after vis_util.visualize_boxes_and_labels_on_image_array

print(get_classes_name_and_scores(
      output_dict['detection_boxes'],
      output_dict['detection_classes'],
      output_dict['detection_scores'],
      category_index))
Sign up to request clarification or add additional context in comments.

2 Comments

Method seems to be valid but in my case only outputs empty curly braces { }. I am using a custo model and a custom annotations (.pbtxt labels). Maybe the name/score changed names in the detection model?
My bad, it was the min_score_thresh=0.9 that made the output empty. My detection was a little less accurate (0.7 or so), so after I've adjusted the threshold accordingly I do get a very nice output. Thanks!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.