How do you use a trained neural net to identify multiple objects in an image?

Question

I've been exploring neural networks and have been able to successfully train a network even on my own images in a way to label individual pictures as certain things, but don't know how to use that trained network to identify and perhaps return multiple objects from one image. For example, if you trained cats and dogs, and one image has multiple cats and dogs, how would you apply the trained network and return their location (in the image)?

Here is the main tutorial I followed for implementation in Python: http://machinelearningmastery.com/object-recognition-convolutional-neural-networks-keras-deep-learning-library/

A general answer would suffice, as in, is a sliding window over the image the best solution for this or is there something easier?

A specific example (particularly in python) would be appreciated. I've been using matplotlib for most of the image work, so I'd prefer to stay away from PIL slicing.

Thanks!

Enigma · Accepted Answer · 2016-11-23 18:36:42Z

2

As you want to use your existing trained n/w:

Brute Sliding window: you will have to process many windows (slide by pixel based on image size) if you don't know the size and location of the object in the image, and each window may produce different outcomes and may be one or few of those are the final required results, do you see how the complexity increases. There will be difficulty in identifying the actual required outcomes among many.
Preprocessing: images can be preprocessed before feeding it to the network. For instance, take an image with a monkey and a snake, calculate energy (Sobel et.al) of the image. Monkeys footprint in the image is more like round balloon (more area) and snake would be thread-like (less area), based on this have a python script to crop the image to that particular section, then feed this to the n/w. You can think of other preprocessing techniques.

If you are open to other n/w's, check out CRF as Recurrent Neural Networks. Ex: https://github.com/torrvision/crfasrnn

Hope this helps.

answered Nov 23, 2016 at 18:36

Enigma

3291 silver badge10 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Beutler Over a year ago

Thanks NKU - I read about the sliding rule before, and yes the complexity and processing requirements seems unrealistic. I will look at other preprocessing techniques to limit the computation time.

Collectives™ on Stack Overflow

How do you use a trained neural net to identify multiple objects in an image?

1 Answer 1

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related