1

Neural networks can be trained to recognize an object, then detect occurrences of that object in an image, regardless of their position and apparent size. An example of doing this in PyTorch is at https://towardsdatascience.com/object-detection-and-tracking-in-pytorch-b3cf1a696a98

As the text observes,

Most of the code deals with resizing the image to a 416px square while maintaining its aspect ratio and padding the overflow.

So the idea is that the model always deals with 416px images, both in training and in the actual object detection. Detected objects, being only part of the image, will typically be smaller than 416px, but that's okay because the model has been trained to detect patterns in a scale-invariant way. The only thing fixed is the size in pixels of the input image.

I'm looking at a context in which it is necessary to do the reverse: train to detect patterns of a fixed size, then detect them in a variable sized image. For example, train to detect patterns 10px square, then look for them in an image that could be 500px or 1000px square, without resizing the image, but with the assurance that it is only necessary to look for 10px occurrences of the pattern.

Is there an idiomatic way to do this in PyTorch?

1 Answer 1

1

Even if you trained your detector with a fixed size image, you can use a different sizes at inference time because everything is convolutional in faster rcnn/yolo architectures. On the other hand, if you only care about 10X10 bounding box detections, you can easily define this as your anchors. I would recomend to you to use the detectron2 framework which is implemented in pytorch and is easily configurable/hackable.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.