A computer-implemented method is provided for image-based, self-guided object detection. The method includes receiving, by a processor device, a set of images. Each of the images has a respective grid thereon that is labeled regarding a respective object to be detected using grid level label data. The method further includes training, by the processor device, a grid-based object detector using the grid level label data. The method also includes determining, by the processor device, a respective bounding box for the respective object in each of the images, by applying local segmentation to each of the images. The method additionally includes training, by the processor device, a Region-based Convolutional Neural Network (RCNN) for joint object localization and object classification using the respective bounding box for the respective object in each of the images as an input to the RCNN.
Legal claims defining the scope of protection, as filed with the USPTO.
1. A computer-implemented method for image-based, self-guided object detection, comprising: receiving, by a processor device, a set of input training images, each of the input training images having a respective grid thereon formed from a plurality of pixels that is labeled using a single label as grid level label data regarding a respective object to be detected; training, by the processor device, a grid-based object detector using the grid level label data; determining, by the processor device, a respective bounding box for the respective object in each of the input training images, by applying local segmentation to each of the input training images; and training, by the processor device, a Region-based Convolutional Neural Network (RCNN) for joint object localization and object classification using the respective bounding box for the respective object in each of the input training images as an input to the RCNN.
2. The computer-implemented method of claim 1 , further comprising performing an action responsive to the object localization and object classification for a respective new object in a new image to which the RCNN is applied.
3. The computer-implemented method of claim 2 , wherein the action comprises autonomously controlling a motor vehicle to avoid a collision with the new object responsive to the object localization and object classification for the respective new object.
4. The computer-implemented method of claim 1 , wherein the local segmentation is performed using a self-similarity search and template matching to provide the respective bounding box around the respective object in the set of input training images.
5. The computer-implemented method of claim 1 , wherein the local segmentation is applied to each of the input training images to segment a respective target region therein.
6. The computer-implemented method of claim 1 , wherein the Region-based Convolutional Neural Network (RCNN) forms a model during an object training stage that is to detect objects in new images during an inference stage.
7. The computer-implemented method of claim 1 , wherein the method is performed by a system selected from the group consisting of a surveillance system, a face detection system, a face recognition system, a cancer detection system, an object tracking system, and an Advanced Driver-Assistance System.
8. The computer-implemented method of claim 1 , wherein the respective bounding box is determined by the local segmentation being performed using a self-similarity search and template matching approach to fit the bounding box around the respective object.
9. A computer program product for image-based, self-guided object detection, the computer program product comprising a non-transitory computer readable storage medium having program instructions embodied therewith, the program instructions executable by a computer to cause the computer to perform a method comprising: receiving, by a processor device, a set of input training images, each of the input training images having a respective grid thereon formed from a plurality of pixels that is labeled using a single label as grid level label data regarding a respective object to be detected; training, by the processor device, a grid-based object detector using the grid level label data; determining, by the processor device, a respective bounding box for the respective object in each of the input training images, by applying local segmentation to each of the input training images; and training, by the processor device, a Region-based Convolutional Neural Network (RCNN) for joint object localization and object classification using the respective bounding box for the respective object in each of the input training images as an input to the RCNN.
10. The computer program product of claim 9 , wherein the method further comprises performing an action responsive to the object localization and object classification for a respective new object in a new image to which the RCNN is applied.
11. The computer program product of claim 10 , wherein the action comprises autonomously controlling a motor vehicle to avoid a collision with the new object responsive to the object localization and object classification for the respective new object.
12. The computer program product of claim 9 , wherein the local segmentation is performed using a self-similarity search and template matching to provide the respective bounding box around the respective object in the set of input training images.
13. The computer program product of claim 9 , wherein the local segmentation is applied to each of the input training images to segment a respective target region therein.
14. The computer program product of claim 9 , wherein the Region-based Convolutional Neural Network (RCNN) forms a model during an object training stage that is to detect objects in new images during an inference stage.
15. The computer program product of claim 9 , wherein the method is performed by a system selected from the group consisting of a surveillance system, a face detection system, a face recognition system, a cancer detection system, an object tracking system, and an Advanced Driver-Assistance System.
16. A computer processing system for image-based, self-guided object detection, comprising: a memory device for storing program code; and a processor device for running the program code to receive a set of input training images, each of the input training images having a respective grid thereon formed from a plurality of pixels that is labeled using a single label as grid level label data regarding a respective object to be detected; train a grid-based object detector using the grid level label data; determine a respective bounding box for the respective object in each of the input training images, by applying local segmentation to each of the images; and train a Region-based Convolutional Neural Network (RCNN) for joint object localization and object classification using the respective bounding box for the respective object in each of the input training images as an input to the RCNN.
17. The computer processing system of claim 16 , wherein the processor device further runs the program code to perform an action responsive to the object localization and object classification for a respective new object in a new image to which the RCNN is applied.
18. The computer processing system of claim 17 , wherein the action comprises autonomously controlling a motor vehicle to avoid a collision with the new object responsive to the object localization and object classification for the respective new object.
19. The computer processing system of claim 16 , wherein the local segmentation is performed using a self-similarity search and template matching to provide the respective bounding box around the respective object in the set of input training images.
20. The computer processing system of claim 16 , wherein the computer processing system is comprised in a system selected from the group consisting of a surveillance system, a face detection system, a face recognition system, a cancer detection system, an object tracking system, and an Advanced Driver-Assistance System.
Cooperative Patent Classification codes for this invention. Click any code to explore related patents in that topic.
October 11, 2018
March 23, 2021
Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.