Despite the impressive advances made in recent decades, past digital image processing system were faced with significant technical challenges to solving important technical problems. The digital image processing system described below helps to solve these technical challenges with regard to spatial location and orientation of arbitrary objects in real-world environments. The digital image processing system performs image segmentation to accurately identify objects in an image, then locates the objects and determines their orientations.
Legal claims defining the scope of protection, as filed with the USPTO.
1. A digital image processing system comprising: a communication interface configured to communicate with an external object system; and image processing circuitry coupled to the communication interface and configured to: obtain object data through the communication interface from the external object system, the object data comprising an expected object location and expected object orientation for a specific object in an environment; obtain image data from multiple distinct viewpoints of the specific object in the environment; segment the image data to identify the specific object in the image data; from the image data, determine a measured object location for the specific object in the environment; and from the image data, determine a measured object orientation for the specific object in the environment.
2. The system of claim 1 , where: the image processing circuitry is further configured to: determine a reference point for the specific object in the environment; and determine sample points offset from the reference point; and where the image data from multiple distinct viewpoints comprises: image data taken within a predetermined radius of the sample points from a camera heading toward the specific object.
3. The system of claim 2 , where: the image processing circuitry is further configured to: request and receive a road graph for the environment; and where: the reference point is on a road represented in the road graph.
4. The system of claim 3 , where: the reference point is on a road represented in the road graph that minimizes distance to the specific object.
5. The system of claim 1 , where: the image processing circuitry is further configured to: provide the image data to a trained scene segmentation model; and obtain image masks comprising predicted object labels from the trained scene segmentation model.
6. The system of claim 1 , where: the image processing circuitry is further configured to: reduce image artifacts by applying a linear model to the image masks to obtain smooth mask boundaries.
7. The system of claim 1 , where: the image processing circuitry is further configured to: apply a foreground object identification model to the image masks to determine a foreground object; and remove the foreground object from the masks.
8. The system of claim 1 , where: the image processing circuitry is further configured to: determine a bearing to the specific object from each sample point; and determine an intersection of the bearings as the measured object location for the specific object in the environment.
9. The system of claim 1 , where: the image processing circuitry is further configured to: determine an angle of rotation and a direction of rotation from object edge height and object width as captured in the image data of the specific object; and determine the measured object orientation as a function of the direction of rotation, angle of rotation, and a camera heading associated with the image data.
10. An image processing method comprising: providing a communication interface configured to communicate with an external object system; and with image processing circuitry: obtaining object data through the communication interface from the external object system, the object data comprising an expected object location and expected object orientation for a specific object in an environment; obtaining image data from multiple distinct viewpoints of the specific object in the environment; segmenting the image data to identify the specific object in the image data; from the image data, determining a measured object location for the specific object in the environment; and from the image data, determining a measured object orientation for the specific object in the environment.
11. The method of claim 10 , further comprising: determining a reference point for the specific object in the environment; and determining sample points offset from the reference point; and where the image data from multiple distinct viewpoints comprises: image data taken within a predetermined radius of the sample points from a camera heading toward the specific object.
12. The method of claim 11 , further comprising: requesting and receiving a road graph for the environment; and where: the reference point is on a road represented in the road graph.
13. The method of claim 12 , where: the reference point is on a road represented in the road graph that minimizes distance to the specific object.
14. The method of claim 10 , further comprising: providing the image data to a trained scene segmentation model; and obtaining image masks comprising predicted object labels from the trained scene segmentation model.
15. The method of claim 10 , further comprising: reducing image artifacts by applying a linear model to the image masks to obtain smooth mask boundaries.
16. The method of claim 10 , further comprising: applying a foreground object identification model to the image masks to determine a foreground object; and removing the foreground object from the masks.
17. The method of claim 10 , further comprising: determining a bearing to the specific object from each sample point; and determining an intersection of the bearings as the measured object location for the specific object in the environment.
18. The method of claim 10 , further comprising: determining an angle of rotation and a direction of rotation from object edge height and object width as captured in the image data of the specific object; and determining the measured object orientation as a function of the direction of rotation, angle of rotation, and a camera heading associated with the image data.
19. A digital image processing system comprising: a communication interface operable to communicate with: an external object system; an external road data provider; and an external image data provider; and image processing circuitry coupled to the communication interface, the image processing circuitry configured to: obtain object data through the communication interface from the external object system, the object data comprising an expected object location and expected object orientation for a specific object in an environment; obtain a road graph for the environment through the communication interface from the external road data provider; determine a reference point for the specific object in the environment, where the reference point is on a road represented in the road graph; and determine sample points offset from the reference point; and obtain image data, through the communication interface from the external image data provider, from multiple distinct viewpoints of the specific object within a predetermined radius of the sample points; from the image data, determine a measured object location for the specific object in the environment, the measured object location comprising an intersection of bearings to the specific object from the sample points; and from the image data, determine a measured object orientation for the specific object in the environment by determining an angle of rotation and a direction of rotation, with respect to a specific camera heading, from object edge height and object width as captured in the image data of the specific object.
20. The system of claim 19 , where: the image processing circuitry is further configured to: provide the image data to a trained scene segmentation model; obtain image masks comprising predicted object labels from the trained scene segmentation model; reduce image artifacts by applying a linear model to the image masks to obtain smooth mask boundaries; apply a foreground object identification model to the image masks to determine a foreground object; and remove the foreground object from the masks.
Cooperative Patent Classification codes for this invention. Click any code to explore related patents in that topic.
July 2, 2019
February 23, 2021
Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.