Legal claims defining the scope of protection. Each claim is shown in both the original legal language and a plain English translation.
1. A computer-implemented method for detecting one or more object contours in an image comprising: receiving a feature map, wherein the feature map encodes high level features of the one or more object contours detected in the image with respect to a plurality of grid cells, and further encodes respective start locations of each of the one or more object contours with respect to the plurality of grid cells; selecting a grid cell from among the plurality of grid cells including at least one of the respective start locations of at least one of the one or more object contours; determining a location of the at least one start location within the selected grid cell; determining a set of feature values from a subset of the plurality of grid cells of the feature map within a proximity threshold of the selected grid cell; processing the location and the set of feature values using a machine learning network to output a displacement vector to indicate a next coordinate of the at least one object contour; and updating a cursor of the machine learning network based on the displacement vector.
This invention relates to computer vision techniques for detecting object contours in images. The method addresses the challenge of accurately identifying and tracing object boundaries in digital images, which is crucial for applications like object recognition, segmentation, and image analysis. The approach leverages a feature map that encodes high-level features of object contours and their starting locations relative to a grid of cells. The grid cells represent spatial regions within the image, and the feature map captures both the presence of contours and their initial positions. The method begins by selecting a grid cell that contains a start location of an object contour. It then determines the precise position of the contour's start point within that cell. Next, it gathers feature values from neighboring grid cells within a defined proximity threshold. These features, along with the start location, are processed by a machine learning network, which outputs a displacement vector. This vector indicates the next coordinate along the object contour. The process iteratively updates the cursor position based on the displacement vector, effectively tracing the contour step-by-step. The machine learning network is trained to predict these vectors, enabling accurate contour detection and tracking. This approach improves contour detection by combining spatial grid-based features with learned displacement predictions, enhancing precision and robustness in identifying object boundaries.
2. The method of claim 1 , wherein the feature map is generated by a convolutional neural network.
A system and method for image processing involves generating a feature map from an input image to detect or classify objects within the image. The feature map is produced by a convolutional neural network (CNN), which applies a series of convolutional layers to extract hierarchical features from the input image. These layers use learnable filters to detect patterns such as edges, textures, and higher-level structures. The CNN may include additional components like pooling layers to reduce spatial dimensions and fully connected layers to produce final output predictions. The feature map represents the extracted features at different stages of processing, enabling tasks such as object detection, segmentation, or classification. The method may also involve preprocessing the input image, such as resizing or normalization, before feeding it into the CNN. The system can be trained using labeled datasets to optimize the network's parameters for accurate feature extraction and task performance. This approach leverages the CNN's ability to automatically learn relevant features from raw pixel data, improving efficiency and accuracy in image analysis applications.
3. The method of claim 1 , wherein the high level features of the one or more object contours include a location of the one or more object contours, a slope of the one or more object contours, a characteristic of the one or more object contours, or a combination thereof.
This invention relates to object contour analysis in image processing, specifically extracting and utilizing high-level features from object contours to improve detection, recognition, or tracking in images or video. The problem addressed is the need for more robust and informative representations of object contours beyond basic pixel-level data, which can enhance accuracy in computer vision tasks. The method involves analyzing one or more object contours detected in an image or video frame to extract high-level features. These features include the spatial location of the contours, their slope or orientation, and specific characteristics such as curvature, length, or shape descriptors. The extracted features are then used to improve object recognition, classification, or tracking by providing a more detailed and structured representation of the object's shape and structure. This approach helps distinguish between similar objects or handle variations in appearance due to lighting, occlusion, or perspective changes. The method may be applied in various computer vision applications, including autonomous navigation, surveillance, medical imaging, or industrial inspection, where accurate object contour analysis is critical. By leveraging these high-level features, the system can achieve more reliable and efficient processing of visual data.
4. The method of claim 1 , wherein the machine learning network is a recurrent neural network, the method further comprising: instantiating the recurrent neural network at the at least one start location within the selected grid cell to output the displacement vector to indicate the next coordinate.
This invention relates to a machine learning-based method for generating displacement vectors to determine a next coordinate in a grid-based system. The method addresses the challenge of accurately predicting sequential positions in a spatial or temporal grid, which is useful in applications such as path planning, robotics, or autonomous navigation. The method employs a recurrent neural network (RNN) to process input data and generate a displacement vector that indicates the next coordinate from a starting location within a selected grid cell. The RNN is instantiated at the start location, allowing it to leverage its memory of previous states to predict the displacement. This approach enables the system to handle sequential dependencies and dynamic environments, improving accuracy in coordinate prediction. The method may also include selecting a grid cell from a predefined grid, where the grid represents a spatial or temporal domain. The RNN processes input features associated with the selected grid cell and the start location to compute the displacement vector. The displacement vector is then used to determine the next coordinate, which can be applied iteratively to generate a sequence of coordinates. This technique enhances the precision of coordinate prediction by utilizing the RNN's ability to model temporal or spatial dependencies, making it suitable for applications requiring adaptive and context-aware navigation or positioning.
5. The method of claim 4 , wherein the recurrent neural network is instantiated at the next coordinate to continue to iteratively determine another next coordinate in the at least one object contour.
This invention relates to computer vision and object contour detection using recurrent neural networks (RNNs). The problem addressed is the accurate and efficient tracing of object contours in digital images, which is challenging due to variations in shape, lighting, and noise. The method involves using a recurrent neural network to iteratively determine coordinates along an object contour. The RNN is initialized at a starting coordinate on the contour and predicts the next coordinate in the sequence. Once a next coordinate is determined, the RNN is re-instantiated at that new position to continue the process, allowing the network to trace the entire contour step-by-step. This approach leverages the RNN's ability to model sequential dependencies, improving contour detection accuracy by maintaining contextual information across iterations. The method may include preprocessing the input image to enhance contour features, such as edge detection or noise reduction. The RNN may be trained on labeled contour data to learn the relationship between coordinates in an object's boundary. The system can handle closed contours by detecting when the traced path returns to the starting point, ensuring complete contour extraction. This technique is particularly useful in applications like medical imaging, autonomous navigation, and industrial inspection, where precise object boundary detection is critical. The iterative RNN-based approach provides a robust solution for contour tracing in complex scenes.
6. The method of claim 1 , further comprising: processing the location and the set of feature values to output state information regarding the at least one object contour.
This invention relates to object contour analysis in computer vision systems. The technology addresses the challenge of accurately determining and interpreting the state of objects within an image or video by analyzing their contours and associated features. The method involves capturing or receiving an image containing at least one object, detecting the contour of the object, and extracting a set of feature values from the contour. These features may include geometric properties, texture information, or other characteristics that describe the object's shape and appearance. The method then processes the location of the contour and the extracted feature values to generate state information about the object. This state information could include classifications, pose estimations, or other contextual details that describe the object's condition or behavior. The processing step may involve machine learning models, statistical analysis, or other computational techniques to derive meaningful insights from the contour data. The invention aims to improve object recognition, tracking, and analysis in applications such as surveillance, robotics, and autonomous systems by providing a more detailed understanding of object contours and their features.
7. The method of claim 6 , wherein the state information includes a contour line start, a contour line stop, a type of contour line, or a combination thereof.
This invention relates to a method for processing contour lines in a digital mapping or geographic information system. The method addresses the challenge of efficiently encoding and transmitting contour line data, which is essential for accurate topographic representations in digital maps. Contour lines are used to depict elevation changes on a map, and their accurate representation is critical for applications such as navigation, terrain analysis, and geographic data visualization. The method involves capturing state information associated with contour lines, which includes key attributes such as the start and stop points of a contour line, the type of contour line (e.g., elevation, depth, or other classifications), or a combination of these attributes. By encoding this state information, the method enables efficient storage, transmission, and rendering of contour line data. This approach reduces redundancy and improves the accuracy of topographic data representation in digital maps. The method may also include generating a contour line based on the state information, ensuring that the contour line is accurately depicted in the digital map. The state information may be used to define the geometric properties of the contour line, such as its shape, position, and elevation, which are essential for precise topographic mapping. Additionally, the method may involve transmitting the state information to a remote device, allowing for real-time updates and synchronization of contour line data across multiple systems. By encoding and transmitting contour line state information, this invention improves the efficiency and accuracy of digital mapping systems, particularly in applications requiring detailed topographic data.
8. The method of claim 1 , further comprising: using the machine learning network to iteratively generate another displacement vector based on the next coordinate and another set of feature values corresponding to the next coordinate to indicate another next coordinate of the least one object contour until an end of the at least one object contour is reached.
This invention relates to machine learning-based contour tracking in images or video frames. The problem addressed is the accurate and efficient tracing of object contours, which is challenging due to variations in shape, lighting, and noise. The solution involves a machine learning network trained to predict displacement vectors that guide the tracking process along an object's contour. The method begins by initializing a starting coordinate on the object contour. The machine learning network processes feature values extracted from the image at this coordinate to generate a displacement vector, which indicates the next coordinate along the contour. This process is repeated iteratively, using the next coordinate and its corresponding feature values to generate subsequent displacement vectors, effectively stepping along the contour. The iteration continues until the end of the contour is detected, ensuring the entire contour is traced. The machine learning network is trained to generalize across different object shapes and imaging conditions, improving robustness. The iterative approach allows for real-time or near-real-time contour tracking, which is useful in applications like medical imaging, autonomous navigation, and quality inspection. The system adapts dynamically to contour variations, reducing the need for manual adjustments or pre-defined templates.
9. The method of claim 8 , wherein the feature map includes one or more intermediate layers including respective grid cells at different respective scales, and wherein the set of feature values also includes the one or more intermediate layers at the different respective scales.
This invention relates to a method for processing feature maps in machine learning, particularly for tasks like object detection or image segmentation. The method addresses the challenge of efficiently extracting and utilizing multi-scale features from input data, which is critical for capturing both fine-grained details and broader contextual information. The method involves generating a feature map from input data, where the feature map includes one or more intermediate layers. Each intermediate layer is composed of grid cells, and these layers are processed at different scales to capture features at varying levels of granularity. The feature values extracted from the feature map include these intermediate layers at their respective scales, allowing the system to leverage multi-scale information for improved accuracy in tasks like object detection or segmentation. By incorporating intermediate layers at different scales, the method enables the model to retain both high-resolution details and broader contextual features, enhancing performance in applications requiring multi-scale analysis. The approach is particularly useful in convolutional neural networks (CNNs) or other deep learning architectures where hierarchical feature extraction is essential. The method ensures that the feature map retains a rich representation of the input data across multiple scales, improving the model's ability to recognize and classify objects or regions within the input.
10. The method of claim 9 , further comprising: processing the set of feature values at a first one of the one or more intermediate layers corresponding to a coarser one of the different respective scales to determine a general direction or a general location of the next coordinate; and processing the set of feature values at a second one of the one or more intermediate layers corresponding to a finer one of the different respective scales to determine the displacement vector or the next coordinate exactly.
This invention relates to a method for processing feature values in a multi-scale neural network to determine a displacement vector or a next coordinate. The method addresses the challenge of accurately identifying precise locations or movements in a hierarchical feature representation, where features are extracted at different scales of resolution. The approach involves analyzing feature values at multiple intermediate layers of the network, each corresponding to different scales of resolution. A first intermediate layer, associated with a coarser scale, is used to determine a general direction or location of the next coordinate. This provides a broad estimate, reducing the search space for finer details. A second intermediate layer, associated with a finer scale, then refines this estimate to determine the exact displacement vector or next coordinate. This two-step process improves accuracy by leveraging both coarse and fine-grained feature information, ensuring precise localization while maintaining computational efficiency. The method is particularly useful in applications requiring high-resolution spatial or temporal tracking, such as object detection, image segmentation, or motion prediction.
11. An apparatus for detecting one or more lane lines in an image comprising: at least one processor; and at least one memory including computer program code for one or more programs, the at least one memory and the computer program code configured to, within the at least one processor, cause the apparatus to perform at least the following, receive a feature map, wherein the feature map encodes high level features of the one or more lane lines detected in the image with respect to a plurality of grid cells, and further encodes respective start locations of each of the one or more lane lines with respect to the plurality of grid cells; select a grid cell from among the plurality of grid cells including at least one of the respective start locations of at least one of the one or more lane lines; determine a location of the at least one start location within the selected grid cell; determine a set of feature values from a subset of the plurality of grid cells of the feature map within a proximity threshold of the selected grid cell; process the location and the set of feature values using a machine learning network to output a displacement vector to indicate a next coordinate of the at least one lane line; and update a cursor of the machine learning network based on the displacement vector.
This invention relates to a system for detecting lane lines in images, particularly for autonomous driving or advanced driver-assistance systems. The problem addressed is accurately identifying and tracking lane lines in real-time from camera images, which is challenging due to varying road conditions, lighting, and occlusions. The apparatus includes at least one processor and memory storing computer program code. The system receives a feature map derived from an input image, where the feature map encodes high-level features of lane lines and their start locations relative to a grid of cells. The system selects a grid cell containing a lane line start location and determines its precise position within that cell. It then extracts feature values from neighboring grid cells within a defined proximity threshold. These features and the start location are processed by a machine learning network, which outputs a displacement vector predicting the next coordinate of the lane line. The system updates a cursor position based on this vector, allowing iterative tracking of the lane line's path. This approach improves lane detection accuracy by leveraging spatial context and machine learning-based prediction.
12. The apparatus of claim 11 , wherein the machine learning network is a recurrent neural network, and wherein the apparatus is further caused to: instantiate the recurrent neural network at the at least one start location within the selected grid cell to output the displacement vector to indicate the next coordinate, wherein the recurrent neural network is instantiated at the next coordinate to continue to iteratively determine another next coordinate in the at least one lane line.
This invention relates to a system for detecting lane lines in images using a recurrent neural network (RNN). The problem addressed is the accurate and efficient detection of lane lines in visual data, which is critical for autonomous driving and advanced driver-assistance systems. Traditional methods often struggle with varying road conditions, occlusions, and complex environments. The apparatus includes a machine learning network, specifically a recurrent neural network, designed to process grid cells representing image regions. The RNN is instantiated at a start location within a selected grid cell to output a displacement vector, which indicates the next coordinate along a lane line. The process is iterative, with the RNN continuing to determine subsequent coordinates by being instantiated at each new coordinate, effectively tracing the lane line step-by-step. This approach leverages the RNN's ability to maintain state information across iterations, improving accuracy in tracking continuous lane structures. The system may also include preprocessing steps to generate the grid cells and select initial start locations, ensuring robust lane detection under varying conditions. The invention enhances lane detection by combining spatial grid analysis with sequential learning, improving reliability in dynamic driving scenarios.
13. The apparatus of claim 11 , wherein the apparatus is further caused to: process the location and the set of feature values to output state information regarding the at least one lane line, wherein the state information includes a line start, a line stop, a type of line, or a combination thereof.
This invention relates to a system for analyzing lane lines in a roadway environment, particularly for autonomous or assisted driving applications. The system processes sensor data, such as from cameras or LiDAR, to detect and interpret lane markings. The apparatus extracts location data and feature values associated with lane lines, then processes this information to generate detailed state information about the lane lines. The state information includes key attributes such as the start and end points of the lane line (line start and line stop), as well as the type of line (e.g., solid, dashed, double, or other markings). This enables the system to accurately map and classify lane lines, which is critical for navigation, lane-keeping, and other driving assistance functions. The apparatus may also integrate additional data, such as road curvature or traffic conditions, to refine the lane line analysis. The invention improves upon prior systems by providing more precise and context-aware lane line detection, reducing errors in autonomous vehicle decision-making. The technology is applicable in self-driving cars, advanced driver-assistance systems (ADAS), and other roadway monitoring applications.
14. The apparatus of claim 11 , wherein the feature map is generated by a convolutional neural network, and wherein the feature map includes one or more intermediate layers including respective grid cells at different respective scales, and wherein the set of feature values also includes the one or more intermediate layers at the different respective scales.
The invention relates to image processing using convolutional neural networks (CNNs) for generating feature maps. The problem addressed is the need for efficient and scalable feature extraction from images at multiple resolutions. The apparatus includes a CNN that processes an input image to produce a feature map. The feature map consists of multiple intermediate layers, each containing grid cells at different scales. These intermediate layers capture hierarchical features of the input image, allowing the system to analyze the image at various levels of detail. The feature values extracted from the feature map include data from these intermediate layers, enabling the apparatus to leverage multi-scale information for tasks such as object detection, classification, or segmentation. By incorporating multiple scales, the system improves robustness and accuracy in identifying and processing image features. The apparatus is designed to enhance computational efficiency while maintaining high performance in image analysis applications.
15. The apparatus of claim 14 , wherein the apparatus is further caused to: process the set of feature values at a first one of the one or more intermediate layers corresponding to a coarser one of the different respective scales to determine a general direction or a general location of the next coordinate; and process the set of feature values at a second one of the one or more intermediate layers corresponding to a finer one of the different respective scales to determine the displacement vector or the next coordinate exactly.
This invention relates to a machine learning apparatus for processing feature values at multiple scales to determine precise coordinates or displacement vectors. The apparatus operates within a neural network framework, where feature values extracted from input data are processed at different hierarchical layers corresponding to varying scales of resolution. The apparatus first processes these feature values at a coarser scale layer to estimate a general direction or approximate location of a target coordinate. This coarse estimation provides a broad contextual understanding, reducing the search space for finer adjustments. Subsequently, the apparatus refines this estimate by processing the same feature values at a finer scale layer, which enables precise determination of the exact displacement vector or coordinate. This multi-scale approach improves accuracy by leveraging both high-level contextual information and fine-grained details, making it suitable for tasks requiring precise spatial or positional predictions, such as object detection, image segmentation, or pose estimation. The apparatus dynamically adjusts its processing based on the scale of the intermediate layers, ensuring efficient and accurate coordinate determination.
16. A non-transitory computer-readable storage medium for detecting one or more object contours in an image, carrying one or more sequences of one or more instructions which, when executed by one or more processors, cause an apparatus to perform: receiving a feature map, wherein the feature map encodes high level features of the one or more object contours detected in the image with respect to a plurality of grid cells, and further encodes respective start locations of each of the one or more object contours with respect to the plurality of grid cells; selecting a grid cell from among the plurality of grid cells including at least one of the respective start locations of at least one of the one or more object contours; determining a location of the at least one start location within the selected grid cell; determining a set of feature values from a subset of the plurality of grid cells of the feature map within a proximity threshold of the selected grid cell; processing the location and the set of feature values using a machine learning network to output a displacement vector to indicate a next coordinate of the at least one object contour; and updating a cursor of the machine learning network based on the displacement vector.
This invention relates to computer vision techniques for detecting object contours in images using machine learning. The problem addressed is the accurate and efficient extraction of object boundaries from images, which is challenging due to variations in object shapes, lighting conditions, and background clutter. The solution involves a machine learning-based approach that processes a feature map encoding high-level features of object contours and their start locations relative to a grid of cells. The system selects a grid cell containing a contour start location, determines the precise position of the start within that cell, and gathers feature values from neighboring grid cells within a defined proximity. These features are then processed by a machine learning network to predict a displacement vector, which indicates the next coordinate along the object contour. The network's cursor is updated iteratively using these vectors to trace the full contour. This method improves contour detection by leveraging spatial relationships between grid cells and refining predictions at each step, enhancing accuracy and robustness in complex scenes. The approach is particularly useful in applications requiring precise object segmentation, such as medical imaging, autonomous navigation, and augmented reality.
17. The non-transitory computer-readable storage medium of claim 16 , wherein the machine learning network is a recurrent neural network, and wherein the apparatus is caused to further perform: instantiating the recurrent neural network at the at least one start location within the selected grid cell to output the displacement vector to indicate the next coordinate, wherein the recurrent neural network is instantiated at the next coordinate to continue to iteratively determine another next coordinate in the at least one lane line.
This invention relates to a computer-implemented method for detecting lane lines in images, particularly using machine learning techniques. The problem addressed is the accurate and efficient identification of lane lines in road images, which is critical for autonomous driving and advanced driver-assistance systems. The system employs a machine learning network, specifically a recurrent neural network (RNN), to process grid cells within an image. The RNN is instantiated at a start location within a selected grid cell to output a displacement vector, which indicates the next coordinate along a lane line. The RNN continues this process iteratively, moving to the next coordinate and repeating the process to trace the lane line. This approach allows for continuous tracking of lane lines by predicting subsequent points based on previous outputs, improving accuracy and adaptability to varying road conditions. The RNN's recurrent nature enables it to maintain contextual information across iterations, enhancing the prediction of subsequent coordinates. The system dynamically adjusts to lane line curvature and discontinuities, ensuring robust detection. This method improves over traditional approaches by leveraging sequential learning to refine lane line detection in real-time applications.
18. The non-transitory computer-readable storage medium of claim 16 , wherein the apparatus is caused to further perform: processing the location and the set of feature values to output state information regarding the at least one object contour, wherein the state information includes a contour line start, a contour line stop, a type of contour line, or a combination thereof.
This invention relates to computer vision and object contour analysis, specifically improving the detection and classification of object contours in digital images. The technology addresses challenges in accurately identifying and characterizing object boundaries, which are critical for applications like image segmentation, object recognition, and autonomous navigation. The system processes image data to extract location and feature values associated with object contours, then further analyzes these parameters to generate detailed state information about the contours. This state information includes key attributes such as the start and end points of contour lines, the type of contour line (e.g., edge, boundary, or silhouette), and combinations of these attributes. By refining contour analysis with these additional details, the system enhances the precision of object detection and classification tasks. The method involves computational techniques for feature extraction and contour state determination, leveraging machine learning or pattern recognition algorithms to interpret the spatial and feature data. This approach improves upon prior systems by providing more granular and context-aware contour information, which is essential for applications requiring high-fidelity object representation and interaction.
19. The non-transitory computer-readable storage medium of claim 16 , wherein the feature map is generated by a convolutional neural network, and wherein the feature map includes one or more intermediate layers including respective grid cells at different respective scales, and wherein the set of feature values also includes the one or more intermediate layers at the different respective scales.
The invention relates to computer vision systems using convolutional neural networks (CNNs) for image analysis. The problem addressed is the need for efficient and scalable feature extraction from images at multiple resolutions to improve object detection, classification, or segmentation tasks. Traditional methods often struggle with capturing multi-scale features effectively, leading to performance limitations in complex scenes. The invention describes a non-transitory computer-readable storage medium containing instructions for generating a feature map using a CNN. The feature map includes one or more intermediate layers, each composed of grid cells at different scales. These intermediate layers capture hierarchical representations of the input image, allowing the system to analyze features at varying resolutions. The set of feature values extracted from the feature map includes these intermediate layers, enabling the system to leverage multi-scale information for downstream tasks. The CNN processes the input image through convolutional operations, progressively refining the feature representations at each layer. By incorporating multiple intermediate layers, the system can detect both fine-grained details and broader contextual information, improving accuracy in tasks like object detection or image segmentation. The invention enhances the flexibility and robustness of CNNs by explicitly utilizing multi-scale feature extraction, addressing limitations in traditional single-scale approaches.
20. The non-transitory computer-readable storage medium of claim 19 , wherein the apparatus is caused to further perform: processing the set of feature values at a first one of the one or more intermediate layers corresponding to a coarser one of the different respective scales to determine a general direction or a general location of the next coordinate; and processing the set of feature values at a second one of the one or more intermediate layers corresponding to a finer one of the different respective scales to determine the displacement vector or the next coordinate exactly.
This invention relates to computer vision and machine learning, specifically improving the accuracy of coordinate prediction in neural networks. The problem addressed is the challenge of precisely determining coordinates or displacement vectors in multi-scale feature maps, where coarse layers provide broad context but lack fine detail, while fine layers offer precision but limited spatial awareness. The invention describes a method for processing feature values at different scales within a neural network to refine coordinate predictions. First, a set of feature values is processed at a coarser intermediate layer to estimate a general direction or location of the next coordinate. This step provides a broad spatial context but with lower precision. Then, the same feature values are processed at a finer intermediate layer to determine the exact displacement vector or coordinate. This step refines the prediction by leveraging higher-resolution features. The approach combines coarse and fine-scale information hierarchically, ensuring that the final prediction benefits from both global context and local precision. This method is particularly useful in tasks requiring high-accuracy spatial predictions, such as object detection, pose estimation, or image segmentation, where multi-scale feature fusion improves robustness and precision. The invention enhances existing neural network architectures by optimizing how intermediate layers at different scales contribute to coordinate prediction.
Unknown
September 3, 2019
Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.