Legal claims defining the scope of protection. Each claim is shown in both the original legal language and a plain English translation.
1. An apparatus, comprising: a first module of at least three modules, wherein the first module configured to generate class independent region proposals to provide a region; a second module of the at least three modules is configured to classify the region as face or non-face using a multi-task analysis, wherein the second module comprises a five convolutional layers with three fully connected layers, a network configured to fuse the three fully connected layers, separate networks for face detection, landmark detection, visibility determination, pose estimation, and gender determination; and a third module of the at least three modules is configured to perform post-processing on the classified region.
Computer vision apparatus for object detection and analysis. The system addresses the problem of efficiently and accurately identifying and analyzing specific regions of interest, such as faces, within an image. The apparatus includes at least three functional modules. The first module is designed to generate region proposals that are not tied to specific object classes, thereby providing candidate regions for further analysis. The second module analyzes these candidate regions. This analysis includes classifying the region as either a face or non-face. The classification process utilizes a multi-task approach. Specifically, the second module comprises a neural network architecture with five convolutional layers and three fully connected layers. A fusion network combines outputs from these fully connected layers. Additionally, separate sub-networks within the second module are dedicated to specific tasks: face detection, landmark detection (e.g., eyes, nose, mouth), visibility determination, pose estimation, and gender determination. The third module then performs post-processing operations on the regions that have been classified and analyzed by the second module.
2. The apparatus of claim 1 , wherein the third module comprises at least one of an iterative region proposal or landmark-based non-maximum suppression.
This invention relates to an apparatus for object detection in images, addressing the challenge of accurately identifying and localizing objects within complex visual scenes. The apparatus includes a third module designed to refine object detection results by employing either an iterative region proposal method or a landmark-based non-maximum suppression technique. The iterative region proposal method involves progressively refining candidate object regions through multiple iterations, improving detection accuracy by iteratively adjusting proposals based on feedback. Alternatively, the landmark-based non-maximum suppression method enhances detection by leveraging key landmarks within the image to filter and merge overlapping detection results, reducing redundancy while preserving critical object information. The apparatus integrates these techniques to enhance the precision and reliability of object detection in automated systems, particularly in applications requiring high accuracy, such as autonomous vehicles, medical imaging, or surveillance. The third module operates in conjunction with other modules that perform initial object detection and feature extraction, ensuring a comprehensive and efficient detection pipeline. By incorporating advanced suppression or refinement methods, the apparatus mitigates false positives and improves the overall robustness of object detection in real-world scenarios.
3. An apparatus, comprising: at least one processor; and at least one memory including computer program instructions, wherein the at least one memory and the computer program instructions are configured to select a set of data for facial analysis; and apply the set of data to a network comprising at least three modules, wherein a first module of the at least three modules is configured to generate class independent region proposals to provide a region, wherein a second module of the at least three modules is configured to classify the region as face or non-face using a multi-task analysis, wherein the second module comprises a five convolutional layers with three fully connected layers, a network configured to fuse the three fully connected layers, and separate networks for face detection, landmark detection, visibility determination, pose estimation, and gender determination; and wherein a third module of the at least three modules is configured to perform post-processing on the classified region.
This invention relates to a facial analysis apparatus designed to improve the accuracy and efficiency of face detection and related tasks. The system addresses challenges in real-time facial analysis, such as handling variations in pose, lighting, and occlusion, by using a multi-stage neural network architecture. The apparatus includes at least one processor and memory storing instructions to process input data for facial analysis. The system selects a set of data and applies it to a network with three modules. The first module generates class-independent region proposals to identify potential face regions in the input data. The second module classifies these regions as face or non-face using a multi-task analysis, which includes five convolutional layers followed by three fully connected layers. This module also integrates separate sub-networks for face detection, landmark detection, visibility determination, pose estimation, and gender determination, allowing simultaneous analysis of multiple facial attributes. The third module performs post-processing on the classified regions to refine the results. The architecture enables efficient and accurate facial analysis by combining region proposal generation, multi-task classification, and post-processing in a unified framework.
4. The apparatus of claim 3 , wherein the third module comprises at least one of an iterative region proposal or landmark-based non-maximum suppression.
This apparatus, comprising a processor and memory, performs facial analysis. It operates by selecting input data and applying it to a three-module neural network. The first module generates initial, class-independent region proposals, identifying potential areas of interest. The second module then classifies these regions as either "face" or "non-face" through a multi-task analysis. This second module is structured with five convolutional layers, three fused fully connected layers, and incorporates separate subnetworks for specific tasks: face detection, landmark detection, visibility determination, pose estimation, and gender recognition. The third module performs post-processing on these classified regions, specifically employing either an iterative region proposal technique or landmark-based non-maximum suppression to refine the final detection results. ERROR (embedding): Error: Failed to save embedding: Could not find the 'embedding' column of 'patent_claims' in the schema cache
5. A method, comprising: selecting a set of data for facial analysis; and applying the set of data to a network comprising at least three modules, wherein a first module of the at least three modules is configured to generate class independent region proposals to provide a region, wherein a second module of the at least three modules is configured to classify the region as face or non-face using a multi-task analysis, wherein the second module comprises a five convolutional layers with three fully connected layers, a network configured to fuse the three fully connected layers, and separate networks for face detection, landmark detection, visibility determination, pose estimation, and gender determination, and wherein a third module of the at least three modules is configured to perform post-processing on the classified region.
This invention relates to a facial analysis system that processes image data to detect and analyze facial features. The system addresses the challenge of accurately identifying and classifying facial regions in images while also extracting additional facial attributes. The method involves selecting a set of image data for analysis and applying it to a multi-module neural network. The first module generates class-independent region proposals to identify potential facial regions within the input data. The second module classifies these regions as either face or non-face using a multi-task analysis approach. This module includes a deep convolutional neural network with five convolutional layers followed by three fully connected layers. The outputs of these fully connected layers are fused together, and separate sub-networks are used for specific tasks: face detection, landmark detection, visibility determination, pose estimation, and gender determination. The third module performs post-processing on the classified regions to refine the results. This approach enables comprehensive facial analysis by integrating multiple detection and classification tasks into a unified framework, improving accuracy and efficiency in facial recognition applications.
6. The method of claim 5 , wherein the third module comprises at least one of an iterative region proposal or landmark-based non-maximum suppression.
This invention relates to computer vision systems for object detection, specifically improving the accuracy and efficiency of identifying and localizing objects within images. The problem addressed is the computational complexity and potential inaccuracies in traditional object detection methods, particularly when dealing with overlapping or densely packed objects. The method involves a multi-stage detection pipeline where a third module refines object proposals generated by earlier stages. This module employs either an iterative region proposal approach or a landmark-based non-maximum suppression technique. The iterative region proposal method iteratively adjusts and refines candidate object regions to improve detection precision. Alternatively, the landmark-based non-maximum suppression uses key points or landmarks within objects to filter and merge overlapping detections, reducing redundant or incorrect predictions. The third module operates after initial detection stages, which likely include generating candidate object regions and applying a classifier to identify objects. By refining these proposals, the method enhances detection accuracy while maintaining computational efficiency. The iterative or landmark-based approaches help distinguish between closely spaced objects and eliminate false positives, improving overall system performance in real-world applications such as autonomous vehicles, surveillance, and medical imaging.
7. An apparatus, comprising: means for selecting a set of data for facial analysis; and means for applying the set of data to a network comprising at least three modules, wherein a first module of the at least three modules is configured to generate class independent region proposals to provide a region, wherein a second module of the at least three modules is configured to classify the region as face or non-face using a multi-task analysis, wherein the second module comprises a five convolutional layers with three fully connected layers, a network configured to fuse the three fully connected layers, and separate networks for face detection, landmark detection, visibility determination, pose estimation, and gender determination, and wherein a third module of the at least three modules is configured to perform post-processing on the classified region.
The apparatus is designed for facial analysis in computer vision, addressing the challenge of accurately detecting and analyzing facial features from input data. The system selects a set of data for processing and applies it to a multi-module neural network architecture. The first module generates class-independent region proposals, identifying potential facial regions without prior classification. The second module classifies these regions as either face or non-face using a multi-task analysis approach. This module includes five convolutional layers followed by three fully connected layers, with a fusion network combining the outputs. It also incorporates separate sub-networks for face detection, landmark detection, visibility determination, pose estimation, and gender determination, enabling comprehensive facial analysis. The third module performs post-processing on the classified regions to refine results. The system improves facial recognition accuracy by integrating multiple analysis tasks into a unified framework, enhancing detection robustness and feature extraction capabilities.
8. The apparatus of claim 7 , wherein the third module comprises at least one of an iterative region proposal or landmark-based non-maximum suppression.
The invention relates to an apparatus for processing image data, specifically for improving object detection and localization in images. The apparatus includes a third module designed to enhance the accuracy and efficiency of identifying and refining object regions within an image. This module employs either an iterative region proposal method or a landmark-based non-maximum suppression technique. The iterative region proposal method involves progressively refining object proposals by iteratively adjusting candidate regions based on learned features or prior detections, reducing false positives and improving localization. Alternatively, the landmark-based non-maximum suppression method uses key landmarks within detected objects to filter overlapping detections, ensuring only the most relevant and accurate regions are retained. The apparatus is particularly useful in applications requiring precise object detection, such as autonomous vehicles, medical imaging, or surveillance systems, where accurate and efficient identification of objects is critical. The third module's flexibility in using either method allows adaptation to different scenarios, optimizing performance based on computational constraints or detection accuracy requirements.
Unknown
December 8, 2020
Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.