Legal claims defining the scope of protection. Each claim is shown in both the original legal language and a plain English translation.
1. A method for one or more processors to implement in an extended reality (XR) device including a camera, the one or more processors, at least one memory, and a display, the method comprising: acquiring, from the camera, a camera data sequence including a first image frame of a real object in a scene; tracking a pose of the real object with respect to the camera along the camera data sequence, the pose being derived based at least on the first image frame and original training data generated from at least one of (i) a synthetic image of a 3D model rendered from a predetermined view and (ii) a camera image of a reference real object captured from the view, where the 3D model and the reference real object correspond to the real object; displaying an XR object on the display by rendering the XR object based at least on the pose; setting flag data, in a memory area of the at least one memory, indicative of whether or not the displayed XR object is consistent in pose with the real object, in response to receipt of an input of a user of the XR device; storing, in another memory area of the at least one memory, second image frames in the camera data sequence acquired when the flag data indicates that the displayed XR object is consistent in pose with the real object; and outputting the stored second image frames to a separate computing device having another processor.
This invention relates to extended reality (XR) devices, such as augmented reality (AR) or mixed reality (MR) systems, and addresses the challenge of accurately tracking and aligning virtual objects with real-world objects in dynamic environments. The method involves using an XR device equipped with a camera, processors, memory, and a display to capture a sequence of image frames containing a real object in a scene. The device tracks the pose (position and orientation) of the real object relative to the camera by comparing the captured frames with pre-generated training data. This training data is derived from either synthetic images of a 3D model rendered from a specific viewpoint or real camera images of a reference object captured from the same viewpoint, ensuring the 3D model or reference object matches the real object being tracked. The system then renders and displays an XR object on the device's display, positioning it based on the tracked pose. A user can provide input to indicate whether the XR object is correctly aligned with the real object. If the alignment is confirmed, the system stores subsequent image frames from the camera sequence where the XR object remains consistently aligned. These stored frames are then transmitted to an external computing device for further processing or analysis. The method improves the accuracy and reliability of XR object placement by leveraging pre-trained data and user feedback to ensure proper alignment.
2. The method according to claim 1 , further comprising receiving, after outputting the stored second image frames, another training data to replace or update the original training data, the another training data being based at least in part on the output second image frames.
This invention relates to machine learning systems for image processing, specifically addressing the challenge of continuously improving model performance by incorporating feedback from generated outputs. The method involves a system that processes input image frames using a trained model to generate output image frames, where the output frames are stored for later use. After outputting these stored frames, the system receives new training data derived from the previously generated output frames. This new training data is used to replace or update the original training data, allowing the model to refine its performance over time. The system may also include a user interface for displaying the output frames and collecting user feedback, which can be incorporated into the training data. The method ensures that the model continuously learns from its own outputs, improving accuracy and adaptability in real-world applications. The invention is particularly useful in scenarios where the quality of generated images needs to be dynamically adjusted based on evolving requirements or user preferences.
3. The method according to claim 1 , further comprising receiving an input from the user of the XR device through a user interface indicating that the displayed XR object is consistent in pose with the real object.
This invention relates to extended reality (XR) systems, specifically addressing the challenge of aligning virtual objects with real-world objects in a consistent pose. The method involves displaying an XR object in an XR environment, where the XR object is intended to represent or augment a real object. The system tracks the pose of the XR object relative to the real object and adjusts its position and orientation to maintain alignment. A user interface allows the user to provide feedback, confirming whether the displayed XR object matches the real object's pose accurately. This feedback loop helps refine the system's tracking and alignment algorithms, improving the accuracy of virtual object placement in the real world. The method may also include additional steps such as detecting the real object's pose using sensors or cameras, calculating the necessary adjustments for the XR object, and dynamically updating the XR object's display to reflect these adjustments. The goal is to enhance the realism and usability of XR applications by ensuring virtual objects appear correctly positioned and oriented relative to their real-world counterparts.
4. The method according to claim 1 , further comprising not storing the second image frames in the camera data sequence acquired when the flag data indicates that the displayed XR object is not consistent in pose with the real object.
This invention relates to augmented reality (AR) systems that overlay digital objects onto real-world scenes captured by a camera. The problem addressed is the computational and storage inefficiency of recording unnecessary image frames when the alignment (pose) between a displayed extended reality (XR) object and its corresponding real-world object is inconsistent. Inconsistent poses occur when tracking errors or environmental changes disrupt the alignment, making the recorded frames less useful for subsequent processing or analysis. The method involves acquiring a sequence of image frames from a camera while displaying an XR object overlaid on the real-world scene. During this process, the system generates flag data indicating whether the XR object's pose is consistent with the real object's pose. If the flag data indicates inconsistency, the system skips storing the second set of image frames in the camera data sequence. This selective storage reduces data volume and processing overhead by excluding frames where the XR object is misaligned, improving efficiency without losing critical information. The method may also include additional steps such as adjusting the XR object's pose based on sensor data or user input, ensuring accurate alignment when possible. The overall approach optimizes AR system performance by dynamically managing data storage based on pose consistency.
5. The method according to claim 1 , wherein tracking a pose of the real object with respect to the camera along the camera data sequence comprises deriving the pose based on the first image frame and training data only generated from a synthetic image of the 3D model rendered from the predetermined view.
This invention relates to computer vision techniques for tracking the pose of a real-world object in a sequence of camera images. The problem addressed is accurately determining the position and orientation (pose) of an object in real-time using minimal computational resources, particularly when real-world training data is limited or unavailable. The method involves capturing a sequence of images from a camera and using a 3D model of the object to track its pose over time. The key innovation is that the pose estimation relies solely on the first image frame in the sequence and synthetic training data generated from rendering the 3D model from a predetermined viewpoint. This synthetic data is used to train a machine learning model or algorithm that can then estimate the object's pose in subsequent frames without requiring additional real-world training examples. The approach reduces dependency on extensive real-world data collection, making it more scalable and adaptable to different objects and environments. The method may also include techniques for aligning the synthetic model with the real object in the first frame to initialize tracking accurately. Subsequent frames are processed to refine the pose estimation, leveraging the initial alignment and synthetic data-based training. This technique is particularly useful in applications like augmented reality, robotics, and industrial automation where real-time pose tracking is critical.
6. The method according to claim 1 , wherein the original training data includes only shape-based training data.
This invention relates to machine learning systems for object recognition, specifically addressing the challenge of training models using limited or shape-based training data. The method involves a training process where the original training data consists exclusively of shape-based information, such as geometric features or contours, without relying on other data types like color, texture, or contextual information. The system processes this shape-based data to generate a trained model capable of recognizing objects based solely on their shapes. The method may include preprocessing steps to extract and normalize shape features, followed by training a machine learning algorithm, such as a neural network, to learn shape-specific patterns. The trained model can then be applied to new, unseen data to classify or detect objects based on their shapes. This approach is particularly useful in scenarios where only shape information is available or when shape is the primary distinguishing feature for recognition tasks. The method ensures robustness by focusing on shape invariance, making it suitable for applications in medical imaging, industrial inspection, or autonomous navigation where shape-based recognition is critical. The system may also include validation steps to assess the model's performance on shape-based test data, ensuring accuracy and reliability in real-world applications.
7. A method for one or more processors to implement in a computing device including the one or more processors and a memory storing original training data for tracking a real object using an extended reality (XR) device, the method comprising: receive image frames of the real object acquired by the XR device, the images being acquired when flag data indicated that an XR object displayed on the XR device was consistent in pose with the real object; extracting feature data of the real object from the image frames, the feature data including the tracked pose of the real object in the respective image frames; and generating another training data to replace or update the original training data, the another training data based at least in part on the extracted feature data.
This invention relates to extended reality (XR) systems, specifically improving object tracking accuracy by dynamically updating training data. The problem addressed is the degradation of tracking performance in XR devices when real-world objects move or change, causing misalignment between virtual and real-world poses. The solution involves a method for real-time training data refinement. An XR device captures image frames of a real object when its virtual counterpart's pose matches the real object's pose, as indicated by flag data. Feature extraction is performed on these frames, including pose tracking information. The extracted data is then used to generate updated training data, which replaces or supplements the original dataset. This adaptive approach ensures the tracking model remains accurate as the real object's appearance or position changes, enhancing the consistency of XR overlays. The method operates within a computing device with processors and memory, leveraging stored original training data to continuously refine tracking performance. The key innovation lies in the dynamic, pose-consistent data collection and feature extraction process, which maintains alignment between virtual and real-world representations.
8. The method according to claim 7 , further comprising outputting the another training data to the XR device.
A system and method for training data generation and distribution in extended reality (XR) environments addresses the challenge of efficiently creating and utilizing high-quality training data for machine learning models in XR applications. The method involves generating synthetic training data by simulating interactions within an XR environment, such as virtual reality (VR) or augmented reality (AR). This synthetic data includes sensor inputs, user interactions, and environmental variables that mimic real-world scenarios. The generated training data is then processed to ensure it meets quality standards, including accuracy, diversity, and relevance to the intended machine learning tasks. The processed training data is stored in a database for future use. Additionally, the method includes distributing the training data to XR devices, enabling real-time or offline training of machine learning models directly on the devices. This approach enhances the adaptability and performance of XR applications by providing a continuous stream of high-quality training data tailored to specific use cases. The system ensures that the training data is dynamically updated and optimized for various XR scenarios, improving the overall efficiency and effectiveness of machine learning in XR environments.
9. The method according to claim 7 , wherein the original training data included only shape-based training data.
This invention relates to a method for training a machine learning model using shape-based training data. The method involves generating synthetic training data by applying transformations to the original shape-based training data, where the transformations include geometric modifications such as rotation, scaling, and deformation. The synthetic training data is then used to train a machine learning model, improving its ability to recognize or classify shapes in real-world applications. The original training data consists solely of shape-based information, meaning it does not include other types of data such as texture or color. By generating synthetic variations of the original shapes, the method enhances the model's robustness and generalization capabilities without requiring additional real-world data collection. This approach is particularly useful in applications where obtaining diverse shape-based training samples is difficult or expensive, such as in medical imaging, industrial quality control, or autonomous navigation. The synthetic data generation process ensures that the model is exposed to a wider variety of shape variations, leading to improved performance in tasks like object detection, segmentation, or classification. The method leverages geometric transformations to artificially expand the training dataset, reducing reliance on manual data annotation and increasing efficiency in model training.
10. A method for one or more processors to implement in an extended reality (XR) device including a camera, the one or more processors, at least one memory, and a display, the method comprising: acquiring, from the camera, a camera data sequence including a first image frame of a real object in a scene; tracking a pose of the real object with respect to the camera along the camera data sequence, the pose being derived based at least on the first image frame and original training data generated from at least one of (i) a synthetic image of a 3D model rendered from a predetermined view and (ii) a camera image of a reference real object captured from the view, where the 3D model and the reference real object correspond to the real object; displaying an XR object on the display by rendering the XR object based at least on the pose; setting flag data, in a memory area of the at least one memory, indicative of whether or not the displayed XR object is consistent in pose with the real object, in response to receipt of an input of a user of the XR device; outputting, to a separate computing device having another processor, second image frames in the camera data sequence acquired when the flag data indicates that the displayed XR object is consistent in pose with the real object.
This invention relates to extended reality (XR) systems, specifically methods for tracking real-world objects and displaying consistent XR overlays. The problem addressed is ensuring accurate pose alignment between virtual objects and their real-world counterparts in XR environments, which is critical for applications like augmented reality (AR) and mixed reality (MR). The method involves an XR device with a camera, processor, memory, and display. The camera captures a sequence of image frames containing a real object in a scene. The system tracks the object's pose (position and orientation) relative to the camera using the first image frame and pre-generated training data. This training data is derived from either synthetic images of a 3D model rendered from a specific view or real camera images of a reference object captured from the same view. The 3D model or reference object must correspond to the real object being tracked. The tracked pose is used to render and display an XR object on the device's display, ensuring it appears correctly aligned with the real object. The system includes a flag in memory that indicates whether the displayed XR object's pose is consistent with the real object. When a user provides input, this flag is set accordingly. If the flag indicates consistency, the system outputs selected image frames from the camera sequence to an external computing device for further processing or analysis. This ensures only frames with properly aligned XR overlays are transmitted, improving data quality for downstream applications.
11. The method according to claim 10 , further comprising receiving, after outputting the second image frames, another training data to replace or update the original training data, the updated training data being based at least in part on the output second image frames.
This invention relates to machine learning systems for image processing, specifically improving training data used in neural networks to enhance image recognition or generation tasks. The problem addressed is the static nature of training datasets, which can limit model performance over time as real-world conditions or requirements evolve. The solution involves dynamically updating training data based on model outputs to improve accuracy and adaptability. The method operates by first processing input image frames through a trained neural network to generate output image frames. These outputs are then analyzed to identify discrepancies, errors, or areas for improvement. Based on this analysis, new training data is generated or existing training data is updated. The updated training data incorporates insights from the model's outputs, such as corrected labels, additional examples, or refined features. This updated training data is then used to retrain or fine-tune the neural network, creating a feedback loop that continuously improves model performance. The system may also include preprocessing steps to prepare input image frames, such as normalization or augmentation, and post-processing steps to refine output image frames. The training data updates can be based on user feedback, automated error detection, or performance metrics. This dynamic approach ensures the model remains accurate and adaptable to changing conditions or requirements.
12. The method according to claim 10 , further comprising receiving an input from the user of the XR device through a user interface indicating that displayed XR object is consistent in pose with the real object.
This invention relates to extended reality (XR) systems, specifically methods for aligning virtual objects with real-world objects in an XR environment. The problem addressed is ensuring accurate positional and orientational consistency between displayed XR objects and their corresponding real-world counterparts, which is critical for applications like augmented reality (AR) and virtual reality (VR). The method involves tracking the pose (position and orientation) of a real object in the physical environment using sensors or cameras. An XR object is then rendered in the XR device's display such that it appears to be overlaid on or aligned with the real object. The system provides a user interface that allows the user to confirm whether the displayed XR object is correctly aligned with the real object. If the user indicates that the alignment is correct, the system records this confirmation, which can be used to refine tracking accuracy or adjust future renderings. This feedback loop helps improve the system's ability to maintain consistent alignment between virtual and real objects over time. The method may also involve adjusting the XR object's pose based on the user's input to fine-tune alignment. This ensures that the XR experience remains immersive and accurate, particularly in dynamic environments where tracking conditions may vary.
13. The method according to claim 10 , further comprising not outputting the second image frames in the camera data sequence acquired when the flag data indicates that the displayed XR object is not consistent in pose with the real object.
This invention relates to augmented reality (AR) systems that overlay digital objects onto real-world camera feeds. The problem addressed is ensuring consistency between the pose (position and orientation) of a displayed extended reality (XR) object and its corresponding real-world object. When the system detects a mismatch, it prevents the display of certain image frames to avoid visual inconsistencies. The method involves acquiring a sequence of camera data frames and tracking the pose of a real object in the environment. A digital XR object is rendered and overlaid onto the camera feed based on the tracked pose. Flag data is generated to indicate whether the XR object's pose matches the real object's pose. If the flag data indicates inconsistency, the system suppresses the output of second image frames in the sequence, ensuring only consistent frames are displayed. This prevents visual artifacts when tracking errors or misalignments occur. The method may also include adjusting the rendering of the XR object based on the flag data, such as modifying its appearance or position to better align with the real object. The system may use sensor data, such as depth or motion data, to improve pose tracking accuracy. The invention ensures a seamless AR experience by dynamically filtering or adjusting the display based on pose consistency.
14. The method according to claim 10 , wherein tracking a pose of the real object with respect to the camera along the camera data sequence comprises deriving the pose based on the first image frame and training data only generated from a synthetic image of the 3D model rendered from a predetermined view.
This invention relates to computer vision techniques for tracking the pose of a real-world object relative to a camera using synthetic training data. The problem addressed is the challenge of accurately tracking object poses in real-world scenarios where sufficient real-world training data may be limited or unavailable. The solution involves using synthetic images generated from a 3D model of the object, rendered from a predetermined viewpoint, to train a system that can then track the object's pose in real-world camera data sequences. The method includes capturing a sequence of image frames from a camera and using the first frame of this sequence to initialize the tracking process. The pose of the real object is determined by comparing features in the real-world image with those in the synthetic image generated from the 3D model. The synthetic image serves as training data, allowing the system to learn the object's appearance and geometry under controlled conditions. This approach reduces reliance on extensive real-world data collection and enables robust pose estimation even when real-world variations are present. The technique is particularly useful in applications such as augmented reality, robotics, and autonomous navigation, where accurate object tracking is critical. By leveraging synthetic data, the method improves efficiency and reliability in pose estimation tasks.
15. The method according to claim 10 , wherein the original training data includes only shape-based training data.
This invention relates to machine learning systems for object recognition, specifically addressing the challenge of training models with limited or shape-based training data. The method involves a training process where the original training data consists exclusively of shape-based information, such as geometric features or contours, without relying on additional contextual or appearance-based data. The system processes this shape-based training data to generate a trained model capable of recognizing objects based on their shapes alone. This approach is particularly useful in scenarios where other types of training data, such as color or texture, are unavailable or unreliable. The method may include preprocessing steps to extract and normalize shape features, followed by model training using these features. The resulting model can then be applied to new, unseen data to classify or identify objects based on their shapes. This technique improves object recognition accuracy in environments where shape is the primary distinguishing characteristic, such as in medical imaging, industrial inspection, or autonomous navigation. The invention ensures robust performance even with limited training data by focusing on shape-based features, reducing dependency on other potentially noisy or inconsistent data sources.
16. A method for one or more processors to implement in an extended reality (XR) device including a camera, the one or more processors, at least one memory, and a display, the method comprising: acquiring, from the camera, a camera data sequence including a first image frame of a real object in a scene; tracking a pose of the real object with respect to the camera along the camera data sequence, the pose being derived based at least on the first image frame and original training data generated from at least one of (i) a synthetic image of a 3D model rendered from a predetermined view and (ii) a camera image of a reference real object captured from the view, where the 3D model and the reference real object correspond to the real object; displaying an XR object on the display by rendering the XR object based at least on the pose; setting flag data, in a memory area of the at least one memory, indicative of whether or not the displayed XR object is consistent in pose with the real object, in response to receipt of an input of a user of the XR device; extracting feature data of the real object from second image frames in the camera data sequence acquired when the flag data indicates that the displayed XR object is consistent in pose with the real object, the feature data including the tracked pose of the real object in the respective second image frames; and generating another training data to replace or update the original training data, the another training data based at least in part on the extracted feature data.
This invention relates to extended reality (XR) devices, specifically methods for improving object tracking and training data generation in XR environments. The problem addressed is the inconsistency between virtual objects and real-world objects in XR applications, which can degrade user experience. The method involves an XR device with a camera, processors, memory, and a display. The device captures a sequence of images containing a real object and tracks its pose relative to the camera using original training data derived from either synthetic images of a 3D model or reference images of a similar real object. The XR device then renders a virtual object based on the tracked pose. A user can flag whether the virtual object's pose aligns with the real object. When the flag indicates consistency, the device extracts feature data, including the tracked pose, from subsequent image frames. This data is used to generate new training data, which replaces or updates the original training data, improving future tracking accuracy. The method ensures that virtual objects remain accurately aligned with real-world objects, enhancing the realism and reliability of XR applications.
17. The method according to claim 16 , further comprising receiving an input from the user of the XR device through a user interface indicating that displayed XR object is consistent in pose with the real object.
This invention relates to extended reality (XR) systems, specifically methods for aligning virtual objects with real-world objects in an XR environment. The problem addressed is ensuring accurate pose consistency between displayed XR objects and their corresponding real-world counterparts, which is critical for immersive and interactive applications. The method involves tracking the pose of a real object in the physical environment using sensors or cameras. An XR object is then rendered in the XR environment, and its pose is adjusted to match the tracked pose of the real object. The system provides a user interface that allows the user to provide feedback, confirming whether the displayed XR object is correctly aligned with the real object. If the user indicates that the XR object is consistent in pose with the real object, the system may use this feedback to refine tracking algorithms, improve alignment accuracy, or update the XR object's position and orientation in real time. This feedback loop enhances the reliability of pose estimation and ensures a seamless integration of virtual and real-world elements in the XR experience. The method may also involve additional steps such as calibrating the XR device, detecting environmental features, or adjusting rendering parameters to optimize the alignment process.
18. The method according to claim 16 , further comprising not extracting feature data of the real object from the second image frames in the camera data sequence when the flag data indicates that the displayed XR object is not consistent in pose with the real object.
This invention relates to augmented reality (AR) systems that overlay virtual objects onto real-world scenes captured by a camera. The core problem addressed is ensuring accurate alignment between virtual (XR) objects and real-world objects in AR environments, particularly when tracking conditions degrade or inconsistencies arise. The method involves processing a sequence of image frames from a camera to track and render virtual objects in alignment with real-world objects. Feature data of the real object is extracted from the image frames to determine its pose (position and orientation). A flag is used to indicate whether the displayed virtual object is consistent in pose with the real object. If the flag indicates inconsistency, the system avoids extracting feature data from subsequent frames, preventing erroneous tracking updates. This helps maintain visual coherence in AR displays when tracking reliability is low. The method also includes generating a virtual object pose based on the real object's pose and rendering the virtual object in the AR scene. If tracking is lost or inconsistent, the system may freeze the virtual object's pose or apply alternative tracking techniques to minimize visual artifacts. The approach improves AR stability by dynamically adapting to tracking challenges.
19. The method according to claim 16 , wherein tracking a pose of the real object with respect to the camera along the camera data sequence comprises deriving the pose based on the first image frame and training data only generated from a synthetic image of the 3D model rendered from a predetermined view.
This invention relates to computer vision techniques for tracking the pose of a real-world object relative to a camera using synthetic training data. The problem addressed is the difficulty of accurately tracking real objects in dynamic environments where traditional methods rely on extensive real-world training data, which can be costly and time-consuming to acquire. The method involves capturing a sequence of image frames from a camera and using a 3D model of the object to track its pose. The key innovation is that the pose tracking is performed using only a single real image frame from the sequence and synthetic training data generated from a synthetic image of the 3D model rendered from a predetermined view. This synthetic image is used to train a model that can then estimate the pose of the real object in subsequent frames. The approach reduces dependency on real-world training data by leveraging synthetic data, making the system more scalable and adaptable to different objects and environments. The method is particularly useful in applications like augmented reality, robotics, and autonomous navigation where real-time pose estimation is critical.
20. The method according to claim 16 , wherein the original training data includes only shape-based training data.
This invention relates to a machine learning method for training a model using shape-based training data. The method addresses the challenge of improving model accuracy by focusing on shape-based features, which are often critical in applications like object recognition, medical imaging, or industrial inspection. The training process involves generating synthetic training data by applying geometric transformations to the original shape-based training data. These transformations include scaling, rotation, and translation, ensuring the model learns to generalize across variations in shape and orientation. The method also includes a validation step where the model's performance is evaluated using a separate validation dataset, allowing for iterative refinement. By restricting the training data to shape-based inputs, the model is optimized for tasks where shape is the primary discriminative feature, reducing reliance on other non-shape attributes that may introduce noise or bias. This approach enhances robustness and accuracy in shape-sensitive applications.
Unknown
April 21, 2020
Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.