Legal claims defining the scope of protection. Each claim is shown in both the original legal language and a plain English translation.
1. A displacement-oriented view synthesis system, comprising: a plurality of three-dimensional (3D) warping devices each coupled to receive at least one input image captured from a corresponding reference view, and each performing 3D warping on the input image to generate at least one corresponding warped image in a target view; a view blending device coupled to receive the warped images, and performing view blending on the warped images to generate at least one blended image in the target view; and an inpainting device coupled to receive the blended image, and performing inpainting on the blended image to generate a synthesized image in the target view; wherein the inpainting is performed according to a difference displacement between frames of different views; wherein the difference displacement is obtained by the following steps: determining a foreground displacement associated with a foreground object, the foreground displacement being constructed by connecting a foreground-associated point in the target view with a corresponding point in the reference view; determining a background displacement associated with a background object, the background displacement being constructed by connecting a background-associated point in the target view with the corresponding point in the reference view; and obtaining the difference displacement by subtracting the background displacement from the foreground displacement.
2. The system of claim 1 , wherein the 3D warping device adopts backward warping that maps from the warped image to the input image.
A system for image processing involves a 3D warping device that performs backward warping to transform a warped image back to an input image. The system addresses the challenge of efficiently and accurately reconstructing an original image from a distorted or warped version, which is common in applications like medical imaging, computer vision, and augmented reality. The backward warping technique maps coordinates from the warped image to the input image, ensuring precise alignment and minimizing artifacts during the transformation process. This approach is particularly useful in scenarios where real-time processing and high accuracy are required, such as in 3D reconstruction, image stabilization, and virtual reality environments. The system may also include additional components, such as a memory for storing image data and a processor for executing the warping algorithms, to enhance performance and scalability. By leveraging backward warping, the system achieves efficient and reliable image reconstruction, improving the overall quality and usability of the processed images.
3. The system of claim 1 , wherein the view blending is performed by a weighted average or winner-take-all technique.
This invention relates to a system for blending multiple views of a scene to generate a composite image. The problem addressed is the need to combine different perspectives or data sources while minimizing artifacts and ensuring visual coherence. The system processes input views, which may include images, video frames, or sensor data, and applies a blending technique to merge them into a single output. The blending process can use either a weighted average approach, where each view contributes proportionally to the final result, or a winner-take-all method, where the most relevant or highest-quality data from each view is selected. The system may also include preprocessing steps to align or normalize the input views before blending. The goal is to produce a high-quality composite image that retains the best features from each input while avoiding distortions or inconsistencies. This technique is useful in applications such as panoramic imaging, medical imaging, and augmented reality, where combining multiple views is essential for accurate representation. The blending method ensures that the final output is visually seamless and free from artifacts that could arise from simple concatenation or overlapping of raw data.
4. The system of claim 1 , wherein holes in an area between a foreground edge and a background edge are filled by pixels or patches searched along the difference displacement.
The invention relates to image processing, specifically techniques for filling gaps or holes in an image, such as those occurring at the boundaries between foreground and background elements. The system addresses the problem of seamlessly reconstructing missing or occluded regions in an image by identifying and utilizing pixels or patches from other parts of the image that correspond to the missing areas. These pixels or patches are selected based on a difference displacement, which likely refers to a measure of spatial or color difference between the missing region and candidate regions in the image. By analyzing this displacement, the system determines the most suitable pixels or patches to fill the holes, ensuring a coherent and visually plausible reconstruction. This approach is particularly useful in applications like image inpainting, object removal, or background reconstruction, where maintaining visual consistency is critical. The method leverages spatial relationships and color similarities to guide the filling process, avoiding artifacts or distortions that could arise from random or uninformed selections.
5. The system of claim 1 , wherein the at least one input image comprises an input texture image and an associated input depth image.
6. The system of claim 1 , further comprising: a plurality of cameras each capturing the at least one input image from the corresponding reference view.
7. A displacement-oriented view synthesis method, comprising: receiving a plurality of input images captured from corresponding reference views respectively; performing three-dimensional (3D) warping on the input images to respectively generate corresponding warped images in a target view; performing view blending on the warped images to generate at least one blended image in the target view; and performing inpainting on the blended image to generate a synthesized image in the target view; wherein the inpainting is performed according to a difference displacement between frames of different views; wherein the difference displacement is obtained by the following steps: determining a foreground displacement associated with a foreground object, the foreground displacement being constructed by connecting a foreground-associated point in the target view with a corresponding point in the reference view; determining a background displacement associated with a background object, the background displacement being constructed by connecting a background-associated point in the target view with the corresponding point in the reference view; and obtaining the difference displacement by subtracting the background displacement from the foreground displacement.
8. The method of claim 7 , wherein the step of performing the 3D warping adopts backward warping that maps from the warped image to the input image.
9. The method of claim 7 , wherein the view blending is performed by a weighted average or winner-take-all technique.
10. The method of claim 7 , wherein holes in an area between a foreground edge and a background edge are filled by pixels or patches searched along the difference displacement.
This invention relates to image processing, specifically techniques for filling holes in images, such as those that occur during foreground-background separation or object extraction. The problem addressed is the presence of gaps or holes in an image region between a foreground edge and a background edge, which can result from imperfect segmentation or other processing steps. These holes degrade image quality and must be filled to produce a clean, seamless result. The method involves identifying holes in the area between the foreground and background edges. To fill these holes, pixels or patches are searched for along a displacement vector representing the difference between the foreground and background edges. This displacement vector guides the search for suitable pixels or patches to fill the holes, ensuring that the filled regions blend naturally with the surrounding image content. The search may involve matching color, texture, or other visual features to ensure coherence. The technique is particularly useful in applications like video editing, object removal, and image compositing, where maintaining visual continuity is critical. By using the displacement vector to guide the filling process, the method ensures that the filled regions align properly with the foreground and background, avoiding artifacts and maintaining realism.
11. The method of claim 7 , wherein the input image comprises an input texture image and an associated input depth image.
This invention relates to image processing techniques for generating a three-dimensional (3D) model from a two-dimensional (2D) input image. The problem addressed is the accurate reconstruction of 3D geometry from a single 2D image, which is challenging due to the loss of depth information in the 2D representation. The solution involves using an input texture image, which contains color and surface detail, along with an associated input depth image, which provides depth information. The depth image is used to estimate the 3D structure of the scene or object depicted in the texture image. The method processes these two images together to generate a 3D model that combines the visual details from the texture image with the spatial information from the depth image. This approach improves the accuracy of 3D reconstruction by leveraging complementary data from both images, ensuring that the resulting model retains both visual fidelity and geometric precision. The technique is particularly useful in applications such as virtual reality, augmented reality, and computer graphics, where realistic 3D representations are required.
12. The method of claim 7 , further comprising: providing a plurality of cameras for capturing the plurality of input images from the corresponding reference views respectively.
A system and method for capturing and processing multiple input images from different reference views using a plurality of cameras. The invention addresses the challenge of obtaining accurate 3D reconstructions or multi-view imaging by ensuring synchronized and aligned image capture from multiple perspectives. Each camera in the system captures an input image from a distinct reference view, allowing for comprehensive spatial data acquisition. The cameras are synchronized to capture images simultaneously or in a controlled sequence, ensuring temporal consistency. The captured images are then processed to generate a 3D model, perform depth estimation, or enable multi-view rendering. The system may include calibration mechanisms to align the cameras' fields of view and correct distortions, ensuring precise spatial relationships between the images. This approach enhances applications such as augmented reality, robotics, medical imaging, and autonomous navigation by providing high-fidelity multi-view data. The invention improves upon prior art by integrating multiple cameras in a coordinated manner, reducing errors in 3D reconstruction and enabling real-time processing.
Unknown
January 12, 2021
Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.