Legal claims defining the scope of protection. Each claim is shown in both the original legal language and a plain English translation.
1. A method for enhancing a facial image of a user in real time, by digital generation of a portion of a facial image using artificial intelligence (AI) during a video conference with a plurality of participants, comprising: receiving, at a control system, a digital image of a first portion of a user's face in real time, the digital image being captured from a camera viewing the first portion of the user's face, the first portion being part of a complete facial image which includes the first portion and one or more second portions, wherein the camera is unable to view the second portion of the user's face, the digital image thereby being incomplete and/or lacking in resolution for the second portion of the user's face; improving resolution and/or digitally completing the second portion of the user's facial image that the camera is not able to capture, using an AI system; the improving resolution including, receiving the digital image at the AI system which includes a Generative Adversarial Network (GAN), the GAN using first additional user facial images of the user to generate enhanced additional facial images using a training method by the GAN; and generating, in real time, a complete enhanced digital facial image of the user's face, using the GAN, which includes the digital image of the first portion of the user's face, the first additional user facial images, and the AI generated enhanced additional facial images.
This invention relates to real-time facial image enhancement during video conferences, addressing the problem of incomplete or low-resolution facial images caused by camera limitations. The method involves capturing a partial digital image of a user's face in real time, where the camera cannot fully capture the entire face due to obstructions or positioning. An artificial intelligence (AI) system, specifically a Generative Adversarial Network (GAN), is used to improve resolution and digitally complete the missing portions of the facial image. The GAN is trained using additional facial images of the user to generate enhanced facial features. The system combines the captured partial image with the AI-generated enhanced portions to produce a complete, high-resolution facial image in real time. This ensures that all video conference participants receive a clear and complete view of the user's face, overcoming the limitations of the camera's field of view or resolution. The process is fully automated and operates without user intervention, ensuring seamless integration into video conferencing applications.
2. The method of claim 1 , further comprising: transmitting, using the control system, the real time complete enhanced digital facial image of the user's face in a video conference, whereby the real time complete enhanced digital facial image is shared with participants of the video conference in real time during the video conference.
This invention relates to video conferencing systems that enhance and transmit real-time digital facial images of participants. The technology addresses the problem of low-quality or distorted facial images during video conferences, which can hinder communication and engagement. The system captures a user's facial image using one or more cameras and processes it in real time to generate a complete, enhanced digital facial image. Enhancements may include improving resolution, correcting distortions, or applying filters to optimize visual quality. The control system then transmits this enhanced image to other participants in the video conference, ensuring all viewers receive a high-quality, real-time representation of the user's face. The system may also include additional features such as background removal, lighting adjustments, or facial expression tracking to further improve the video conferencing experience. By dynamically sharing the enhanced image during the conference, the technology ensures clear and engaging visual communication for all participants.
3. The method of claim 1 , further comprising: the first additional user facial images being accessible, with permission from the user, from social media sources of the user, and/or receiving the first additional user facial images from the user; and the training method including capturing second additional digital images of the user's face using another camera when the user is in view of the another camera to use the second additional digital images in the generating of the enhanced second additional facial images.
This invention relates to a method for improving facial recognition systems by enhancing facial image datasets using additional user-provided images and real-time camera captures. The method addresses the challenge of limited or low-quality training data for facial recognition, which can reduce accuracy in identifying individuals across different lighting, angles, and expressions. The method involves collecting first additional user facial images from social media sources with the user's permission or directly from the user. These images are used to supplement an initial set of facial images, improving the diversity and quantity of training data. The method also includes a training process that captures second additional digital images of the user's face using a secondary camera when the user is in view. These real-time captures are processed to generate enhanced second additional facial images, which are then incorporated into the training dataset. This dynamic approach ensures the system continuously updates with fresh, high-quality facial data, improving recognition performance over time. The method enhances facial recognition accuracy by leveraging both user-provided and system-captured images, ensuring robust identification in varied real-world conditions.
4. The method of claim 1 , further comprising: performing facial mesh training cycles for the GAN, as part of the training method, the training cycles including a user speaking a specialized audiologist-created paragraph to create training data to generate, using an interpolation, the enhanced additional facial images.
This invention relates to facial mesh training for generative adversarial networks (GANs) in the field of audiological diagnostics. The problem addressed is the need for high-quality, diverse facial image data to improve GAN-based facial animation for speech analysis, particularly in audiological applications. The method involves training a GAN to generate enhanced facial images by using specialized training data. A user speaks a predefined paragraph designed by an audiologist, which is recorded to create training data. This data is used to generate additional facial images through interpolation techniques. The training cycles focus on refining the GAN's ability to produce realistic and detailed facial expressions synchronized with speech. The process includes capturing facial images during speech, extracting facial landmarks, and using these landmarks to train the GAN. The interpolation step ensures smooth transitions between facial expressions, improving the accuracy of the generated images. The audiologist-created paragraph ensures that the training data covers a wide range of phonetic sounds and facial movements relevant to speech analysis. The enhanced facial images generated by the GAN can be used for various applications, such as improving speech recognition systems, diagnosing speech disorders, or creating realistic avatars for communication tools. The method ensures that the GAN produces high-fidelity facial animations that accurately reflect the nuances of human speech.
5. The method of claim 1 , wherein the digital image of the first portion of the user's face is received from a camera in a vicinity of the user and viewing the first portion of the user's face.
This invention relates to facial recognition systems that capture and process digital images of a user's face for authentication or identification purposes. The problem addressed is the need for accurate and reliable facial recognition in scenarios where only a portion of the user's face is visible to the camera, such as when the user is wearing a mask or partially obscured. The invention improves upon existing systems by dynamically capturing and analyzing partial facial images to enhance recognition accuracy. The method involves receiving a digital image of a first portion of the user's face from a camera positioned near the user and oriented to view that portion. The camera may be part of a security system, a mobile device, or another imaging device. The captured image is then processed to extract facial features, which are compared against stored reference data to verify the user's identity. The system may also adjust imaging parameters, such as focus or lighting, to optimize the quality of the partial facial image. Additionally, the method may incorporate multiple images or sensors to reconstruct a more complete facial representation when only a portion is initially visible. This approach ensures robust facial recognition even under suboptimal conditions, improving security and usability in real-world applications.
6. The method of claim 1 , wherein the digital image of the first portion of the user's face is received from a head-set having a camera viewing the first portion of the user's face.
A system captures and processes digital images of a user's face to enable augmented reality (AR) or virtual reality (VR) applications. The invention addresses the challenge of accurately tracking facial expressions and movements in real-time, which is essential for immersive AR/VR experiences. A headset equipped with a camera captures digital images of a portion of the user's face, such as the eyes, nose, or mouth. The captured images are analyzed to detect facial features, expressions, or movements. This data is then used to generate or update a digital representation of the user's face, which can be overlaid onto a virtual environment or used for avatar animation. The system may also incorporate additional sensors or cameras to enhance tracking accuracy. The method ensures low-latency processing to maintain real-time interaction, improving the user experience in AR/VR applications. The invention is particularly useful in gaming, social VR, and other interactive digital environments where realistic facial expressions are critical.
7. The method of claim 1 , further comprising: receiving the digital image of the first portion of the user's face from a camera in a head set wherein the camera is viewing the first portion of the user's face and the user is moving or in motion such that a stationary camera would not view the user's face to provide a real-time facial image of the user to video conference participants; and transmitting, using the control system, the real time complete enhanced digital facial image of the user's face in a video conference, whereby the real time complete enhanced digital facial image is shared with participants of the video conference in real time during the video conference.
This invention relates to video conferencing systems that enhance facial imaging for users in motion. The problem addressed is the difficulty of capturing clear, complete facial images during video conferences when users are moving, as stationary cameras often fail to provide adequate real-time views of the user's face. The solution involves a headset with a camera that tracks the user's face in motion, ensuring a real-time facial image is captured even when the user is moving. The system processes this image to generate a complete, enhanced digital facial image, which is then transmitted to video conference participants. The enhanced image compensates for partial or obscured views caused by movement, providing a full, high-quality facial representation. The headset's camera dynamically adjusts to the user's motion, ensuring continuous, uninterrupted facial imaging. The real-time transmission of the enhanced image allows participants to see a clear, complete view of the user's face throughout the conference, improving communication quality. The system integrates with existing video conferencing platforms to deliver seamless, motion-adaptive facial imaging.
8. The method of claim 1 , wherein the AI generated enhanced additional facial images and the first additional user facial images correspond to the second portions of the user's facial image, for the generation, in real time, of the complete enhanced digital facial image of the user's face.
This invention relates to real-time digital facial image enhancement using artificial intelligence (AI). The problem addressed is the generation of a complete, high-quality digital facial image from partial or low-quality input data, particularly for applications like virtual avatars, video conferencing, or facial recognition. The method involves capturing a first set of user facial images, which may be incomplete or of low resolution. An AI system then generates enhanced additional facial images corresponding to missing or low-quality portions of the original input. These AI-generated images are combined with the original user facial images to produce a complete, high-quality digital facial image in real time. The AI system is trained to accurately reconstruct facial features, textures, and expressions based on the available input data, ensuring natural and realistic results. The technique is particularly useful in scenarios where only partial facial data is available, such as when a user's face is partially obscured or when low-resolution images are used. The AI-driven enhancement ensures that the final output is a seamless, high-fidelity representation of the user's face, suitable for real-time applications. The system dynamically adjusts to variations in input quality, maintaining consistency and accuracy in the generated output.
9. A system for enhancing a facial image of a user in real time, by digital generation of a portion of a facial image using artificial intelligence (AI) during a video conference with a plurality of participants, which comprises: a computer system comprising: a computer processor, a computer-readable storage medium, and program instructions stored on the computer-readable storage medium being executable by the processor, to cause the computer system to: receive, at a control system, a digital image of a first portion of a user's face in real time, the digital image being captured from a camera viewing the first portion of the user's face, the first portion being part of a complete facial image which includes the first portion and one or more second portions, wherein the camera is unable to view the second portion of the user's face, the digital image thereby being incomplete and/or lacking in resolution for the second portion of the user's face; improve resolution and/or digitally completing the second portion of the user's facial image that the camera is not able to capture, using an AI system; the improving resolution including, receiving the digital image at the AI system which includes a Generative Adversarial Network (GAN), the GAN using first additional user facial images of the user to generate enhanced additional facial images using a training method by the GAN; and generate, in real time, a complete enhanced digital facial image of the user's face, using the GAN, which includes the digital image of the first portion of the user's face, the first additional user facial images, and the AI generated enhanced additional facial images.
This system enhances facial images in real time during video conferences where a camera cannot fully capture a user's face. The system addresses the problem of incomplete or low-resolution facial images by digitally generating missing portions using artificial intelligence (AI). A computer system receives a partial digital image of a user's face from a camera, where the image lacks one or more portions due to the camera's limited view or resolution. The system improves the resolution and completes the missing portions using an AI system that includes a Generative Adversarial Network (GAN). The GAN is trained using additional facial images of the user to generate enhanced facial images. In real time, the system combines the original partial image, the additional user images, and the AI-generated enhancements to produce a complete, high-resolution facial image. This allows participants in a video conference to see a fully rendered face, even when the camera cannot capture all parts. The system ensures seamless integration of real-time AI-generated content with live video, improving visual quality without requiring additional hardware.
10. The system of claim 9 , further comprising: transmitting, using the control system, the real time complete enhanced digital facial image of the user's face in a video conference, whereby the real time complete enhanced digital facial image is shared with participants of the video conference in real time during the video conference.
This invention relates to video conferencing systems that enhance and transmit high-fidelity digital facial images in real time. The problem addressed is the limited quality and realism of facial representations in traditional video conferencing, which can hinder communication and user experience. The system captures a user's facial image using multiple sensors, including depth sensors and high-resolution cameras, to generate a complete, enhanced digital facial image. This image is processed in real time to improve clarity, lighting, and detail, creating a lifelike representation of the user's face. The enhanced image is then transmitted to participants in a video conference, allowing them to see the user's facial expressions and movements with greater accuracy. The system ensures synchronization between the enhanced facial image and the user's actual movements, providing a seamless and immersive experience. By integrating advanced imaging and real-time processing, the invention enhances the quality of video conferencing, making interactions more natural and engaging.
11. The system of claim 9 , further comprising: the first additional user facial images being accessible, with permission from the user, from social media sources of the user, and/or receiving the first additional user facial images from the user; and the training method including capturing second additional digital images of the user's face using another camera when the user is in view of the another camera to use the second additional digital images in the generating of the enhanced second additional facial images.
A facial recognition system enhances user authentication by leveraging multiple sources of facial images. The system addresses the challenge of improving accuracy in facial recognition by incorporating diverse facial data from various sources. It collects first additional user facial images from social media platforms, with user consent, and optionally receives additional images directly from the user. The system also captures second additional digital images of the user's face using a secondary camera when the user is within its view. These additional images are used to generate enhanced second additional facial images, improving the system's ability to recognize the user under different conditions. The training method integrates these diverse image sources to create a more robust facial recognition model. This approach enhances security and reliability by reducing false positives and negatives, ensuring accurate user identification across different environments and lighting conditions. The system dynamically updates its training data with new images, continuously improving recognition performance. This method is particularly useful in applications requiring high-security authentication, such as biometric access control or secure digital transactions.
12. The system of claim 9 , further comprising: performing facial mesh training cycles for the GAN, as part of the training method, the training cycles including a user speaking a specialized audiologist-created paragraph to create training data to generate, using an interpolation, the enhanced additional facial images.
This invention relates to a system for generating enhanced facial images using a generative adversarial network (GAN) trained with specialized audiologist-created speech data. The system addresses the challenge of creating realistic facial expressions and movements synchronized with speech, particularly for applications in audiological assessments or speech therapy. The GAN is trained using a dataset derived from users speaking predefined paragraphs designed by audiologists. These paragraphs are structured to elicit specific facial movements and expressions, ensuring the training data captures a wide range of speech-related facial dynamics. During training, the system performs facial mesh training cycles, where the GAN learns to generate additional facial images through interpolation. This interpolation enhances the realism and accuracy of the generated facial images, ensuring they closely match the natural facial movements associated with the spoken words. The system leverages the structured speech data to improve the GAN's ability to produce high-fidelity facial animations, making it useful for applications requiring precise facial expression modeling, such as virtual avatars, speech analysis tools, or medical diagnostics. The training process ensures the generated images are both anatomically and dynamically accurate, addressing limitations in existing methods that rely on less controlled or less diverse training datasets.
13. The system of claim 9 , wherein the digital image of the first portion of the user's face is received from a camera in a vicinity of the user and viewing the first portion of the user's face.
A system for facial recognition or authentication captures a digital image of a user's face using a camera positioned near the user. The camera is oriented to view a first portion of the user's face, such as a partial or full frontal view. The system processes this image to identify or verify the user's identity, likely comparing it against stored facial data. The camera may be part of a larger security, access control, or biometric authentication system. The system may also include additional components, such as a display, input device, or processing unit, to facilitate user interaction and authentication. The captured image is used to determine whether the user is authorized or to perform other facial recognition tasks, such as tracking or identification in surveillance applications. The system ensures accurate and secure facial recognition by using a camera positioned to capture clear, high-quality images of the user's face.
14. The system of claim 9 , wherein the digital image of the first portion of the user's face is received from a head-set having a camera viewing the first portion of the user's face.
A system captures and processes digital images of a user's face, particularly a first portion of the face, using a headset equipped with a camera. The headset is positioned to view the first portion of the user's face, such as the eyes or forehead, and transmits the captured digital images to a processing unit. The system may also capture a second portion of the user's face using a separate camera, such as a smartphone or tablet, to provide a broader view. The processing unit analyzes the digital images to detect and track facial features, such as eye movements, expressions, or other biometric data. This data can be used for various applications, including augmented reality, user authentication, or health monitoring. The system may also include a display to present visual feedback or augmented content based on the analyzed facial data. The headset and additional camera work together to provide a comprehensive view of the user's face, enabling more accurate and reliable facial analysis. The system ensures real-time processing and feedback, enhancing user experience in applications requiring precise facial tracking.
15. The system of claim 9 , further comprising: receiving the digital image of the first portion of the user's face from a camera in a head set wherein the camera is viewing the first portion of the user's face and the user is moving or in motion such that a stationary camera would not view the user's face to provide a real-time facial image of the user to video conference participants; and transmitting, using the control system, the real time complete enhanced digital facial image of the user's face in a video conference, whereby the real time complete enhanced digital facial image is shared with participants of the video conference in real time during the video conference.
This invention relates to a system for enhancing and transmitting real-time facial images during video conferences, particularly for users in motion. The system addresses the challenge of capturing clear facial images when a user is moving, which would otherwise be difficult with a stationary camera. The system includes a headset with a camera that tracks and captures a first portion of the user's face in motion, ensuring a stable view even as the user moves. A control system processes this partial image and generates a complete, enhanced digital facial image by combining it with stored or dynamically generated data representing the remaining portions of the face. This enhanced image is then transmitted in real time to video conference participants, providing a seamless and lifelike representation of the user's face despite movement. The system ensures that the facial image remains clear and complete, improving the quality of video communication for users who are active or in motion during a call. The invention enhances video conferencing by dynamically adapting to user movement, ensuring consistent and high-quality facial imagery for all participants.
16. The system of claim 9 , wherein the AI generated enhanced additional facial images and the first additional user facial images correspond to the second portions of the user's facial image, for the generation, in real time, of the complete enhanced digital facial image of the user's face.
This invention relates to a system for generating enhanced digital facial images in real time. The system addresses the challenge of creating high-quality, complete facial images from partial or low-resolution input data, particularly for applications in augmented reality, virtual reality, or digital avatars. The system uses artificial intelligence (AI) to generate additional facial images that enhance and complete the user's facial image. These AI-generated images correspond to specific portions of the user's original facial image, ensuring seamless integration. The system also incorporates first additional user facial images, which are captured or provided separately, to further refine the final output. By combining these AI-generated and user-provided images, the system constructs a complete, enhanced digital facial image of the user's face in real time. The AI algorithms analyze facial features, lighting, and other visual data to ensure the generated images are realistic and consistent with the user's actual appearance. This approach improves the accuracy and quality of digital facial representations, making it suitable for applications requiring high-fidelity facial rendering.
17. A computer program product for enhancing a facial image of a user in real time, by digital generation of a portion of a facial image using artificial intelligence (AI) during a video conference with a plurality of participants, the computer program product comprising a computer readable storage medium having program instructions embodied therewith, wherein the computer readable storage medium is not a transitory signal per se, the program instructions executable by a computer to cause the computer to perform a method, comprising: receiving, at a control system, a digital image of a first portion of a user's face in real time, the digital image being captured from a camera viewing the first portion of the user's face, the first portion being part of a complete facial image which includes the first portion and one or more second portions, wherein the camera is unable to view the second portion of the user's face, the digital image thereby being incomplete and/or lacking in resolution for the second portion of the user's face; improving resolution and/or digitally completing the second portion of the user's facial image that the camera is not able to capture, using an AI system; the improving resolution including, receiving the digital image at the AI system which includes a Generative Adversarial Network (GAN), the GAN using first additional user facial images of the user to generate enhanced additional facial images using a training method by the GAN; and generating, in real time, a complete enhanced digital facial image of the user's face, using the GAN, which includes the digital image of the first portion of the user's face, the first additional user facial images, and the AI generated enhanced additional facial images.
This invention relates to real-time facial image enhancement during video conferences, addressing the problem of incomplete or low-resolution facial images captured by cameras. The system receives a digital image of a partial view of a user's face, where the camera cannot capture the full face due to angle, distance, or other limitations. The system uses an artificial intelligence (AI) system, specifically a Generative Adversarial Network (GAN), to improve resolution and digitally reconstruct the missing portions of the face. The GAN is trained using additional facial images of the user to generate enhanced facial features that match the user's appearance. The system then combines the original captured image with the AI-generated enhancements to produce a complete, high-resolution facial image in real time. This allows video conference participants to see a fully rendered face, even when the camera cannot capture the entire face or certain features are obscured. The solution improves visual clarity and user experience in video communication by leveraging AI to fill in missing or low-quality facial data dynamically.
18. The computer program product of claim 17 , further comprising: transmitting, using the control system, the real time complete enhanced digital facial image of the user's face in a video conference, whereby the real time complete enhanced digital facial image is shared with participants of the video conference in real time during the video conference.
This invention relates to video conferencing systems that enhance and transmit high-quality digital facial images in real time. The problem addressed is the limited quality and realism of facial images during video conferences, which can hinder communication and engagement. The solution involves a computer program product that processes and enhances a user's facial image to produce a complete, high-fidelity digital representation. This enhanced image is then transmitted to participants in a video conference, ensuring all viewers receive the improved visual data simultaneously. The system captures the user's facial features with high precision, applies enhancements such as lighting adjustments, resolution improvements, and detail refinements, and integrates these into the video conference stream. The real-time transmission ensures that all participants see the enhanced image without delay, improving the overall video conferencing experience. The technology is particularly useful in professional settings where clear, lifelike facial expressions are critical for effective communication.
19. The computer program product of claim 17 , further comprising: the first additional user facial images being accessible, with permission from the user, from social media sources of the user, and/or receiving the first additional user facial images from the user; and the training method including capturing second additional digital images of the user's face using another camera when the user is in view of the another camera to use the second additional digital images in the generating of the enhanced second additional facial images.
This invention relates to a computer program product for enhancing facial recognition accuracy by leveraging additional user facial images from diverse sources. The system addresses the challenge of improving facial recognition performance by incorporating supplementary images to train and refine recognition models. The program collects first additional user facial images with the user's permission, either by accessing social media sources or receiving direct uploads from the user. These images are used to generate enhanced versions, improving the quality and variability of training data. Additionally, the system captures second additional digital images of the user's face using a secondary camera when the user is in view, further enriching the dataset. These second additional images are also processed to create enhanced versions, which are then used to refine the facial recognition model. By combining images from social media, direct user input, and secondary camera captures, the system enhances the robustness and accuracy of facial recognition across different lighting, angles, and expressions. The training method dynamically updates the model with new data, ensuring continuous improvement in recognition performance. This approach mitigates limitations of traditional facial recognition systems that rely on limited or static datasets.
20. The computer program product of claim 17 , further comprising: performing facial mesh training cycles for the GAN, as part of the training method, the training cycles including a user speaking a specialized audiologist-created paragraph to create training data to generate, using an interpolation, the enhanced additional facial images.
This invention relates to improving facial animation in computer-generated avatars or digital representations of individuals, particularly for applications in communication, entertainment, or medical diagnostics. The core problem addressed is the lack of realistic and synchronized facial movements in existing systems, especially when generating facial expressions from audio inputs. Traditional methods often produce unnatural or mismatched animations due to insufficient training data or inadequate modeling of subtle facial dynamics. The solution involves a computer program product that enhances facial animation by training a generative adversarial network (GAN) using specialized audiologist-created paragraphs. During training, a user speaks these paragraphs to generate training data, which includes synchronized audio and facial movement recordings. The system then performs facial mesh training cycles, where the GAN learns to interpolate between these recorded facial expressions to produce additional, enhanced facial images. This interpolation ensures smoother transitions and more natural animations. The training process leverages the structured, controlled nature of the audiologist-designed paragraphs to capture nuanced facial movements that are often missed in unstructured speech. The resulting model can then generate highly realistic facial animations from new audio inputs, improving applications like virtual assistants, telemedicine, or digital avatars. The key innovation lies in the use of controlled speech data and interpolation-based training to refine the GAN's ability to produce lifelike facial expressions.
Unknown
April 21, 2020
Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.