Techniques are presented for inserting markers into a video stream. For each frame of an encoded video stream, disclosed techniques may determine a structure of macroblocks from a code of the frame, and then, select macroblocks to be replaced from the determined structure of macroblocks. Inserting a marker into a frame may be carried out by replacing codes of the selected macro blocks with a code of a marker that identifies the frame. Marking frames of the video stream may facilitate finding correspondence between frames from the video stream before transmission over a channel and the video stream received from the channel, based on the inserted markers. Knowledge of frame correspondence may enable a video quality metric estimation based on a comparison between the found corresponding frames.
Legal claims defining the scope of protection. Each claim is shown in both the original legal language and a plain English translation.
2. The method of claim 1, wherein the diagnostic information is monochrome content.
A system and method for processing diagnostic information in medical imaging involves capturing and analyzing visual data to assist in medical diagnosis. The system includes a camera configured to capture images of a patient's body, particularly focusing on areas of interest such as skin lesions, wounds, or other visible medical conditions. The captured images are processed to extract diagnostic information, which is then displayed on a display device for review by medical professionals. The diagnostic information is presented in monochrome content, enhancing clarity and reducing visual distractions. The system may also include a processing unit that applies image processing techniques to enhance the diagnostic information, such as adjusting contrast, brightness, or applying filters to highlight relevant features. The display device may be integrated with the camera or a separate unit, allowing for flexible deployment in clinical or remote settings. The method ensures that the diagnostic information is accurately captured, processed, and presented in a format that aids in efficient and effective medical diagnosis.
3. The method of claim 1, wherein the syntactic element(s) are H.264 macroblocks.
The invention relates to video processing, specifically to methods for analyzing and manipulating syntactic elements within video data. The problem addressed is the need for efficient and accurate processing of video data at the level of individual syntactic elements, such as macroblocks in H.264 encoded video streams. These macroblocks are fundamental units of video compression and encoding, containing pixel data and motion information. The method involves identifying and processing H.264 macroblocks within a video stream. These macroblocks are the basic building blocks of the video data, each containing a set of pixels and associated motion vectors that describe how the pixels move between frames. The method may include steps such as extracting, modifying, or analyzing these macroblocks to achieve specific video processing goals, such as compression, error correction, or content analysis. The processing of macroblocks can involve techniques like motion compensation, where the motion vectors within the macroblocks are used to predict pixel values in subsequent frames, reducing redundancy and improving compression efficiency. Additionally, the method may include error detection and correction mechanisms to handle corrupted or lost macroblocks, ensuring smooth playback and maintaining video quality. By focusing on H.264 macroblocks, the method provides a granular approach to video processing, enabling precise control over video data manipulation. This can be particularly useful in applications requiring high-quality video transmission, storage, or analysis, such as video streaming, surveillance, or medical imaging. The method ensures that the video data is processed efficiently while maintaining the integrity and quality of the video content.
4. The method of claim 1, wherein the syntactic element(s) are HEVC coding units.
The invention relates to video compression techniques, specifically within the High Efficiency Video Coding (HEVC) standard. HEVC is widely used for encoding and decoding video data, but existing methods may not efficiently handle the hierarchical structure of coding units (CUs), which are the basic building blocks in HEVC. The invention addresses this by providing a method for processing syntactic elements in video data, where these elements are specifically HEVC coding units. The method involves analyzing and manipulating these coding units to improve compression efficiency, reduce computational overhead, or enhance decoding performance. The approach may include techniques for partitioning, predicting, or transforming coding units in a way that optimizes bitrate or quality. By focusing on HEVC coding units, the invention ensures compatibility with existing HEVC encoders and decoders while introducing improvements in handling these syntactic elements. The method may also involve adaptive techniques that adjust processing based on the characteristics of the coding units, such as size, content, or motion. This allows for more efficient encoding and decoding, particularly in complex video scenes where traditional methods may struggle. The overall goal is to enhance the performance of HEVC-based video compression systems by refining how coding units are processed.
5. The method of claim 1, wherein the element(s) are one of AV1, AV2, or VP9 coding units.
This invention relates to video coding techniques, specifically methods for encoding or decoding video data using specific coding units. The problem addressed is the need for efficient and standardized video compression methods that balance computational complexity and compression efficiency. The invention provides a method for processing video data where the coding units are selected from AV1, AV2, or VP9 coding units. These coding units are standardized structures used in video compression to organize and encode video data efficiently. The method involves determining the coding unit type for a given block of video data and applying the corresponding encoding or decoding process. AV1, AV2, and VP9 are different video coding standards, each with unique features and optimizations. The selection of the coding unit type may depend on factors such as the video content, desired compression ratio, or hardware capabilities. The method ensures compatibility with multiple video coding standards while maintaining efficient compression and decoding performance. This approach allows for flexible and adaptable video processing, supporting different coding standards within a single system. The invention aims to improve video compression efficiency and reduce computational overhead by leveraging standardized coding units.
6. The method of claim 1, wherein the replaced syntactic element(s) correspond to a common predetermined spatial location of the frames across the video stream.
This invention relates to video processing, specifically techniques for modifying video streams by replacing syntactic elements while maintaining spatial consistency. The problem addressed is ensuring that replaced elements in a video stream appear in consistent spatial locations across frames, which is critical for maintaining visual coherence and realism in edited or synthesized video content. The method involves identifying and replacing syntactic elements within a video stream, where the replaced elements are positioned at a common predetermined spatial location across multiple frames. This ensures that the modifications appear naturally integrated into the video, avoiding visual discontinuities that could arise from inconsistent positioning. The technique may be applied to various types of syntactic elements, such as objects, text, or graphical overlays, and is particularly useful in applications like video editing, augmented reality, and automated content generation. By enforcing spatial consistency, the method improves the quality of video modifications, making them less detectable and more visually plausible. This is particularly important in scenarios where dynamic content is inserted or altered, such as in real-time video processing or post-production editing. The approach may also involve analyzing the video stream to determine the optimal spatial location for replacement, ensuring that the modifications align with the natural structure of the video. The technique can be implemented in software, hardware, or a combination thereof, and may be integrated into existing video processing pipelines.
7. The method of claim 1, wherein the diagnostic information of each frame uniquely identifies the frame within the video stream.
The invention relates to video processing systems that analyze video streams to extract diagnostic information from individual video frames. The problem addressed is the need to uniquely identify each frame within a video stream to enable accurate tracking, analysis, and synchronization of diagnostic data across multiple frames. Existing systems may struggle with frame misidentification, leading to errors in video analysis, synchronization, or diagnostic reporting. The method involves processing a video stream to extract diagnostic information from each frame. The diagnostic information is generated in a way that ensures each frame is uniquely identifiable within the video stream. This uniqueness allows for precise tracking of frames, even in scenarios where frames may be dropped, reordered, or corrupted. The diagnostic information may include timestamps, frame sequence numbers, or other identifiers that distinguish each frame from others in the stream. The method ensures that the diagnostic data remains associated with the correct frame, improving the reliability of video analysis applications such as surveillance, medical imaging, or industrial inspection. The uniqueness of the diagnostic information enables accurate frame-level diagnostics, synchronization, and error detection in video processing workflows.
10. The method of claim 8, further comprising storing the displayable video data.
A method for processing and storing displayable video data involves capturing video frames from a video source, such as a camera or video file, and analyzing the frames to identify regions of interest. The method includes applying image processing techniques to enhance or modify the identified regions, such as adjusting brightness, contrast, or applying filters. The processed frames are then encoded into a compressed video format to reduce file size while maintaining visual quality. The encoded video data is stored in a memory or storage device for later retrieval and display. The method may also include metadata generation, where additional data such as timestamps, frame rates, or processing parameters are associated with the video data. The stored video data can be accessed and displayed on a display device, such as a monitor or screen, for viewing by a user. The method ensures efficient storage and retrieval of high-quality video content while optimizing processing resources.
12. The method of claim 8, wherein the marker contains monochrome content.
A method for processing visual markers in augmented reality or computer vision applications addresses the challenge of efficiently detecting and decoding markers in varying lighting conditions. The method involves capturing an image of a marker, which is a visual pattern used to trigger augmented reality content or provide spatial references. The marker contains monochrome content, meaning it consists of distinct black and white regions without color variations, simplifying detection and reducing computational complexity. The method includes preprocessing the captured image to enhance contrast and noise reduction, followed by edge detection to identify the marker's boundaries. A decoding step interprets the marker's pattern to extract embedded data, such as identifiers or positional information. The monochrome design ensures robust detection under different lighting conditions and minimizes processing overhead, making it suitable for real-time applications. This approach improves reliability and speed in augmented reality systems, robotics navigation, and object tracking.
13. The method of claim 8, wherein the marker contains a QR code.
A system and method for tracking and identifying objects using markers, particularly in environments where visual identification is challenging. The invention addresses the need for reliable object tracking in automated systems, such as manufacturing, logistics, or inventory management, where traditional identification methods may fail due to environmental conditions or object movement. The system employs markers attached to objects, which are detected and analyzed by a vision system to determine the object's identity, orientation, or other relevant data. The markers may include encoded information, such as a QR code, which can be scanned and decoded to retrieve specific details about the object. The vision system processes images of the markers, extracts the encoded data, and uses this information to track or manage the object within the system. The use of QR codes allows for high-density data storage and quick, accurate decoding, even in dynamic environments. The system may also include calibration techniques to ensure accurate marker detection and decoding, improving reliability in real-world applications. This approach enhances automation by enabling precise object identification and tracking without requiring direct physical contact or complex sensor setups.
14. The method of claim 8, wherein the syntactic element(s) are H.264 macroblocks.
The invention relates to video processing, specifically to methods for analyzing and manipulating syntactic elements within video data. The problem addressed is the need for efficient and accurate identification and processing of specific structural components in video streams to enable tasks such as compression, error correction, or content analysis. The invention provides a method for processing video data by identifying and manipulating syntactic elements, which are discrete units of video information. In particular, the method involves recognizing and handling H.264 macroblocks, which are fundamental building blocks in the H.264/AVC video coding standard. These macroblocks represent fixed-size blocks of pixels that are encoded and decoded independently, allowing for efficient compression and transmission. The method may include steps such as detecting macroblock boundaries, extracting macroblock data, or modifying macroblock attributes to achieve desired processing outcomes. By focusing on H.264 macroblocks, the invention enables precise control over video data at a granular level, facilitating applications in video editing, transcoding, or quality assessment. The approach leverages the structured nature of H.264 encoding to ensure compatibility with existing video systems while enabling advanced processing capabilities.
15. The method of claim 8, wherein the syntactic element(s) are HEVC coding units.
This invention relates to video encoding and decoding, specifically improving efficiency in High Efficiency Video Coding (HEVC) by optimizing the processing of coding units. HEVC is a widely used video compression standard that divides video frames into hierarchical coding units for efficient encoding. However, existing methods may not fully leverage the structural relationships between these units, leading to suboptimal compression performance. The invention addresses this by analyzing and processing syntactic elements, specifically HEVC coding units, to enhance encoding efficiency. Coding units in HEVC are block-based structures that can be recursively split into smaller units, such as coding tree units (CTUs), coding quadtree units (CQUs), and prediction units (PUs). The method involves identifying and utilizing the hierarchical relationships and dependencies between these units during encoding and decoding. By doing so, the invention reduces redundancy, improves compression ratios, and minimizes computational overhead. The approach may include techniques such as adaptive splitting of coding units based on content characteristics, optimized prediction modes for smaller units, and efficient syntax parsing to streamline the encoding process. This results in faster encoding and decoding while maintaining or improving video quality. The method is particularly useful in applications requiring high compression efficiency, such as streaming, video conferencing, and storage systems.
16. The method of claim 8, wherein the syntactic element(s) are one of AV1, AV2, or VP9 coding units.
The invention relates to video encoding and decoding, specifically improving efficiency in handling syntactic elements within video codecs. The problem addressed is the computational overhead and inefficiency in processing certain syntactic elements during video compression, particularly in modern codecs like AV1, AV2, or VP9. These codecs use coding units (CUs) to partition video frames, but existing methods often lack optimized techniques for managing these units, leading to suboptimal performance. The invention provides a method for processing syntactic elements in video encoding or decoding, where the syntactic elements are coding units (CUs) from AV1, AV2, or VP9. The method involves analyzing and manipulating these CUs to enhance compression efficiency, reduce computational complexity, or improve decoding speed. The approach may include techniques such as adaptive partitioning, optimized prediction modes, or selective quantization adjustments tailored to the specific CU structure of the codec. By focusing on these syntactic elements, the method ensures compatibility with existing codec standards while improving overall encoding and decoding performance. The solution is particularly useful in applications requiring real-time video processing, such as streaming, video conferencing, or high-definition video playback.
17. The method of claim 8, wherein the marker is displayed at a common predetermined spatial location of the video, the spatial location determined by a location represented by the syntactic elements.
This invention relates to video processing and display systems, specifically addressing the challenge of visually indicating syntactic elements within video content. The method involves analyzing video frames to identify syntactic elements, such as text, objects, or other structured data, and then displaying a marker at a predetermined spatial location in the video that corresponds to the identified elements. The marker serves as a visual indicator to highlight or annotate the syntactic elements, improving user comprehension or interaction with the video content. The spatial location of the marker is fixed relative to the video frame, ensuring consistency across multiple frames. This approach is particularly useful in applications like video editing, machine learning training, or user interface design, where clear visual feedback on syntactic elements is required. The method may involve preprocessing the video to extract syntactic elements, determining their spatial coordinates, and then rendering the marker at the calculated position. The marker can be a graphical symbol, text label, or other visual cue, and its appearance may be customized based on the type or significance of the syntactic elements. The invention enhances the usability of video analysis tools by providing a standardized way to visually reference syntactic elements within the video content.
18. The method of claim 8, wherein the diagnostic information of each frame uniquely identifies the frame within the video sequence.
A system and method for video processing involves analyzing a video sequence to extract diagnostic information from individual frames. The diagnostic information uniquely identifies each frame within the sequence, allowing for precise tracking and analysis. This is particularly useful in applications where frame-level identification is critical, such as video compression, error detection, or content verification. The method may involve generating a unique identifier for each frame based on its content, metadata, or a combination of both. This identifier ensures that each frame can be distinctly recognized, even in cases where frames may appear similar or identical. The system may also include mechanisms to compare frames across different video sequences or within the same sequence to detect duplicates, errors, or other anomalies. The diagnostic information can be used to improve video encoding efficiency, enhance error correction, or verify the integrity of the video data. By uniquely identifying each frame, the system enables more accurate and reliable video processing operations.
19. The method of claim 8, wherein the portion of visual content from the respective source frame are located within a non-visible portion of the frame.
This invention relates to video processing, specifically techniques for handling visual content within video frames. The problem addressed is the inefficient or inaccurate processing of visual content that may be located in non-visible portions of video frames, such as during transitions, overlays, or when content is intentionally placed outside the visible display area. The method involves analyzing video frames to identify and extract portions of visual content that are located within non-visible regions of the frame. These regions may include areas outside the standard display boundaries, such as during wipes, fades, or when content is intentionally placed in margins. The extracted content is then processed separately from the visible portions of the frame, allowing for more accurate analysis, editing, or compression. This approach ensures that important visual information is not lost or corrupted during processing, even if it is not immediately visible to the viewer. The method may also involve tracking the movement or transformation of non-visible content across multiple frames, enabling seamless integration or removal of such content in post-processing. Additionally, the technique can be applied to various video formats and encoding standards, ensuring compatibility with existing systems. By distinguishing between visible and non-visible content, the method improves video quality, reduces artifacts, and enhances overall processing efficiency.
21. The system of claim 20, wherein the video coder operates according to H.264.
A video coding system is designed to efficiently encode and decode video data using the H.264 standard. The system includes a video coder that processes video frames to reduce data size while maintaining quality. The coder employs techniques such as motion compensation, intra-frame prediction, and entropy coding to achieve compression. It supports both encoding and decoding operations, allowing for bidirectional video transmission. The system may also include a memory for storing encoded or decoded video data and a processor for executing the coding algorithms. The use of H.264 ensures compatibility with widely adopted video compression standards, enabling efficient storage and transmission of video content across various applications, including streaming, broadcasting, and video conferencing. The system optimizes bandwidth usage and processing efficiency, making it suitable for real-time and high-definition video applications.
22. The system of claim 20, wherein the video coder operates according to HEVC.
The system relates to video coding, specifically improving efficiency in video compression and decompression. The problem addressed is the computational complexity and bandwidth demands of modern video coding standards, which can limit real-time processing and storage efficiency. The system includes a video coder configured to encode or decode video data using the High Efficiency Video Coding (HEVC) standard, which is designed to achieve higher compression ratios than earlier standards like H.264/AVC. HEVC employs advanced techniques such as larger block partitioning, improved motion compensation, and more sophisticated entropy coding to reduce redundancy in video data. The system may also include a memory buffer to store encoded or decoded video data, ensuring smooth processing and reducing latency. Additionally, the system may incorporate a processor to manage the video coder's operations, optimizing resource allocation and ensuring compatibility with various video formats. The integration of HEVC in the video coder allows for efficient compression while maintaining high video quality, making it suitable for applications requiring high-definition video transmission and storage, such as streaming services, video conferencing, and digital broadcasting.
23. The system of claim 20, wherein the video coder operates according to one of AV1, AV2, or VP9 coding units.
A system for video coding includes a video coder configured to encode or decode video data using coding units. The video coder processes video data by dividing it into coding units, which are blocks of pixels used for compression. The system further includes a memory storing the video data and a processor executing instructions to control the video coder. The video coder operates according to one of the AV1, AV2, or VP9 coding standards, which define how video data is partitioned, transformed, and compressed. AV1 and VP9 are open-source video coding formats developed by the Alliance for Open Media, while AV2 is an advanced version of AV1 with improved compression efficiency. The system may also include a display for rendering decoded video or a network interface for transmitting encoded video streams. The video coder dynamically selects coding units based on the video content to optimize compression efficiency and quality. The system is designed for applications in video streaming, broadcasting, and real-time communication, where efficient video compression is essential to reduce bandwidth and storage requirements.
24. The system of claim 20, wherein the select pixel blocks correspond to a common spatial location of the frames.
The invention relates to a video processing system that aligns and processes pixel blocks from multiple video frames to improve image quality or reduce computational complexity. The system addresses the challenge of accurately matching corresponding regions across frames, which is essential for tasks like motion estimation, super-resolution, or noise reduction. The system includes a frame alignment module that identifies and selects pixel blocks from different frames that correspond to the same spatial location in the scene. These selected pixel blocks are then processed together to enhance the video output. The alignment ensures that the pixel blocks represent the same scene content, allowing for effective temporal filtering or fusion. The system may also include a motion compensation module to handle slight misalignments caused by camera movement or object motion. By focusing on pixel blocks from the same spatial location, the system improves the accuracy of frame-based processing tasks, such as deblurring, denoising, or frame interpolation. The invention is particularly useful in applications requiring high-quality video reconstruction from multiple frames, such as surveillance, medical imaging, or consumer electronics.
25. The system of claim 20, wherein the select pixel blocks correspond to a non-visible portion of the frames when displayed.
The invention relates to video processing systems that optimize data handling by selectively processing specific pixel blocks within video frames. The problem addressed is the inefficient use of computational resources when processing entire video frames, including portions that are not visible or relevant to the final output. The system includes a video frame analyzer that identifies and selects pixel blocks corresponding to non-visible portions of the frames, such as those outside the display area, obscured by overlays, or otherwise not rendered in the final output. These selected pixel blocks are then processed differently from visible portions, such as by skipping, compressing, or applying reduced-quality processing to conserve resources. The system may also include a frame buffer manager that dynamically adjusts processing based on the visibility status of pixel blocks, ensuring that only necessary computations are performed. This approach improves efficiency by reducing unnecessary processing of non-visible data, leading to faster encoding, decoding, or rendering while maintaining visual quality for the visible portions. The invention is particularly useful in video streaming, real-time rendering, and display systems where resource optimization is critical.
27. The method of claim 8, wherein the syntactic element(s) corresponding to a non-visible spatial portion of visual content replaces a coded portion of the visual content.
This invention relates to digital image processing, specifically techniques for handling non-visible spatial portions of visual content. The problem addressed is the inefficient or inaccurate representation of visual data when certain regions are not meant to be displayed, such as obscured or hidden areas in an image or video. The solution involves replacing these non-visible spatial portions with syntactic elements, which are structured data representations that can be more efficiently processed or transmitted. These syntactic elements are used to replace coded portions of the visual content, allowing for optimized storage, transmission, or further processing. The method ensures that the non-visible regions do not consume unnecessary resources while maintaining the integrity of the visible content. The syntactic elements may include metadata, compression markers, or other structured data that describe the non-visible regions without requiring full pixel-level encoding. This approach is particularly useful in applications like video streaming, augmented reality, or image compression, where bandwidth and processing efficiency are critical. By dynamically replacing non-visible portions with syntactic elements, the system reduces computational overhead and improves overall performance.
Cooperative Patent Classification codes for this invention. Click any code to explore related patents in that topic.
October 1, 2021
April 23, 2024
Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.