In general, techniques are described regarding adaptation of image capture modes to provide refined image capture capabilities. Cameras comprising image sensors configured to perform the techniques are also disclosed. An image sensor may use different image capture modes to capture frames of image data based at least in part on various operating conditions. For example, the processor can cause an image sensor to transition between image capture modes specifying different binning levels in the case that a digital zoom level satisfies a predefined threshold.
Legal claims defining the scope of protection. Each claim is shown in both the original legal language and a plain English translation.
Claims not yet imported for this patent.
Claims are being imported from USPTO data. Check back soon!
See the raw claims text section below.
Original claims text from the patent document.
Claim 1: . An apparatus configured to capture image data, the apparatus comprising:
Claim 2: . The apparatus of, wherein the one or more processors are further configured to:
Claim 3: . The apparatus of, whereinthe second image sensor andthethirdimage sensorthat corresponds to the second camera comprises the first image sensorcomprise the same number of pixels.
Claim 4: . The apparatus of, wherein theone or more processors are further configured to:
Claim 5: . The apparatus of, wherein to cause the first image sensor to capture the first one or more frames of the image data using the first image capture mode, the one or more processors are configured to cause the first image sensor to combine four pixels of the first image sensor for each output pixel of the image data, and
Claim 6: . The apparatus of claim, wherein the one or more processors are further configured to:
Claim 7: . The apparatus of, wherein the one or more processors are further configured to:
Claim 8: . The apparatus of, wherein the reduced portion comprises a predefined percentage of a full field of view of the first image sensor, wherein the full field of view comprises all pixels of the first image sensor.
Claim 9: . The apparatus of, wherein the first image capture mode provides a lower pixel density compared to that of the second image capture mode.
Claim 10: . The apparatus of, wherein the one or more processors are further configured to:
Claim 11: . The apparatus of, wherein to cause the first image sensor to capture the first one or more frames of the image data using the first image capture mode, the one or more processors are configured to cause the first image sensor to combine at least five pixels of the first image sensor for each output pixel of the image data, wherein to cause the first image sensor to capture the second one or more frames of the image data using the second image capture mode, the one or more processors are configured to cause the first image sensor to use four pixels of the first image sensor for each output pixel of the image data, and
Claim 12: . The apparatus of, wherein the one or more processors are further configured to:
Claim 13: . The apparatus of, wherein the one or more processors are further configured to:
Claim 14: . The apparatus of, further comprising:
Claim 15: . A method of capturing image data comprising:
Claim 16: . The method of, further comprising:
Claim 17: . The method of, whereinthe number of pixels ofthe second image sensor andthethirdimage sensorcorresponding to the second camera that are used for each output pixel is based at least in part on a difference between a total number of pixels of the first image sensor and a total number of pixels of the image sensor corresponding to the second cameracomprise the same number of pixels.
Claim 18: . The method of, further comprising:
Claim 19: . The method of, further comprising:
Claim 20: . The method of, wherein causing the first image sensor to capture the first one or more frames of the image data using the first image capture mode comprises causing the first image sensor to combine at least five pixels of the first image sensor for each output pixel of the image data, wherein causing the first image sensor to capture the second one or more frames of the image data using the second image capture mode comprises causing the first image sensor to use four pixels of the first image sensor for each output pixel of the image data, and
Claim 21: . The method of, further comprising:
Claim 22: . The method of, further comprising:
Claim 23: . An apparatus configured to capture image data, the apparatus comprising:
Claim 24: . The apparatus of, the apparatus further comprising:
Claim 25: . The apparatus of, wherein the means for causing the image sensor to capture the first one or more frames of the image data using the first image capture mode comprises means for causing the image sensor to combine four pixels of the image sensor for each output pixel of the image data, and
Claim 26: . The apparatus of, further comprising:
Claim 27: . A non-transitory computer-readable storage medium having stored thereon instructions that, when executed, cause one or more processors to:
Claim 28: . The non-transitory computer-readable storage medium of, wherein the one or more processors are further configured to:
Claim 29: . The non-transitory computer-readable storage medium of, wherein the instructions, when executed, further cause one or more processors to at least:
Claim 30: . The non-transitory computer-readable storage medium of, wherein the instructions, when executed, further cause one or more processors to at least:
Claim 31: . The non-transitory computer-readable storage medium of, wherein the instructions, when executed, further cause one or more processors to at least:
Claim 32: 32. The method of, wherein the second image sensor and the third image sensor are 12MP image sensors.
Claim 33: 33. The method of, further comprising:
Claim 34: 34. The method of, wherein the reduced portion comprises a predefined percentage of a full field of view of the first image sensor, wherein the full field of view comprises all pixels of the first image sensor.
Claim 35: 35. The method of, wherein the first image capture mode provides a lower pixel density compared to that of the second image capture mode.
Claim 36: 36. The apparatus of, further comprising:
Claim 37: 37. The apparatus of, wherein the second image sensor and the third image sensor comprise the same number of pixels.
Claim 38: 38. The apparatus of, further comprising:
Claim 39: 39. The method of, wherein the reduced portion comprises a predefined percentage of a full field of view of the first image sensor, wherein the full field of view comprises all pixels of the first image sensor.
Claim 40: 40. The apparatus of, further comprising:
Claim 41: 41. The apparatus of, wherein the second image sensor and the third image sensor are 12MP image sensors.
Claim 42: 42. The apparatus of, further comprising:
Claim 43: 43. The apparatus of, wherein the first image capture mode provides a lower pixel density compared to that of the second image capture mode.
Claim 44: 44. The non-transitory computer-readable storage medium of, wherein the second image sensor and the third image sensor comprise the same number of pixels.
Claim 45: 45. The non-transitory computer-readable storage medium of, wherein the second image sensor and the third image sensor are 12MP image sensors.
Claim 46: 46. The non-transitory computer-readable storage medium of, wherein the instructions further cause the one or more processors to:
Claim 47: 47. The non-transitory computer-readable storage medium of, wherein the instructions further cause the one or more processors to:
Claim 48: 48. The non-transitory computer-readable storage medium of, wherein the reduced portion comprises a predefined percentage of a full field of view of the first image sensor, wherein the full field of view comprises all pixels of the first image sensor.
Claim 49: 49. The non-transitory computer-readable storage medium of, wherein the first image capture mode provides a lower pixel density compared to that of the second image capture mode.
Claim 50: 50. The non-transitory computer-readable storage medium of, wherein instructions further cause the one or more processors to:
Complete technical specification and implementation details from the patent document.
This application claims priority to U.S. Provisional Patent Application No. 62/897,697, filed Sep. 9, 2019, which is hereby incorporated by reference in its entirety.
The disclosure relates to image capture and processing.
Image capture devices are commonly incorporated into a wide variety of devices. In this disclosure, an image capture device refers to any device that can capture one or more digital images, including devices that can capture still images and devices that can capture sequences of images to record video. By way of example, image capture devices may comprise stand-alone digital cameras or digital video camcorders, camera-equipped wireless communication device handsets, such as mobile telephones having one or more cameras, cellular or satellite radio telephones, camera-equipped personal digital assistants (PDAs), panels or tablets, gaming devices, computer devices that include cameras, such as so-called “web-cams,” or any devices with digital imaging or video capabilities.
Certain digital cameras feature the ability to capture specified portions of a potential field-of-view (FOV) for cameras, such as with zoom-in, zoom-out, panoramic, telescope, telephoto, or periscope features. Such features allow a digital camera, using an image sensor and camera processor, to enhance, or otherwise alter, the capture of a scene (or sequence of images, in the case of a video) depending on user settings and/or manipulations (e.g., pinch-to-zoom, flash settings, aspect ratio settings, etc.). In some examples, the potential FOV refers to the entirety of pixels that a particular image sensor can use to sense a scene or image sequence.
In addition, certain image capture devices may include multiple image sensors and/or multiple lenses that may be used in conjunction with one another, or otherwise, may be toggled from one camera to another camera. Example lens types include wide-angle lenses, ultra-wide-angle lenses, telephoto lenses, telescope lenses, periscope-style zoom lenses, fisheye lenses, macro lenses, prime lenses, or various combinations thereof. For example, a dual camera configuration may include both a wide lens and a telephoto lens. Similarly, a triple camera configuration may include an ultra-wide lens, in addition to a wide lens and a telephoto lens. By using multiple lenses and/or image sensors, image capture devices are able to capture images with different FOVs and/or optical zoom levels. Image sensors can then output a number of pixels and pixel values to a camera processor for further processing.
In general, this disclosure describes image capture techniques involving digital cameras having image sensors, lenses having respective optical zooms, and camera processors. The camera processors may be configured to cause the image sensors to capture image data, such as frames of video data and/or still image shots, using various image capture modes. For example, a camera processor may cause an image sensor to capture one or more frames of image data using pixel binning. An image sensor performs pixel binning by combining multiple pixels of an image sensor into fewer pixels for output to a camera processor (e.g., 4×4 binning, 3×3 binning, 2×2 binning, horizontal binning, vertical binning, etc.).
The image sensor may output combined pixels to the camera processor for further processing. Based on a desired zoom level or zoom ratio, a camera processor may perform digital zoom techniques, such as digital cropping, upsampling, downsampling, scaling, or combinations thereof. As the zoom level increases, however, so does the amount of upsampling or scaling performed by the camera processor. As such, the resulting image data from the increasing zoom levels tends to provide a distorted image or video because of the amount of upsampling or scaling performed to achieve the desired digital zoom level.
In some examples, as the user-requested digital zoom level exceeds a camera transition threshold, the camera processor may be configured to transition between cameras having different optical zoom levels and/or different effective focal lengths. Before the transition between cameras, however, the camera processor may continue to increase the amount of upsampling or scaling used to achieve the desired digital zoom level. In some instances, transitions between cameras may not occur until a large amount of zoom increase has transpired (e.g., from 1.0× zoom to 5.0× zoom, or more).
In an attempt to bridge between such wide gaps, certain cameras may use image sensors with higher and higher pixel counts (e.g., 48MP, 64MP, 108MP), so as to provide as many output pixels as possible for the camera processor to perform digital zoom operations. Because pixel binning is then used with such large image sensors, a camera processor will still have less pixels to use in order to achieve a desired digital zoom level at a desired digital resolution. As such, the camera processor will, nevertheless, use upsampling or scaling to bridge the gap between camera transitions and the amount of image distortion caused by the upsampling or scaling will continue to worsen until a camera transition occurs. Thus, with wide gaps between camera transitions, the amount of image distortion caused by sampling or scaling can seriously degrade image or video quality and as such, the user experience will suffer as the user attempts to perform zoom operations that reach the outer limits of each camera.
Moreover, at the moment when the camera transition occurs, the image or video quality will abruptly change due to the sudden decrease in the amount of upsampling or scaling performed by the camera processor in view of the different optical zoom level of the second camera. This abrupt change is undesirable as it provides inconsistent image or video quality as the digital zoom level increases and transitions between cameras occur.
In accordance with various techniques of this disclosure, the camera processor may cause the image sensor to change binning levels as the camera processor detects changes in the desired zoom level as those zoom levels satisfy predefined binning transition thresholds or camera transition thresholds. As such, the image sensor may utilize different binning levels to capture frames of image data, whether that be video data, still photos, or combinations thereof, in response to the desired digital zoom level increasing or decreasing. In some examples, the desired zoom level may be specified by a user or may be the result of an automatic zoom operation. A camera processor may then receive the image data from the image sensor in the form of pixel information binned or not binned in accordance with the various binning level transitions.
In some instances, depending on the difference between transitioning binning levels, the image sensor may determine a reduced amount of output pixels that were binned at the transitioned binning level prior to outputting those pixels to the camera processor. For example, the image sensor may determine a center portion of the pixels binned using a second, lower level of pixel binning and output the center portion of pixels to the camera processor, rather than the full amount of pixels binned at the second level. In some examples, this may be done in order to maintain a constant or near constant throughput or bit rate to the camera processor between binning transitions. The camera processor may then perform additional digital zoom operations to achieve the desired zoom level using the received output pixels.
In accordance with techniques of this disclosure, a camera processor may receive consistent image quality as a user increases a desired zoom level. In addition, the camera processor may use less upsampling and scaling to achieve the desired zoom level at a desired output resolution. The techniques of this disclosure further provide extended zoom ranges for the camera processor due to timely binning transitions for the image sensor to transition between binning levels, a decrease in the amount of upsampling camera processor performs to achieve the desired zoom level between transitions, etc.
In one example, the techniques of the disclosure are directed to an apparatus configured to capture image data, the apparatus comprising: a memory configured to store image data, and one or more processors in communication with the memory, the one or more processors configured to: cause a first image sensor to capture one or more frames of image data using a first image capture mode in the case that a digital zoom level is less than a first predefined threshold, wherein the first image capture mode includes a first binning level that uses two or more pixels of the first image sensor for each output pixel of the image data, determine that a requested digital zoom level is greater than the first predefined threshold, and cause the first image sensor to capture one or more frames of the image data using a second image capture mode, wherein the second image capture mode includes a second binning level that uses fewer pixels of the first image sensor relative to the first binning level for each output pixel of the image data.
In another example, the techniques of the disclosure are directed to a method of capturing image data, the method comprising: causing a first image sensor to capture one or more frames of image data using a first image capture mode in the case that a digital zoom level is less than a first predefined threshold, wherein the first image capture mode includes a first binning level that uses two or more pixels of the first image sensor for each output pixel of the image data, determining that a requested digital zoom level satisfies the first predefined threshold, and causing the first image sensor to capture one or more frames of the image data using a second image capture mode, wherein the second image capture mode includes a second binning level that uses fewer pixels of the first image sensor relative to the first binning level for each output pixel of the image data.
In another example, the techniques of the disclosure are directed to an apparatus configured to capture image data, the apparatus comprising: means for causing an image sensor to capture a first one or more frames of image data using a first image capture mode in the case that a digital zoom level is less than a first predefined threshold, wherein the first image capture mode includes a first binning level that uses two or more pixels of the image sensor for each output pixel of the image data, means for determining that a requested digital zoom level satisfies the first predefined threshold, and means for causing the image sensor to capture a second one or more frames of the image data using a second image capture mode, wherein the second image capture mode includes a second binning level that uses fewer pixels of the image sensor relative to the first binning level for each output pixel of the image data.
In another example, the techniques of the disclosure are directed to a non-transitory computer-readable storage medium having stored thereon instructions that, when executed, cause one or more processors to: cause a first image sensor to capture one or more frames of image data using a first image capture mode in the case that a digital zoom level is less than a first predefined threshold, wherein the first image capture mode includes a first binning level that uses two or more pixels of the first image sensor for each output pixel of the image data, determine that a requested digital zoom level satisfies the first predefined threshold, and cause the first image sensor to capture one or more frames of the image data using a second image capture mode, wherein the second image capture mode includes a second binning level that uses fewer pixels of the first image sensor relative to the first binning level for each output pixel of the image data.
The details of one or more examples are set forth in the accompanying drawings and the description below. Other features, objects, and advantages will be apparent from the description, drawings, and claims.
An image sensor may use various capture modes under various circumstances to capture frames of image data, such as video data or snapshots, using various binning levels. For example, certain camera processors can cause image sensors to use various binning techniques to capture image data by combining multiple pixels of an image sensor into a fewer number of pixels. The image sensor may then output the combined pixels to the camera processor. Binning techniques can be particularly advantageous in cases where a camera processor interfaces with high-resolution or ultra-high resolution image sensors (e.g., 12-megapixels (MP), 48MP, 64MP, 108MP, 120MP, 144MP, etc.). For example, the pixels on such image sensors can have small physical sizes. In such examples the incident light on each pixel can be so limited, especially in low-light conditions, and as such, each individual pixel can become increasingly susceptible to noise.
As such, binning techniques improve the signal-to-noise ratio (SNR) by combining pixels together through various combination schemes, including averaging or summing multiple pixels together for each output pixel. In addition, binning techniques can provide higher frame rates for digital cameras, for example, in the case of recording video or providing image previews to a user via a display device. In addition, binning techniques can reduce the processing burden and demands placed on the camera processor, thereby reducing the amount of processing performed on the image data received from an image sensor, and also, increasing efficiency of the system overall. Furthermore, when an image sensor uses binning techniques at certain zoom levels, the human eye may not be able to perceive that the image sensor is using pixel binning, much less whether the image sensor is using 4×4 binning, 2×2 binning, or non-binning, etc. That is, the human eye may not perceive degradation due to binning until the zoom level increases to certain high zoom levels. While binning techniques may come at the cost of spatial resolution, the output resolution can still be high due to the pixel density of certain image sensors.
The image sensor may then output the combined pixels to a camera processor. Using the output pixels from the image sensor, the camera processor may perform digital zoom techniques to achieve a desired digital zoom level at a desired output resolution (e.g., 1080p, 4K, 8K, etc.). In order to do so, the camera processor may perform any number of digital zoom techniques, including downsampling, cropping, upsampling, scaling, or combinations thereof. In some examples, the camera processor may first crop the pixels received from the image sensor to remove those pixels that fall outside of a field of zoom before then upsampling or downsampling the remaining pixels. Upsampling or scaling, however, tends to degrade the resulting image quality as the zoom level increases. This is because upsampling or scaling involves the process of using known pixel values received from the image sensor and interpolating between those pixel values in order to artificially create new pixel values until the desired output resolution is achieved. Performing large amounts of upsampling can, thus, distort the resulting image or video. The image quality may continue to degrade and become distorted as the amount of upsampling increases.
In addition, the camera processor may switch between multiple cameras having different effective focal lengths. For example, a camera processor may switch between cameras in response to detecting a camera transition trigger, such as a particular zoom level satisfying a camera transition threshold. In some examples, however, the amount of increasing zoom levels may be quite high before a particular camera transition threshold is met. For example, a first camera may be used for digital zoom levels between 1.0× and 5.0× zoom, with a second camera being used for zoom levels greater than 5.0× zoom. Thus, in order to bridge the gap between various cameras, larger and larger image sensors may be used in an effort to provide the camera processor with as many pixels as possible until the next camera transition occurs.
In such instances, the image quality will continue to degrade due to the camera processor using digital upsampling or digital scaling techniques until the camera processor switches to the next camera, which may not occur until a high enough zoom level has been reached. At that time, the camera processor may perform less upsampling due to the difference in effective focal length of the transitioned camera. Thus, switching between cameras also provides a user with inconsistent image quality. For example, as the camera processor uses upsampling or scaling, the image quality will continue to degrade until the desired zoom level reaches the camera transition threshold, at which point the image quality will suddenly improve and then start to degrade again once the camera processor starts upsampling again with the new camera. These distortions and inconsistencies in image quality can be particularly noticeable when using digital zoom while recording high resolution video at high frame rates that may switch between multiple cameras as the user attempts to zoom in and out of a scene while recording video.
The aforementioned problems, among others, may be addressed by the disclosed capture mode adaptation techniques by providing cameras capable of leveraging various pixel binning levels at various zoom levels. Specifically, depending on the binning technique used, a camera processor may determine a binning transition threshold at which to cause the image sensor to alter the pixel binning level at the image sensor. In one example, the camera processor may cause the image sensor to transition from one binning level (e.g., 2×2, 3×3, 4×4, 8×8, etc.) to a second binning level that provides less binning than the first binning level (e.g., non-binning). It should be noted that “non-binning” or “no binning” generally involves an image sensor using one pixel of the image sensor, rather than using multiple pixels, for each output pixel of the image data that is to be output to a camera processor. In other words, non-binning may mean that there is a 1-to-1 correspondence between input pixels and output pixels.
In some examples, an image sensor may transition from one binning level to a lower binning level once a predefined binning transition threshold is met (e.g., a particular zoom level). Advantageously, the human eye may not perceive a difference between images captured using binning techniques or not at low zoom levels, such as zoom levels below the binning transition threshold amount. This allows the image capture device to utilize binning at certain lower zoom levels without impacting the user experience. That is, the image sensor may start with a binning level and then transition to a lower binning level as the zoom level increases in order to decrease the amount of upsampling used and to preserve the user experience as much as possible. In addition, camera transitions, coupled with such binning transitions, may allow a user to span the entire zoom spectrum smoothly without readily noticing when the camera transition occurs. Due to the change between binning levels that occurs prior to a particular camera transition, the camera processor may decrease the amount of upsampling or scaling used near the camera transition threshold compared to if a uniform binning level were used for the same zoom span.
As is described in detail below, the camera processor may cause such a transition in response to a particular zoom level satisfying a predefined binning transition threshold or in some cases, as the zoom level approaches the predefined binning transition threshold. The image sensor may then output combined pixels, non-binned pixels, portions of binned or non-binned pixels, to the camera processor, where then the camera processor may perform digital zoom using the pixels received from the image sensor.
is a block diagram of a device configured to perform one or more of the example techniques described in this disclosure. Examples of computing deviceinclude a computer (e.g., personal computer, a desktop computer, or a laptop computer), a mobile device such as a tablet computer, a wireless communication device (such as, e.g., a mobile telephone, a cellular telephone, a satellite telephone, and/or a mobile telephone handset), an Internet telephone, a digital camera, a digital video recorder, a handheld device, such as a portable video game device or a personal digital assistant (PDA), a drone device, or any device that may include one or more cameras.
As illustrated in the example of, computing deviceincludes one or more image sensor(s). Image sensor(s)may be referred to in some instances herein simply as “sensor,” while in other instances may be referred to as a plurality of “sensors” where appropriate. Computing devicefurther includes one or more lens(es)and a camera processor. As shown in, a cameramay refer to a collective device including one or more image sensor(s), one or more lens(es), and at least one camera processor. In any event, multiple camerasmay be included with a single computing device(e.g., a mobile phone having one or more front facing cameras and one or more rear facing cameras). In a non-limiting example, one computing devicemay include a first cameracomprising a 16MP image sensor, a second cameracomprising a 108MP image sensor, a third camerahaving a 12MP image sensor, etc., and dual “front-facing” cameras, etc. It should be noted that while some example techniques herein may be discussed in reference to so-called “rear” cameras or multiple rear-facing cameras, the techniques of this disclosure are not so limited, and a person of skill in the art will appreciate that the techniques of this disclosure may be implemented for any type of camerasand for any transitions between camerasthat are included with computing device.
In some instances, cameramay include multiple camera processors. In some instances, multiple camera processorsmay refer to an image signal processor (ISP) that uses various processing algorithms under various circumstances. In some examples, camera processormay include an image front end (IFE) and/or an image processing engine (IPE) as part of a processing pipeline. In addition, cameramay include a single one of sensor(s)or a single one of lens(es).
As illustrated, computing devicemay further include a central processing unit (CPU), an encoder/decoder, a graphics processing unit (GPU), local memoryof GPU, user interface, memory controllerthat provides access to system memory, and display interfacethat outputs signals that cause graphical data to be displayed on display.
While some example techniques are described herein with respect to a single sensor, the example techniques are not so limited, and may be applicable to various camera types used for capturing images/videos, including devices that include multiple image sensors, multiple lens types, and/or multiple camera processors. For example, computing devicemay include dual lens devices, triple lens devices, etc. As such, each lensand image sensorcombination may provide various optical zoom levels, angles of view (AOV), focal lengths, FOVs, etc. In some examples, one image sensormay be allocated for each lens. That is, multiple image sensorsmay be each allocated to different lens types (e.g., wide lens, ultra-wide lens, telephoto lens, and/or periscope lens, etc.). For example, a wide lens may correspond to a first image sensorof a first size (e.g., 108MP), whereas an ultra-wide lens may correspond to a second image sensorof a different size (e.g., 16MP). In another example, a telephoto lens may correspond to an image sensorof a third size (e.g., 12MP). In an illustrative example, a single computing devicemay include two or more cameras, where at least two of the camerascorrespond to image sensorshaving a same size (e.g., two 12MP sensors, three 108MP sensors, three 12MP sensors, two 12MP sensors and a 108MP sensor, etc.). In any event, image sensorsmay correspond to different lensesso as to provide multiple camerasfor computing device.
In some examples, a single image sensormay correspond to multiple lenses. In such examples, light guides may be used to direct incident light on lensesto respective image sensor(s). An example light guide may include a prism, a moving prism, mirrors, etc. In this way, light received from a single lens may be redirected to a particular sensor, such as away from one sensorand toward another sensor. For example, camera processormay cause a prism to move and redirect light incident on one of lensesin order to effectively change the focal lengths for the received light. In any event, computing devicemay include multiple lensescorresponding to a single image sensor. In addition, computing devicemay include multiple lensescorresponding to separate image sensors. In such instances, separate image sensorsmay be of different sizes (e.g., a 12MP sensor and a 108MP sensor) or in some examples, at least two of the separate image sensorsmay be of the same size (e.g., two 12MP sensors, three 108MP sensors, three 12MP sensors, two 12MP sensors and a 108MP sensor, etc.).
In some examples, a single camera processormay be allocated to one or more sensors. In some instances, however, multiple camera processorsmay be allocated to one or more sensors. For example, camera processormay use multiple processing algorithms under various circumstances to perform digital zoom operations or other processing operations. In examples including multiple camera processors, camera processorsmay share sensors, where each camera processormay interface with each sensorregardless of any processor-to-image sensor allocation rules. For example, each camera processormay coordinate with one another to efficiently allocate processing resources to the sensor(s).
In addition, while cameramay be described as comprising one sensorand one camera processor, cameramay include multiple sensors and/or multiple camera processors. In any event, computing devicemay include multiple camerasthat may include one or more sensor(s), one or more lens(es), and/or one or more camera processor(s). In some examples, cameramay refer to sensor(s)as the camera device, such that camerais a sensorcoupled to camera processor(e.g., via a communication link) and/or lens(es), for example.
Also, although the various components are illustrated as separate components, in some examples the components may be combined to form a system on chip (SoC). As an example, camera processor, CPU, GPU, and display interfacemay be formed on a common integrated circuit (IC) chip. In some examples, one or more of camera processor, CPU, GPU, and display interfacemay be in separate IC chips. Various other permutations and combinations are possible, and the techniques of this disclosure should not be considered limited to the example illustrated in.
The various components illustrated in(whether formed on one device or different devices), including sensorand camera processor, may be formed as at least one of fixed-function or programmable circuitry, or a combination of both, such as in one or more microprocessors, application specific integrated circuits (ASICs), field programmable gate arrays (FPGAs), digital signal processors (DSPs), or other equivalent integrated or discrete logic circuitry. Examples of local memoryinclude one or more volatile or non-volatile memories or storage devices, such as random-access memory (RAM), static RAM (SRAM), dynamic RAM (DRAM), erasable programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), flash memory, a magnetic data media or an optical storage media.
The various structures illustrated inmay be configured to communicate with each other using bus. Busmay be any of a variety of bus structures, such as a third-generation bus (e.g., a HyperTransport bus or an InfiniBand bus), a second-generation bus (e.g., an Advanced Graphics Port bus, a Peripheral Component Interconnect (PCI) Express bus, or an Advanced eXtensible Interface (AXI) bus) or another type of bus or device interconnect. It should be noted that the specific configuration of buses and communication interfaces between the different components shown inis merely exemplary, and other configurations of computing devices and/or other image processing systems with the same or different components may be used to implement the techniques of this disclosure.
Camera processoris configured to receive image frames (e.g., pixel data) from sensor, and process the image frames to generate image and/or video content. For example, image sensormay be configured to capture individual frames, frame bursts, frame sequences for generating video content, photo stills captured while recording video, image previews, or motion photos from before and/or after capture of a still photograph. CPU, GPU, camera processors, or some other circuitry may be configured to process the image and/or video content captured by sensorinto images or video for display on display. In an illustrative example, CPUmay cause image sensorto capture image frames using pixel binning and/or may receive pixel data from image sensor. In the context of this disclosure, image frames may generally refer to frames of data for a still image or frames of video data or combinations thereof, such as with motion photos. Camera processormay receive pixel data of the image frames in any format. For example, the pixel data may include different color formats, such as RGB, YCbCr, YUV, and the like.
In some examples, camera processormay comprise an image signal processor (ISP). For instance, camera processormay include a camera interface that interfaces between sensorand camera processor. Camera processormay include additional circuitry to process the image content. Camera processormay be configured to perform various operations on image data captured by sensor, including auto white balance, color correction, or other post-processing operations.
In addition, camera processormay be configured to analyze pixel data and/or output the resulting images (e.g., pixel values for each of the image pixels) to system memoryvia memory controller. Each of the images may be further processed for generating a final image for display. For example, GPUor some other processing unit, including camera processoritself, may perform color correction, white balance, blending, compositing, rotation, or other operations to generate the final image content for display.
In addition, computing devicemay include a video encoder and/or video decoder, either of which may be integrated as part of a combined video encoder/decoder (CODEC). Encoder/decodermay include a video coder that encodes video captured by one or more camera(s)or a decoder that can decode compressed or encoded video data. In some instances, CPUmay be configured to encode and/or decode video data, in which case, CPUmay include encoder/decoder.
CPUmay comprise a general-purpose or a special-purpose processor that controls operation of computing device. A user may provide input to computing deviceto cause CPUto execute one or more software applications. The software applications that execute on CPUmay include, for example, a camera application, a graphics editing application, a media player application, a video game application, a graphical user interface application or another program. For example, a camera application may allow the user to control various settings of camera. The user may provide input to computing devicevia one or more input devices (not shown) such as a keyboard, a mouse, a microphone, a touch pad or another input device that is coupled to computing devicevia user interface. For example, user interfacemay receive input from the user to adjust desired digital zoom levels, alter aspect ratios of image data, record video, take a snapshot while recording video, apply filters to the image capture, select a region-of-interest for automatic-focus operations, record slow motion video or super slow motion video, apply night shot settings, capture panoramic image data, etc.
One example of the software application is a camera application. CPUexecutes the camera application, and in response, the camera application causes CPUto generate content that displayoutputs. For instance, displaymay output information such as light intensity, whether flash is enabled, and other such information. The user of computing devicemay interface with display(e.g., via user interface) to configure the manner in which the images are generated (e.g., with or without flash, focus settings, exposure settings, and other parameters). The camera application also causes CPUto instruct camera processorto process the images captured by sensorin the user-defined manner. For example, CPUmay instruct camera processorto perform a zoom operation on the images captured by sensor. In some examples, CPUmay receive a request to zoom from a user (e.g., a pinch-to-zoom command, a discrete input, such as operation of a 0.5× zoom button, 2× zoom button, 3× zoom button, 10× zoom button, etc., a slider input, or some combination thereof). In some examples, a zoom operation may include a digital zoom that comprises a zoom field. For instance, a digital zoom field may include a portion of less than the full FOV of sensor. CPUmay then instruct camera processorto perform the digital zoom operation accordingly. In some examples, camera processormay receive the request to zoom from the user directly (e.g., by detecting certain input from a user, such as during a time when the camera application is active and/or running in the foreground of computing device).
In any event, camera processormay cause image sensorto capture frames of image data, such as video data, using various binning levels that combine multiple pixels of the image sensor into fewer output pixels, as is the case with binning, or uses one pixel for each output pixel, as is the case with non-binning. When the camera processorreceives a zoom request, camera processormay determine whether the zoom request satisfies a predefined binning transition threshold, at which point camera processormay cause image sensorto capture subsequent frames using an altered binning level. In a non-limiting example, image sensormay initially capture image data using a 2×2 binning level, where four pixels are combined into one output pixel. As such, a 108MP image sensor, such as a 108MP image sensorthat corresponds to a wide-angle lens, may use four pixels for each output pixel, thereby transferring 27MP binned pixels to camera processor. In another example, image sensormay initially capture image data using a 3×3 binning level, where nine pixels are combined into one output pixel. As such, a 108MP image sensormay use nine pixels for each output pixel, thereby transferring 12MP binned pixels to camera processor.
Camera processormay receive a desired digital zoom level that exceeds a first predefined threshold. As such, camera processormay cause image sensorto transition from 2×2 binning to non-binning, where now image sensoruses one pixel of image sensorfor each output pixel of the image data. As such, image sensormay potentially output 108MP to camera processor. In some instances, however, camera processormay additionally cause image sensorto maintain a constant or near constant throughput level. As such, camera processormay cause image sensorto output 25% of the potential 108MP to arrive at a constant or near constant throughput of, in this particular example, 27MP.
In some examples, camera processormay set the various transition parameters (e.g., binning transition threshold, reduced output portion, camera transition threshold, etc.) in order to keep the data throughput or bit rate within a specific range that avoids spikes in the throughput or bit rate. For example, the range in the above scenario could be 27MP±1MP such that the throughput can vary while limiting large fluctuations in throughput data from image sensorto camera processor. This technique may be especially advantageous in the context of video, which may be desirable to be captured at high resolution and a high frames-per-second (FPS). If the processor is overburdened, such as with large fluctuations of incoming pixel data, then camera processorwill likely be unable to achieve high FPS. In such instances, any captured movement in the video will appear choppy or the movement will appear less smooth. As such, the disclosed technology reduces distortion, and potential choppiness, while zooming and while also keeping throughput high.
In addition, camera processormay further use digital cropping, downsampling, upsampling or scaling to achieve the desired digital zoom level at the desired number of pixels for display (e.g., a desired resolution parameter). In one illustrative example, camerahaving a 48MP image sensorwith one lensmay capture a maximum pixel array of roughly 8000×6000 pixels. Camera, using the 48MP image sensor, may be configured to capture so-called 4K picture or video, where the resolution for such picture or video is approximately 4000×2000 pixels or ˜8MP. Thus, in the case of a digital zoom operation, the 48MP image sensorcan potentially output 48MP non-binned pixels to camera processor, where camera processorcan perform digital cropping of the 8000×6000 pixels to achieve the desired digital zoom level before using other techniques, such as downsampling, for instances where the number of pixels remaining after the digital crop is greater than the desired resolution, or upsampling, for instances where the number of pixels remaining after the digital crop is less than the desired resolution.
In an example where image sensoruses 2×2 binning, the 48MP image sensorcan potentially output 12MP binned pixels to camera processor, where camera processorcan perform digital cropping of the 12MP to achieve the desired digital zoom level before using downsampling, upsampling, or scaling techniques to achieve the desired digital zoom level at the desired resolution.
In some examples, the 48MP image sensorcan output to camera processorless than all of the available pixels (binned or non-binned), such as a predetermined percentage of the available pixels to be output. That is, camera processormay perform any number of digital zoom techniques based on a reduced portion of a potential FOV of image sensor(e.g., all binned or non-binned pixels of image sensor). In one example, the 48MP image sensorcan output a reduced portion of the 2×2 binned pixels, or a reduced portion of the non-binned pixels, to camera processor. For example, 48MP image sensormay output 25% of non-binned pixels, equaling 12MP non-binned pixels. Camera processorcan perform digital cropping of the 12MP non-binned pixels to achieve the desired digital zoom level before using downsampling, upsampling, or scaling techniques to achieve the desired digital zoom level at the desired resolution.
In another example, 48MP image sensormay output a reduced portion of binned pixels to camera processor, such as in the case where camera processoris causing image sensorto perform multiple binning transitions. For example, 48MP image sensormay output a predetermined percentage of binned pixels to camera processorpursuant to a binning transition from a first binning level that uses multiple pixels of image sensorfor each output pixel of the image data to a second binning level that uses fewer pixels of image sensorrelative to the first binning level for each output pixel of the image data, but where the second binning level uses at least two pixels of image sensorfor each output pixel of the image data. As such, camera processorcan perform digital cropping using the reduced number of binned pixels to achieve the desired digital zoom level before using other digital zoom techniques. A person skilled in the art will understand that a 48MP image sensoris only used herein as an illustrative example and that the techniques of this disclosure may use any number image sensorsto perform the techniques of this disclosure.
Unknown
March 17, 2026
Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.