A head-wearable extended reality (XR) device includes an optical assembly. The optical assembly has a display and an optical element. The display is provided to display virtual content to a user of the XR device. The optical element is provided to direct the virtual content from the display along an optical path to an eye of the user. The optical element includes a first portion and a second portion. The first portion provides a first focus distance that corresponds to a first viewing zone of the display. The second portion provides a second focus distance that differs from the first focus distance and corresponds to a second viewing zone of the display.
Legal claims defining the scope of protection, as filed with the USPTO.
. A head-wearable extended reality (XR) device that includes an optical assembly, the optical assembly comprising:
. The XR device of, wherein the optical element comprises a lens.
. The XR device of, wherein the lens is arranged in a fixed position relative to the display.
. The XR device of, wherein the lens is a fixed-focus focusing lens.
. The XR device of, wherein the lens is a bifocal focusing lens.
. The XR device of, wherein the lens is a trifocal focusing lens, and the lens further comprises a third portion providing a third focus distance that corresponds to a third viewing zone of the display, the third focus distance differing from both the first focus distance and the second focus distance.
. The XR device of, wherein the lens is a progressive focusing lens comprising a plurality of portions in addition to the first portion and the second portion, and each of the plurality of portions provides a different focus distance that corresponds to a respective viewing zone of the display, thereby defining multiple focus distances distributed across a field of view.
. The XR device of, wherein the multiple focus distances are distributed according to a gradient.
. The XR device of, wherein the virtual content comprises first virtual content and second virtual content, wherein the display is to simultaneously display the first virtual content in the first viewing zone and the second virtual content in the second viewing zone, and the optical assembly is to direct, via the optical element, the first virtual content to be displayed at the first focus distance and the second virtual content to be displayed at the second focus distance.
. The XR device of, wherein the virtual content comprises a virtual object, and the XR device further comprises at least one processor to:
. The XR device of, wherein the virtual content comprises first virtual content from the first viewing zone and second virtual content from the second viewing zone, wherein the optical element is configured such that the first portion operatively directs the first virtual content such that the first virtual content is perceived at a first image plane at the first focus distance, and the second portion operatively directs the second virtual content such that the second virtual content is perceived at a second image plane at the second focus distance, the first image plane being located in front of the second image plane from a viewing perspective of the user.
. The XR device of, wherein the first virtual content comprises a first virtual object and the second virtual content comprises a second virtual object, and the XR device further comprises at least one processor to:
. The XR device of, wherein, from the viewing perspective of the user, the first viewing zone is located in a lower section of a field of view and the second viewing zone is located in an upper section of the field of view.
. The XR device of, wherein the first focus distance and the second focus distance are fixed distances.
. The XR device of, wherein the first focus distance is a first distance selected for hand-based interactions with the XR device, and the second focus distance is a second distance that is greater than the first distance.
. The XR device of, wherein the optical assembly forms part of an optical see-through (OST) display arrangement.
. The XR device of, wherein the display is offset from a gaze path of the XR device, and the OST display arrangement further comprises an optical combiner to direct light originating from the display from the optical path into the gaze path to enable the user to view the virtual content.
. The XR device of, wherein the optical assembly is a first optical assembly and the eye of the user is a first eye of the user, and the XR device further includes a second optical assembly for a second eye of the user.
. An optical assembly for a head-wearable extended reality (XR) device, the optical assembly comprising:
. A method performed by a head-wearable extended reality (XR) device that includes an optical assembly, the method comprising:
Complete technical specification and implementation details from the patent document.
Subject matter disclosed herein relates, generally, to extended reality (XR). More specifically, but not exclusively, the subject matter relates to a multifocal assembly for an XR device.
The field of XR continues to grow. Some XR devices are able to overlay virtual content onto, or mix virtual content into, a user's perception of reality, providing a user experience that can be entertaining, informative, or useful.
The description that follows describes systems, methods, techniques, instruction sequences, and computing machine program products that illustrate examples of the present subject matter. In the following description, for purposes of explanation, numerous specific details are set forth to provide an understanding of various examples of the present subject matter. It will be evident, however, to those skilled in the art, that examples of the present subject matter may be practiced without some or other of these specific details. Examples merely typify possible variations. Unless explicitly stated otherwise, structures (e.g., hardware structures) are optional and may be combined or subdivided, and operations (e.g., in a procedure, algorithm, or other function) may vary in sequence or be combined or subdivided.
The field of XR includes augmented reality (AR) and virtual reality (VR). AR may include an interactive experience of a real-world environment where physical objects or environments that reside in the real world are “augmented” or enhanced by computer-generated digital content (also referred to as virtual content). AR may include a system that enables a combination of real and virtual worlds, real-time interaction, and three-dimensional (3D) presentation of virtual and real objects. A user of an AR system may perceive virtual content that appears to be attached or interact with a real-world physical object. In some examples, AR overlays digital content on the real world. Alternatively, or additionally, AR combines real-world and digital elements. The term “AR” may thus include mixed reality experiences. The term “AR application” is used herein to refer to a computer-operated application that enables an AR experience.
VR may include a simulation experience of a virtual-world environment that is distinct from the real-world environment. Computer-generated digital content is displayed in the virtual-world environment. VR may also refer to a system that enables a user to be completely immersed in the virtual-world environment and to interact with virtual objects presented in the virtual-world environment. While examples described in the present disclosure focus primarily on XR devices that provide an AR experience, it will be appreciated that at least some aspects of the present disclosure may also be applied to VR.
A “user session” is used herein to refer to an operation of a device or application during periods of time. For example, a user session can include an operation of an AR application executing on a head-wearable XR device between the time the user puts on the XR device and the time the user takes off the head-wearable device. In some examples, a user session starts when an XR device is turned on or is woken up from sleep mode and stops when the XR device is turned off or placed in sleep mode. In other examples, the user session starts when the user runs or starts an AR application, or runs or starts a particular feature of the AR application, and stops when the user ends the AR application or stops the particular feature of the AR application.
A head-wearable XR device can display virtual content in different ways. For example, head-wearable AR devices can be categorized as having optical see-through (OST) displays or video pass-through (VPT) displays. In OST technologies, a user views the physical environment directly through transparent or semi-transparent display components, and virtual content can be rendered to appear as part of, or overlaid upon, the physical environment. In VPT technologies, a view of the physical environment is captured by one or more cameras and then presented to the user on an opaque display (e.g., in combination with virtual content). While examples described in the present disclosure focus primarily on OST displays, it will be appreciated that aspects of the present disclosure may also be applied to other types of displays, such as VPT displays.
Vergence and accommodation are two separate visual processes. Vergence refers to the movement of the eyes to maintain binocular vision, while accommodation refers to adjustment of an eye's lens to focus on objects at different distances. In natural viewing conditions, vergence and accommodation work together to enable a human to see objects clearly and without discomfort.
Vergence-accommodation conflict (VAC) is a problem experienced by users of at least some XR devices. Conventional XR devices may use a single image plane that is located at a predetermined distance in front of the user. As used herein, the term “image plane” refers to the virtual surface upon which digital content is projected or rendered from the user's viewing perspective. For example, if a user is viewing a virtual apple through a head-wearable XR device, the image plane associated with the virtual apple is the location in space where the apple appears sharp and clear.
The fixed focus distance of a conventional XR device may cause a mismatch between vergence and accommodation, leading to issues such as discomfort, visual strain, blurred perception, cybersickness, or visual fatigue. The VAC may also be associated with technical inconsistencies in the appearance of virtual content. For example, in an AR device, if the image plane is located at two meters from the user's eyes and the XR device renders a virtual apple on the user's outstretched hand (which is closer than two meters from the user's eyes), the apple may appear blurred when the user focuses on the hand, but sharp when the user focuses on the image plane, in conflict with the appearance of the hand.
One approach to mitigating the VAC involves dynamically adjusting the image plane. For example, an XR device can identify, measure, or estimate the focus distance or a desired focus distance, and then shift the image plane accordingly using a varifocal mechanism. The varifocal mechanism can, for example, dynamically adjust the power of a focusing lens or shift the position of the lens relative to a display to shift the image plane. However, this typically requires precise focus measurements or estimations, which can be difficult to obtain. For example, eye tracking errors can cause the XR device to shift the image plane to an incorrect position, leading to a loss of contrast and failing to address the VAC. Moreover, users may experience visual inconsistencies as a result of dynamic image plane shifting. The use of a varifocal mechanism in an XR device may also necessitate additional parts or more computations.
Another approach involves using a multilayer configuration to generate multiple image planes at different focus distances. However, this approach necessitates additional components and can involve a complex setup, such as a plurality of display layers operating together with a plurality of optical combiners. Furthermore, the use of multiple optical combiners can result in discoloration and distortions.
Multilayer light field displays have also been proposed. Multilayer light field displays emit directional light and thus show virtual content at different depths. However, such displays may exhibit poor brightness and/or poor contrast, and their working volume may be limited by diffraction effects. A multilayer light field configuration also involves additional components, such as stacked liquid-crystal display (LCD) layers that define the working volume.
Many XR devices are portable, making it desirable to conserve resources. Additional components used to combat the VAC, such as those referred to above, can increase power consumption and reduce battery life.
Examples described herein address or alleviate the VAC by providing a fixed optical assembly configured to present virtual content at multiple focus distances. Virtual content rendered by an XR device is displayed at different focus distances depending on a viewing zone of a display of the XR device.
In some examples, an accommodation-supporting device is provided through the incorporation of a multifocal assembly. The term “multifocal,” as used herein, may include an optical component or system designed to provide multiple distinct focus distances. In the context of an XR device, a multifocal optical element enables the presentation of virtual content at multiple perceived distances, enhancing the user's visual experience by allowing for clear focus across different depths. A multifocal lens may be a bifocal lens, a trifocal lens, a progressive lens providing a gradual transition or shift between focal powers, or another type of lens that provides multiple distinct focus distances.
Examples in the present disclosure leverage the observation that, when a user is experiencing AR, virtual content presented in a lower section of a field of view will likely be viewed with nearby real-world objects (e.g., when the user of a head-wearable XR device is performing hand-based interactions). Virtual content rendered in the lower section may thus be rendered at an image plane that is closer to the user than an image plane used to render virtual content in an upper section of the field of view, thereby addressing discrepancies between depth of content and background location.
An optical assembly of an XR device may include a display and an optical element. The display is provided to display virtual content for presentation to a user of the XR device. The optical element is provided to direct the virtual content from the display along an optical path towards an eye of the user. The optical assembly may be for one or both eyes of the user. Accordingly, in some cases, the optical assembly is a first optical assembly and the eye of the user is a first eye of the user, and the XR device further includes a second optical assembly for a second eye of the user. The second optical assembly may be similar to the first optical assembly.
The optical element may comprise multiple portions. In some examples, the optical element is a single, fixed element comprising at least a first portion and a second portion, with the first portion providing a first focus distance that corresponds to a first viewing zone of the display and the second portion providing a second focus distance that differs from the first focus distance and corresponds to a second viewing zone of the display.
The first portion may direct and focus light originating from the display such that the virtual content directed thereby is presented at a first image plane (e.g., located the first focus distance relative to the XR device), while the second portion directs and focuses light originating from the display such that the virtual content directed thereby is presented at a second image plane (e.g., located the second focus distance relative to the XR device). In some examples, the first focus distance and the second focus distance are fixed distances. Accordingly, a fixed multifocal assembly may be provided that does not require image plane adjustment during operation.
The optical element may be a focusing lens. Accordingly, the focusing lens may have multiple areas providing varying focus distances. In this way, content rendered on the display of the XR device is presented at different focus distances for the user based on the area or part of the display they are rendered in. This area or part may be referred to as the “viewing zone.”
As used herein, a “focus distance” refers to the distance over which light rays converge to form a sharp image of an object after passing through an optical element, such as a lens. In the context of displaying virtual images, the focus distance is the distance at which virtual images appear sharp and clear. A “viewing zone” as used herein, may include a specific region of a display designated for presenting virtual content at a specific focus distance. For example, in a bifocal lens setup, the viewing zone at a bottom region of the display of the XR device may be aligned with a near-focus area of the lens for displaying content that should appear close to the user, while the viewing zone at the top of the display may be aligned with a far-focus area for content that should appear at a greater distance.
Where the optical element is a lens, examples described herein provide for the use of different types of multifocal lenses, such as a bifocal focusing lens, a trifocal focusing lens, or a progressive focusing lens.
In the case of a trifocal focusing lens, for example, in addition to the first portion and the second portion mentioned above, the lens can further include a third portion providing a third focus distance that corresponds to a third viewing zone of the display, with the third focus distance differing from both the first focus distance and the second focus distance. In the case of a progressive focusing lens, for example, the lens may further include a plurality of portions, each providing a different focus distance that corresponds to a respective viewing zone of the display.
In this way, an XR device may provide multiple focus distances distributed (e.g., mapped) across a field of view according to a predetermined configuration. In some examples, focus distances are spread across the field of view according to a gradient. In some examples, more distinct steps are provided between viewing zones (e.g., in a bifocal setup).
In some examples, the lens is arranged in a fixed position relative to the display. The lens may be a fixed-focus focusing lens (e.g., the focusing power of the lens is not adjustable during operation).
The virtual content rendered on the display may include first virtual content and second virtual content. For example, the display can simultaneously display the first virtual content in the first viewing zone and the second virtual content in the second viewing zone. The optical assembly of the XR device is configured to direct, via the optical element, the first virtual content to be displayed at the first focus distance and the second virtual content to be displayed at the second focus distance. In other words, light representing the first virtual content is directed through the first portion of the XR device and light representing the second virtual content is directed through the second portion of the XR device.
The virtual content may include at least one virtual object. The XR device may include a processor to determine a presentation distance associated with the virtual object, and assign, based on the presentation distance, a matching focus distance to the virtual object. In response to the assignment of the focus distance to the virtual object, the XR device may cause the virtual object to be rendered in a viewing zone of the display that is associated with the focus distance for the virtual object. In this context, the term “presentation distance” may include the perceived or intended distance from the user at which virtual content is to be displayed within an XR environment (e.g., according to instructions received via an AR application and/or data describing the positions of objects in a real-world environment).
As mentioned above, from the viewing perspective of the user, the first viewing zone may be located in a lower section of a field of view, while the second viewing zone is located in an upper section of the field of view. The first viewing zone may thus provide virtual content at the first focus distance (e.g., a closer distance selected for hand-based interactions with the XR device, such as the presentation of a virtual apple on the hand), and the second viewing zone may provide virtual content at the second focus distance (e.g., a distance that is greater than the first distance, such as for presentation of a virtual scoreboard at a sporting event).
An XR device, according to some examples, is designed to enhance the visual experience of the user by providing the ability to perceive virtual content at multiple focus distances without requiring real-time image plane adjustment during a user session. A multifocal approach may address certain VAC-related challenges by allowing for natural focus transitions between virtual objects positioned at different depths within the field of view, closely mimicking the dynamic focusing capability of the human eye.
In some examples, a focusing lens of the optical assembly features regions with varying focal characteristics (e.g., powers), enabling the presentation of virtual content at near, intermediate, and/or far distances. This design may allow users to engage with intricate virtual details up close, interact with content at arm's length, and/or observe distant virtual landscapes, all with focus and without the need for varifocal adjustment or multilayer displays. Viewing zones of a display can be strategically designed to coincide with the user's natural gaze direction for different types of content, such as reading text at a lower angle versus viewing the environment at eye level, and virtual content can be rendered accordingly. As a result, reduced visual fatigue with prolonged use of XR devices may be facilitated.
Examples described herein may obviate the need for complex varifocal mechanisms that require precise focus measurements or estimations. For example, a fixed multifocal assembly may reduce the need to use eye tracking systems and avoid a significant loss of contrast and visual inconsistencies that can arise from shifting an image plane during operation. In some examples of the present disclosure, the need to perform eye tracking (e.g., gaze estimation) for image plane or content depth estimation is therefore obviated or reduced.
An optical assembly as described herein may also provide a simplified and more lightweight design, both from a mechanical and a computational perspective. As a result, manufacturing and computational costs can be reduced. For example, by avoiding the use of multiple, stacked display layers, an XR device may be produced at a lower cost while also improving its computational efficiency and/or battery life.
Examples described herein may enable an XR device to provide clear, high-quality, and high-contrast imagery without having to integrate complex optical arrangements, such as light field displays. Virtual content may be rendered with clarity across a range of depths while also providing a simplified optical design. For example, by utilizing a multifocal focusing lens instead of a conventional focusing lens, benefits can be obtained without adding mechanical components to an XR device and without necessitating relative displacement between optical assembly elements.
According to some examples, the presently described devices, systems, or methodologies provide an improvement to an operation of the functioning of a computer by providing an XR device that can better address the VAC and/or provide improved display features. One or more of the methodologies described herein may obviate a need for certain efforts or computing resources. Examples of such computing resources include processor cycles, network traffic, memory usage, data storage capacity, power consumption, network bandwidth, and cooling capacity.
is a network diagram illustrating a network environmentsuitable for operating an XR device, according to some examples. The network environmentincludes an XR deviceand a server, communicatively coupled to each other via a network. The servermay be part of a network-based system. For example, the network-based system can be or include a cloud-based server system that provides additional information, such as virtual content (e.g., 3D models of virtual objects, or augmentations to be applied as virtual overlays onto images depicting real-world scenes) to the XR device.
A useroperates the XR device. The usermay be a human user (e.g., a human being), a machine user (e.g., a computer configured by a software program to interact with the XR device), or any suitable combination thereof (e.g., a human assisted by a machine or a machine supervised by a human).
The useris not part of the network environment, but is associated with the XR device. For example, where the XR deviceis a head-wearable apparatus, the userwears the XR deviceduring a user session.
The XR devicemay have different display arrangements. In some examples, the display arrangement may include a screen that displays virtual content and/or what is captured with a camera of the XR device. In some examples, the display arrangement is an OST arrangement. The screen may be positioned in the gaze path of the user or offset from the gaze path of the user.
In some examples, the useroperates an application of the XR device, referred to herein as an AR application. The AR application may be configured to provide the userwith an experience triggered or enhanced by a physical object, such as a two-dimensional (2D) physical object (e.g., a picture), a 3D physical object (e.g., a statue), a location (e.g., at factory), or references (e.g., perceived corners of walls or furniture, or digital codes) in a real-world environment. For example, the usercan point a camera of the XR deviceto capture an image of the physical objectand a virtual overlay may be presented over the physical objectvia the display.
Experiences may also be triggered or enhanced by a hand or other body part of the user. For example, the XR devicemay detect and respond to hand gestures or signals. When using some XR devices, such as head-wearable devices (also referred to as head-mounted devices, or “HMDs”), the hand of the user serves as an interaction tool. As a result, the hand is often “visible” to the XR device, with virtual content being rendered to appear on or close to the hand.
The XR deviceincludes tracking components (not shown in). The tracking components track the pose (e.g., position and orientation) of the XR devicerelative to the real-world environmentusing image sensors (e.g., depth-enabled 3D camera and image camera), inertial sensors (e.g., gyroscope, accelerometer, or the like), wireless sensors (e.g., Bluetooth™ or Wi-Fi™), a Global Positioning System (GPS) sensor, and/or audio sensor to determine the location of the XR devicewithin the real-world environment. In some examples, the tracking components track the pose of the hand (or hands) of the useror some other physical objectin the real-world environment.
In some examples, the serveris used to detect and identify the physical objectbased on sensor data (e.g., image and depth data) from the XR device, and determine a pose of the XR device, the physical objectand/or the hand of the userbased on the sensor data. The servercan also generate virtual content based on the pose of the XR device, the physical object, and/or the hand.
In some examples, the servercommunicates virtual content (e.g., a virtual object) to the XR device. The XR deviceor the server, or both, can perform image processing, object detection, and object tracking functions based on images captured by the XR deviceand one or more parameters internal or external to the XR device.
The object recognition, tracking, and content rendering can be performed on either the XR device, the server, or a combination between the XR deviceand the server. Accordingly, while certain functions are described herein as being performed by either an XR device or a server, the location of certain functionality may be a design choice (unless specifically indicated to the contrary). For example, it might be technically preferable to deploy particular technology and functionality within a server system initially, but later to migrate this technology and functionality to a client installed locally at the XR device where the XR device has sufficient processing capacity.
One or more of the machines, components, or devices shown inmay be implemented in a general-purpose computer modified (e.g., configured or programmed) by software to be a special-purpose computer to perform one or more of the functions described herein for that machine, database, or device. For example, a computer system able to implement any one or more of the methodologies described herein is discussed below with respect to. Moreover, two or more of the machines, components, or devices illustrated inmay be combined into a single machine, component, or device, and the functions described herein for any single machine, component, or device may be subdivided among multiple machines, components, or devices.
The networkmay be any network that enables communication between or among machines (e.g., server), databases, or devices (e.g., XR device). Accordingly, the networkmay be a wired network, a wireless network (e.g., a mobile or cellular network), or any suitable combination thereof. The networkmay include one or more portions that constitute a private network, a public network (e.g., the Internet), or any suitable combination thereof.
is a block diagram illustrating components (e.g., modules, parts, or systems) of the XR deviceof, according to some examples. The XR deviceis shown into include sensors, a processor, a display arrangement, a storage component, and a communication component. It will be appreciated thatis not intended to provide an exhaustive indication of components of the XR device.
The sensorsinclude one or more image sensors, one or more inertial sensors, one or more depth sensors, and one or more eye tracking sensors. The image sensorcan include, for example, a combination of a color camera, a thermal camera, a depth sensor, and one or multiple grayscale, global shutter tracking cameras.
In some examples, the inertial sensorincludes a combination of a gyroscope, accelerometer, and a magnetometer. In some examples, the inertial sensorincludes one or more Inertial Measurement Units (IMUs). An IMU enables tracking of movement of a body by integrating the acceleration and the angular velocity measured by the IMU. An IMU can include a combination of accelerometers and gyroscopes that can determine and quantify linear acceleration and angular velocity, respectively. The values obtained can be processed to obtain the pitch, roll, and heading of the IMU and, therefore, of the body with which the IMU is associated. Signals from the accelerometers of the IMU also can be processed to obtain velocity and displacement. The IMU may also include one or more magnetometers.
Unknown
October 16, 2025
Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.