US-11257276

Appearance synthesis of digital faces

PublishedFebruary 22, 2022

Assigneenot available in USPTO data we have

Inventorsnot available in USPTO data we have

Technical Abstract

Techniques are disclosed for generating digital faces. In some examples, a style-based generator receives as inputs initial tensor(s) and style vector(s) corresponding to user-selected semantic attribute styles, such as the desired expression, gender, age, identity, and/or ethnicity of a digital face. The style-based generator is trained to process such inputs and output low-resolution appearance map(s) for the digital face, such as a texture map, a normal map, and/or a specular roughness map. The low-resolution appearance map(s) are further processed using a super-resolution generator that is trained to take the low-resolution appearance map(s) and low-resolution 3D geometry of the digital face as inputs and output high-resolution appearance map(s) that align with high-resolution 3D geometry of the digital face. Such high-resolution appearance map(s) and high-resolution 3D geometry can then be used to render standalone images or the frames of a video that include the digital face.

Patent Claims

20 claims

Legal claims defining the scope of protection, as filed with the USPTO.

1. A computer-implemented method for rendering one or more images of a digital face, the method comprising: generating, via a first machine learning model, one or more first appearance maps based on a user selection of one or more styles associated with one or more attributes of a digital face; generating, via a second machine learning model, one or more second appearance maps and a first three-dimensional (3D) geometry associated with the digital face based on jointly upsampling the one or more first appearance maps and a second 3D geometry associated with the digital face, wherein the one or more second appearance maps are aligned with the first 3D geometry; and rendering one or more images including the digital face based on the one or more second appearance maps and the first 3D geometry.

2. The computer-implemented method of claim 1 , wherein the one or more first appearance maps have a lower resolution than the one or more second appearance maps, and the second 3D geometry has a lower resolution than the first 3D geometry.

3. The computer-implemented method of claim 1 , wherein generating the one or more first appearance maps based on the user selection of one or more styles comprises controlling one or more adaptive instance normalization (AdaIN) operations based on the user selection of one or more styles.

4. The computer-implemented method of claim 3 , wherein the one or more AdaIN operations are performed in conjunction with one or more convolution operations.

5. The computer-implemented method of claim 4 , wherein the first machine learning model comprises a plurality of semantics transfer blocks, and performing the one or more AdaIN operations in conjunction with the one or more convolution operations comprises, for each semantics transfer block included in the plurality of semantics transfer blocks: performing multiple sets of convolution operations and AdaIN operations in parallel; determining a weighted sum of outputs of the multiple sets of convolution operations and AdaIN operations; and upscaling the weighted sum of outputs.

6. The computer-implemented method of claim 1 , wherein the one or more second appearance maps include at least one of a texture map, a normal map, or a specular roughness map.

7. The computer-implemented method of claim 1 , wherein the one or more attributes of the digital face include at least one of expression, gender, age, identity, or ethnicity.

8. The computer-implemented method of claim 1 , wherein the second machine learning model comprises a super-resolution generator.

9. The computer-implemented method of claim 1 , wherein the first 3D geometry associated with the digital face conveys a non-neutral facial expression.

10. A non-transitory computer-readable storage medium including instructions that, when executed by a processing unit, cause the processing unit to perform steps for rendering one or more images of a digital face, the steps comprising: generating, via a first machine learning model, one or more first appearance maps based on a user selection of one or more styles associated with one or more attributes of a digital face; generating, via a second machine learning model, one or more second appearance maps and a first three-dimensional (3D) geometry associated with the digital face based on jointly upsampling the one or more first appearance maps and a second 3D geometry associated with the digital face, wherein the one or more second appearance maps are aligned with the first 3D geometry; and rendering one or more images including the digital face based on the one or more second appearance maps and the first 3D geometry.

11. The computer-readable storage medium of claim 10 , wherein generating the one or more first appearance maps based on the user selection of one or more styles comprises controlling one or more adaptive instance normalization (AdaIN) operations based on the user selection of one or more styles.

12. The computer-readable storage medium of claim 11 , wherein the first machine learning model comprises a plurality of semantics transfer blocks, and generating the one or more first appearance maps based on the user selection of one or more styles includes, for each semantics transfer block included in the plurality of semantics transfer blocks: performing multiple sets of convolution operations and AdaIN operations in parallel; determining a weighted sum of outputs of the multiple sets of convolution operations and AdaIN operations; and upscaling the weighted sum of outputs.

13. The computer-readable storage medium of claim 12 , wherein one or more weights used in the weighted sum of outputs, the multiple sets of convolution operations and AdaIN operations, one or more initial tensors, and one or more style vectors associated with the one or more attributes of the digital face are determined while training the first machine learning model.

14. The computer-readable storage medium of claim 10 , wherein the first machine learning model is trained using a progressive training technique.

15. The computer-readable storage medium of claim 10 , wherein the second machine learning model is trained using ground truth and adversarial learning techniques.

16. The computer-readable storage medium of claim 10 , wherein the one or more first appearance maps have a lower resolution than the one or more second appearance maps, and the second 3D geometry has a lower resolution than the first 3D geometry.

17. The computer-readable storage medium of claim 10 , wherein the one or more second appearance maps include at least one of a texture map, a normal map, or a specular roughness map.

18. The computer-readable storage medium of claim 10 , wherein the one or more attributes of the digital face include at least one of expression, gender, age, identity, or ethnicity.

19. The computer-readable storage medium of claim 10 , wherein the second machine learning model comprises a super-resolution generator.

20. A computing device comprising: a memory storing an application; and a processor coupled to the memory, wherein when executed by the processor, the application causes the processor to: generate, via a first machine learning model, one or more first appearance maps based on a user selection of one or more styles associated with one or more attributes of a digital face; generate, via a second machine learning model, one or more second appearance maps and a first three-dimensional (3D) geometry associated with the digital face based on jointly upsampling the one or more first appearance maps and a second 3D geometry associated with the digital face, wherein the one or more second appearance maps are aligned with the first 3D geometry; and render one or more images including the digital face based on the one or more second appearance maps and the first 3D geometry.

Classification Codes (CPC)

Cooperative Patent Classification codes for this invention. Click any code to explore related patents in that topic.

G06T G06N

Patent Metadata

Filing Date

June 8, 2020

Publication Date

February 22, 2022

Want to explore more patents?

Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.

Browse All Patents Try Prior Art Search