An automatic teeth whitening system analyzes digital content and detects at least one teeth region in the digital content. A teeth region refers to a region or portion of the digital content that includes teeth (e.g., human teeth), and the teeth region detection includes identifying each pixel that displays part of the teeth using instance segmentation. The automatic teeth whitening system also finds the visual structure of each tooth in the teeth region using instance contours specific to the tooth. After finding the teeth region and the visual structure of each tooth in the teeth region, a whitening process is applied to the teeth to whiten them. The whitening of the teeth is performed automatically—manual steps by the user of selecting teeth regions and coloring the teeth in those regions are avoided.
Legal claims defining the scope of protection, as filed with the USPTO.
1. In a digital medium environment to edit digital content, a method implemented by at least one computing device, the method comprising: obtaining first digital content; identifying a teeth region mask that identifies a teeth region of the first digital content that includes teeth of a person; determining, using the first digital content, where each individual tooth is located in the teeth region; creating, using the first digital content, an intensity map that identifies an intensity of each individual tooth in the teeth region, the intensity map indicating that a first individual tooth located in the teeth region has a different intensity than a second individual tooth located in the teeth region; generating, using a first machine learning system, second digital content that includes a whitened version of the teeth of the person, the generating second digital content including generating, for at least one individual tooth in the teeth region, colors of pixels of the at least one individual tooth based on a color of the at least one individual tooth in the first digital content as well as the intensity of the at least one individual tooth in the intensity map; and displaying the second digital content.
2. The method as recited in claim 1 , the identifying the teeth region mask comprising using a second machine learning system that is trained at least in part by minimizing a teeth pixel gradient loss, the teeth pixel gradient loss being a difference between a first teeth pixel gradient value calculated over a first teeth region mask generated by the second machine learning system for a training image and a second teeth pixel gradient value calculated over a ground truth teeth region mask of the training image, the first teeth pixel gradient value measuring a gradient of probability and a Hessian at multiple points in the first teeth region mask, and the second teeth pixel gradient value measuring the gradient of probability and the Hessian at multiple points in the ground truth teeth region mask.
3. The method as recited in claim 2 , the second machine learning system being further trained by minimizing a mask loss, the mask loss comprising a binary cross-entropy loss that compares, for the training image, how close the first teeth region mask generated by the second machine learning system is to the ground truth teeth region mask of the training image.
4. The method as recited in claim 1 , wherein determining where each individual tooth is located in the teeth region comprises: identifying one or more pixels that have at least a threshold probability of being at a location on an edge of a tooth in the teeth region; and reducing pixel RGB values of each of the identified one or more pixels.
5. The method as recited in claim 1 , wherein generating the second digital content comprises using a generative adversarial network to generate the second digital content.
6. The method as recited in claim 1 , wherein creating the intensity map comprises averaging, for each individual tooth in the teeth region, pixel color values across the individual tooth.
7. The method as recited in claim 1 , wherein generating the second digital content comprises: generating, using the first machine learning system, intermediate whitened teeth for the teeth in the teeth region, the first machine learning system having been trained to transform the teeth region from a domain of non-white teeth to a domain of white teeth; and combining, using the intensity map, the intermediate whitened teeth and the teeth located in the teeth region of the first digital content.
8. The method as recited in claim 1 , further comprising repeating the identifying, determining, creating, and generating for each of one or more additional teeth regions in the first digital content.
9. The method as recited in claim 1 , wherein the first digital content comprises a first digital still image of a collection of multiple digital still images, the method further comprising automatically repeating the obtaining, identifying, determining, creating, and generating for each of multiple additional digital still images in the collection of multiple digital still images.
10. The method as recited in claim 1 , wherein the first digital content comprises a first frame of a digital video, the method further comprising automatically repeating the obtaining, identifying, determining, creating, and generating for each of multiple additional frames in the digital video.
11. The method as recited in claim 1 , the at least one individual tooth being a single user-selected tooth.
12. In a digital medium environment to edit digital content, a computing device comprising: a processor; and computer-readable storage media having stored thereon multiple instructions that, responsive to execution by the processor, cause the processor to perform operations comprising: obtaining first digital content; identifying both a teeth region mask that identifies a teeth region of the first digital content that includes teeth of a person and a cropped portion of the first digital content that includes the teeth of the person; determining, using the cropped portion of the first digital content and the teeth region mask, a tooth segmentation drawing that identifies where each individual tooth is located in the teeth region; creating, using the cropped portion of the first digital content and the tooth segmentation drawing, an intensity map that identifies an intensity of each individual tooth in the tooth segmentation drawing, the intensity map indicating that a first individual tooth located in the teeth region has a different intensity than a second individual tooth located in the teeth region; generating, using a first machine learning system, second digital content that includes a whitened version of the teeth of the person, the generating second digital content including generating, for at least one individual tooth in the tooth segmentation drawing, colors of pixels of the at least one individual tooth based on a color of each individual tooth in the first digital content as well as the intensity of the at least one individual tooth in the intensity map; and displaying the second digital content.
13. The computing device as recited in claim 12 , the generating the teeth region mask and the cropped portion of the first digital content comprising using a second machine learning system that is trained at least in part by minimizing a teeth pixel gradient loss, the teeth pixel gradient loss being a difference between a first teeth pixel gradient value calculated over a first teeth region mask generated by the second machine learning system for a training image and a second teeth pixel gradient value calculated over a ground truth teeth region mask of the training image, the first teeth pixel gradient value measuring a gradient of probability and a Hessian at multiple points in the first teeth region mask, and the second teeth pixel gradient value measuring the gradient of probability and the Hessian at multiple points in the ground truth teeth region mask.
14. The computing device as recited in claim 12 , wherein determining the tooth segmentation drawing comprises: identifying one or more pixels that have at least a threshold probability of being at a location on an edge of a tooth in the teeth region; and reducing pixel RGB values of each of the identified one or more pixels.
15. The computing device as recited in claim 12 , wherein generating the second digital content comprises using a generative adversarial network to generate the second digital content.
16. The computing device as recited in claim 12 , wherein creating the intensity map comprises averaging, for each individual tooth in the tooth segmentation drawing, pixel color values across the individual tooth.
17. The computing device as recited in claim 12 , wherein generating the second digital content comprises: generating, using the first machine learning system, intermediate whitened teeth for the teeth in the teeth region; and combining, using the intensity map, the intermediate whitened teeth and the teeth located in the teeth region of the cropped portion of the first digital content, the intensity map indicating a ratio at which values of the teeth in the cropped portion of the first digital content are used relative to values of teeth in the intermediate whitened teeth.
18. A system comprising: an input module to obtain first digital content; a teeth region detection module to identify both a teeth region mask that identifies a teeth region of the first digital content that includes teeth and a cropped portion of the first digital content that includes the teeth; a tooth edge detection module to determine, using the cropped portion of the first digital content and the teeth region mask, a tooth segmentation drawing that identifies where each individual tooth is located in the teeth region; means for creating an intensity map that identifies an intensity of each individual tooth in the tooth segmentation drawing, the intensity map indicating that a first individual tooth located in the teeth region has a different intensity than a second individual tooth located in the teeth region; means for generating second digital content including generating, for at least one individual tooth in the tooth segmentation drawing, colors of pixels of the at least one individual tooth based on a color of each individual tooth in the first digital content as well as the intensity of the at least one individual tooth in the intensity map; and an output module to display the second digital content.
19. The system as recited in claim 18 , wherein the means for generating the second digital content comprises using a generative adversarial network to generate the second digital content.
20. The system as recited in claim 18 , wherein the means for generating the second digital content comprises: means for generating intermediate whitened teeth for the teeth in the teeth region, and combining, based on the intensity map, the intermediate whitened teeth and the teeth located in the teeth region of the cropped portion of the first digital content.
Cooperative Patent Classification codes for this invention. Click any code to explore related patents in that topic.
April 23, 2019
December 29, 2020
Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.