Legal claims defining the scope of protection, as filed with the USPTO.
1. A method for quantitatively tracking food intake using smart glasses, the method comprising: capturing at least one image of food, and motion data indicative of motion by a user of the smart glasses; identifying a type of the food by performing object recognition on the at least one image of food; determining a volume of the food by performing volume estimation on the at least one image of food; obtaining nutritional data associated with the type of the food and the volume of the food; generating nutritional performance data by comparing the nutritional data to a nutritional benchmark for the user; identifying a plurality of hand-to-mouth motions by the user by analyzing the motion data; identifying a plurality of chewing motions by the user using the smart glasses; calculating a weighted average of a number of the plurality of hand-to-mouth motions and a number of the plurality of chewing motions, wherein calculating the weighted average includes weighing the number of the plurality of hand-to-mouth motions more heavily than the number of the plurality of chewing motions; generating food intake frequency data by comparing the weighted average of the number of the plurality of hand-to-mouth motions and the number of the plurality of chewing motions to baseline metrics; and displaying the nutritional performance data and the food intake frequency data to the user on the smart glasses.
2. The method of claim 1, wherein identifying the plurality of hand-to-mouth motions includes determining that a gaze of the user is focused on food being brought to a mouth of the user.
3. The method of claim 1, wherein the plurality of chewing motions are identified, at least in part, using an audio signal captured by a microphone on the smart glasses.
4. The method of claim 1, further comprising: receiving feedback from the user explicitly identifying the type of the food; and updating, based on the feedback, a machine learning model trained to perform the object recognition for the type of the food.
5. The method of claim 1, wherein the volume estimation is performed by analyzing a plurality of images of the food captured at different angles.
6. The method of claim 1, wherein the volume estimation is performed by applying a machine learning model trained to predict depth of the food from one image of the food.
7. The method of claim 1, wherein identifying the plurality of hand-to-mouth motions includes: applying a machine learning model trained to receive the motion data and categorize the motion data as being or not being indicative of a hand-to-mouth motion.
8. The method of claim 1, wherein the motion data is received from at least one of an inertial measurement unit (IMU) or an image capture device, or a combination thereof.
9. A computer-readable storage medium storing instructions that, when executed by a computing system, cause the computing system to perform a process for quantitatively tracking food intake using smart glasses, the process comprising: capturing motion data indicative of motion by a user of the smart glasses; identifying a plurality of hand-to-mouth motions by the user by analyzing the motion data; identifying a plurality of chewing motions by the user using the smart glasses; calculating a weighted average of a number of the plurality of hand-to-mouth motions and a number of the plurality of chewing motions, wherein calculating the weighted average includes weighing the number of the plurality of hand-to-mouth motions more heavily than the number of the plurality of chewing motions; generating food intake frequency data by comparing the weighted average of the number of the plurality of hand-to-mouth motions and the number of the plurality of chewing motions to baseline metrics; and displaying the food intake frequency data to the user on the smart glasses.
10. The computer-readable storage medium of claim 9, wherein the process further comprises: capturing at least one image of food; identifying a type of the food by performing object recognition on the at least one image of food; determining a volume of the food by performing volume estimation on the at least one image of food; obtaining nutritional data associated with the type of the food and the volume of the food; generating nutritional performance data by comparing the nutritional data to a nutritional benchmark for the user; and displaying the nutritional performance data to the user on the smart glasses.
11. The computer-readable storage medium of claim 9, wherein identifying the plurality of hand-to-mouth motions includes determining that a gaze of the user is focused on food being brought to a mouth of the user.
12. The computer-readable storage medium of claim 9, wherein the plurality of chewing motions are identified, at least in part, using an audio signal captured by a microphone on the smart glasses.
13. The computer-readable storage medium of claim 9, wherein identifying the plurality of hand-to-mouth motions includes: applying a machine learning model trained to receive the motion data and categorize the motion data as being or not being indicative of a hand-to-mouth motion.
14. The computer-readable storage medium of claim 9, wherein the motion data is received from at least one of an inertial measurement unit (IMU) or an image capture device, or a combination thereof.
15. A computing system for quantitatively tracking food intake using smart glasses, the computing system comprising: one or more processors; and one or more memories storing instructions that, when executed by the one or more processors, cause the computing system to perform a process comprising: capturing at least one image of food; identifying a type of the food by performing object recognition on the at least one image of food; determining a volume of the food by performing volume estimation on the at least one image of food; obtaining nutritional data associated with the type of the food and the volume of the food; generating nutritional performance data by comparing the nutritional data to a nutritional benchmark for a user of the smart glasses; identifying a plurality of hand-to-mouth motions by the user by analyzing the motion data; identifying a plurality of chewing motions by the user using the smart glasses; calculating a weighted average of a number of the plurality of hand-to-mouth motions and a number of the plurality of chewing motions, wherein calculating the weighted average includes weighing the number of the plurality of hand-to-mouth motions more heavily than the number of the plurality of chewing motions; generating food intake frequency data by comparing the weighted average of the number of the plurality of hand-to-mouth motions and the number of the plurality of chewing motions to baseline metrics; and displaying the nutritional performance data and the food intake frequency data to the user on the smart glasses.
16. The computing system of claim 15, wherein the volume estimation is performed by analyzing a plurality of images of the food captured at different angles.
17. The computing system of claim 15, wherein the volume estimation is performed by applying a machine learning model trained to predict depth of the food from one image of the food.
Unknown
February 4, 2025
Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.