12236652

Event Trigger Based on Region-Of-Interest Near Hand-Shelf Interaction

PublishedFebruary 25, 2025
Assigneenot available in USPTO data we have
Technical Abstract

Patent Claims
18 claims

Legal claims defining the scope of protection, as filed with the USPTO.

1

1. A system, comprising: an object configured to store items; an image sensor positioned such that a field-of-view of the image sensor encompasses at least a portion of the object, wherein the image sensor is configured to generate images of the stored items; and at least one processor configured to: determine a first pixel position of a body part of a person based at least in part upon a first image generated by the image sensor; determine a second pixel position of the body part of the person based at least in part upon a second image generated by the image sensor; determine a set of pixel positions of the body part during a timeframe associated with the images, wherein the set of pixel positions comprises the first pixel position of the body part and the second pixel position of the body part; determine an aggregated body part position based on the set of pixel positions determined for the timeframe; determine that the aggregated body part position corresponds to a position associated with the object; in response to determining that the aggregated body part position corresponds to a position associated with the object, provide a trigger signal indicating an interaction event has occurred; and in response to providing the trigger signal: determine at least one item-selection image associated with the person placing a first item on the object; determine, in the at least one item-selection image, a region-of-interest based on the aggregated body part position, wherein the region-of-interest includes a subset of the pixels of the item-selection image; identify, using an object detection algorithm, the first item in the selected region-of-interest; and assign the identified first item to the person.

2

2. The system of claim 1, wherein the processor determines that the aggregated body part position corresponds to the position associated with the object by: comparing the aggregated body part position to a set of one or more predefined object positions; and determining, based on the comparison of the aggregated body part position to the set of one or more predefined object positions, that the aggregated body part position is within a threshold distance of at least one of the set of predefined object positions.

3

3. The system of claim 1, further comprising a second image sensor positioned such that a field-of-view of the image sensor encompasses at least a portion of the object, wherein the second image sensor is configured to generate top-down images of a region around the object; and wherein the processor is communicatively coupled to the second image sensor and further configured to: receive a top-view image feed comprising top-view images from the second image sensor; determine, based on the received top-view image feed, that the person is within a threshold distance from the object; and in response to determining that the person is within the threshold distance of the object, begin receiving an image feed comprising the set of images generated by the image sensor.

4

4. The system of claim 1, wherein the processor is further configured to determine the aggregated body part position by determining a maximum depth associated with the object to which the pixel position body part position extends in the set of images.

5

5. The system of claim 1, wherein: the object includes a visible marker located at a predefined location; and the processor is further configured to: detect the visible marker; and determine the predefined position associated with the object based on the detected markers.

6

6. The system of claim 1, wherein the processor is further configured to: determine, based on the aggregated body part position, candidate items that may have been placed on the object by the person, wherein the candidate items include a subset of all stored items, wherein the subset comprises the items located within a threshold distance of the aggregated body part position; for each candidate item, determine, based on a comparison of a predefined position associated with the candidate items to the aggregated body part position, a probability value that the candidate item was interacted with by the person; and identify the first item as the candidate item with the largest probability value.

7

7. A method, comprising: determining a first pixel position of a body part of a person based at least in part upon a first image generated by the image sensor; determining a second pixel position of the body part of the person based at least in part upon a second image generated by the image sensor; determining a set of pixel positions of the body part during a timeframe associated with the set of images, wherein the set of pixel positions comprises the first pixel position of the body part and the second pixel position of the body part; determining an aggregated body part position based on the set of pixel positions determined for the timeframe; determining that the aggregated body part position corresponds to a position associated with the object; in response to determining that the aggregated body part position corresponds to a position associated with the object, providing a trigger signal indicating an interaction event has occurred; and in response to providing the trigger signal: determining at least one item-selection image associated with the person placing a first item on the object; determining, in the at least one item-selection image, a region-of-interest based on the aggregated body part position, wherein the region-of-interest includes a subset of the pixels of the item-selection image; identifying, using an object detection algorithm, the first item in the selected region-of-interest; and assigning the identified first item to the person.

8

8. The method of claim 7, wherein determining that the aggregated body part position corresponds to the position associated with the object by: comparing the aggregated body part position to a set of one or more predefined object positions; and determining, based on the comparison of the aggregated body part position to the set of one or more predefined object positions, that the aggregated body part position is within a threshold distance of at least one of the set of predefined object positions.

9

9. The method of claim 7, further comprising: receiving a top-view image feed comprising top-view images from a second image sensor, wherein the second image sensor is positioned such that a field-of-view of the second image sensor encompasses at least a portion of the object, wherein the second image sensor is configured to generate top-down images of a region around the object; determining, based on the received top-view image feed, that the person is within a threshold distance from the object; and in response to determining that the person is within the threshold distance of the object, beginning to receive an image feed comprising the set of images generated by the image sensor.

10

10. The method of claim 7, further comprising determining the aggregated body part position by determining a maximum depth associated with the object to which the pixel position body part position extends in the set of images.

11

11. The method of claim 7, wherein: the object includes a visible marker located at a predefined location; and the method further comprises: detecting the visible marker; and determining the predefined position associated with the object based on the detected markers.

12

12. The method of claim 7, further comprising: determining, based on the aggregated body part position, candidate items that may have been placed on the object by the person, wherein the candidate items include a subset of all stored items, wherein the subset comprises the items located within a threshold distance of the aggregated body part position; for each candidate item, determining, based on a comparison of a predefined position associated with the candidate items to the aggregated body part position, a probability value that the candidate item was interacted with by the person; and identifying the first item as the candidate item with the largest probability value.

13

13. A tracking subsystem comprising at least one processor configured to: determine a first pixel position of a body part of a person based at least in part upon a first image generated by the image sensor; determining a second pixel position of the body part of the person based at least in part upon a second image generated by the image sensor; determine a set of pixel positions of the body part during a timeframe associated with the set of images, wherein the set of pixel positions comprises the first pixel position of the body part and the second pixel position of the body part; determine an aggregated body part position based on the set of pixel positions determined for the timeframe; determine that the aggregated body part position corresponds to a position associated with the object; in response to determining that the aggregated body part position corresponds to a position associated with the object, provide a trigger signal indicating an interaction event has occurred; and in response to providing the trigger signal: determine at least one item-selection image associated with the person placing a first item on the object; determine, in the at least one item-selection image, a region-of-interest based on the aggregated body part position, wherein the region-of-interest includes a subset of the pixels of the item-selection image; identify, using an object detection algorithm, the first item in the selected region-of-interest; and assign the identified first item to the person.

14

14. The tracking subsystem of claim 13, wherein the processor determines that the aggregated body part position corresponds to the position associated with the object by: comparing the aggregated body part position to a set of one or more predefined object positions; and determining, based on the comparison of the aggregated body part position to the set of one or more predefined object positions, that the aggregated body part position is within a threshold distance of at least one of the set of predefined object positions.

15

15. The tracking subsystem of claim 13, wherein the processor is further configured to: receive a top-view image feed comprising top-view images from a second image sensor, wherein the second image sensor is positioned such that a field-of-view of the second image sensor encompasses at least a portion of the object, wherein the second image sensor is configured to generate the top-down images of a region around the object; determine, based on the received top-view image feed, that the person is within a threshold distance from the object; and in response to determining that the person is within the threshold distance of the object, begin receiving an image feed comprising the set of images generated by the image sensor.

16

16. The tracking subsystem of claim 13, wherein the processor is further configured to determine the aggregated body part position by determining a maximum depth associated with the object to which the pixel position body part position extends in the set of images.

17

17. The tracking subsystem of claim 13, wherein: the object includes a visible marker located at a predefined location; and the processor is further configured to: detect the visible marker; and determine the predefined position associated with the object based on the detected markers.

18

18. The tracking subsystem of claim 13, wherein the processor is further configured to: determine, based on the aggregated body part position, candidate items that may have been placed on the object by the person, wherein the candidate items include a subset of all stored items, wherein the subset comprises the items located within a threshold distance of the aggregated body part position; for each candidate item, determine, based on a comparison of a predefined position associated with the candidate items to the aggregated body part position, a probability value that the candidate item was interacted with by the person; and identify the first item as the candidate item with the largest probability value.

Patent Metadata

Filing Date

Unknown

Publication Date

February 25, 2025

Inventors

Sumedh Vilas Datar
Sailesh Bharathwaaj Krishnamurthy
Shahmeer Ali Mirza

Want to explore more patents?

Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.

Citation & reuse

Analysis on this page is generated by Patentable — an AI-powered patent intelligence platform. AI-generated summaries, explanations, and analysis may be reused with attribution and a visible link back to the canonical URL below. Patent abstracts and claims are USPTO public domain.

Cite as: Patentable. “EVENT TRIGGER BASED ON REGION-OF-INTEREST NEAR HAND-SHELF INTERACTION” (12236652). https://patentable.app/patents/12236652

© 2026 Patentable. All rights reserved.

Patentable is a research and drafting-assistant tool, not a law firm, and does not provide legal advice. Documents we generate are drafts for review by a licensed patent attorney.