US-8411149

Method and device for identifying and extracting images of multiple users, and for recognizing user gestures

PublishedApril 2, 2013

Assigneenot available in USPTO data we have

Inventorsnot available in USPTO data we have

Technical Abstract

The invention relates to a method for identifying and extracting images of one or more users in an interactive environment comprising the steps of: —obtaining a depth map (7) of a scene in the form of an array of depth values, and an image (8) of said scene in the form of a corresponding array of pixel values, said depth map (7) and said image (8) being registered; applying a coordinate transformation to said depth map (7) and said image (8) for obtaining a corresponding array (15) containing the 3D positions in a real-world coordinates system and pixel values points; —grouping said points according to their relative positions, by using a clustering process (18) so that each group contains points that are in the same region of space and correspond to a user location (19); —defining individual volumes of interest (20) each corresponding to one of said user locations (19); —selecting, from said array (15) containing the 3D positions and pixel values, the points located in said volumes of interest for obtaining segmentation masks (35) for each user; —applying said segmentation masks (35) to said image (8) for extracting images of said users. The invention also relates to a method for recognizing gestures of said users.

Patent Claims

2 claims

Legal claims defining the scope of protection, as filed with the USPTO.

1. A device for identifying and extracting images of multiple users in an interactive environment scene comprising: a video camera for capturing an image from the scene; a depth perception device for providing depth information about said scene; at least one computer processor for processing said depth information and said image information, wherein: said device comprises means for using individual volume of interest from said scene for each user; said video camera comprises means for obtaining an image of said scene in a form of a corresponding array of pixel values; said depth perception device comprises means for obtaining a depth map of said scene, said depth map and said image being registered to one another; said at least one computer processor comprises means for coordinate transforming said depth map and said image to obtain a corresponding array of points containing 3D positions in a real-world coordinates system and pixel values at said 3D positions; said at least one computer processor also comprises means for grouping said points according to their relative positions, by using a clustering process so that each group contains points that are in a same region of space and correspond to a respective user location; said at least one computer processor further comprises means for defining individual volumes of interest, each volume of interest corresponding to one of said user locations; said means for using said individual volumes of interest comprises means for selecting, from said array of points containing the 3D positions and pixel values, the points located in said volumes of interest for obtaining segmentation masks for each user; and said means for using said individual volumes of interest further comprises means for applying said segmentation masks to said image for extracting images of said users.

2. A device for identifying and extracting images of multiple users in an interactive environment scene comprising: means for obtaining a depth map of a scene in a form of an array of depth values, and an image of said scene in a form of a corresponding array of pixel values, said depth map and said image being registered to one another, means for coordinate transforming said depth map and said image to obtain a corresponding array of points containing 3D positions in a real-world coordinates system and pixel values at said 3D positions; means for grouping said points according to their relative positions, by using a clustering process so that each group contains points that are in a same region of space and correspond to a respective user location; means for defining individual volumes of interest, each volume of interest corresponding to one of said user locations; means for selecting, from said array of points containing the 3D positions and pixel values, the points located in said volumes of interest for obtaining segmentation masks for each user, and means for applying said segmentation masks to said image for extracting images of said users.

Classification Codes (CPC)

Cooperative Patent Classification codes for this invention. Click any code to explore related patents in that topic.

G06V A63F G06F G06T

Patent Metadata

Filing Date

August 3, 2006

Publication Date

April 2, 2013

Want to explore more patents?

Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.

Browse All Patents Try Prior Art Search