An information processing apparatus of the present disclosure includes: a reception unit configured to receive a first image in which one or more persons are captured and designation of a specific region in the first image, from a user; and a registration unit configured to perform a registration process of registering feature information of a specific person included in the specific region among the one or more persons captured in the first image, and consent to registration of the feature information for the specific person is obtained from the user.
Legal claims defining the scope of protection, as filed with the USPTO.
a reception unit configured to receive a first image in which one or more persons are captured and designation of a specific region in the first image, from a user; and a registration unit configured to perform a registration process of registering feature information of a specific person included in the specific region among the one or more persons captured in the first image, wherein consent to registration of the feature information for the specific person is obtained from the user. . An information processing apparatus comprising:
claim 1 the reception unit receives a second image corresponding to the first image before reception of the designation of the specific region, the information processing apparatus further comprises a preliminary process unit configured to detect one or more person regions from the received second image and send the one or more person regions to the user as a reply, and the specific region is designated from among the one or more person regions detected by the preliminary process unit. . The information processing apparatus according to, wherein
claim 2 . The information processing apparatus according to, wherein the second image is an image identical to the first image.
claim 1 . The information processing apparatus according to, wherein the registration process includes a process of detecting one or more person regions from the first image and determining whether or not each of the detected one or more person regions matches the specific region, after reception of the first image and the designation by the reception unit.
claim 4 . The information processing apparatus according to, wherein, in a case where the person region detected from the first image matches the specific region in the registration process, the feature information of the person included in the specific region is extracted.
claim 2 the reception unit receives the second image corresponding to the first image before reception of the designation, the preliminary process unit further stores the one or more person regions detected from the second image and a second hash value determined from the second image in a storage unit in association with each other, and a process of determining a first hash value from the first image by using an algorithm identical to an algorithm used for determination of the second hash value and obtaining the one or more person regions detected from the second image from the storage unit by using the first hash value, and a process of determining whether or not each of the one or more detected person regions matches the specific region. the registration process includes . The information processing apparatus according to, wherein
claim 6 . The information processing apparatus according to, wherein, in a case where the person region detected from the first image and the specific region match each other in the registration process, the feature information of the person included in the specific region is extracted.
claim 2 the reception unit receives the first image and the designation from a client terminal communicably connected to the information processing apparatus, and the preliminary process unit receives the second image from the client terminal, and sends the one or more person regions detected from the second image to the client terminal as the reply. . The information processing apparatus according to, wherein
claim 8 . The information processing apparatus according to, wherein the preliminary process unit causes the client terminal to display a screen depicting one or more persons included in the one or more person regions detected from the second image.
a preliminary process unit configured to receive an image from a user and send one or more person regions included in the image and one or more pieces of feature information in association with each other as a reply, the one or more pieces of feature information extracted, respectively, from the one or more person regions and encrypted; a reception unit configured to receive the feature information associated with a specific region including a specific person among the one or more person regions included in the reply; and a registration unit configured to decrypt and register the feature information received by the reception unit, wherein consent to registration of the feature information for the specific person is obtained from the user. . An information processing apparatus comprising:
claim 10 the preliminary process unit receives the image from a client terminal communicably connected to the information processing apparatus, and sends the reply to the client terminal, and the reception unit receives the feature information associated with the specific region designated in the client terminal. . The information processing apparatus according to, wherein
claim 11 . The information processing apparatus according to, wherein the preliminary process unit causes the client terminal to display a screen depicting one or more persons included in the one or more person regions detected from the image.
claim 9 . The information processing apparatus according to, wherein the screen includes an inquiry to the user about the consent to the registration of the feature information by the registration unit for each of the one or more persons displayed on the screen.
claim 13 . The information processing apparatus according to, wherein the one or more persons being targets of the inquiry are sequentially changed and displayed on the screen.
claim 13 . The information processing apparatus according to, wherein a designation operation of the one or more persons being targets of the inquiry is received from the user on the screen.
claim 1 the specific region includes a face region, and the feature information is a feature amount of a face. . The information processing apparatus according to, wherein
a reception unit configured to receive an image transmitted from the client terminal; a reply unit configured to send information on one or more persons detected from the image as a reply such that the information is visually recognizable by a user of the client terminal; and a registration unit configured to register feature information of a specific person for which consent of the user is obtained, among the one or more persons detected from the image. . An information processing apparatus capable of being communicably connected to a client terminal, the information processing apparatus comprising:
a reception unit configured to receive a first image in which one or more persons are captured and designation of a specific region in the first image, from the client terminal; and a registration unit configured to perform a registration process of registering feature information of a specific person included in the specific region among the one or more persons captured in the first image, and the information processing apparatus includes: the client terminal includes a transmission unit configured to transmit the first image and the designation of the specific region including the specific person in the first image to the information processing apparatus in a case where consent to registration of the feature information for the specific person is obtained from a user. . An information processing system in which an information processing apparatus and a client terminal are communicably connected to each other via a network, wherein
a preliminary process unit configured to receive an image from the client terminal and send one or more person regions included in the image and one or more pieces of feature information in association with each other to the client terminal as a reply, the one or more pieces of feature information extracted, respectively, from the one or more person regions and encrypted; a reception unit configured to receive the feature information associated with a specific region including a specific person among the one or more person regions included in the reply, from the client terminal; and a registration unit configured to decrypt and register the feature information received by the reception unit, and the information processing apparatus includes: the client terminal includes a transmission unit configured to transmit the feature information associated with the specific region to the information processing apparatus in a case where consent to registration of the feature information for the specific person designated from the one or more person regions included in the reply is obtained from the user. . An information processing system in which an information processing apparatus and a client terminal are communicably connected to each other via a network, wherein
a reception unit configured to receive an image transmitted from the client terminal; a reply unit configured to send information on one or more persons detected from the image to the client terminal as a reply; and a registration unit configured to register feature information of a specific person for which consent of a user is obtained, among the one or more persons detected from the image, and the information processing apparatus includes: an inquiry unit configured to present the one or more persons detected from the image based on the reply such that the one or more persons are visually recognizable by the user, and inquire about the consent of the user to the registration of the feature information for each of the one or more presented persons; and a request unit configured to request the information processing apparatus to register the feature information of the specific person for which the consent is obtained by the inquiry unit. the client terminal includes: . An information processing system in which an information processing apparatus and a client terminal are communicably connected to each other via a network, wherein
receiving a first image in which one or more persons are captured and designation of a specific region in the first image, from a user; and performing a registration process of registering feature information of a specific person included in the specific region among the one or more persons captured in the first image, wherein consent to registration of the feature information for the specific person included in the specific region is obtained from the user. . An information processing method executed by an information processing apparatus, the information processing method comprising:
receiving an image from a user and sending one or more person regions included in the image and one or more pieces of feature information in association with each other as a reply, the one or more pieces of feature information extracted, respectively, from the one or more person regions and encrypted; receiving the feature information associated with a specific region including a specific person among the one or more person regions included in the reply; and decrypting and registering the received feature information, wherein consent to registration of the feature information for the specific person is obtained from the user. . An information processing method executed by an information processing apparatus, the information processing method comprising:
receiving an image transmitted from the client terminal; sending information on one or more persons detected from the image as a reply such that the information is visually recognizable by a user of the client terminal; and registering feature information of a specific person for which consent of the user is obtained, among the one or more persons detected from the image. . An information processing method executed by an information processing apparatus capable of being communicably connected to a client terminal, the information processing method comprising:
receiving a first image in which one or more persons are captured and designation of a specific region in the first image, from a user; and performing a registration process of registering feature information of a specific person included in the specific region among the one or more persons captured in the first image, wherein consent to registration of the feature information for the specific person included in the specific region is obtained from the user. . A non-transitory computer readable storage medium storing a program which causes a computer to execute an image processing method comprising:
receiving an image from a user and sending one or more person regions included in the image and one or more pieces of feature information in association with each other as a reply, the one or more pieces of feature information extracted, respectively, from the one or more person regions and encrypted; receiving the feature information associated with a specific region including a specific person among the one or more person regions included in the reply; and decrypting and registering the received feature information, wherein consent to registration of the feature information for the specific person is obtained from the user. . A non-transitory computer readable storage medium storing a program which causes a computer to execute an information processing method comprising:
receiving an image transmitted from the client terminal; sending information on one or more persons detected from the image as a reply such that the information is visually recognizable by a user of the client terminal; and registering feature information of a specific person among the one or more persons detected from the image. . A non-transitory computer readable storage medium storing a program which causes a computer to execute an information processing method executed by a computer capable of being communicably connected to a client terminal, the information processing method comprising:
Complete technical specification and implementation details from the patent document.
The present disclosure relates to a service in which feature information of a person captured in an image is registered.
A service that receives input of an image from a client service or an application and that recognizes whose face is captured in the inputted image (hereinafter, face recognition service) is widely used. A user of the service needs to perform in advance work of registering a face of a person to be a recognition target in the face recognition service.
Patent Literature 1 (Japanese Patent Laid-Open No. 2023-157932) discloses a procedure of registering a face image. In this procedure, explanation relating to handling of personal information in a system is first displayed on an initial screen in which a user performs an operation of registering a face image, and the user is requested to give consent to the handling of the personal information. Then, in the case where the consent of the user is obtained, the face image of the user himself/herself is captured with a camera provided in a registration device, is presented to the user, and is transmitted to a face management server with consent.
An information processing apparatus of the present disclosure includes: a reception unit configured to receive a first image in which one or more persons are captured and designation of a specific region in the first image, from a user; and a registration unit configured to perform a registration process of registering feature information of a specific person included in the specific region among the one or more persons captured in the first image, and consent to registration of the feature information for the specific person is obtained from the user.
Features of the present disclosure will become apparent from the following description of embodiments with reference to the attached drawings. The following description of embodiments are described by way of example.
A server of a face recognition service outputs a face recognition result based on a degree of similarity between a feature amount extracted from a face captured in an inputted image and a feature amount of a face of a person registered in advance. Accordingly, a user of the service needs to perform work of registering a face of a person to be a recognition target in the face recognition service in advance.
However, in the method described in Patent Literature 1, it is assumed that a captured image obtained by capturing a user himself/herself with a camera provided in a registration apparatus is set as a target of a registration process, and consent is obtained from the user himself/herself who is the subject of the captured image. Accordingly, the target of the process cannot be any image such as a snapshot in which multiple faces are captured. In recent years, a service that manages various images captured by a user in a cloud is also provided, and achieving both of various services and compliance with the AI ethics and the legal restraints described above has become a challenge.
In the present embodiment, explanation is given of an information processing system capable of providing a service complying with the AI ethics and the legal restraints in a service in which feature information of a person captured in an image is registered.
The present invention is explained below in detail based on preferrable embodiments of the present invention with reference to the attached drawings. Note that configurations illustrated in the following embodiments are merely examples, and the present invention is not limited to the illustrated configurations.
1 FIG. 1 FIG. 1 1 100 110 120 100 1 110 A system configuration of an information processing system according to the present embodiment is explained.is a diagram illustrating an example of the system configuration of the information processing systemaccording to the present embodiment. As illustrated in, in the information processing system, an information processing apparatusand a client terminalare communicably connected to each other via a networkto form a server-client system. The information processing apparatusfunctions as a server apparatus of the information processing system, and the client terminalfunctions as a client.
100 1 110 110 1 FIG. Note that the information processing apparatusthat is the server may be implemented by one computer or may have a configuration including multiple computers. In the information processing systemillustrated in, a communication method used for connection between the apparatuses is, for example, communication standards of IEEE 802.11 series (Wi-Fi (registered trademark)) or Bluetooth (registered trademark). The communication between the apparatuses may be executed by Internet communication via a wireless LAN router, or the apparatuses may communicate with each other by mobile communication (3G, 4G, or 5G). The client terminalis an information processing terminal used by the user, and has a web browser function for implementing browsing of a web site on the Internet and executing a web application provided by the server. The client terminalis formed of, for example, a personal computer (PC), a smartphone, a tablet, or any other information processing terminal, or a camera or the like having a network communication function.
2 FIG.A 100 100 201 202 203 204 205 206 207 209 is a diagram illustrating an example of a hardware configuration of the information processing apparatus. The information processing apparatusis a computer, and includes a CPU, a RAM, a ROM, a network interface (I/F), a storage device, a display device, and an input device. These units are connected to one another via a bus.
201 100 100 202 201 100 203 The CPUperforms operation control of the units forming the information processing apparatus, and is a subject that executes later-mentioned various processes performed by the information processing apparatus. The RAMis a memory that temporarily stores data and control information, and is a work area used in the execution of the various processes by the CPU. Operation parameters, operation programs, and the like fixedly used by the information processing apparatusare stored in the ROM.
204 120 100 204 205 205 205 201 100 The network I/Fprovides a function of connecting to and communicating with the network. The information processing apparatusexchanges data with an external apparatus via the network I/F. The storage deviceis a device that stores data, and has an interface that receives I/O commands for reading and writing data. The storage devicemay be a hard disk drive (HDD), a solid-state drive (SSD), an optical disc drive, a semiconductor storage device, or any other storage device. The storage devicestores computer programs and data that cause the CPUto execute the later-mentioned processes executed by the information processing apparatus.
206 201 207 201 The display deviceis, for example, a liquid crystal display (LCD), and displays information outputted from the CPUin a state where the user can visually recognize the information. The input deviceincludes, for example, a keyboard, a mouse, a touch panel, and the like, receives input of information depending on an operation of the user, and inputs the information into the CPU.
2 FIG.B 2 FIG.B 110 110 211 212 213 214 215 216 217 218 219 110 110 211 212 213 214 215 216 217 201 202 203 204 205 206 207 218 211 is a diagram illustrating an example of a hardware configuration of the client terminal. The client terminalincludes a CPU, a RAM, a ROM, a network I/F, a storage device, a display device, an input device, an imaging device, and the like. These units are connected to one another via a bus. Although a configuration of a smartphone having an imaging function is illustrated as the client terminalin the example of, the client terminalis not limited to this configuration, and may have a different configuration. Since the CPU, the RAM, the ROM, the network I/F, the storage device, the display device, and the input deviceare similar to the CPU, the RAM, the ROM, the network I/F, the storage device, the display device, and the input devicedescribed above, explanation thereof is omitted. The imaging deviceincludes a lens and an imaging element such as charge coupled devices (CCD) or a complementary metal-oxide-semiconductor (CMOS), and inputs data of a captured image into the CPU.
100 110 100 110 100 100 110 110 110 110 100 100 100 100 100 204 204 In the present embodiment, the various functions provided by the information processing apparatusare provided to the client terminalas a web service. Specifically, the information processing apparatusprovides a user interface (UI) screen through a web browser of the client terminal. The information processing apparatusexecutes processes corresponding to the various functions provided by the information processing apparatuswhile displaying various pieces of data on the client terminalthrough the UI screen and receiving input of data from the client terminal. Alternatively, a dedicated application may be installed in the client terminal. The client terminalmay execute this application to execute the processes corresponding to the various functions with the information processing apparatuswhile exchanging data with the information processing apparatus. Moreover, the present disclosure is not limited to these forms, and the functions of the information processing apparatusmay be implemented by any method. Furthermore, various pieces of hardware forming the information processing apparatusmay be virtual hardware resources on a cloud. In this case, the information processing apparatustransmits requests for executing the functions to the hardware resources via the network I/F, and obtains processing results via the network I/F.
1 1 100 300 305 310 300 301 302 303 310 306 307 308 309 110 311 312 313 3 FIG. 3 FIG. Next, a functional configuration of the information processing systemaccording to the present embodiment is explained.is a diagram illustrating a functional configuration example of the information processing systemin the first embodiment. As illustrated in, the information processing apparatusincludes a preliminary process unit, a reception unit, and a registration unit. The preliminary process unitincludes a preliminary reception unit, a preliminary detection unit, and a reply unit. The registration unitincludes a detection unit, a determination unit, an extraction unit, and a storage unit. The client terminalincludes a preliminary transmission unit, a consent obtaining unit, and a transmission unit. The CPU implements functions of these functional units by invoking programs stored in the ROM or the storage device and executing processes according to the programs.
305 313 110 The reception unitreceives a first image in which a person is captured and designation of a specific region in the first image as an execution request of registration. The first image and the designation of the specific region in the first image are transmitted from the transmission unitof the client terminal. The first image is data of an arbitrarily-captured image, and may be a still image or a moving image. Moreover, the first image may be an image captured by any imaging device. Furthermore, the first image is assumed to include one or multiple persons as a subject. The specific region is a region including a specific person who is the target of registration of the feature information. The designation of the specific region is expressed by information by which the position and the size of the specific region in the first image can be identified, and is expressed by, for example, coordinate values of two points in the first image. Specifically, in the case where the shape of the specific region is a rectangle on an image plane, the upper left coordinates and the lower right coordinates of this rectangular region can be used as the information relating to the designation of the specific region. Note that the information designating the specific region is not limited to this, and may be other information. For example, the information designating the specific region may be information indicating the upper left coordinates of the region, the size of the region, and the shape of the region.
305 300 300 305 300 305 300 In the first embodiment, the reception unitreceives the first image and the designation of the specific region in the first image after a preliminary process by the preliminary process unit. The preliminary process unitreceives a second image corresponding to the first image before the reception of the designation of the specific region by the reception unit. The preliminary process unitdetects one or multiple person regions from the second image, and sends the person regions to the client terminal as a reply. The specific region in the first image received by the reception unitis designated from among the person regions sent by the preliminary process unitas the reply. Each of the person regions is assumed to be detected as a region including a face. In the following explanation, the person regions are also referred to as face regions.
301 300 311 110 The preliminary reception unitof the preliminary process unitreceives the second image from the preliminary transmission unitof the client terminal. In the present embodiment, the second image is assumed to be an image identical to the first image. However, the second image does not have to be an image completely identical to the first image. For example, the first image and the second image may vary from each other by an image process or changes in some of color values in the images.
302 301 303 302 110 110 The preliminary detection unitdetects one or multiple person regions from the second image received by the preliminary reception unit. In the following explanation, one or multiple person regions detected from the second image are referred to as second person regions, and one or multiple person regions detected from the first image are referred to as first person regions. The reply unitsends (transmits) information on the second person regions detected by the preliminary detection unitto the client terminal. The information on the second person regions transmitted to the client terminalincludes at least information designating the position and the size of each person region in the second image. For example, in the case where each second person region is designated as the upper left coordinates and the lower right coordinates in the second image, a rectangular region whose diagonal line is a line connecting the upper left coordinates and the lower right coordinates is the second person region in the second image.
110 100 312 110 110 100 216 110 312 311 100 216 110 The client terminalreceives the information on the second person regions from the information processing apparatusas the reply of the preliminary process. The consent obtaining unitof the client terminalpresents the second person regions such that the user can visually recognize the second person regions. The client terminaldisplays a screen depicting the second person regions obtained from the information processing apparatus, on the display deviceof the client terminal. For example, the consent obtaining unitcuts out regions corresponding to the second person regions from the second image transmitted by the preliminary transmission unitto the information processing apparatus, and displays person images enlarged or shrunk to a predetermined size, on the screen. The screen is displayed on the display deviceof the client terminal. Examples of the screen are described later.
312 100 110 312 312 The consent obtaining unitinquires of the user about designation of one of the second person regions included in the reply of the preliminary process and consent to the registration of the feature information by the information processing apparatusfor the person included in the designated region. This inquiry is performed for each person. For example, on the screen described above, the inquiry about consent is performed for each of the persons displayed on the screen. Note that the designation of the second person region may be designation by a user operation or automatic designation by a process of the client terminal. For example, in the case where there is one second face region obtained as the reply of the preliminary process, the consent obtaining unitmay designate this one second person region, and inquire of the user about consent to the registration for a person in this region. Moreover, in the case where there are multiple second face regions obtained as the reply of the preliminary process, the consent obtaining unitmay designate and display the multiple second person regions one by one on the screen, and inquire of the user about consent to the registration for the person in the designated region together with the display of the designated region. In the following explanation, an example in which the region is designated by the user operation is explained.
312 313 110 100 313 100 In the case where the consent obtaining unitobtains the consent of the user for the person included in the designated region, the transmission unitof the client terminalsets this region as the specific region, and transmits designation information of the specific region to the information processing apparatustogether with the first image. As described above, the first image is normally the same image as the second image. Note that the transmission unitmay also transmit information indicating that the consent for the specific person included in the specific region is obtained, together with the designation information of the specific region and the first image, to the information processing apparatus.
313 110 Note that the user who answers the inquiry about the consent does not have to be the person captured in the region. The user who answers the inquiry about the consent is a user of the face registration service, and is assumed to be a person who captures the first image (second image), a person who owns the first image (second image), an assistant who assists the registration, or the like, in addition to the person captured in the region. Specifically, the user who responds to the inquiry about the consent is a person such as a family member or a friend of the person captured in the region or a person who receives a permission from the person captured in the region. Note that, in the case where no consent of the user is obtained, the transmission unitof the client terminaldoes not transmit the first image and the designation of the specific region.
310 100 305 The registration unitof the information processing apparatusregisters the feature information of the specific person included in a region matching the designated specific region, the region being one of the first person regions that are the one or multiple person regions included in the first image received by the reception unit. Note that “match” in the present specification means that the positions, the sizes, and the shapes of compared regions are identical or similar to one another. Similar means that a value of overlapping degree to be described later is equal to or more than a predetermined value.
310 306 310 307 306 305 308 309 205 The registration unitfirst detects one or multiple person regions (first person regions) included in the first image with the detection unit. For example, the registration unitdetects regions including faces of persons. Next, the determination unitdetermines whether the first person regions detected by the detection unitinclude a region matching the region (specific region) corresponding to the designation received by the reception unit. In the case where there is a matching region, this matching region is identified as the specific region. The extraction unitextracts a feature amount of a person, in this case a feature amount of a face for the specific region. The determination of matching is performed based on, for example, an overlapping degree of the regions. The overlapping degree is described later. The storage unitstores the feature amount of the face extracted from the specific region in the storage deviceas the feature information of the specific person.
1 402 404 409 413 203 205 100 201 202 201 401 405 408 212 110 211 110 4 FIG. Next, explanation is given of a feature registration process executed by the information processing systemin the first embodiment.is a flowchart illustrating a flow of the feature registration process in the first embodiment, and registration of a feature amount of a face of a person is explained as an example. Processes illustrated in Sto Sand Sto Sof the present flowchart are described in a program of a web application stored in the ROMor the storage deviceof the information processing apparatus. The program is invoked by the CPU, is expanded on the RAM, and is executed by the CPU. Moreover, processes illustrated in Sand Sto Sof the present flowchart are described in a program of a web application expanded on the RAMof the client terminal, and is executed by the CPUof the client terminal.
110 100 100 110 110 201 100 401 404 100 409 413 4 FIG. The client terminalaccesses a web site provided by the information processing apparatusthrough the web browser. In the case where a login process is completed, the information processing apparatustransmits a top screen including a menu list to the client terminal. In the case where a face registration menu in the top screen is selected from the client terminal, the CPUof the information processing apparatusstarts the process of the present flowchart. Sign “S” in the following explanation means step. In the feature registration process of, Sto Samong the processes executed by the information processing apparatusare referred to as preliminary process, and Sto Sare referred to as registration process.
401 211 110 100 215 110 218 110 100 100 In S, the CPUof the client terminaltransmits an image to the information processing apparatusas an execution request of the preliminary process of the face registration. The image transmitted in this case is the second image described above. The second image is an image stored in the storage deviceof the client terminalor an image captured with the imaging deviceof the client terminal. The second image may be a still image or a moving image. The user selects the second image, and transmits the second image to the information processing apparatus. Note that the information processing apparatusmay display a dialog screen for selecting the second image on the client terminal in the case where the face registration menu is selected.
402 201 100 110 In S, the CPUof the information processing apparatusreceives the second image transmitted from the client terminal, as the execution request of the preliminary process of the face registration.
403 201 402 501 502 500 503 504 501 502 201 506 503 504 500 503 504 506 503 206 205 684 416 1 504 1176 310 1458 483 2 5 FIG. 5 FIG. 5 FIG. 5 FIG. In S, the CPUdetects regions of faces of persons from the image received in S.is a diagram illustrating an example of face regions detected from the second image. Two personsandare captured in a second image, and face regionsandare detected for these personsand, respectively. The CPUobtains coordinate informationof the face regionsandin an image coordinate system of the second image. The image coordinate system is a two-dimensional coordinate system including coordinate axes in two directions orthogonal to each other. In the example of, an upper left point of the image is the origin (0, 0), a horizontal direction inis an X axis, and a vertical direction inis a Y axis. For example, the position and the size of each of the face regionsandare identified as a rectangular region identified by the upper left coordinates and the lower right coordinates of the region. In the example of the coordinate information, the face regionwith the upper left coordinates (,) and the lower right coordinates (,) is detected as region ID “”. Moreover, the face regionwith the upper left coordinates (,) and the lower right coordinates (,) is detected as region ID “”.
201 In the present embodiment, the CPUdetects regions estimated to be faces from an inputted image (second image) by using a publicly-known technique such as an inference machine trained by deep learning, and outputs image information of the detected face regions. Moreover, in the case where the inputted image is a moving image, the CPU only needs to detect regions estimated to be faces by using the inference machine as in the case of the still image for one representative frame in the moving image. Note that the detection method of the face regions is not limited to the method using deep learning, and any other method may be used.
404 201 100 403 110 201 506 110 100 5 FIG. In S, the CPUof the information processing apparatussends the information on the face regions detected in Sto the client terminalas a reply. Specifically, the CPUtransmits the coordinate informationofto the client terminal. In this stage, the second image is not held in the information processing apparatus. This is to reduce data holding cost and comply with legal restraints.
405 211 110 503 504 100 216 110 110 405 211 110 600 216 601 602 503 504 403 211 503 504 100 601 602 600 601 602 6 6 FIGS.A toC 6 FIG.A 6 FIG.A 6 FIG.A In S, the CPUof the client terminaldisplays face images corresponding to the face regionsandreceived as the reply from the information processing apparatus, on the display deviceof the client terminal.are diagrams illustrating examples of screens displayed in the client terminal. First, in S, the CPUof the client terminaldisplays a face selection screenillustrated in, on the display device. In, face imagesandcorresponding to the face regionsanddetected in the preliminary detection process of Sare displayed side by side. The CPUcuts out the face regions from the second image transmitted in the preliminary process based on the coordinate information of the face regionsandreceived as the reply from the information processing apparatus, and displays the face regions on the screen. Note that the face imagesanddo not have to be displayed in a state where the sizes thereof are the same as the sizes of the second face regions in the second image. On the face selection screen, the face imagesandmay be displayed while being enlarged or shrunk to an appropriate size such that the user can visually recognize each face. Moreover, although the example in which the face images obtained by cutting out the face regions from the second image are displayed is illustrated in the example of, the present disclosure is not limited to this. The configuration may be such that a frame line of an appropriate size is displayed at an appropriate position on the second image to indicate each of the face regions included in the reply.
604 406 211 407 610 603 403 6 FIG.B In the case where one of the images of the face regions is designated by a user operation and an OK buttonis operated in S, the CPUcauses the process to proceed to S, and the screen transitions to a consent screenof. In the case where a cancel buttonis operated, the present flowchart is terminated, or the process returns to the face region detection (S).
407 211 610 216 211 405 610 601 611 611 611 100 610 612 613 612 211 407 620 613 403 6 FIG.B 6 FIG.B 6 FIG.A 6 FIG.C In S, the CPUdisplays the consent screenofon the display device. Then, the CPUobtains the consent of the user for the face included in the specific face region designated by the user in S. In the consent screenof, the face imagedesignated inand a messagerequesting for consent are displayed. Contents of the messageare, for example, “This face will be registered. Do you agree with extraction of feature amount of face and saving of feature amount in system?” or the like. As illustrated as an example, the messageincludes a text requesting consent to extraction and registration of the feature amount of the face by the information processing apparatus. Moreover, the consent screenis provided with a “yes” buttonpressed in the case where the user gives the consent and a “no” buttonpressed in the case where the user does not give the consent. In the case where the user operates the “yes” button, the CPUcauses the process to proceed to S, and the screen transitions to a confirmation screenin. In the case where the user operates the “no” button, the present flowchart is terminated, or the process returns to the face region detection (S).
626 623 620 610 620 624 625 626 623 623 407 624 623 211 601 212 211 620 408 625 600 6 FIG.C 6 FIG.B 6 FIG.C 6 FIG.A A linkto terms of use and a check boxare displayed on the confirmation screenof, in addition to the information displayed on the consent screenillustrated in. Moreover, the confirmation screenis provided with a “next” buttonand a “return” button. In the case where the user operates the linkto terms of use, the screen transitions to a page of terms of use relating to the face recognition service. In the case where the check boxis pressed once, a check mark is displayed. In the case where the check boxis pressed again, the displayed check mark disappears. In S, in the case where the user operates the “next” buttonin a state where the check mark is displayed in the check box, the CPUassumes that the consent is obtained, and saves the coordinate information of the face region corresponding to the designated face image, in the RAM. Then, the CPUcloses the confirmation screenof, and the process proceeds to S. In the case where the user operates the “return” button, the screen returns to the face selection screenillustrated in.
6 FIG.A 5 FIG. 6 FIG.B 6 FIG.C 6 FIG.C 601 602 610 610 503 504 601 503 610 612 620 613 620 610 602 504 Note that, in the example of, description is given of the example as follows: the face imagesandcorresponding to the multiple face regions detected from the second image are displayed side by side; the user selects one of the images of the face regions; and the screen transitions to the subsequent consent screen. However, the screen configuration may be a different screen configuration. The consent screenson which the face images corresponding to the multiple detected face regions are displayed, respectively, may be sequentially displayed. For example, in the case where the two face regionsandare detected as illustrated in, first, an inquiry about the consent for the face imagecorresponding to one face regionis made on the consent screenof. In the case where the user operates the “yes” button, the screen transitions to the confirmation screenof. In the case where the user operates the “no” button, the confirmation screenofis skipped, and the consent screendisplaying the face imagecorresponding to the next face regionis displayed.
624 623 620 211 212 610 602 504 624 623 620 211 610 602 504 408 In the case where the user operates the “next” buttonin the state where the checkmark is inputted in the check boxin the confirmation screen, the CPUassumes that the consent is obtained, and stores the coordinate information of the face region corresponding to the face image, in the RAM. Then, the consent screendisplaying the face imagecorresponding to the next face regionis displayed. In the case where the user operates the “next” buttonin a state where no checkmark is inputted in the check boxin the confirmation screen, the CPUassumes that the consent is not obtained, and displays the consent screendisplaying the face imagecorresponding to the next face region. In the case where the inquiry about consent for all face regions included in the reply is performed and answers to the inquiry are obtained, the process proceeds to S.
600 610 620 600 Note that the display form of each of the screens,, andmay be any display form as long as the face images being the targets of consent are displayed such that the user can visually recognize the face images. For example, frames illustrating the detected face regions may be displayed on the second image. Moreover, selection of the face to be the target of consent may be received from the face regions by selecting any of these frames. Furthermore, the configuration may be such that, on the face selection screen, checkmarks are displayed for the faces to be the targets of consent in response to designation operations by the user, and consent is performed in a batch for the faces for which the checkmarks are displayed. Moreover, confirmation items necessary for the consent other than the terms of use may be added to the screen.
407 211 408 407 408 In the case where the consent is obtained for none of the face regions in S, the CPUdoes not execute the processes of Sand beyond, and terminates the present flowchart. In the case where the consent is obtained for at least one of the face regions in S, the process proceeds to S.
408 211 110 100 401 212 In S, the CPUof the client terminaltransmits the first image and the designation information of the face region for which the consent is obtained from the user, to the information processing apparatusas an execution request of the face registration. The first image transmitted in this case is assumed to the same image as the second image preliminarily transmitted in S. The designation information of the face region is the coordinate information of the face region saved in the RAMas the coordinate information for which the consent is obtained.
409 201 100 In S, the CPUof the information processing apparatusreceives the first image and the designation information of the face region for which the consent is obtained from the user, that is the specific region, as the execution request of the face registration.
410 201 100 409 100 In S, the CPUof the information processing apparatusdetects one or multiple face regions from the first image received in S. The one or multiple face regions detected from the first image by the information processing apparatusafter the reception of the first image and the specific region are also referred to as first face regions.
411 201 100 410 409 201 409 201 410 201 201 In S, the CPUof the information processing apparatusdetermines whether or not the first face regions detected in Sinclude a region matching the region relating to the designation received in S. In the determination process, the CPUcalculates the overlapping degree between the region relating to the designation received in Sand each of the first face regions detected by the CPUin S. The CPUdetermines whether the regions match or not based on a value of the overlapping degree. The overlapping degree is an index expressing a ratio of overlapping of images, and for example, an evaluation index referred to as intersection over union (IoU) can be used. The larger the IoU is, the more the images overlap each other. The CPUassumes that the first face region whose value of the overlapping degree is the largest and is larger than a predetermined threshold is the region “matching” the region relating to the designation. The first face region whose value of the overlapping degree is not the largest or is not larger than the predetermined threshold is assumed to be a region “not matching” the region relating to the designation.
201 110 201 A method of calculating the overlapping degree is explained. The CPUcalculates the overlapping degree as a ratio of an “AND region of the target face region that is one of the detected first face regions and the region relating to the designation received from the client terminal” to an “OR region of the target face region and the region relating to the designation”. The CPUcalculates the overlapping degree while setting each of the first face regions detected from the first image as the target. Then, the first face region whose value of the overlapping degree is the largest and is larger than the predetermined threshold is set as the specific region as described above. The overlapping degree is expressed by the following formula 1.
410 Moreover, in the case where there are multiple target face regions whose overlapping degrees are the largest, the target face region whose likelihood is the largest is preferentially assumed to be the “matching” region (specific region). The likelihood is a score calculated in the course of the face region detection process in S. The higher the value of the likelihood is, the higher the trustworthiness of the target face region being a face, in other words, the higher the likelihood of the detected region being a “face”. In the case where there are multiple face regions with the same overlapping degree for the positions, the sizes, and the shapes of the face regions, the region that is most trustworthy as being a face is prioritized based on the likelihood.
211 411 411 211 411 411 412 In the case where the CPUdetermines that the detected first face regions does not include the region matching the region relating to the designation in S(S; NO), the present flowchart is terminated, and the feature registration process is terminated. Meanwhile, in the case where CPUdetermines that the detected first face regions include the region matching the region relating to the designation in S(S; YES), the process proceeds to S.
412 201 201 In S, the CPUextracts the feature amount of the face while setting the region (specific region) matching the region relating to the designation in the first image as the target. In the present embodiment, the CPUextracts the feature amount of the face as an N-dimensional vector by using an inference model trained in advance by using a publicly-known technique such as deep learning. Note that the extraction process of the feature amount of the face is not limited to this method, and any method may be used as long as the extraction of the feature amount is possible.
413 201 412 205 700 205 701 702 703 704 700 701 701 702 703 704 700 7 FIG. 7 FIG. In S, the CPUstores the feature amount of the face extracted in Sin the storage deviceas the feature information.is a diagram illustrating an example of a feature amount tablestored in the storage device. As illustrated in, records including an ID column, a feature amount column, a label information column, and a registration time and date columnare accumulated and stored in the feature amount table. IDs for uniquely identifying the registered faces are stored in the ID column. The feature amounts of the faces corresponding to the face regions identified with the ID columnare stored in the feature amount column. Additional information for identifying the face regions is stored in the label information column. For example, names of persons are stored. Times and dates at which the feature amounts of the faces are registered are stored in the registration time and date column. Note that the information stored in the feature amount tableis an example, and is not limited to this.
110 100 110 110 110 100 100 100 As explained above, in the case where an image is transmitted from the client terminal, the information processing apparatusof the first embodiment presents face regions detected from the transmitted image on the client terminalas the reply to the transmission such that the user can visually recognize the face regions. In the client terminal, the consent to the registration of the face feature amount in the face recognition service is obtained from the user individually for each of the presented face regions. The client terminaltransmits the designation of the face region and the image to the information processing apparatusfor the face region for which the consent is obtained from the user. The information processing apparatusdetects face regions from the received image, identifies a region matching the region relating to the designation among the detected face regions, and extracts and registers the feature information of the face. The information processing apparatuscan thereby set any image as a target and register the feature information of a specific person captured in the image. Moreover, even in the case where the image includes multiple faces, the feature information of the face can be registered for a specific face for which the user has given consent. Accordingly, it is possible to provide a service complying with AI ethics and legal restraints in a service in which the feature information of a person captured in an image is registered.
110 100 100 Executing the feature registration process in the above-mentioned procedure allows the feature amount of the face that is personal information to be safely registered also in the case where data is exchanged via a network. Specifically, since the feature information of the face is not directly exchanged between the client terminaland the information processing apparatus, leak of the feature information does not occur during the exchange of data. Moreover, since the information processing apparatusdoes not have to save the face image, it is possible to reduce data holding cost, and comply with legal restraints. Furthermore, there is a time lag between the reception of the image in the preliminary process and the reception of the image in the registration process. If the image is altered or changed to a different image in this time lag, determination of no match is likely to be given in the determination of the matching of the regions, and continuance of the process is inhibited.
Note that, although the method of detecting the regions estimated to be the faces by using a publicly-known technique such as the inference machine trained by deep learning is explained as the detection process of the face regions in the first embodiment, the present disclosure is not limited to this method. For example, the detection of the face regions may be implemented by an algorithm of a form other than the form of the inference model. Moreover, although the method of extracting the feature amount of the face as the N-dimensional vector by using the inference model trained in advance by a publicly-known technique such as deep learning is explained as the extraction method of the feature amount of the face in the above-mentioned embodiment, the present disclosure is not limited to this method. For example, the extraction of the feature amount of the face may be implemented by an algorithm of a form other than the form of the inference model. Alternatively, the feature amount of the face may be extracted as information that is not a vector. Moreover, although the method in which the overlapping degree is calculated as the ratio of the “AND region of the target face region and the region relating to the designation” to the “OR region of the target face region and the region relating to the designation” is explained in the present embodiment, the present disclosure is not limited to this calculation method. For example, the area of the “AND region of the target face region and the region relating to the designation” may be calculated as the overlapping degree. Moreover, a value of the “AND region of the target face region and the region relating to the designation” with respect to the “target face region” may be calculated as the overlapping degree. In addition, the configurations of the screens and the procedure of the processes in the flowchart are examples, and the present disclosure is not limited to the above-mentioned examples.
100 In the first embodiment, description is given of an example of the procedure in which the detection of the person regions is executed twice in the series of processes of feature registration from the preliminary process to the registration process. In a second embodiment, explanation is given of a processing procedure in which the detection of the person regions is performed once. Note that, also in the second embodiment, as in the first embodiment, the information processing apparatuspresents the faces of the registration targets to the user to obtain the consent individually for each face, and then registers the feature information of the face in the face recognition service. Note that, since a system configuration and a hardware configuration of the information processing system in the second embodiment are similar to those in the first embodiment, explanation thereof is omitted, and the second embodiment is explained with the same units denoted by the same reference numerals. Moreover, points different from the first embodiment are mainly explained.
100 100 8 FIG. 8 FIG. A functional configuration of the information processing apparatusA according to the second embodiment is explained.is a block diagram illustrating the functional configuration of the information processing apparatusA according to the second embodiment. Note that units inthat are identical to the units in the functional configuration of the first embodiment are denoted by reference numerals identical to those in the first embodiment. The CPU implements functions of the functional units described below by invoking a program stored in the ROM or the storage device and executing processes according to the program.
100 800 801 305 810 110 311 312 313 The information processing apparatusA of the second embodiment includes a preliminary process unit, a hash value determination unit, the reception unit, and a registration unit. The client terminalincludes the preliminary transmission unit, the consent obtaining unit, and the transmission unitas in the first embodiment.
800 301 302 802 303 810 803 307 308 309 801 800 810 The preliminary process unitincludes the preliminary reception unit, the preliminary detection unit, a temporary storage unit, and the reply unit. The registration unitincludes a reading unit, the determination unit, the extraction unit, and the storage unit. The hash value determination unitis used by both of the preliminary process unitand the registration unit.
801 801 301 801 305 801 801 800 801 810 The hash value determination unitcalculates hash values of images. The hash value determination unitcalculates the hash value for the image (second image) received by the preliminary reception unit. Moreover, the hash value determination unitcalculates the hash value for the image (first image) received by the reception unit. The hash value determination unitcalculates the hash values of the images by using the same algorithm for the case where the hash value determination unitis executed in the preliminary process unitand for the case where the hash value determination unitis executed in the registration unit. In the present embodiment, for example, the hash values are assumed to be calculated by calculation using SHA-256 algorithm. However, the present disclosure is not limited to this, and any other algorithm can be used. If the first image and the second image are the same image, the same hash value is obtained. If not, different hash values are obtained.
802 800 302 The temporary storage unitstores the hash value calculated for the second image in the preliminary process unitand one or multiple person regions detected from the second image by the preliminary detection unitin association with each other. In the following description, the person regions detected from the second images are assumed to be regions of faces of persons. The one or multiple face regions detected from the second image are referred to as second face regions.
303 302 110 The reply unittransmits information on the second face regions detected from the second image by the preliminary detection unit, to the client terminal.
810 801 305 803 803 802 305 307 803 305 308 309 205 In the registration unit, the hash value determination unitcalculates the hash value for the image (first image) received by the reception unit, and passes the hash value to the reading unit. The reading unitmakes a query about (reads) the second face regions from the temporary storage unitby using the hash value calculated for the image received by the reception unit. The determination unitdetermines whether or not the second face regions read by the reading unitincludes a region matching the region relating to the designation received by the reception unit. The determination of matching is the same as that in the first embodiment. In the case where there is a matching region, the extraction unitextracts a feature amount of a face for this specific region. The storage unitstores the feature amount of the face extracted from the specific region in the storage device.
1 402 403 901 902 404 409 903 905 412 413 203 205 100 201 202 201 401 405 408 212 110 211 110 9 FIG. 9 FIG. Next, a feature registration process executed by the information processing systemA in the second embodiment is explained.is a flowchart illustrating a flow of the feature registration process in the second embodiment. Processes illustrated in S, S, S, S, S, S, Sto S, S, and Sof the present flowchart are described in a program of a web application stored in the ROMor the storage deviceof the information processing apparatusA. The program is invoked by the CPU, expanded on the RAM, and executed by the CPU. Moreover, the processes illustrated in Sand Sto Sof the present flowchart are described in a program of a web application expanded on the RAMof the client terminal, and are executed by the CPUof the client terminal. In, the same processes as those in the feature registration process of the first embodiment are denoted by the same reference numerals. Processes different from the first embodiment are mainly explained below.
401 403 211 110 100 201 100 110 901 The processes of Sto Sare the same as those in the first embodiment. Specifically, the CPUof the client terminaltransmits an image (second image) to the information processing apparatusA as the execution request of the preliminary process of the face registration. The CPUof the information processing apparatusA receives the second image transmitted from the client terminal, and detects face regions of persons. Next, the process proceeds to S.
901 201 100 402 902 201 100 901 403 802 In S, the CPUof the information processing apparatusA calculates the hash value of the second image received in S. In S, the CPUof the information processing apparatusA stores the hash value calculated in Sand one or multiple face regions (second face regions) detected in Sin the temporary storage unit, in association with each other.
10 FIG. 10 FIG. 1000 802 1001 1002 1003 1004 1000 1001 1002 1003 1004 1000 is a diagram illustrating an example of a hash value tablestored in the temporary storage unit. As illustrated in, records including an image hash value column, a face region column, a detection time and date column, and an automatic deletion time and date columnare stored in the hash value table. Note that, in one record, an image being a target is the one same image. The hash value calculated from the image is stored in the image hash value column. A list of one or multiple face regions detected from the target image is stored in the face region column. The time and date of execution of the detection of the face regions for the target image is stored in the detection time and date column. The time and date of automatic deletion of the record is stored in the automatic deletion time and date column. Note that the information stored in the hash value tableis an example, and is not limited to that described above.
10 FIG. In the first record of, the image hash value “98abf72408 . . . ” is stored in association with regions [{(260, 205), (684, 416)}, {(1176, 310), (1458, 483)}] detected from the image. Two regions of a coordinate range indicated by {(260, 205), (684, 416)} and a coordinate range indicated by {(1176, 310), (1458, 483)} are detected as the regions.
404 201 100 403 110 Then, in S, the CPUof the information processing apparatusA sends the face regions detected in Sto the client terminalas a reply.
110 405 408 211 110 216 110 211 The client terminalexecutes the processes of Sto Sas in the first embodiment. Specifically, the CPUof the client terminaldisplays the face regions received as the reply, on the display deviceof the client terminal. The CPUdesignates a specific face region, and obtains the consent to the registration of the feature information for this face region, from the user.
211 110 100 401 The CPUof the client terminaltransmits the first image and the designation information of the region for which the consent is obtained from the user, to the information processing apparatusA as the execution request of the face registration. The first image transmitted in this case is assumed to be an image identical to the second image transmitted in S. Note that the first image and the second image do not have to be completely identical.
409 201 100 903 In S, the CPUof the information processing apparatusA receives the first image and the designation information of the region of the face for which the consent is obtained, as the execution request of the face registration. Next, the process proceeds to S.
903 201 100 409 901 In S, the CPUof the information processing apparatusA calculates the hash value of the first image received in Sby using the same algorithm as that in S.
904 201 802 903 201 903 1000 In S, the CPUmakes a query about the information on the face regions stored in the temporary storage unitby using the hash value calculated in S. Specifically, the CPUreads, based on the hash value calculated in S, the information on the face regions stored in association with this hash value, from the hash value table.
905 201 904 409 201 409 904 201 201 In S, the CPUdetermines whether the face regions for which the query is made in Sinclude a region matching the region relating to the designation received in S. In the determination process, the CPUcalculates the overlapping degree between the region relating to the designation received in Sand each of the one or multiple face regions for which the query is made in S. The CPUdetermines whether the region matches or not based on the value of the overlapping degree. Specifically, the CPUdetermines that the first face region whose value of overlapping degree is the largest and is larger than a predetermined threshold is the region “matching” the region relating to the designation. The first face region whose value of the overlapping degree is not the largest or is not larger than the predetermined threshold is determined to be a region “not matching” the region relating to the designation. Moreover, in the case where there are multiple first face regions whose overlapping degrees are the largest, the first face region whose likelihood is the largest is preferentially assumed to be the “matching” region. The overlapping degree and the likelihood are the same as those defined in the first embodiment.
211 905 211 905 412 In the case where the CPUdetermines that the second face regions for which the query is made do not include the region matching the region relating to the designation in S, the present flowchart is terminated, and the feature registration process is terminated. Meanwhile, in the case where CPUdetermines that the second face regions for which the query is made include the region matching the region relating to the designation in S, the process proceeds to S.
412 201 In S, the CPUextracts the feature amount of the face while setting the region (specific region) matching the region relating to the designation as the target.
413 201 412 205 412 413 In S, the CPUstores the feature amount of the face extracted in Sin the storage deviceas the feature information. The extraction method of the feature amount of the face in Sand the method of storing the feature information in Sare the same as those in the first embodiment. Then, the present flowchart is terminated.
100 As explained above, the information processing apparatusA of the second embodiment can set any image as the target and register the feature information for a specific person captured in the image while performing the detection of face regions once. Moreover, even in the case where multiple faces are included in the image, the feature information of the face can be registered for a specific face for which the user has given the consent.
Note that, although the example in which SHA-256 algorithm is used for the calculation of the hash values of the images is explained in the second embodiment, the present disclosure is not limited to this. For example, other hashing algorithms such as SHA-512, MD5, perceptual hash, and average hash may be used to implement the calculation.
100 In the first embodiment, explanation is given of an example in which the detection of the person regions is executed twice in the series of processes from the preliminary process to the registration process. Moreover, in the second embodiment, description is given of an example in which the image handled in the preliminary process and the image handled in the registration process are both hashed by using the same algorithm, and the query is made about the person regions detected in the preliminary process by using the hash value to implement the registration of the feature information with the number of times of detection of the person regions suppressed to one. Next, another example of the processing procedure in which the number of times of detection of the person regions is one is explained as a third embodiment. Note that, as in the first embodiment, an information processing apparatusB of the third embodiment presents the faces of the persons that are the registration targets to the user to obtain the consent individually for each person, and then registers the feature information of the person into the face recognition service. Note that, since a system configuration and a hardware configuration of the information processing system in the third embodiment are similar to those in the first embodiment, explanation thereof is omitted, and the third embodiment is explained with the same units denoted by the same reference numerals. Moreover, points different from the first embodiment are mainly explained.
100 100 11 FIG. 11 FIG. A functional configuration of the information processing apparatusB according to the third embodiment is explained.is a block diagram illustrating the functional configuration of the information processing apparatusB according to the third embodiment. Note that units inthat are the same as the units in the functional configuration of the first embodiment are denoted by the same reference numerals as those in the first embodiment. The CPU implements functions of the functional units described below by invoking a program stored in the ROM or the storage device and executing processes according to the program.
100 1100 1103 1110 110 311 312 1111 The information processing apparatusB of the third embodiment includes a preliminary process unit, a reception unit, and a registration unit. The client terminalincludes the preliminary transmission unit, the consent obtaining unit, and a transmission unit.
1100 301 302 1101 1102 303 1110 1104 309 The preliminary process unitincludes the preliminary reception unit, the preliminary detection unit, an extraction unit, an encryption unit, and the reply unit. The registration unitincludes a decryption unitand the storage unit.
1100 1101 1102 1101 302 301 1102 1101 The preliminary process unitincludes the extraction unitand the encryption unitunlike in the first embodiment. Specifically, in the third embodiment, the feature information is extracted in the preliminary process. The extraction unitextracts the feature amount of the face for each of one or multiple face regions (second face regions) detected by the preliminary detection unit, from the image (second image) received by the preliminary reception unit. The encryption unitencrypts the feature amounts of the faces extracted by the extraction unit.
303 302 1102 110 The reply unitsends one or multiple second face regions detected by the preliminary detection unitand the feature amounts of the faces corresponding to the second face regions and encrypted by the encryption unit, to the client terminalas a reply.
110 100 311 312 110 1111 100 1111 100 The client terminalreceives information on the second face regions and the encrypted feature amounts of the respective face regions, from the information processing apparatusB. The preliminary transmission unitand the consent obtaining unitof the client terminalare the same as those in the first embodiment. In the third embodiment, the transmission unittransmits an encrypted feature amount of a face to the information processing apparatusB as the execution request of the face registration. Note that the transmission unittransmits the feature amount encrypted and associated with the face region for which the consent is obtained from the user, to the information processing apparatusB.
1103 100 110 1104 1103 309 1104 205 The reception unitof the information processing apparatusB receives the encrypted feature amount of the face, from the client terminalas the execution request of the face registration. The decryption unitdecrypts the encrypted feature amount of the face received by the reception unit. The storage unitstores the feature amount of the face decrypted by the decryption unit, in the storage device.
1 402 403 1201 1203 1205 1206 413 203 205 100 201 202 201 401 405 407 1204 212 110 211 110 12 FIG. 12 FIG. Next, the feature registration process executed by the information processing systemB in the third embodiment is explained.is a flowchart illustrating a flow of the feature registration process in the third embodiment. Processes illustrated in S, S, Sto, Sto, and Sof the present flowchart are described in a program of a web application stored in the ROMor the storage deviceof the information processing apparatusB. The program is invoked by the CPU, expanded on the RAM, and executed by the CPU. Moreover, processes illustrated in S, Sto S, and Sof the present flowchart are described in a program of a web application expanded on the RAMof the client terminal, and are executed by the CPUof the client terminal. In, the same processes as those in the feature registration process of the first embodiment are denoted by the same reference numerals. Processes different from the first embodiment are mainly explained below.
401 403 211 110 100 201 100 110 1201 The processes of Sto Sare the same as those in the first embodiment. Specifically, the CPUof the client terminaltransmits an image (second image) to the information processing apparatusB as the execution request of the preliminary process of the face registration. The CPUof the information processing apparatusB receives the second image transmitted from the client terminal, and detects face regions of persons. Next, the process proceeds to S.
1201 201 100 403 In S, the CPUof the information processing apparatusB extracts the feature amount of the face for each of the one or multiple face regions detected in S. The extraction method of the feature amount of the face is the same as that in the first embodiment.
1202 201 1201 In S, the CPUencrypts each of the one or multiple features amounts of the faces extracted in S. In the present embodiment, a method of encryption is such a method that the feature amount is encrypted by using AES-256 algorithm.
1203 201 403 1202 110 In S, the CPUassociates the regions of the faces detected in Sand the feature amounts of the faces encrypted in Swith one another, and sends the regions and the feature amounts to the client terminalas a reply.
110 405 407 211 110 216 110 211 1204 The client terminalexecutes the processes of Sto Sas in the first embodiment. Specifically, the CPUof the client terminaldisplays the face regions received as the reply, on the display deviceof the client terminal. The CPUdesignates the specific face region, and obtains the consent to the registration of the feature information for this face region, from the user. Next, the process proceeds to S.
1204 211 110 100 In S, the CPUof the client terminaltransmits the feature amount encrypted and associated with the face region for which the consent is obtained from the user, to the information processing apparatusB as the execution request of the face registration.
1205 201 100 110 In S, the CPUof the information processing apparatusB receives the encrypted feature amount of the face, from the client terminalas the execution request of the face registration.
1206 201 1205 1202 In S, the CPUdecrypts the encrypted feature amount of the face received in S. The decryption is assumed to be a decryption algorithm corresponding to the encryption algorithm and a key used in S.
413 201 1206 205 In S, the CPUstores the feature amount of the face decrypted in Sin the storage deviceas the feature information. Then, the present flowchart is terminated.
100 As explained above, the information processing apparatusB of the third embodiment can set any image as the target and register the feature information for a specific person captured in an image while performing the detection of the face regions once. Moreover, even in the case where multiple faces are included in the image, the feature information can be registered for a specific face for which the user has given the consent.
100 110 100 Moreover, since the information processing apparatusB does not have to save the image and the feature information of the face is encrypted and exchanged, it is possible to reduce data holding cost and comply with legal restraints, and the service can be implemented without leakage of information. Furthermore, in the third embodiment, the image transmission from the client terminalto the information processing apparatusB only needs to be performed once, and the communication load can be reduced from those in the first and second embodiments.
Note that, although the example in which AES-256 algorithm is used in the encryption of the feature amounts is explained in the present embodiment, the present disclosure is not limited to this example. For example, the encryption may be implemented by using other encryption algorithms such as RSA and ECC.
Although the preferable embodiments of the present disclosure are explained above with reference to the attached drawings, the present disclosure is not limited to the above-mentioned examples. For example, although the feature registration process is executed with the information processing apparatus functioning as the server and exchanging data with the client terminal in the above-mentioned embodiments, the present disclosure is not limited to this. An information processing apparatus having a server function and a client function may execute the feature registration process of each of the embodiments described above. Moreover, although the registration of the feature information is performed with the face of the person being the target, the feature information is not limited to the feature amount of the face, and a feature amount of a part of the body other than the face or other portions may be extracted and registered. In addition, it is apparent that those skilled in the art can come up with various change examples and modified examples within the scope of the disclosed technical idea, and these examples are understood to also belong to the technical scope of the present disclosure as a matter of course.
The information processing apparatus of the present disclosure can provide a service complying with AI ethics and legal restraints in a service in which feature information of a person captured in an image is registered.
Embodiment(s) of the present disclosure can also be realized by a computer of a system or apparatus that reads out and executes computer executable instructions (e.g., one or more programs) recorded on a storage medium (which may also be referred to more fully as a ‘non-transitory computer-readable storage medium’) to perform the functions of one or more of the above-described embodiment(s) and/or that includes one or more circuits (e.g., application specific integrated circuit (ASIC)) for performing the functions of one or more of the above-described embodiment(s), and by a method performed by the computer of the system or apparatus by, for example, reading out and executing the computer executable instructions from the storage medium to perform the functions of one or more of the above-described embodiment(s) and/or controlling the one or more circuits to perform the functions of one or more of the above-described embodiment(s). The computer may comprise one or more processors (e.g., central processing unit (CPU), micro processing unit (MPU)) and may include a network of separate computers or separate processors to read out and execute the computer executable instructions. The computer executable instructions may be provided to the computer, for example, from a network or the storage medium. The storage medium may include, for example, one or more of a hard disk, a random-access memory (RAM), a read only memory (ROM), a storage of distributed computing systems, an optical disk (such as a compact disc (CD), digital versatile disc (DVD), or Blu-ray Disc (BD)™), a flash memory device, a memory card, and the like.
While the present disclosure has been described with reference to embodiments, it is to be understood that the present disclosure is not limited to the disclosed embodiments. The scope of the following claims is to be accorded the broadest interpretation so as to encompass all such modifications and equivalent structures and functions.
This application claims the benefit of Japanese Patent Application No. 2024-139059, filed Aug. 20, 2024, which is hereby incorporated by reference herein in its entirety.
Cooperative Patent Classification codes for this invention. Click any code to explore related patents in that topic.
July 21, 2025
February 26, 2026
Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.