A system for sensitive data protection in a video call/conference environment. In response to initiating a video call/conference and placing a video call participant on mute, the system uses Artificial Intelligence (AI) specifically computer vision to monitor for mouth movements by the muted video call participant that indicate speech. In response to the monitoring detecting mouth movements that indicate speech, one or more actions are performed that prevent other video call/conference participants from viewing the mouth movement indicating speech by the first call participant. The actions may include stopping/pausing the video feed or capturing of video, obfuscating the region in the video feed that includes the muted video call participant's mouth, or using AI to replace the mouth movements with images of the video call participant's mouth being stationary. The actions may be performed once Non-Public Information (NPI) is identified in the speech.
Legal claims defining the scope of protection, as filed with the USPTO.
a memory; at least one computing processor device in communication with the memory; an image-capturing device in communication with one or more of the at least one computing processor device; and initiate a video call amongst a plurality of call participants, receive, during the video call, an input that is configured to place a first call participant from amongst the plurality of call participants on mute, in response to placing the first call participant on mute, implement the AI comprising the computer vision to monitor for mouth movement that indicates speech by the first call participant, and in response to the monitoring detecting mouth movement indicating speech by the first call participant, perform one or more actions that prevent other call participants from amongst the plurality of call participants from viewing the mouth movement indicating speech by the first call participant. a video call application in communication with the image-capturing device and including Artificial Intelligence (AI) comprising computer vision, the video call application is stored in the memory, executable by one or more of the at least one computing processor device and configured to: a computing platform including: . A system for sensitive data leakage prevention, the system comprising:
claim 1 . The system of, wherein the video call application is further configured to perform the one or more actions that prevent the other call from viewing the mouth movement indicating speech by the first call participant, wherein the one or more actions includes pausing or stopping at least one of (i) capture of video by the image-capturing device or (ii) transmission of a video feed of the first call participant to the other call participants.
claim 1 . The system of, wherein the video call application is further configured to perform the one or more actions that prevent the other call participants from viewing the mouth movement indicating speech by the first call participant, wherein the one or more actions includes continually identifying a region within a video feed that includes a mouth of the first call participant and obfuscating the region within the video feed.
claim 1 . The system of, wherein the video call application is further configured to perform the one or more actions that prevent the other call participants from viewing the mouth movement indicating speech by the first call participant, wherein the one or more actions includes generating or retrieving one or more images that depict a mouth of the first call participant in a stationary position and superimposing the one or more images over the mouth movement in a video feed of the first call participant.
claim 1 . The system of, wherein the video call application further comprises Natural Language Processing (NLP) and wherein the video call application is further configured to implement the AI comprising the computer vision and the NLP to determine that the speech by the first call participant includes Non-Public Information (NPI).
claim 5 . The system of, wherein the video call application is further configured to perform the one or more actions in response to (i) the monitoring detecting mouth movement indicating speech by the first call participant and (ii) determination that the speech includes NPI.
claim 1 . The system of, wherein the video call application is further configured to, in response to the monitoring detecting mouth movement indicating speech by the first call participant, receive a second input from the first call participant that indicates that the first call participant desires to remain on mute.
claim 7 . The system of, wherein the video call application is further configured to, is further configured to perform the one or more actions in response to (i) the monitoring detecting mouth movement indicating speech by the first call participant and (ii) receiving the second input from the first call participant that indicates that the first call participant desires to remain on mute.
initiating a video call amongst a plurality of call participants; receiving, during the video call, an input that is configured to place a first call participant from amongst the plurality of call participants on mute; in response to placing the first call participant on mute, implementing Artificial Intelligence (AI) comprising computer vision to monitor for mouth movement that indicates speech by the first call participant; and in response to the monitoring detecting mouth movement indicating speech by the first call participant, performing one or more actions that prevent other call participants from amongst the plurality of call participants from viewing the mouth movement indicating speech by the first call participant. . A computer-implemented method for sensitive data leakage prevention, the computer-implemented method executed by one or more computing processor device and comprising:
claim 9 . The computer-implemented method of, wherein performing the one or more actions further defines the one or more actions as pausing or stopping at least one of (i) capture of video or (ii) transmission of a video feed of the first call participant to the other call participants.
claim 9 . The computer-implemented method of, wherein performing the one or more actions further defines the one or more actions as continually identifying a region within a video feed that includes a mouth of the first call participant and obfuscating the region within the video feed.
claim 9 . The computer-implemented method of, wherein the performing the one or more actions further defines the one or more actions as generating or retrieving one or more images that depict a mouth of the first call participant in a stationary position and superimposing the one or more images over the mouth movement in a video feed of the first call participant.
claim 9 . The computer-implemented method of, further comprising implementing the AI comprising the computer vision and Natural Language Processing (NLP) to determine that the speech by the first call participant includes Non-Public Information (NPI) and wherein performing the one or more actions further comprises in response to (i) the monitoring detecting mouth movement indicating speech by the first call participant and (ii) determination that the speech includes NPI, performing the one or more actions.
claim 9 . The computer-implemented method of, further comprising, in response to the monitoring detecting mouth movement indicating speech by the first call participant, receiving a second input from the first call participant that indicates that the first call participant desires to remain on mute and wherein performing the one or more actions further comprises in response to (i) the monitoring detecting mouth movement indicating speech by the first call participant and (ii) receiving the second input from the first call participant that indicates that the first call participant desires to remain on mute, performing the one or more actions.
initiate a video call amongst a plurality of call participants, receive, during the video call, an input that is configured to place a first call participant from amongst the plurality of call participant on mute, in response to placing the first call participant on mute, implement Artificial Intelligence (AI) comprising computer vision to monitor for mouth movement that indicates speech by the first call participant, and in response to the monitoring detecting mouth movement indicating speech by the first call participant, perform one or more actions that prevent other call participants from amongst the plurality of call participants from viewing the mouth movement indicating speech by the first call participant. . A computer program product including a non-transitory computer-readable medium, the non-transitory computer-readable medium comprising sets of codes for causing one or more computing devices to:
claim 15 . The computer program product of, wherein the set of codes for causing the one or more computing devices to perform the one or more actions further defines the one or more actions as pausing or stopping at least one of (i) capture of video or (ii) transmission of a video feed of the first call participant to the other call participants.
claim 15 . The computer program product of, wherein the set of codes for causing the one or more computing devices to perform the one or more actions further defines the one or more actions as continually identifying a region within a video feed that includes a mouth of the first call participant and obfuscating the region within the video feed.
claim 15 . The computer program product of, wherein the set of codes for causing the one or more computing devices to perform the one or more actions further defines the one or more actions as generating or retrieving one or more images that depict a mouth of the first call participant in a stationary position and superimposing the one or more images over the mouth movement in a video feed of the first call participant.
claim 15 . The computer program product of, wherein the sets of codes further comprise a set of codes for causing the one or more computing devices to implement the AI comprising the computer vision and Natural Language Processing (NLP) to determine that the speech by the first call participant includes Non-Public Information (NPI) and wherein the set of codes for causing the one or more computing devices to perform the one or more actions further comprises the set of codes for causing the one or more computing devices to, in response to (i) the monitoring detecting mouth movement indicating speech by the first call participant and (ii) determination that the speech includes NPI, perform the one or more actions.
claim 15 . The computer program product of, wherein the sets of codes further comprise a set of codes for causing the one or more computing devices to, in response to the monitoring detecting mouth movement indicating speech by the first call participant, receive a second input from the first call participant that indicates that the first call participant desires to remain on mute and wherein the set of codes for causing the one or more computing devices to perform the one or more actions further comprises the set of codes for causing the one or more computing devices to, in response to (i) the monitoring detecting mouth movement indicating speech by the first call participant and (ii) receiving the second input from the first call participant that indicates that the first call participant desires to remain on mute, performing the one or more actions.
Complete technical specification and implementation details from the patent document.
The present invention is generally directed to data security and, more specifically, preventing leakage of sensitive data, such as non-public information (NPI) or the like, while a video call/conference participant is placed on mute and is speaking to someone external from the video call/conference.
Video calls and video conferences have become a preferred means for communication between two or more call participants. In the instance of large video conferences, with hundreds of call participants many of the call participants may be unknown to the one or more of the other call participants. It is also not outside the realm of possibility that large video conferences may include uninvited call participants, so called “intruders” or “gate crashers” which may attend the video conferences with bad intentions, such as acquiring non-public information (NPI) from the call participants or the like.
While call participants may intentionally or unintentionally disclose NPI during a video conference when they are the designated speaker, there are other means by which NPI is disclosed or otherwise becomes available during a video conference. For example, a call participant may desire to place themselves on mute (i.e., temporarily disable the microphone, so that any sound/speech coming from call participant or the call participant's proximate environment is not transmitted to other call participants). In some instances, a call participant may desire to place themselves on hold in order to conduct a conversation external from the video conference (e.g., participate in a secondary voice call, speak with someone who has entered the room or the like). In many instances, the external conservation being conducted by the muted caller participant may include information, such as NPI or the like, that the call participant desires to keep private.
In most instances, in which the caller participant is on mute and conducting a conversation external from the video conference, the call participant will still be within view of the camera/video capturing device and, as such, other video call participants can still view the muted caller participant. This opens the possibility that other video call participants who have speechreading (e.g., lip reading, gesture interpretation and the like) capabilities may be able to discern what the muted call participant is saying as part of the offline/external conversation. This issue becomes problematic if uninvited call participants attending with hopes of acquiring NPI also possess speechreading capabilities. Even if the call participants do not possess actual speechreading capabilities, they may have access to or otherwise implement lip reading-Artificial Intelligence (AI) or another automatic lip-reading system to discern the inaudible information being conveyed by the muted call participant during the offline/external conversation.
Therefore, a need exists to develop systems, computer-implemented methods, computer program products or the like that serve to effectively and efficiently protect information, such as NPI, during a video call/conference. Specifically, the desired systems and the like should serve to prevent a muted video call/conference participant from conveying information, such as NPI or the like, while on mute (i.e., while conducting and offline/external conversation). In this regard, the muted video call/conference participant should be able to conduct offline/external conversations without being in peril that the information in the offline/external conversation is being discerned by other video call participants.
The following presents a simplified summary of one or more embodiments of the invention in order to provide a basic understanding of such embodiments. This summary is not an extensive overview of all contemplated embodiments and is intended to neither identify key or critical elements of all embodiments, nor delineate the scope of any or all embodiments. Its sole purpose is to present some concepts of one or more embodiments in a simplified form as a prelude to the more detailed description that is presented later.
Embodiments of the present invention address the above needs and/or achieve other advantages by providing for protection of sensitive data, such as Non-Public Information (NPI) within a video call/conference environment when a call participant is on mute (i.e., when the call participant's microphone is temporarily disabled). The invention recognizes when a call participant is placed on mute and, in response, implements Artificial Intelligence (AI) specifically computer vision to monitor for mouth movements by the muted call participant that indicate speech. In this regard, the muted call participant may be conducting an offline/external conversation, such as offline/external voice call or a conversation with someone in the room (i.e., the call participant's video call environment). In response to the AI/computer vision detecting mouth movements by the muted call participant that indicate speech, the invention performs one or more actions that prevent other video call participants from viewing the mouth movement indicating speech by the first call participant.
In specific embodiments of the invention, the one or more actions may include, but are not limited to, (1) pausing or stopping at least one of (i) capture of video by the image-capturing device or (ii) transmission of a video feed of the first call participant (captured by the image-capturing device) to the other call participants, (2) continually identifying a region within a video feed that includes a mouth of the first call participant and obfuscating the region within the video feed (e.g., blurring/pixelating the first call participant's mouth or the like), and (3) generating, using AI or the like, or retrieving one or more images that depict a mouth of the first call participant in a stationary position and superimposing the one or more images over the mouth movement in a video feed of the first call participant (i.e., not letting other video call participants know that the muted call participant is conducting an offline/external conversation).
In other specific embodiments of the invention, Natural Language Processing (NLP) is implemented to determine that the speech by the muted call participant includes Non-Public Information (i.e., “read” the speech and identify NPI in the speech). In such embodiments of the invention, the determination that the speech includes NPI is a prerequisite to performing at least one of the one or more actions that prevent other video call participants from viewing the mouth movement indicating speech by the first call participant.
In still other specific embodiments of the invention, in response to the AI/computer vision detecting mouth movements by the muted call participant that indicate speech, the muted call participant is reminded that they are on mute and given the opportunity to unmute themselves prior to the one or more actions being performed. In other words, the one or more actions are performed only after the muted call participant provides an input that indicates their desire to remain on mute.
A system for sensitive data leakage prevention defines first embodiments of the invention. The system includes a computing platform having a memory, at least one computing processor device in communication with the memory and an image-capturing device in communication with one or more of the at least one computing processor device. The computing platform additionally include a video call/conference application that is in communication with the image-capturing device and includes Artificial Intelligence (AI) having computer vision capabilities. The video call/conference application is stored in the memory and executable by one or more of the at least one computing processor device. The video call/conference application is configured to initiate a video call/conference amongst a plurality of call participants and receive, during the video call, an input that requests for a first call participant from amongst the plurality of call participants to be placed on mute (i.e., for the first call participant's microphone to be temporarily disabled). In response to placing the first call participant on mute, the video call/conference application is further configured to implement the AI, specifically the computer vision capabilities to monitor for mouth movement that indicates speech by the first call participant, and, in response to the monitoring detecting mouth movement indicating speech by the first call participant, perform one or more actions that prevent other call participants from amongst the plurality of call participants from viewing the mouth movement indicating speech by the first call participant.
In specific embodiments of the system, the video call/conference application is further configured to perform the one or more actions which includes, at least, pausing or stopping at least one of (i) capture of video by the image-capturing device or (ii) transmission of a video feed of the first call participant (captured by the image-capturing device) to the other call participants. In other related embodiments of the system, the video call/conference application is further configured to perform the one or more actions which includes, at least, continually identifying a region within a video feed that includes a mouth of the first call participant and obfuscating the region within the video feed (e.g., blurring/pixelating the first call participant's mouth or the like). In still further related embodiments of the system, the video call/conference application is further configured to perform the one or more actions which includes, at least, generating, using AI or the like, or retrieving one or more images that depict a mouth of the first call participant in a stationary position and superimposing the one or more images over the mouth movement in a video feed of the first call participant (i.e., not letting other video call participants know that the muted call participant is conducting an offline/external conversation).
In further embodiments of the system, the video call application further comprises Natural Language Processing (NLP), which is implemented to determine that the speech by the first call participant includes Non-Public Information (i.e., “read” the speech coming from mouth movements and identify NPI in the speech). In such embodiments of the system, the video call application is further configured to perform at least one of the one or more actions in response to both the (i) the monitoring detecting mouth movement indicating speech by the first call participant and (ii) determination that the speech includes NPI.
In further embodiments of the system, the video call application is further configured to, in response to the monitoring detecting mouth movement indicating speech by the first call participant, receive a second input from the first call participant that indicates that the first call participant desires to remain on mute. For example, the video call application may be configured to present the first call participant a pop-up window or the like that asks the first call participant if they desire to remain on mute (i.e., the first call participant may be unaware or has forgotten they are on mute and the speech is intended for inclusion in the video call/conference). In such embodiments of the system, the video call application is further configured to perform at least one of the one or more actions in response to both the (i) the monitoring detecting mouth movement indicating speech by the first call participant and (ii) receiving the second input from the first call participant that indicates that the first call participant desires to remain on mute.
A computer-implemented method for sensitive data leakage prevention defines second embodiments of the invention. The computer-implemented method is executed by one or more computing processor device. The method includes initiating a video call amongst a plurality of call participants, and receiving, during the video call, an input that requests for a first call participant from amongst the plurality of call participants to be placed on mute. In response to placing the first call participant on mute, the computer-implemented method further includes implementing Artificial Intelligence (AI) specifically computer vision capabilities to monitor for mouth movement that indicates speech by the first call participant. In response to the monitoring detecting mouth movement indicating speech by the first call participant, the computer-implemented method includes performing one or more actions that prevent other call participants from amongst the plurality of call participants from viewing the mouth movement indicating speech by the first call participant.
In specific embodiments of the computer-implemented method, performing the one or more actions further defines the one or more actions as pausing or stopping at least one of (i) capture of video or (ii) transmission of a video feed of the first call participant to the other call participants. In other related embodiments of the computer-implemented method, performing the one or more actions further defines the one or more actions as continually identifying a region within a video feed that includes a mouth of the first call participant and obfuscating (e.g., blurring, pixelating or the like) the region within the video feed. In still further embodiments of the computer-implemented method, performing the one or more actions further defines the one or more actions as generating, using AI or the like, or retrieving from memory one or more images that depict a mouth of the first call participant in a stationary position and superimposing the one or more images over the mouth movement in a video feed of the first call participant.
In specific embodiments the computer-implemented method further includes implementing the AI specifically the computer vision and Natural Language Processing (NLP) to determine that the speech by the first call participant includes Non-Public Information (NPI). In such embodiments of the computer-implemented method, performing the one or more actions further includes in response to (i) the monitoring detecting mouth movement indicating speech by the first call participant and (ii) determination that the speech includes NPI, performing the one or more actions.
In further specific embodiments the computer-implemented method includes, in response to the monitoring detecting mouth movement indicating speech by the first call participant, receiving a second input from the first call participant that indicates that the first call participant desires to remain on mute. In such embodiments of the computer-implemented method, performing the one or more actions further includes in response to (i) the monitoring detecting mouth movement indicating speech by the first call participant and (ii) receiving the second input from the first call participant that indicates that the first call participant desires to remain on mute, performing the one or more actions.
A computer program product including a non-transitory computer-readable medium defines third embodiments of the invention. Th non-transitory computer-readable medium includes sets of codes for causing one or more computing devices to initiate a video call amongst a plurality of call participants, and receive, during the video call, an input that requests for a first call participant from amongst the plurality of call participants to be placed on mute. In response to placing the first call participant on mute, the sets of codes further cause the one or more computing devices to implement Artificial Intelligence (AI) comprising computer vision to monitor for mouth movement that indicates speech by the first call participant. In response to the monitoring detecting mouth movement indicating speech by the first call participant, the sets of codes further cause the one or more computing devices to perform one or more actions that prevent other call participants from amongst the plurality of call participants from viewing the mouth movement indicating speech by the first call participant.
In specific embodiments of the computer program product, the set of codes for causing the one or more computing devices to perform the one or more actions further defines the one or more actions as pausing or stopping at least one of (i) capture of video or (ii) transmission of a video feed of the first call participant to the other call participants. In related embodiments of the computer program product, the set of codes for causing the one or more computing devices to perform the one or more actions further defines the one or more actions as continually identifying a region within a video feed that includes a mouth of the first call participant and obfuscating the region within the video feed. In still other related embodiments of the computer program product, the set of codes for causing the one or more computing devices to perform the one or more actions further defines the one or more actions as generating, via AI or the like, or retrieving from memory one or more images that depict a mouth of the first call participant in a stationary position and superimposing the one or more images over the mouth movement in a video feed of the first call participant.
In other specific embodiments of the computer program product, the sets of codes further comprise a set of codes for causing the one or more computing devices to implement the AI comprising the computer vision and Natural Language Processing (NLP) to determine that the speech by the first call participant includes Non-Public Information (NPI). In such embodiments of the computer program product, the set of codes for causing the one or more computing devices to perform the one or more actions further comprises the set of codes for causing the one or more computing devices to, in response to (i) the monitoring detecting mouth movement indicating speech by the first call participant and (ii) determination that the speech includes NPI, perform the one or more actions.
In other specific embodiments of the computer program product, the sets of codes further comprise a set of codes for causing the one or more computing devices to, in response to the monitoring detecting mouth movement indicating speech by the first call participant, receive a second input from the first call participant that indicates that the first call participant desires to remain on mute. In such embodiments of the computer program product, the set of codes for causing the one or more computing devices to perform the one or more actions further comprises the set of codes for causing the one or more computing devices to, in response to (i) the monitoring detecting mouth movement indicating speech by the first call participant and (ii) receiving the second input from the first call participant that indicates that the first call participant desires to remain on mute, performing the one or more actions.
Thus, as described in detail above, present embodiments of the invention include systems, methods, computer program products and/or the like that protect sensitive data protection, such as Non-Public Information (NPI) or the like in a video call/conference environment. Specifically, in response to initiating a video call/conference and placing a video call participant on mute, the system uses Artificial Intelligence (AI) specifically computer vision to monitor for mouth movements by the muted video call participant that indicate speech. In response to the monitoring detecting mouth movements that indicate speech, one or more actions are performed that prevent other video call/conference participants from viewing the mouth movement indicating speech by the first call participant. The actions may include, but are not limited to, stopping/pausing the video feed or capturing of video, obfuscating the region in the video feed that includes the muted video call participant's mouth, or using AI to replace the mouth movements with images of the video call participant's mouth being stationary. In specific embodiments one or more of the actions may be performed once Non-Public Information (NPI) is identified in the speech and/or after the muted video call participant has indicated a desire to remain on mute.
The features, functions, and advantages that have been discussed may be achieved independently in various embodiments of the present invention or may be combined with yet other embodiments, further details of which can be seen with reference to the following description and drawings.
Embodiments of the present invention will now be described more fully hereinafter with reference to the accompanying drawings, in which some, but not all, embodiments of the invention are shown. Indeed, the invention may be embodied in many different forms and should not be construed as limited to the embodiments set forth herein; rather, these embodiments are provided so that this disclosure will satisfy applicable legal requirements. Like numbers refer to like elements throughout.
As will be appreciated by one of skill in the art in view of this disclosure, the present invention may be embodied as a system, a method, a computer program product, or a combination of the foregoing. Accordingly, embodiments of the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment (including firmware, resident software, micro-code, a.), or an embodiment combining software and hardware aspects that may be referred to herein as a “system.” Furthermore, embodiments of the present invention may take the form of a computer program product comprising a computer-usable storage medium having computer-usable program code/computer-readable instructions embodied in the medium.
Any suitable computer-usable or computer-readable medium may be utilized. The computer usable or computer-readable medium may be, for example but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device. More specific examples (e.g., a non-exhaustive list) of the computer-readable medium would include the following: an electrical connection having one or more wires; a tangible medium such as a portable computer diskette, a hard disk, a time-dependent access memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or Flash memory), a compact disc read-only memory (CD-ROM), or other tangible optical or magnetic storage device.
Computer program code/computer-readable instructions for conducting operations of embodiments of the present invention may be written in an object oriented, scripted, or unscripted programming language such as JAVA, PERL, SMALLTALK, C++, PYTHON, or the like. However, the computer program code/computer-readable instructions for conducting operations of the invention may also be written in conventional procedural programming languages, such as the “C” programming language or similar programming languages.
Embodiments of the present invention are described below with reference to flowchart illustrations and/or block diagrams of methods or systems. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a particular machine, such that the instructions, which execute by the processor of the computer or other programmable data processing apparatus, create mechanisms for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instructions, which implement the function/act specified in the flowchart and/or block diagram block or blocks.
The computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational events to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions, which execute on the computer or other programmable apparatus, provide events for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks. Alternatively, computer program implemented events or acts may be combined with operator or human implemented events or acts in order to conduct an embodiment of the invention.
As the phrase is used herein, a processor may be “configured to” perform or “configured for” performing a certain function in a variety of ways, including, for example, by having one or more general-purpose circuits perform the function by executing particular computer-executable program code embodied in computer-readable medium, and/or by having one or more application-specific circuits perform the function.
“Computing platform” or “computing device” as used herein refers to a networked computing device within the computing system. The computing platform includes a processor, a non-transitory storage medium (i.e., memory), a communications device, and a display. The computing platform may be configured to support user logins and inputs from any combination of similar or disparate devices. Accordingly, the computing platform includes servers, personal desktop computer, laptop computers, mobile computing devices and the like.
Thus, systems, apparatus, and methods are described in detail below that provide for protection of sensitive data, such as Non-Public Information (NPI) within a video call/conference environment when a call participant is on mute (i.e., when the call participant's microphone is temporarily disabled). The invention recognizes when a call participant is placed on mute and, in response, implements Artificial Intelligence (AI) specifically computer vision to monitor for mouth movements by the muted call participant that indicate speech. In this regard, the muted call participant may be conducting an offline/external conversation, such as offline/external voice call or a conversation with someone in the room (i.e., the call participant's video call environment). In response to the AI/computer vision detecting mouth movements by the muted call participant that indicate speech, the invention performs one or more actions that prevent other video call participants from viewing the mouth movement indicating speech by the first call participant.
In specific embodiments of the invention, the one or more actions may include, but are not limited to, (1) pausing or stopping at least one of (i) capture of video by the image-capturing device or (ii) transmission of a video feed of the first call participant (captured by the image-capturing device) to the other call participants, (2) continually identifying a region within a video feed that includes a mouth of the first call participant and obfuscating the region within the video feed (e.g., blurring/pixelating the first call participant's mouth or the like), and (3) generating, using AI or the like, or retrieving one or more images that depict a mouth of the first call participant in a stationary position and superimposing the one or more images over the mouth movement in a video feed of the first call participant (i.e., not letting other video call participants know that the muted call participant is conducting an offline/external conversation).
In other specific embodiments of the invention, Natural Language Processing (NLP) is implemented to determine that the speech by the muted call participant includes Non-Public Information (i.e., “read” the speech and identify NPI in the speech). In such embodiments of the invention, the determination that the speech includes NPI is a prerequisite to performing at least one of the one or more actions that prevent other video call participants from viewing the mouth movement indicating speech by the first call participant.
In still other specific embodiments of the invention, in response to the AI/computer vision detecting mouth movements by the muted call participant that indicate speech, the muted call participant is reminded that they are on mute and given the opportunity to unmute themselves prior to the one or more actions being performed. In other words, the one or more actions are performed only after the muted call participant provides an input that indicates their desire to remain on mute.
1 FIG. 100 1 Referring to, a schematic/block is presented of a system-for prevention of sensitive data leakage, such as non-public information (NPI) during a video call/conference when a call participant is placed on mute, in accordance with embodiments of the present invention. Sensitive data, as used herein may include, but is not limited to, private data and/or confidential data, including personal data including name, address, biometric data, legal data, employment data, intellectual property data and the like. Non-Public Information is a special type of sensitive data protected under various laws, regulations and/or industry standards, such as, but not limited to, financial information (e.g., account numbers, transaction records, personal identification numbers and the like), health data and the like.
100 1 110 200 1 200 5 120 1 120 5 200 202 204 202 202 210 204 The system-is implemented amongst a distributed communication network, which may include the Internet, one or more intranets, cellular network(s) or the like. The system includes computing platforms---, each associated with a corresponding video call participant---. The computing platformincludes memoryand one or more computing processor devicesin communication with memory. Memorystores video call/conference application, which is executable by at least one of the computing processor device(s). Video call/conference application may a commercial application, such as WEBEX, ZOOM, Microsoft TEAMS or the like or may be a non-commercial custom application.
210 220 220 1 210 230 120 1 120 5 230 210 240 120 1 242 208 Video call/conference applicationincludes artificial intelligence, specifically computer vision-, which is used to replicate the complexity of human vision and the human mind's ability to recognize objects, analyze scenes and understand visual cues. Video call/conference applicationis configured to initiate a video call/conferenceamongst a plurality of video call participants---. In response to initiating the video call/conference, video call/conference applicationis configured to receive an inputthat is configured to place a first video call participant-on mute(i.e., temporarily disable the sound-capturing device(e.g., microphone)).
120 1 242 220 220 1 250 252 120 1 254 252 254 252 254 254 120 1 242 120 1 230 In response to placing the first video call participant-on mute, the video call/conference application is further configured to implement the AI, specifically the computer vision-to monitorfor mouth movementby the first video call participant-that indicates speech. In this regard, the invention realizes that certain mouth movements will not indicate speech or otherwise rise to the level of speech and, thus, is able to discern between mouth movementsthat indicate speechand mouth movementsthat do not indicate speech. The speechmay be the reason why the first video call participant-requested to be placed on mute. For example, the first video call participant-may need to conduct an offline conversation with a family member or conduct a secondary voice call with someone external from the video call/conference.
250 252 254 120 1 210 260 262 252 120 2 120 5 120 2 120 5 120 1 260 3 FIG. In response to the monitoringdetecting mouth movementindicating speechby the first video call participant-, video call/conference applicationis configured to perform one or more actionsthat prevent viewingof the mouth movementsby the other video call participants (---). In this regard, the present invention takes into account that the other video call participants (---) may be capable of speechreading (e.g., lip reading) or employ automated means, such as Artificial Intelligence, for speechreading, and, thus, discern what the first video call participant-is saying. Examples of the action(s)that may be taking are discussed in more detail infra., in relation to.
2 FIG. 100 2 Referring to, a schematic/block is presented of a system-for prevention of sensitive data leakage in a video call/conference environment, in accordance with embodiments of the present invention. Sensitive data, as used herein may include, but is not limited to, private data and/or confidential data, including personal data including name, address, biometric data, legal data, employment data, intellectual property data and the like. Non-Public Information is a special type of sensitive data protected under various laws, regulations and/or industry standards, such as, but not limited to, financial information (e.g., account numbers, transaction records, personal identification numbers and the like), health data and the like.
100 2 110 200 1 200 5 120 1 120 5 200 202 204 202 202 210 204 210 220 The system-is implemented amongst a distributed communication network, which may include the Internet, one or more intranets, cellular network(s) or the like. The system includes computing platforms---, each associated with a corresponding video call participant---. The computing platformincludes memoryand one or more computing processor devicesin communication with memory. Memorystores video call/conference application, which is executable by at least one of the computing processor device(s). Video call/conference application may a commercial application, such as FACETIME, WEBEX, ZOOM, TEAMS or the like or may be a non-commercial custom application. Video call/conference applicationincludes artificial intelligence.
210 230 120 1 120 5 230 210 220 300 310 320 330 120 1 120 Video call/conference applicationis configured to initiate a video call/conferenceamongst a plurality of video call participants---. In response to initiating the video call/conference, video call/conference applicationis configured to implement the AIto detectsensitive datain at least one of a video feedor an audio feedbeing transmitted by a first call participant-from amongst the plurality of call participants.
300 310 320 330 120 1 210 340 342 310 120 120 2 120 5 260 4 FIG. In response to detectingthe sensitive datain at least one of the video feedor the audio feedof the first call participant-, the video call/conference applicationis configured to perform one or more actionsthat prevent viewing or hearingof the sensitive databy one or more other video call participants(e.g.,---). Examples of the action(s)that may be taking are discussed in more detail infra., in relation to.
3 FIG. 1 FIG. 1 FIG. 200 200 200 202 202 Referring to, a block diagram is depicted of computing platformhighlighting various alternate embodiments of the system shown and described in relation to, in accordance with embodiments of the present invention. Computing platformmay comprise one or multiple computing devices, such as personal computers (PCs), laptops, mobile communication devices (e.g., smart phones), tablet devices or the like or the like. As previously discussed in relation to, computing platformincludes memory, which may comprise volatile and/or non-volatile memory, such as read-only memory (ROM) and/or random-access memory (RAM), EPROM, EEPROM, flash cards, or any memory common to computing platforms. Moreover, memorymay comprise cloud storage, such as provided by a cloud storage service and/or a cloud connection service.
200 204 204 205 210 202 200 200 200 200 110 200 210 3 FIG. 1 FIG. Further, computing platformincludes one or more computing processor devices, which may be an application-specific integrated circuit (“ASIC”), or other chipset, logic circuit, or other data processing device. Computing processor device(s)may execute one or more application programming interface (APIs)that interface with any resident programs, such as video call/conference applicationor the like, stored in memoryof computing platformand any external programs. Computing platformincludes various processing sub-systems (not shown in) embodied in hardware, firmware, software, and combinations thereof, that enable the functionality of computing platformand the operability of computing platformon a distributed communication network, such as distributed communication networkshown in. For example, processing sub-systems allow for initiating and maintaining communications and exchanging data with other networked devices. For the disclosed aspects, processing sub-systems of computing platformincludes any processing sub-system portion used in conjunction with s video call/conference applicationand engines, tools, routines, sub-routines, applications, sub-applications, sub-modules thereof.
200 200 3 FIG. In specific embodiments of the present invention, computing platformadditionally includes a communications module (not shown in) embodied in hardware, firmware, software, and combinations thereof, that enables electronic communications between components of computing platformand other networks and network devices. Thus, communication module includes the requisite hardware, firmware, software and/or combinations thereof for establishing and maintaining a network communication connection with one or more devices and/or networks.
1 FIG. 202 210 204 210 220 220 1 220 220 2 As previously discussed in relation to, memorystores video call/conference applicationthat is executable by one or more of the computing processor device(s). Video call/conference applicationincludes Artificial Intelligence (AI), specifically computer vision-, which is used to replicate the complexity of human vision and the human mind's ability to recognize objects, analyze scenes and understand visual cues. In additional embodiments of the invention, AIincludes Natural Language Processing-, which is capable of understanding and interpreting human language.
1 FIG. 210 230 120 120 1 120 5 230 210 240 120 1 242 208 As previously discussed in relation to, video call/conference applicationis configured to initiate a video call/conferenceamongst a plurality of video call participants(e.g.,---. In response to initiating the video call/conference, video call/conference applicationis configured to receive an input, herein first inputthat is configured to place a first video call participant-on mute(i.e., temporarily disable the sound-capturing device(e.g., microphone)).
120 1 242 220 220 1 250 252 120 1 254 252 254 252 254 254 120 1 242 120 1 230 In response to placing the first video call participant-on mute, the video call/conference application is further configured to implement the AI, specifically the computer vision-to monitorfor mouth movementby the first video call participant-that indicates speech. In this regard, the invention realizes that certain mouth movements will not indicate speech or otherwise rise to the level of speech and, thus, is able to discern between mouth movementsthat indicate speechand mouth movementsthat do not indicate speech. The speechmay be the reason why the first video call participant-requested to be placed on mute. For example, the first video call participant-may need to conduct an offline conversation with a family member or conduct a secondary voice call with someone external from the video call/conference.
220 220 2 254 310 310 1 In specific embodiments of the invention, the video call/conference application is further configured to implement the AI, specifically the NLP-to determine whether the speechincludes sensitive data, specifically Non-Public Information (NPI)-or the like.
250 252 254 210 280 120 1 282 252 254 120 1 282 254 230 120 1 242 280 210 120 1 242 208 282 In further specific embodiments of the invention, in response to the monitoringdetecting mouth movementsthat indicate speech, the video call/conference applicationis configured to receive a second inputthat indicates that the first call participant-desires to remain muted. The present invention realizes that detection of mouth movementthat indicates speechmay be indicative of the video call participant-failing to realize that they are on mute(i.e., the speechmay be intended for the video call/conferencebut is not being transmitted because the video call participant-is on mute). As such, second inputmay result from the video call/conference applicationpresenting the first video call participant-, via a pop-up window or the like, an option to disable mute(i.e., enable the sound-capturing device) or remain muted.
250 252 254 120 1 254 310 310 1 210 260 262 252 120 2 120 5 120 2 120 5 120 1 In response to the monitoringdetecting mouth movementindicating speechby the first video call participant-and in some embodiments of the invention determining that the speechincludes sensitive data, specifically NPI-, video call/conference applicationis configured to perform one or more actionsthat prevent viewingof the mouth movementsby the other video call participants (---). In this regard, the present invention takes into account that the other video call participants (---) may be capable of speechreading (e.g., lip reading) or employ automated means, such as Artificial Intelligence, for speechreading, and, thus, discern what the first video call participant-is saying.
260 264 266 206 268 206 260 122 120 1 270 260 220 202 274 272 274 252 120 1 120 1 120 120 2 120 5 120 1 The actionsmay include pausing/stoppingat least one of (i) video captureby the image-capturing deviceand (ii) video transmissionof video captured by the image-capturing device. Further actionsmay include continually identifying the region of the video that includes the mouthof the first video call participant-and obfuscating(e.g., blurring, pixelating or the like) the identified region. In additional embodiments of the invention, actionsmay include generating, using AIor the like or retrieving from memoryone or more imagesthat depict a mouth of the first call participant in a stationary position and superimposingthe one or more imagesover the mouth movementsin a video feed of the first call participant-. In this regard, since the other video call participants see/view the mouth of the first video call participant-in a stationary position, the other video call participants(e.g.,---) are led to believe that the first video call participant-is not engaged in speech.
4 FIG. 2 FIG. 2 FIG. 200 200 200 202 202 Referring to, a block diagram is depicted of computing platformhighlighting various alternate embodiments of the system shown and described in relation to, in accordance with embodiments of the present invention. Computing platformmay comprise one or multiple computing devices, such as personal computers (PCs), laptops, mobile communication devices (e.g., smart phones), tablet devices or the like or the like. As previously discussed in relation to, computing platformincludes memory, which may comprise volatile and/or non-volatile memory, such as read-only memory (ROM) and/or random-access memory (RAM), EPROM, EEPROM, flash cards, or any memory common to computing platforms. Moreover, memorymay comprise cloud storage, such as provided by a cloud storage service and/or a cloud connection service.
200 204 204 205 210 202 200 200 200 200 110 200 210 4 FIG. 2 FIG. Further, computing platformincludes one or more computing processor devices, which may be an application-specific integrated circuit (“ASIC”), or other chipset, logic circuit, or other data processing device. Computing processor device(s)may execute one or more application programming interface (APIs)that interface with any resident programs, such as video call/conference applicationor the like, stored in memoryof computing platformand any external programs. Computing platformincludes various processing sub-systems (not shown in) embodied in hardware, firmware, software, and combinations thereof, that enable the functionality of computing platformand the operability of computing platformon a distributed communication network, such as distributed communication networkshown in. For example, processing sub-systems allow for initiating and maintaining communications and exchanging data with other networked devices. For the disclosed aspects, processing sub-systems of computing platformincludes any processing sub-system portion used in conjunction with s video call/conference applicationand engines, tools, routines, sub-routines, applications, sub-applications, sub-modules thereof.
200 200 4 FIG. In specific embodiments of the present invention, computing platformadditionally includes a communications module (not shown in) embodied in hardware, firmware, software, and combinations thereof, that enables electronic communications between components of computing platformand other networks and network devices. Thus, communication module includes the requisite hardware, firmware, software and/or combinations thereof for establishing and maintaining a network communication connection with one or more devices and/or networks.
2 FIG. 202 210 204 210 220 220 1 220 2 220 3 210 222 As previously discussed in relation to, memorystores video call/conference applicationthat is executable by one or more of the computing processor device(s). Video call/conference applicationincludes artificial intelligence, which may include computer vision-, NLP-and facial recognition-. In addition, video conference applicationmay include optical character recognition (OCR), which may or may not AI techniques.
210 230 120 1 120 5 230 210 220 300 310 320 330 120 1 120 Video call/conference applicationis configured to initiate a video call/conferenceamongst a plurality of video call participants---. In response to initiating the video call/conference, video call/conference applicationis configured to implement the AIto detectsensitive datain at least one of a video feedor an audio feedbeing transmitted by a first call participant-from amongst the plurality of call participants.
310 320 322 320 120 1 322 120 1 310 320 324 120 1 324 206 220 3 222 322 324 The sensitive datain the video feedmay include, but is not limited to, text/indiciathat may show up in the background of the video feedor is in the possession of first video call participant-. For example, the text/indiciamay be on a whiteboard/chalkboard or the like or may printed on materials held by the first video call participant-. In other instances, the sensitive datain the video feedmay be images of individualsother than the first video call participant-. For example, the images of individualsmay be photographs of family members or friends in the background or may be actual individuals that come within the field of view of the image-capturing device. In specific embodiments of the invention, facial recognition-and/or optical character recognitionmay be implemented to determine whether the text/indiciaor the imagesincludes sensitive data, or more specifically, non-public information (NPI).
310 330 332 102 1 220 2 The sensitive datain the audio feedmay be voices of other individuals/speakers, such as family members, work colleagues or the like either addressing the first video call participant-or conducting conversations with other individuals. In specific embodiments of the invention, NLP-may be implemented to determine whether the output of the voices of the other individuals includes sensitive data, or more specifically non-public information (NPI).
300 310 320 330 120 1 210 340 342 310 120 120 2 120 5 340 120 1 120 1 370 340 In response to detectingthe sensitive datain at least one of the video feedor the audio feedof the first call participant-, the video call/conference applicationis configured to perform one or more actionsthat prevent viewing or hearingof the sensitive databy one or more other video call participants(e.g.,---). In specific embodiments of the invention, prior to performing at least one of the one or more actions, the first video call participant-may be presented with a dialog box or the like that requests that the first video call participant-provide permissionfor performing that at one of the one or more actions.
340 350 322 324 340 360 332 The actionsmay include, but are not limited to obfuscation(e.g., blurring, pixelating or the like) the text/indiciaand/or the images of the individuals. In other embodiments of the invention, the actionsmay include, but are not limited to, implementing noise reductiontechniques to reduce or in some instances eliminate the background voices of other individuals/speakers.
5 FIG. 500 510 520 Referring to, a flow diagram is presented of a methodfor prevention of sensitive data leakage, such as non-public information (NPI) during a video call/conference when a call participant is placed on mute, in accordance with embodiments of the present invention. At Event, a video call/conference is initiated amongst multiple video call participants and, at Event, an input is received from one of the video call participants that is configured to place the video call participant on mute (i.e., disable the microphone).
530 At Decision, a determination is made as to whether mouth movement, by the muted video call participant, which indicates speech is detected. If mouth movement indicating speech is not detected, the monitoring of the muted video call participant for mouth movement indicating speech continues as long as the video call participant remains on mute.
540 550 If mouth movement indicating speech is detected, at Decision, a determination is made as to whether the muted video call participant desires to remain on mute. In specific embodiments of the method, detection of mouth movement indicating speech may prompt a pop-up window or the like to be displayed which asks the video call participant if they desire to remain on mute. If the video call participant indicates that they no longer desire to remain on mute, at Event, the video call participant is unmuted (i.e., the microphone is activated, such that other video call participants can receive audio feed from the video call participant).
560 570 If the video call participant indicates that they desire to remain on mute, at Decision, a determination is made as to whether the speech includes non-public information, i.e., personal information that the video call participant would not want made public. Such a determination may be made via use of NLP or the like. If a determination is made the speech does not include non-public information, the method returns to visually monitoring the muted video call participant for mouth movements that indicate speech. If a determination is made that the speech does include NPI, at Event, one or more actions are performed that prevent other video call participants from viewing the mouth movements of the muted video call participant. Examples of other actions, include pausing or stopping the video feed (or the capture of video), obfuscating regions of the video feed that include the mouth of the muted video call participant or superimposing stationary images of the muted video call participant's mouth over the moving images.
6 FIG. 600 610 620 Referring toa flow diagram is presented of a methodfor prevention of sensitive data leakage, such as non-public information (NPI) during a video call/conference when a call participant is placed on mute, in accordance with embodiments of the present invention. At Event, a video call/conference is initiated amongst multiple video call participants and, at Event, the audio and video feed of the video call is monitored for sensitive data.
630 620 At Decision, a determination is made as to whether the video or audio feed includes sensitive data. If the video or audio feed does not include sensitive data, the method returns to Eventfor further monitoring of the video call for sensitive data in the audio or video feed.
640 620 If sensitive data is detected in the audio or video feed, at Decision, a determination is made as to whether the sensitive data includes non-public information, i.e., personal information that the video call participant would not want made public. Such a determination may be made via use of NLP, OCR or the like. If a determination is made the sensitive data does not include non-public information, the method returns to Eventfor further monitoring of the video call for sensitive data in the audio or video feed.
650 620 If the sensitive data does include non-public information, at Decision, a determination is made as to whether the voice call participate has provided permission for further actions. In specific embodiments of the method, detection of sensitive data or NPI may prompt a pop-up window or the like to be displayed which asks the video call participant if they desire for further actions to be taken on the sensitive data/NPI. If the video call participant does not provide permission, the method returns to Eventfor further monitoring of the video call for sensitive data in the audio or video feed.
660 If the video call participant provides permission, at Event, one or more actions are performed that prevent other video call participants from viewing or hearing the sensitive data. Examples of other actions, include, but are not limited to, obfuscating regions of the video feed that include the sensitive data or implementing noise reduction in the audio feed to mask audible sensitive data/NPI.
7 FIG. 700 710 720 Referring to, a flow diagram is presented of a methodfor prevention of sensitive data leakage, such as non-public information (NPI) during a video call/conference, in accordance with embodiments of the present invention. At Event, a video call/conference is initiated amongst a plurality of video call participants and, at Event, an input is received, during the video call/conference, which is configured to place one of the video call participants on mute (i.e., disable the video call participant's microphone).
730 740 In response to placing the video call participant on mute, at Event, artificial intelligence including computer vision is implemented to monitor for detection of mouth movement by the muted video call participant that indicates speech by the muted video call participant. In response to the monitoring detecting mouth movements that indicate speech by the muted video call participant, at Event, one or more actions are performed that prevent the other video call participants from viewing the mouth movements by the muted video call participant. The actions may include, but are not limited to, pausing or stopping the video feed (or the capture of video), obfuscating regions of the video feed that include the mouth of the muted video call participant or superimposing stationary images of the muted video call participant's mouth over the moving images.
8 FIG. 800 810 820 Referring to, a flow diagram is presented of a methodfor prevention of sensitive data leakage in a video call/conference environment, in accordance with embodiments of the present invention. At Event, a video call/conference is initiated amongst a plurality of video call participants and, in response to initiating the video call/conference, at Event, artificial intelligence is implemented to monitor for detection of sensitive data in the captured images or sounds (i.e., in the video feed or the audio feed being transmitted by the first call participant to other video call participants). Sensitive data may include indicia/text displayed in the background of the video feed, images/photographs of individuals displayed in the background of the video feed, actual individuals that come within the field of view of the image-capturing device, and voices of other individuals that are picked up in the audio feed.
830 In response to detecting sensitive data in the video and/or audio feeds being transmitted by the video call participant, at Event, one or more one or more actions are performed that prevent the other video call participants from viewing or hearing the sensitive data. The actions may include, but are not limited to, obfuscating regions of the video feed that include the sensitive data (such as text/indicia, images or the like) or suppress the background audio to lessen or eliminate speech of other individuals.
9 FIG. 900 900 902 910 916 922 936 illustrates an exemplary machine learning (ML) subsystem architecture, in accordance with an embodiment of the invention. The machine learning subsystemincludes a data acquisition engine, data ingestion engine, data pre-processing engine, ML model tuning engine, and inference engine.
902 924 904 906 908 902 904 906 908 904 906 908 902 904 906 908 910 The data acquisition engineidentifies various internal and/or external data sources to generate, test, and/or integrate new features for training the machine learning model. These internal and/or external data sources,, andmay be initial locations where the data originates or where physical information is first digitized. The data acquisition engineidentifies the location of the data and describes connection characteristics for access and retrieval of data. In some embodiments, data is transported from each data source,, orusing any applicable network protocols, such as the File Transfer Protocol (FTP), Hyper-Text Transfer Protocol (HTTP), or any of the myriad Application Programming Interfaces (APIs) provided by websites, networked applications, and other services. In some embodiments, these data sources include Enterprise Resource Planning (ERP) database(s)that host data related to day-to-day business activities such as accounting, procurement, project management, exposure management, supply chain operations, and/or the like, mainframethat is often the entity's central data processing center, edge device(s)that may be any piece of hardware, such as sensors, actuators, gadgets, appliances, or machines, that are programmed for certain applications and can transmit data over the internet or other networks, and/or the like. The data acquired by the data acquisition enginefrom these data sources,, andis transported to the data ingestion enginefor further processing.
902 910 902 910 912 914 912 914 Depending on the nature of the data imported from the data acquisition engine, the data ingestion enginemay move the data to a destination for storage or further analysis. Typically, the data imported from the data acquisition engineis in varying formats as the data comes from different sources, including Rational Database Management Systems (RDBMs), other types of databases, Simple Storage Service (S3) buckets, Commas-Separated Value (CSVs), or from streams. Since the data comes from different entities, the data needs to be cleansed and transformed so that it can be analyzed together with data from other sources. At the data ingestion engine, the data may be ingested in real-time, using the stream processing engine, in batches using the batch data warehouse, or a combination of both. The stream processing enginemay be used to process continuous data stream (e.g., data from edge devices), i.e., computing on data directly as it is received, and filter the incoming data to retain specific portions that are deemed useful by aggregating, analyzing, transforming, and ingesting the data. On the other hand, the batch data warehousecollects and transfers data in batches according to scheduled intervals, trigger events, or any other logical ordering.
924 916 In machine learning, the quality of data and the useful information that can be derived therefrom directly affects the ability of the machine learning modelto learn. The data pre-processing engineimplements advanced integration and processing steps needed to prepare the data for machine learning execution. This includes modules to perform any upfront, data transformation to consolidate the data into alternate forms by changing the value, structure, or format of the data using generalization, normalization, attribute selection, and aggregation, data cleaning by filling missing values, smoothing the noisy data, resolving the inconsistency, and removing outliers, and/or any other encoding steps as needed.
916 918 918 In addition to improving the quality of the data, the data pre-processing engineimplements feature extraction and/or selection techniques to generate training data. Feature extraction and/or selection is a process of dimensionality reduction by which an initial set of data is reduced to more manageable groups for processing. A characteristic of these large data sets is a large number of variables that require sizeable computing resources to process. Feature extraction and/or selection may be used to select and/or combine variables into features, effectively reducing the amount of data that must be processed, while still accurately and completely describing the original data set. Depending on the type of machine learning algorithm being used, training datamay require further enrichment. For example, in supervised learning, the training data is enriched using one or more meaningful and informative labels to provide context so a machine learning model can learn from it. For example, labels might indicate whether a photo contains a bird or car, which words were uttered in an audio recording, or if an x-ray contains a tumor. Data labeling is required for a variety of use cases including computer vision, natural language processing, and speech recognition. In contrast, unsupervised learning uses unlabeled data to find patterns in the data, such as inferences or clustering of data points.
922 924 918 924 920 The ML model tuning enginemay be used to train a machine learning modelusing the training datato make predictions or decisions without explicitly being programmed to do so. The machine learning modelrepresents what was learned by the selected machine learning algorithmand represents the rules, numbers, and any other algorithm-specific data structures required for classification. Selecting the right machine learning algorithm may depend on a number of different factors, such as the problem statement and the kind of output needed, type and size of the data, the available computational time, number of features and observations in the data, and/or the like. Machine learning algorithms may refer to programs (math and logic) that are configured to self-adjust and perform better as they are exposed to more data. To this extent, machine learning algorithms are capable of adjusting their own parameters, given feedback on previous performance in making prediction about a dataset.
The machine learning algorithms contemplated, described, and/or used herein include supervised learning (e.g., using logistic regression, using back propagation neural networks, using random forests, decision trees, or the like), unsupervised learning (e.g., using an Apriori algorithm, using K-means clustering), semi-supervised learning, reinforcement learning (e.g., using a Q-learning algorithm, using temporal difference learning), and/or any other suitable machine learning model type. Each of these types of machine learning algorithms can implement any of one or more of a regression algorithm (e.g., ordinary least squares, logistic regression, stepwise regression, multivariate adaptive regression splines, locally estimated scatterplot smoothing, or the like), an instance-based method (e.g., k-nearest neighbor, learning vector quantization, self-organizing map, or the like), a regularization method (e.g., ridge regression, least absolute shrinkage and selection operator, elastic net, or the like), a decision tree learning method (e.g., classification and regression tree, iterative dichotomiser 3, C4.5, chi-squared automatic interaction detection, decision stump, random forest, multivariate adaptive regression splines, gradient boosting machines, or the like), a Bayesian method (e.g., naïve Bayes, averaged one-dependence estimators, Bayesian belief network, or the like), a kernel method (e.g., a support vector machine, a radial basis function, or the like), a clustering method (e.g., k-means clustering, expectation maximization, or the like), an associated rule learning algorithm (e.g., an Apriori algorithm, an Eclat algorithm, or the like), an artificial neural network model (e.g., a Perceptron method, a back-propagation method, a Hopfield network method, a self-organizing map method, a learning vector quantization method, or the like), a deep learning algorithm (e.g., a restricted Boltzmann machine, a deep belief network method, a convolution network method, a stacked auto-encoder method, or the like), a dimensionality reduction method (e.g., principal component analysis, partial least squares regression, Sammon mapping, multidimensional scaling, projection pursuit, or the like), an ensemble method (e.g., boosting, bootstrapped aggregation, AdaBoost, stacked generalization, gradient boosting machine method, random forest method, or the like), and/or the like.
922 926 928 930 924 922 918 932 To tune the machine learning model, the ML model tuning enginerepeatedly executes cycles of initialization/experimentation, testing, and tuningto optimize the performance of the machine learning modeland refine the results in preparation for deployment of those results for consumption or decision making. To this end, the ML model tuning enginemay dynamically vary hyperparameters each iteration (e.g., number of trees in a tree-based algorithm or the value of alpha in a linear algorithm), run the algorithm on the data again, then compare its performance on a validation set to determine which set of hyperparameters results in the most accurate model. The accuracy of the model is the measurement used to determine which set of hyperparameters is best at identifying relationships and patterns between variables in a dataset based on the input, or training data. A fully trained machine learning modelis one whose hyperparameters are tuned and model accuracy maximized.
932 932 934 900 936 938 938 934 938 934 200 934 The trained machine learning model, similar to any other software application output, can be persisted to storage, file, memory, or application, or looped back into the processing component to be reprocessed. More often, the trained machine learning modelis deployed into an existing production environment to make practical decisions based on live data(such as, in accordance with the present invention, signals from beacons, data derived from beacon signals, movement/route maps and the like). To this end, the machine learning subsystemuses the inference engineto make such decisions. The type of decision-making may depend upon the type of machine learning algorithm used. For example, machine learning models trained using supervised learning algorithms may be used to structure computations in terms of categorized outputs (e.g., C_1, C_2 . . . C_n) or observations based on defined classifications, represent possible solutions to a decision based on certain conditions, model complex relationships between inputs and outputs to find patterns in data or capture a statistical structure among variables with unknown relationships, and/or the like. On the other hand, machine learning models trained using unsupervised learning algorithms may be used to group (e.g., C_1, C_2 . . . C_n) live databased on how similar they are to one another to solve exploratory challenges where little is known about the data, provide a description or label (e.g., C_1, C_2 . . . C_n) to live data, such as in classification, and/or the like. These categorized outputs, groups (clusters), or labels are then presented to the user input system. In still other cases, machine learning models that perform regression techniques may use live datato predict or forecast continuous outcomes.
900 900 9 FIG. It will be understood that the embodiment of the machine learning subsystemillustrated inis exemplary and that other embodiments may vary. As another example, in some embodiments, the machine learning subsystemincludes more, fewer, or different components.
Thus, as described in detail above, present embodiments of the invention include systems, methods, computer program products and/or the like that protect sensitive data protection, such as Non-Public Information (NPI) or the like in a video call/conference environment. Specifically, in response to initiating a video call/conference and placing a video call participant on mute, the system uses Artificial Intelligence (AI) specifically computer vision to monitor for mouth movements by the muted video call participant that indicate speech. In response to the monitoring detecting mouth movements that indicate speech, one or more actions are performed that prevent other video call/conference participants from viewing the mouth movement indicating speech by the first call participant. The actions may include, but are not limited to, stopping/pausing the video feed or the capturing of video, obfuscating the region in the video feed that includes the muted video call participant's mouth, or using AI to replace the mouth movements with images of the video call participant's mouth being stationary. In specific embodiments one or more of the actions may be performed once Non-Public Information (NPI) is identified in the speech and/or after the muted video call participant has indicated a desire to remain on mute.
While certain exemplary embodiments have been described and shown in the accompanying drawings, it is to be understood that such embodiments are merely illustrative of and not restrictive on the broad invention, and that this invention not be limited to the specific constructions and arrangements shown and described, since various other changes, combinations, omissions, modifications and substitutions, in addition to those set forth in the above paragraphs, are possible.
Those skilled in the art may appreciate that various adaptations and modifications of the just described embodiments can be configured without departing from the scope and spirit of the invention. Therefore, it is to be understood that, within the scope of the appended claims, the invention may be practiced other than as specifically described herein.
Cooperative Patent Classification codes for this invention. Click any code to explore related patents in that topic.
September 6, 2024
March 12, 2026
Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.