Method and System for Predicting in Real-Time One or More Potential Threats in Video Surveillance

PublishedAugust 4, 2020

Assigneenot available in USPTO data we have

InventorsGopichand Agnihotram Manjunath Ramachandra Iyer

Technical Abstract

Patent Claims

11 claims

Legal claims defining the scope of protection, as filed with the USPTO.

1. A method of predicting in real-time one or more potential threats in video surveillance, the method comprising: receiving, by a threat prediction system, a real-time video feed from a video surveillance system, wherein the video feed comprises a plurality of frames associated with a scene captured at a location of the video surveillance system; identifying, by the threat prediction system, one or more objects in each of the plurality of frames, wherein each of the one or more objects is sequenced with respect to the received plurality of frames; generating, by the threat prediction system, a scene description for each of the plurality of frames based on the one or more objects and context associated with corresponding frames, wherein the scene description comprises sentences describing the scene and gestures, and the context comprises sentences describing the scene along with emotions associated with a user, wherein the user is associated with the one or more objects in the corresponding frames; determining, by the threat prediction system, one or more real-time actions for the scene based on the scene description, wherein the one or more real-time actions are determined using a trained action prediction model which is trained based on a conditional probability of possible state action change from one state to another state from a sequence of possible states; predicting, by the threat prediction system, one or more potential threats to the user associated with the video feed based on the one or more real-time actions; and alerting, by the threat prediction system, the user of the one or more potential threats based on the prediction.

2. The method as claimed in claim 1 , wherein the one or more objects are identified using a trained object detection model, wherein the object detection model is trained using a plurality of video training feeds using convolution neural network technique.

3. The method as claimed in claim 1 , wherein the scene description is generated using a trained scene description model, and wherein the scene description model is trained using a plurality of training objects identified for a plurality of video training feeds.

4. The method as claimed in claim 1 , wherein the action prediction model is trained using a plurality of actions identified from a plurality of video training feeds.

5. The method as claimed in claim 1 , wherein the one or more potential threats are predicted by mapping each of the one or more real-time actions with a plurality of predefined threats using a trained threat prediction model, wherein the threat prediction model is trained using a plurality of training actions.

6. A threat prediction system for predicting in real-time one or more potential threats in video surveillance, comprising: a processor; and a memory communicatively coupled to the processor, wherein the memory stores processor instructions, which, on execution, causes the processor to: receive a real-time video feed from a video surveillance system, wherein the video feed comprises a plurality of frames associated with a scene captured at a location of the video surveillance system; identify one or more objects in each of the plurality of frames, wherein each of the one or more objects is sequenced with respect to the received plurality of frames; generate a scene description for each of the plurality of frames based on the one or more objects and context associated with corresponding frames, wherein the scene description comprises sentences describing the scene and gestures, and the context comprises sentences describing the scene along with emotions associated with a user, wherein the user is associated with the one or more objects in the corresponding frames; determine one or more real-time actions for the scene based on the scene description, wherein the one or more real-time actions are determined using based on a conditional probability of possible state action change from one state to another state from a sequence of possible states; predict one or more potential threats to the user associated with the video feed based on the one or more real-time actions; and alert the user of the one or more potential threats based on the prediction.

7. The threat prediction system as claimed in claim 6 , wherein the processor identifies the one or more objects using a trained object detection model, wherein the object detection model is trained using a plurality of video training feeds using convolution neural network technique.

8. The threat prediction system as claimed in claim 6 , wherein the processor generates the scene description using a trained scene description model, and wherein the scene description is trained using a plurality of training objects identified for a plurality of video training feeds.

9. The threat prediction system as claimed in claim 6 , wherein the action prediction model is trained using a plurality of actions identified from a plurality of video training feeds.

10. The threat prediction system as claimed in claim 6 , wherein the processor predicts the one or more potential threats by mapping each of the one or more real-time actions with a plurality of predefined threats using a trained threat prediction model, and wherein the threat prediction model is trained using a plurality of training actions.

11. A non-transitory computer readable medium including instruction stored thereon that when processed by at least one processor cause threat prediction system to perform operation comprising: receiving a real-time video feed from a video surveillance system, wherein the video feed comprises a plurality of frames associated with a scene captured at a location of the video surveillance system; identifying one or more objects in each of the plurality of frames, wherein each of the one or more objects is sequenced with respect to the received plurality of frames; generating a scene description for each of the plurality of frames based on the one or more objects and context associated with corresponding frames, wherein the scene description comprises sentences describing the scene and gestures, and the context comprises sentences describing the scene along with emotions associated with a user, wherein the user is associated with the one or more objects in the corresponding frames; determining one or more real-time actions for the scene based on the scene description, wherein the one or more real-time actions are determined using based on a conditional probability of possible state action change from one state to another state from a sequence of possible states; predicting one or more potential threats to the user associated with the video feed based on the one or more real-time actions; and alerting the user of the one or more potential threats based on the prediction.

Patent Metadata

Filing Date

Unknown

Publication Date

August 4, 2020

Inventors

Gopichand Agnihotram

Manjunath Ramachandra Iyer

Want to explore more patents?

Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.

Browse All Patents Try Prior Art Search