Legal claims defining the scope of protection, as filed with the USPTO.
1. A method for identifying suspicious network entity groups from a dataset of entity information, the method comprising: selecting, by a processor, a multi-view sub-graph within a multi-view graph corresponding to a subset of network entities and a subset of views, the multi-view graph representing the dataset of entity information, each node of the multi-view graph corresponding a network entity identifier, each view of the multi-view graph corresponding to an attribute identifier, and each edge between the nodes of a respective view having an edge weight corresponding to attribute value overlap between those nodes in that view; updating, by the processor, the selected multi-view sub-graph by alternating between a first state in which the subset of network entities is fixed and the subset of views is updated and a second state in which the subset of views is fixed and the subset of network entities is updated; determining, by the processor, a suspiciousness value for the updated multi-view subgraph; repeating the updating and the determining the suspiciousness value until a current suspiciousness value for an updated multi-view sub-graph does not exceed a previously determined suspiciousness value for a preceding multi-view sub-graph; and recording, by the processor, the previously determined suspiciousness value and the subset of network entities corresponding to the preceding multi-view sub-graph.
2. The method of claim 1 , further comprising: receiving, by the processor, the network entity identifiers and the attribute identifiers; and generating, by the processor, the multi-view graph from the dataset of entity information using the entity identifiers and the attribute identifiers.
3. The method of claim 1 , wherein the updating comprises: updating the selected multi-view sub-graph by updating the subset of view with the subset of network entities fixed; determining the suspiciousness value for the updated multi-view subgraph with the subset of network entities fixed; updating the selected multi-view sub-graph by updating the subset of network entities with the subset of views fixed; and determining the suspiciousness value for the updated multi-view subgraph with the subset of views fixed.
4. The method of claim 3 , wherein the updating the selected multi-view sub-graph comprises maintaining a number of value frequencies in a view-specific hashmap.
5. The method of claim 1 , further comprising: running multiple instances of the method simultaneously in a multi-thread computing system.
6. The method of claim 1 , wherein the selecting comprises: identifying a constraint; selecting, by the processor, a view within the multi-view graph; initializing a candidate seed with nodes having similarity in the selected view; adding a node to the candidate seed; determining if the candidate seed with the added node meets the constraint; and selecting the candidate seed with the added node as the multi-view sub-graph if the constraint is met.
7. The method of claim 6 , wherein the constraint is a ratio of a sum of the edge weights to a possible number of edges between the nodes.
8. The method of claim 6 , wherein the selecting a view comprises: sampling views within the multi-view graph by weight based on an inverse of a qth frequency percentile across views, wherein q is 95 or greater.
9. The method of claim 1 , further comprising the step of: presenting the recorded values to a user.
10. A system for identifying suspicious network entity groups from a dataset of entity information, the system comprising: a memory that stores instructions; and a processor configured by the instructions to perform operations comprising: selecting a multi-view sub-graph within a multi-view graph corresponding to a subset of network entities and a subset of views, the multi-view graph representing the dataset of entity information, each node of the multi-view graph corresponding to a network entity identifier, each view of the multi-view graph corresponding to an attribute identifier, and each edge between the nodes of a respective view having an edge weight corresponding to attribute value overlap between those nodes in that view; updating the selected multi-view sub-graph by alternating between a first state in which the subset of network entities is fixed and the subset of views is updated and a second state in which the subset of views is fixed and the subset of network entities is updated; determining a suspiciousness value for the updated multi-view subgraph; repeating the updating and the determining the suspiciousness value until a current suspiciousness value for an updated multi-view sub-graph does not exceed a previously determined suspiciousness value for a preceding multi-view sub-graph; and recording the previously determined suspiciousness value and the subset of network entities corresponding to the preceding multi-view sub-graph.
11. The system of claim 10 , the processor further configured by the instructions to perform operations comprising: receiving the network entity identifiers and the attribute identifiers; and generating the multi-view graph from the dataset of network entity information using the entity identifiers and the attribute identifiers.
12. The system of claim 10 , wherein the updating comprises: updating the selected multi-view sub-graph by updating the subset of view with the subset of network entities fixed; determining the suspiciousness value for the updated multi-view subgraph with the subset of network entities fixed; updating the selected multi-view sub-graph by updating the subset of network entities with the subset of views fixed; and determining the suspiciousness value for the updated multi-view subgraph with the subset of views fixed.
13. The system of claim 12 , wherein the updating the selected multi-view sub-graph comprises maintaining a number of value frequencies in a view-specific hashmap.
14. The system of claim 10 , the processor further configured by the instructions to perform operations comprising: running multiple instances of the system simultaneously in a multi-thread computing system.
15. The system of claim 10 , wherein the selecting comprises: identifying a constraint; selecting a view within the multi-view graph; initializing a candidate seed with nodes having similarity in the selected view; adding a node to the candidate seed; determining if the candidate seed with the added node meets the constraint; and selecting the candidate seed with the added node as the multi-view sub-graph if the constraint is met.
16. The system of claim 15 , wherein the constraint is a ratio of a sum of the edge weights to a possible number of edges between the nodes.
17. The system of claim 15 , wherein the selecting a view comprises: sampling views within the multi-view graph by weight based on an inverse of a qth frequency percentile across views, wherein q is 95 or greater.
18. The system of claim 10 , the processor further configured by the instructions to perform operations comprising: presenting the recorded values to a user.
19. A non-transitory processor-readable storage medium storing processor-executable instructions that, when executed by a processor of a machine, cause the machine to perform operations comprising: selecting, by the processor, a multi-view sub-graph within a multi-view graph corresponding to a subset of network entities and a subset of views, the multi-view graph representing the dataset of entity information, each node of the multi-view graph corresponding to a network entity identifier, each view of the multi-view graph corresponding to an attribute identifier, and each edge between the nodes of a respective view having an edge weight corresponding to attribute value overlap between those nodes in that view; updating, by the processor, the selected multi-view sub-graph by alternating between a first state in which the subset of network entities is fixed and the subset of views is updated and a second state in which the subset of views is fixed and the subset of network entities is updated; determining, by the processor, a suspiciousness value for the updated multi-view subgraph; repeating the updating and the determining the suspiciousness value until a current suspiciousness value for an updated multi-view sub-graph does not exceed a previously determined suspiciousness value for a preceding multi-view sub-graph; and recording, by the processor, the previously determined suspiciousness value and the subset of network entities corresponding to the preceding multi-view sub-graph.
20. The non-transitory processor-readable storage medium of claim 19 , wherein identifying a constraint comprises: selecting a view within the multi-view graph; initializing a candidate seed with nodes having similarity in the selected view; adding a node to the candidate seed; determining if the candidate seed with the added node meets the constraint; and selecting the candidate seed with the added node as the multi-view sub-graph if the constraint is met.
Unknown
May 31, 2022
Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.