Legal claims defining the scope of protection, as filed with the USPTO.
1. A cyber security system for digital artifact genetic modeling and forensic analysis, comprising: one or more processors and a memory, the memory having instructions encoded thereon such that upon execution of the instructions, the one or more processors perform operations of: receiving a plurality of digital artifacts, each digital artifact possessing features; extracting the features from the digital artifacts; classifying the features into descriptive genotype-phenotype structures, such that genotype features are those that carry inheritance information, and phenotype features are those that carry behavioral information; and determining a lineage, heredity, and provenance of the digital artifacts based on mapping of the genotype-phenotype structures.
2. The system of claim 1 , wherein in extracting the features from digital artifacts, the system uses a spatial-temporal vocabulary tree to index a set of features and identify potential relationships between the features.
3. The system of claim 2 , wherein in classifying the features into descriptive genotype-phenotype structures, the system identifies correlations between the extracted features and cluster correlated features into motifs.
4. The system of claim 3 , wherein in determining a lineage, heredity, and provenance of the digital artifacts, the system further models relations between the digital artifacts as an artifact relation network (ARN), the ARN having nodes and links, such that the nodes represent the digital artifacts, annotated with genotype and phenotype feature vectors, and the links represent similarity relationships between the nodes.
5. The system of claim 4 , wherein in determining a lineage, heredity, and provenance of the digital artifacts, the system further generates a hierarchical artifact network (HAN), the HAN having HAN clusters that provide a hierarchical organization of multiple, dependent relations between the digital artifacts, the HAN clusters having leaf nodes that represent digital artifacts and intermediate nodes that represent a degree of relatedness among clusters of digital artifacts.
6. The system of claim 5 , wherein in determining a lineage, heredity, and provenance of the digital artifacts, the system further determines lineage between each pair of digital artifacts in the ARN by computing a Kullback-Leibler (KL) divergence of the digital artifacts to estimate an evolution transition, such that if the KL divergence is below a predetermined threshold, the pair of digital artifacts are determined to be of an established ARN lineage relation.
7. The system of claim 6 , wherein in determining a lineage, heredity, and provenance of the digital artifacts, the system further determines heredity relations for each established ARN lineage relation by comparing common features of their respective HAN clusters to identify unique overlapping shared features.
8. The system of claim 7 , wherein in determining a lineage, heredity, and provenance of the digital artifacts, the system further determines provenance by combing HAN clusters with the ARN lineage relations and heredity relations to estimate missing provenance values.
9. A computer implemented method for digital artifact genetic modeling and forensic analysis, comprising an act of causing a data processor to execute instructions stored on a non-transitory computer-readable medium such that upon execution, the data processor performs operations of: receiving a plurality of digital artifacts, each digital artifact possessing features; extracting the features from the digital artifacts; classifying the features into descriptive genotype-phenotype structures, such that genotype features are those that carry inheritance information, and phenotype features are those that carry behavioral information; and determining a lineage, heredity, and provenance of the digital artifacts based on mapping of the genotype-phenotype structures.
10. The method of claim 9 , wherein in extracting the features from digital artifacts, the data processor uses a spatial-temporal vocabulary tree to index a set of features and identify potential relationships between the features.
11. The method of claim 10 , wherein in classifying the features into descriptive genotype-phenotype structures, the data processor identifies correlations between the extracted features and cluster correlated features into motifs.
12. The method of claim 11 , wherein in determining a lineage, heredity, and provenance of the digital artifacts, the data processor further models relations between the digital artifacts as an artifact relation network (ARN), the ARN having nodes and links, such that the nodes represent the digital artifacts, annotated with genotype and phenotype feature vectors, and the links represent similarity relationships between the nodes.
13. The method of claim 12 , wherein in determining a lineage, heredity, and provenance of the digital artifacts, the data processor further generates a hierarchical artifact network (HAN), the HAN having HAN clusters that provide a hierarchical organization of multiple, dependent relations between the digital artifacts, the HAN clusters having leaf nodes that represent digital artifacts and intermediate nodes that represent a degree of relatedness among clusters of digital artifacts.
14. The method of claim 13 , wherein in determining a lineage, heredity, and provenance of the digital artifacts, the data processor further determines lineage between each pair of digital artifacts in the ARN by computing a Kullback-Leibler (KL) divergence of the digital artifacts to estimate an evolution transition, such that if the KL divergence is below a predetermined threshold, the pair of digital artifacts are determined to be of an established ARN lineage relation.
15. The method of claim 14 , wherein in determining a lineage, heredity, and provenance of the digital artifacts, the data processor further determines heredity relations for each established ARN lineage relation by comparing common features of their respective HAN clusters to identify unique overlapping shared features.
16. The method of claim 15 , wherein in determining a lineage, heredity, and provenance of the digital artifacts, the data processor further determines provenance by combing HAN clusters with the ARN lineage relations and heredity relations to estimate missing provenance values.
17. A computer program product for digital artifact genetic modeling and forensic analysis, the computer program product comprising computer-readable instructions stored on a non-transitory computer-readable medium that are executable by a computer having a processor for causing the processor to perform operations of: receiving a plurality of digital artifacts, each digital artifact possessing features; extracting the features from the digital artifacts; classifying the features into descriptive genotype-phenotype structures, such that genotype features are those that carry inheritance information, and phenotype features are those that carry behavioral information; and determining a lineage, heredity, and provenance of the digital artifacts based on mapping of the genotype-phenotype structures.
18. The computer program product of claim 17 , wherein in extracting the features from digital artifacts, the computer program product further includes instructions for causing the processor to use a spatial-temporal vocabulary tree to index a set of features and identify potential relationships between the features.
19. The system of claim 18 , wherein in classifying the features into descriptive genotype-phenotype structures, the computer program product further includes instructions for causing the processor to identify correlations between the extracted features and cluster correlated features into motifs.
20. The system of claim 19 , wherein in determining a lineage, heredity, and provenance of the digital artifacts, the computer program product further includes instructions for causing the processor to model relations between the digital artifacts as an artifact relation network (ARN), the ARN having nodes and links, such that the nodes represent the digital artifacts, annotated with genotype and phenotype feature vectors, and the links represent similarity relationships between the nodes.
21. The system of claim 20 , wherein in determining a lineage, heredity, and provenance of the digital artifacts, the computer program product further includes instructions for causing the processor to generate a hierarchical artifact network (HAN), the HAN having HAN clusters that provide a hierarchical organization of multiple, dependent relations between the digital artifacts, the HAN clusters having leaf nodes that represent digital artifacts and intermediate nodes that represent a degree of relatedness among clusters of digital artifacts.
22. The system of claim 21 , wherein in determining a lineage, heredity, and provenance of the digital artifacts, the computer program product further includes instructions for causing the processor to determine lineage between each pair of digital artifacts in the ARN by computing a Kullback-Leibler (KL) divergence of the digital artifacts to estimate an evolution transition, such that if the KL divergence is below a predetermined threshold, the pair of digital artifacts are determined to be of an established ARN lineage relation.
23. The system of claim 22 , wherein in determining a lineage, heredity, and provenance of the digital artifacts, the computer program product further includes instructions for causing the processor to determine heredity relations for each established ARN lineage relation by comparing common features of their respective HAN clusters to identify unique overlapping shared features.
24. The system of claim 23 , wherein in determining a lineage, heredity, and provenance of the digital artifacts, the computer program product further includes instructions for causing the processor to determine provenance by combing HAN clusters with the ARN lineage relations and heredity relations to estimate missing provenance values.
25. The system of claim 1 , wherein in determining a lineage, heredity, and provenance of the digital artifacts, the system further models relations between the digital artifacts as an artifact relation network (ARN), the ARN having nodes and links, such that the nodes represent the digital artifacts, annotated with genotype and phenotype feature vectors, and the links represent similarity relationships between the nodes, and the system further determines lineage between each pair of digital artifacts in the ARN by computing a Kullback-Leibler (KL) divergence of the digital artifacts to estimate an evolution transition, such that if the KL divergence is below a predetermined threshold, the pair of digital artifacts are determined to be of an established ARN lineage relation.
Unknown
December 29, 2015
Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.