US-10929762

Distributable event prediction and machine learning recognition system

PublishedFebruary 23, 2021

Assigneenot available in USPTO data we have

Inventorsnot available in USPTO data we have

Technical Abstract

Data is classified using corrected semi-supervised data. Cluster centers are defined for unclassified observations. A class is determined for each cluster. A distance value is computed between a classified observation and each cluster center. When the class of the classified observation is not the class determined for the cluster center having a minimum distance, a first distance value is selected as the minimum distance, a second distance value is selected as the distance value computed to the cluster center having the class of the classified observation, a ratio value is computed between the second distance value and the first distance value, and the class of the classified observation is changed to the class determined for the cluster center having the minimum distance value when the computed ratio value satisfies a label correction threshold. A classification matrix is defined using corrected observations to determine the class for the unclassified observations.

Patent Claims

30 claims

Legal claims defining the scope of protection, as filed with the USPTO.

1. A non-transitory computer-readable medium having stored thereon computer-readable instructions that when executed by a computing device cause the computing device to: access a plurality of observation vectors, wherein the plurality of observation vectors includes a plurality of unclassified observation vectors and a plurality of classified observation vectors; define cluster centers for the plurality of unclassified observation vectors using a clustering algorithm, wherein a number of the cluster centers is a number of unique values of a target variable value of the plurality of classified observation vectors, wherein the target variable value is defined to represent a class of each respective observation vector of the plurality of classified observation vectors, wherein the target variable value is not defined to represent the class of each respective observation vector of the plurality of unclassified observation vectors; determine a unique class for each cluster of the defined cluster centers, wherein the unique class is selected from a plurality of classes that each represent unique values of the target variable value of the plurality of classified observation vectors; (A) select a next classified observation vector from the plurality of classified observation vectors; (B) compute a distance value between the selected next classified observation vector and each cluster center of the defined cluster centers; (C) when the target variable value of the selected next classified observation vector is not the unique class determined for a cluster center having a minimum computed distance value, select a first distance value as the minimum computed distance value; select a second distance value as the computed distance value to the cluster center having the unique class of the target variable value of the selected next classified observation vector; compute a ratio value between the selected second distance value and the selected first distance value; and change the target variable value of the selected next classified observation vector to the unique class determined for the cluster center having the minimum computed distance value when the computed ratio value satisfies a predefined label correction threshold; (D) repeat (A) through (C) until each observation vector of the plurality of classified observation vectors is selected in (A); (E) define a classification matrix using the plurality of observation vectors; (F) determine the target variable value for each observation vector of the plurality of unclassified observation vectors based on the defined classification matrix; and (G) output the target variable value for each observation vector of the plurality of observation vectors, wherein the target variable value selected for each observation vector of the plurality of observation vectors is defined to represent the class for a respective observation vector.

2. The non-transitory computer-readable medium of claim 1 , wherein the cluster centers are defined using a k-means clustering algorithm with the number of the cluster centers.

3. The non-transitory computer-readable medium of claim 1 , wherein the distance value is a Euclidian distance.

4. The non-transitory computer-readable medium of claim 1 , wherein the ratio value is computed using r v = e - d 1 ∑ j = 1 2 ⁢ e - d j , where r v is the ratio value, d 1 is the selected first distance value, and d 2 is the selected second distance value.

5. The non-transitory computer-readable medium of claim 4 , wherein the computed ratio value satisfies the predefined label correction threshold when the computed ratio value is greater than the predefined label correction threshold.

6. The non-transitory computer-readable medium of claim 1 , wherein defining the cluster centers comprises: executing the clustering algorithm to assign each observation vector of the plurality of unclassified observation vectors to a cluster of the number of the cluster centers; and computing an average of each observation vector assigned to a common cluster to define a cluster center for the common cluster.

7. The non-transitory computer-readable medium of claim 6 , wherein determining the unique class for each cluster of the defined cluster centers comprises: defining a class center for each unique value of the target variable value of the plurality of classified observation vectors by computing an average of each observation vector of the plurality of classified observation vectors having a common unique value of the target variable value; and computing a cluster distance value between each cluster center of the defined cluster centers and each defined class center to define cluster distance values for each respective cluster center; wherein the unique class determined for each cluster center of the defined cluster centers is the unique value of the target variable value associated with the defined class center that has a minimum value of the cluster distance values defined for each respective cluster center.

8. The non-transitory computer-readable medium of claim 7 , wherein the computed cluster distance value is a Euclidian distance.

9. The non-transitory computer-readable medium of claim 1 , wherein after (D) and before (E), the computer-readable instructions further cause the computing device to: compute a consistency measure based on a number of times the computed ratio value did not satisfy the label correction threshold; compute a within class covariance matrix from the plurality of classified observation vectors; compute a between class covariance matrix from the plurality of classified observation vectors; compute an eigenvector of a quotient of the computed between class covariance matrix and the computed within class covariance matrix; define a noise score value as an eigenvalue of the computed eigenvector; and compute a weighted score value from the computed consistency measure and the defined noise score value, wherein an algorithm used to define the classification matrix in (E) is selected based on the computed weighted score value.

10. The non-transitory computer-readable medium of claim 9 , wherein the consistency measure is computed as the number of times the computed ratio value satisfies the predefined label correction threshold divided by a number of the plurality of classified observation vectors.

11. The non-transitory computer-readable medium of claim 9 , wherein the within class covariance matrix is computed using S w =Σ i=1 n l (x i −μ y i )(x i −μ y i ) τ , where S w is the within class covariance matrix, n l is a number of the plurality of classified observation vectors, x i is an i th observation vector of the plurality of classified observation vectors, y i is the target variable value of x i , μ y i is a sample mean vector computed for a subset of the plurality of classified observation vectors having y i as the target variable value, and T indicates a transpose.

12. The non-transitory computer-readable medium of claim 9 , wherein the between class covariance matrix is computed using S b =Σ k=1 c n k (μ k −μ)(μ k −μ) τ , where S b is the between class covariance matrix, c is the number of unique values of the target variable value of the plurality of classified observation vectors, n k is a number of observation vectors of the plurality of classified observation vectors having the target variable value associated with a k th class, μ k is a sample mean vector computed for a subset of the plurality of classified observation vectors having the target variable value associated with the k th class, μ is a sample mean vector computed for the plurality of classified observation vectors, and τ indicates a transpose.

13. The non-transitory computer-readable medium of claim 9 , wherein the eigenvector is computed using argmax w ⁢ w ⊤ ⁢ S b ⁢ w w ⊤ ⁢ S w ⁢ w , where w is the eigenvector, S b is the between class covariance matrix, S w is the within class covariance matrix, and τ indicates a transpose.

14. The non-transitory computer-readable medium of claim 9 , wherein the weighted score value is computed using W s =N sv +α*C m , where W s is the weighted score value, N sv is the defined noise score value, C m is the computed consistency measure, and a is a predefined weight value.

15. The non-transitory computer-readable medium of claim 9 , wherein a first algorithm to define the classification matrix in (E) is selected when the computed weight score is greater than a predefined algorithm selection threshold and a second algorithm to define the classification matrix in (E) is selected when the computed weight score is less than the predefined algorithm selection threshold.

16. The non-transitory computer-readable medium of claim 15 , wherein defining the classification matrix in (E) using the first algorithm comprises: (H) computing a weight matrix using a kernel function applied to the plurality of observation vectors; (I) performing a decomposition of the computed weight matrix to define a decomposition matrix; (J) selecting a predefined number of eigenvectors from the defined decomposition matrix to define a second decomposition matrix, wherein the predefined number of eigenvectors have smallest eigenvalues relative to other eigenvectors not selected from the decomposed weight matrix; (K) computing a gradient value as a function of the defined second decomposition matrix, a plurality of sparse coefficients, and a label vector defined from the plurality of observation vectors based on the target variable value; (L) updating a value of each coefficient of the plurality of sparse coefficients based on the computed gradient value; (M) repeating (K) and (L) until a convergence parameter value indicates the plurality of sparse coefficients have converged, wherein the classification matrix is defined using the converged plurality of sparse coefficients.

17. The non-transitory computer-readable medium of claim 16 , wherein the gradient value is computed using ∀ a i =C 3,i +L 2 , where ∀ a i is the gradient value for α i , α i is an i th coefficient of the plurality of sparse coefficients, C 3 =V m τ (V m a−Y) is an mx1 vector, m is the predefined number of eigenvectors, C 3,i is an i th element of C 3 , V m is the defined second decomposition matrix, a is the plurality of sparse coefficients, Y is the label vector, T indicates a transpose, and L 2 is an L 2 -norm term.

18. The non-transitory computer-readable medium of claim 17 , wherein L 2 = 2 ⁢ λ 2 ⁢ ∑ ii 1 2 ⁢ ∑ i = 1 m ⁢ ∑ ii 1 2 ⁢ a i , where λ 2 is a first predefined regularization parameter value.

19. The non-transitory computer-readable medium of claim 16 , wherein the classification matrix is defined using F=V m a, where F is the classification matrix, V m is the defined second decomposition matrix, m is the predefined number of eigenvectors, and a is the plurality of sparse coefficients.

20. The non-transitory computer-readable medium of claim 16 , wherein updating the value of each coefficient of the plurality of sparse coefficients comprises: computing a difference value using the defined decomposition matrix and the computed gradient value; wherein the value is updated using a k = max ⁢ {  Δ k  - λ 1  v m  s , 0 } , where α k is a k th coefficient of the plurality of sparse coefficients, Δ k is the computed difference value for the k th coefficient of the plurality of sparse coefficients, λ 1 is a predefined regularization parameter value, V m is the defined second decomposition matrix, m is the predefined number of eigenvectors, |Δ k | indicates an absolute value of Δ k , and ∥V m ∥ s indicates a spectral norm of V m .

21. The non-transitory computer-readable medium of claim 20 , wherein the difference value is computed using Δ k = a k - ∇ a k ⁢ ( C 1 ⁡ ( a k ) )  v m  s 2 , where ∀ a k C 1 (α k )) is the gradient value for the k th coefficient of the plurality of sparse coefficients.

22. The non-transitory computer-readable medium of claim 16 , wherein computing the weight matrix comprises: computing a first matrix using the kernel function and the plurality of observation vectors; and computing a diagonal matrix by summing each row of the computed first matrix, wherein the sum of each row is stored in a diagonal of a respective row with zeroes in remaining positions of the respective row; wherein the weight matrix is computed using the first matrix and the diagonal matrix.

23. The non-transitory computer-readable medium of claim 22 , wherein the weight matrix is computed using W=ZD −1/2 D −1/2 Z, where W is the weight matrix, D is the computed diagonal matrix, and Z is the computed first matrix.

24. The non-transitory computer-readable medium of claim 23 , wherein the decomposition of the computed weight matrix is a singular value decomposition.

25. The non-transitory computer-readable medium of claim 22 , wherein the first matrix is computed using z ij = exp ⁡ ( -  x i - u j  2 2 ⁢ s 2 ) ∑ k ∈ N r ⁡ ( i ) ⁢ exp ⁡ ( -  x i - u k  2 2 ⁢ s 2 ) , i = 1 , 2 , … ⁢ , N ⁢ ⁢ and ⁢ ⁢ j = 1 , 2 , … , t , where z ij is an i,j th entry of the first matrix, s is a predefined kernel parameter value, x i is an i th observation vector of the plurality of observation vectors, r is a predefined number of nearest cluster centers, N r (i) is an index to a nearest cluster center to x i of the predefined number of nearest cluster centers, t is the number of unique values of the target variable value of the plurality of classified observation vectors, u j is a j th cluster center selected from the t cluster centers, and N is a number of the plurality of observation vectors.

26. The non-transitory computer-readable medium of claim 25 , wherein the t cluster centers are determined using a k-means clustering algorithm to cluster the plurality of observation vectors.

27. The non-transitory computer-readable medium of claim 15 , wherein defining the classification matrix in (E) using the second algorithm comprises: computing an affinity matrix using a kernel function with a predefined kernel parameter value and the plurality of observation vectors; computing a diagonal matrix by summing each row of the computed affinity matrix, wherein the sum of each row is stored in a diagonal of the row with zeroes in remaining positions of the row; computing a normalized distance matrix using the computed affinity matrix and the computed diagonal matrix; and defining a label matrix using the target variable value of each observation vector of the plurality of classified observation vectors; and computing a converged classification matrix by updating the defined label matrix, wherein the converged classification matrix defines a class probability for each unique class of the plurality of classes, wherein the classification matrix is defined using the converged classification matrix.

28. The non-transitory computer-readable medium of claim 27 , wherein the classification matrix is converged using F(t+1)=αSF(t)+(1−α)Y, where F(t+1) is a next classification matrix, α is a predefined weight value, S is the computed, normalized distance matrix, F(t) is the classification matrix, Y is the defined label matrix, and t is an iteration number of computing the converged classification matrix.

29. A computing device comprising: a processor; and a computer-readable medium operably coupled to the processor, the computer-readable medium having computer-readable instructions stored thereon that, when executed by the processor, cause the computing device to access a plurality of observation vectors, wherein the plurality of observation vectors includes a plurality of unclassified observation vectors and a plurality of classified observation vectors; define cluster centers for the plurality of unclassified observation vectors using a clustering algorithm, wherein a number of the cluster centers is a number of unique values of a target variable value of the plurality of classified observation vectors, wherein the target variable value is defined to represent a class of each respective observation vector of the plurality of classified observation vectors, wherein the target variable value is not defined to represent the class of each respective observation vector of the plurality of unclassified observation vectors; determine a unique class for each cluster of the defined cluster centers, wherein the unique class is selected from a plurality of classes that each represent unique values of the target variable value of the plurality of classified observation vectors; (A) select a next classified observation vector from the plurality of classified observation vectors; (B) compute a distance value between the selected next classified observation vector and each cluster center of the defined cluster centers; (C) when the target variable value of the selected next classified observation vector is not the unique class determined for a cluster center having a minimum computed distance value, select a first distance value as the minimum computed distance value; select a second distance value as the computed distance value to the cluster center having the unique class of the target variable value of the selected next classified observation vector; compute a ratio value between the selected second distance value and the selected first distance value; and change the target variable value of the selected next classified observation vector to the unique class determined for the cluster center having the minimum computed distance value when the computed ratio value satisfies a predefined label correction threshold; (D) repeat (A) through (C) until each observation vector of the plurality of classified observation vectors is selected in (A); (E) define a classification matrix using the plurality of observation vectors; (F) determine the target variable value for each observation vector of the plurality of unclassified observation vectors based on the defined classification matrix; and (G) output the target variable value for each observation vector of the plurality of observation vectors, wherein the target variable value selected for each observation vector of the plurality of observation vectors is defined to represent the class for a respective observation vector.

30. A method of classifying data using semi-supervised data, the method comprising: accessing, by a computing device, a plurality of observation vectors, wherein the plurality of observation vectors includes a plurality of unclassified observation vectors and a plurality of classified observation vectors; defining, by the computing device, cluster centers for the plurality of unclassified observation vectors using a clustering algorithm, wherein a number of the cluster centers is a number of unique values of a target variable value of the plurality of classified observation vectors, wherein the target variable value is defined to represent a class of each respective observation vector of the plurality of classified observation vectors, wherein the target variable value is not defined to represent the class of each respective observation vector of the plurality of unclassified observation vectors; determining, by the computing device, a unique class for each cluster of the defined cluster centers, wherein the unique class is selected from a plurality of classes that each represent unique values of the target variable value of the plurality of classified observation vectors; (A) selecting, by the computing device, a next classified observation vector from the plurality of classified observation vectors; (B) computing, by the computing device, a distance value between the selected next classified observation vector and each cluster center of the defined cluster centers; (C) when the target variable value of the selected next classified observation vector is not the unique class determined for a cluster center having a minimum computed distance value, selecting, by the computing device, a first distance value as the minimum computed distance value; selecting, by the computing device, a second distance value as the computed distance value to the cluster center having the unique class of the target variable value of the selected next classified observation vector; computing, by the computing device, a ratio value between the selected second distance value and the selected first distance value; and changing, by the computing device, the target variable value of the selected next classified observation vector to the unique class determined for the cluster center having the minimum computed distance value when the computed ratio value satisfies a predefined label correction threshold; (D) repeating, by the computing device, (A) through (C) until each observation vector of the plurality of classified observation vectors is selected in (A); (E) defining, by the computing device, a classification matrix using the plurality of observation vectors; (F) determining, by the computing device, the target variable value for each observation vector of the plurality of unclassified observation vectors based on the defined classification matrix; and (G) outputting, by the computing device, the target variable value for each observation vector of the plurality of observation vectors, wherein the target variable value selected for each observation vector of the plurality of observation vectors is defined to represent the class for a respective observation vector.

Classification Codes (CPC)

Cooperative Patent Classification codes for this invention. Click any code to explore related patents in that topic.

G06N

Patent Metadata

Filing Date

July 28, 2020

Publication Date

February 23, 2021

Want to explore more patents?

Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.

Browse All Patents Try Prior Art Search