Re-Training A Machine Learning Model

PublishedMay 10, 2016

Assigneenot available in USPTO data we have

InventorsStephen Purpura James E. Walsh Dustin Lundring Rigg Hillard

Technical Abstract

Patent Claims

22 claims

Legal claims defining the scope of protection, as filed with the USPTO.

1. A method performed by one or more computers, the method comprising: receiving an ordered sequence of feature vectors; for each feature vector of a plurality of feature vectors in the ordered sequence: using a predictive model having a plurality of parameters to generate a predicted output for the feature vector, wherein the predictive model has been trained on a plurality of old feature vectors using a model training process that generates respective first parameter values for each of the plurality of parameters of the predictive model, identifying recent feature vectors in the ordered sequence, wherein each recent feature vector is within a window of predetermined size preceding the feature vector in the ordered sequence, and computing a measure of the quality of the output of the predictive model on the recent feature vectors; determining, for a first feature vector, that the quality of the output of the predictive model on first recent feature vectors within a first window of the predetermined size preceding the first feature vector in the ordered sequence has become unacceptable as of the first feature vector, and in response: selecting retraining data for retraining the predictive model from a collection of feature vectors consisting of the first recent feature vectors and the plurality of old feature vectors, wherein the ratio of first recent feature vectors to old feature vectors in the retraining data is greater than the corresponding ratio in the collection by an amount based on how unacceptable the quality of the output has become as of the first feature vector, whereby a more unacceptable quality of the output results in the retraining data having a greater ratio of first recent feature vectors to old feature vectors; and retraining the predictive model on the retraining data.

2. The method of claim 1 , further comprising: determining that, as of a second feature vector preceding the first feature vector in the ordered sequence, the measure of the quality of the output of the predictive model remains acceptable; and refraining from retraining the predictive model.

3. The method of claim 1 , wherein the model training process is a gradient descent process.

4. The method of claim 3 , wherein the gradient descent process uses a limited-memory Broyden-Fletcher-Goldfarb-Shanno (L-BFGS) optimization process.

5. The method of claim 1 , wherein determining that the measure of the quality of the output of the predictive model has become unacceptable comprises: computing a measure of the quality of the output of the predictive model on one or more sets of old feature vectors; and determining that the measure of the quality of the output of the predictive model on the first recent feature vectors is unacceptable based on a comparison of the measure of the quality of the output of the predictive model on the first recent feature vectors and the measure of the quality of the output of the predictive model on the one or more sets of old feature vectors.

6. The method of claim 1 , wherein: the plurality of feature vectors comprises vectors representing completed financial product transactions including transaction prices; and the predictive model is a model predicting a next transaction price or a next transaction price direction for one or more financial products.

7. The method of claim 6 , wherein the financial products comprise one or more of common stock shares, exchange traded fund shares, options contracts, commodity futures contracts, or financial derivative contracts.

8. The method of claim 6 , wherein the predictive model is a model predicting whether a next transaction price for a particular financial product is likely to be at a higher price or at a lower price than a most recent completed transaction.

9. The method of claim 6 , wherein the next transaction is a next trade on an electronic exchange.

10. The method of claim 1 , wherein: the plurality of feature vectors comprises vectors representing completed credit card transactions or debit card transactions or both; and the predictive model is a model classifying particular transactions as likely being anomalous or not.

11. The method of claim 1 , wherein: the plurality of feature vectors comprises vectors representing financial claims processing transactions; and the predictive model is a model classifying particular transactions as likely being anomalous or not.

12. The method of claim 1 , wherein: the plurality of feature vectors comprises vectors representing prices for products or services or both at particular times or places or both; and the predictive model is a model predicting prices for products or services or both in particular places or on particular dates or both.

13. The method of claim 1 , wherein: the plurality of feature vectors comprises vectors representing purchase transactions representing purchases of products or services or both and including respective prices paid for the products or services or both; and the predictive model is a model predicting prices for products or services in particular places or on particular dates or both.

14. The method of claim 13 , wherein: the predictive model is further a model classifying particular prices for particular products or services as likely being anomalous or not.

15. The method of claim 14 , wherein: the predictive model is further a model classifying particular prices for particular products or services as likely being fraudulent or not.

16. The method of claim 1 , wherein: the plurality of feature vectors comprises vectors representing user actions on an interactive computer-based system; and the predictive model is a model predicting user actions on the interactive computer-based system.

17. A system comprising one or more computers and one or more storage devices storing instructions that are operable, when executed by the one or more computers, to cause the one or more computers to perform operations comprising: receiving an ordered sequence of feature vectors; for each feature vector of a plurality of feature vectors in the ordered sequence: using a predictive model having a plurality of parameters to generate a predicted output for the feature vector, wherein the predictive model has been trained on a plurality of old feature vectors using a model training process that generates respective first parameter values for each of the plurality of parameters of the predictive model, identifying recent feature vectors in the ordered sequence, wherein each recent feature vector is within a window of predetermined size preceding the feature vector in the ordered sequence, and computing a measure of the quality of the output of the predictive model on the recent feature vectors; determining, for a first feature vector, that the quality of the output of the predictive model on first recent feature vectors within a first window of the predetermined size preceding the first feature vector in the ordered sequence has become unacceptable as of the first feature vector, and in response: selecting retraining data for retraining the predictive model from a collection of feature vectors consisting of the first recent feature vectors and the plurality of old feature vectors, wherein the ratio of first recent feature vectors to old feature vectors in the retraining data is greater than the corresponding ratio in the collection by an amount based on how unacceptable the quality of the output has become as of the first feature vector, whereby a more unacceptable quality of the output results in the retraining data having a greater ratio of first recent feature vectors to old feature vectors; and retraining the predictive model on the retraining data.

18. The system of claim 17 , the operations further comprising: determining that, as of a second feature vector preceding the first feature vector in the ordered sequence, the measure of the quality of the output of the predictive model remains acceptable; and refraining from retraining the predictive model.

19. The system of claim 17 , wherein determining that the measure of the quality of the output of the predictive model has become unacceptable comprises: computing a measure of the quality of the output of the predictive model on one or more sets of old feature vectors; and determining that the measure of the quality of the output of the predictive model on the first recent feature vectors is unacceptable based on a comparison of the measure of the quality of the output of the predictive model on the first recent feature vectors and the measure of the quality of the output of the predictive model on the one or more sets of old feature vectors.

20. A non-transitory computer storage medium encoded with instructions that, when executed by one or more computers, cause the one or more computers to perform operations comprising: receiving an ordered sequence of feature vectors; for each feature vector of a plurality of feature vectors in the ordered sequence: using a predictive model having a plurality of parameters to generate a predicted output for the feature vector, wherein the predictive model has been trained on a plurality of old feature vectors using a model training process that generates respective first parameter values for each of the plurality of parameters of the predictive model, identifying recent feature vectors in the ordered sequence, wherein each recent feature vector is within a window of predetermined size preceding the feature vector in the ordered sequence, and computing a measure of the quality of the output of the predictive model on the recent feature vectors; determining, for a first feature vector, that the quality of the output of the predictive model on first recent feature vectors within a first window of the predetermined size preceding the first feature vector in the ordered sequence has become unacceptable as of the first feature vector, and in response: selecting retraining data for retraining the predictive model from a collection of feature vectors consisting of the first recent feature vectors and the plurality of old feature vectors, wherein the ratio of first recent feature vectors to old feature vectors in the retraining data is greater than the corresponding ratio in the collection by an amount based on how unacceptable the quality of the output has become as of the first feature vector, whereby a more unacceptable quality of the output results in the retraining data having a greater ratio of first recent feature vectors to old feature vectors; and retraining the predictive model on the retraining data.

21. The non-transitory computer storage medium of claim 20 , the operations further comprising: determining that, as of a second feature vector preceding the first feature vector in the ordered sequence, the measure of the quality of the output of the predictive model remains acceptable; and refraining from retraining the predictive model.

22. The non-transitory computer storage medium of claim 20 , wherein determining that the measure of the quality of the output of the predictive model has become unacceptable comprises: computing a measure of the quality of the output of the predictive model on one or more sets of old feature vectors; and determining that the measure of the quality of the output of the predictive model on the first recent feature vectors is unacceptable based on a comparison of the measure of the quality of the output of the predictive model on the first recent feature vectors and the measure of the quality of the output of the predictive model on the one or more sets of old feature vectors.

Patent Metadata

Filing Date

Unknown

Publication Date

May 10, 2016

Inventors

Stephen Purpura

James E. Walsh

Dustin Lundring Rigg Hillard

Want to explore more patents?

Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.

Browse All Patents Try Prior Art Search