Legal claims defining the scope of protection, as filed with the USPTO.
1. A computer-implemented method comprising: calculating one or more types of suggestion scores for each of a plurality of training examples, wherein each type of suggestion score is based at least in part on a plurality of computed predictions for each training example by generated by a plurality of different trained models, including weighting each type of suggestion score by an accuracy of the trained model that generated the prediction; calculating an overall suggestion score for each training example based at least in part on a combination of the one or more types of suggestion scores for each training example; ranking the training examples by the corresponding overall suggestion scores; and providing one or more highest-ranked training examples as a set of suggested training examples.
2. The method of claim 1 , further comprising providing one or more of highest-ranked training examples in response to a request.
3. The method of claim 1 , wherein one of the one or more types of suggestion scores is an ambiguity score, wherein the ambiguity score for a particular training example in the training examples is based on an answer distribution of a training example between two or more categories.
4. The method of claim 1 , wherein one of the one or more types of suggestion scores is a difficulty score, wherein the difficulty score for a particular training example in the training examples is based on comparing a confidence associated with an incorrectly predicted category for the training example to a threshold.
5. The method of claim 1 , wherein one of the one or more types of suggestion scores is a sparseness score, wherein the sparseness score for a particular training example in the training examples is based on comparing a count of training examples for a particular category or feature space of each training example to a threshold.
6. The method of claim 1 , wherein one of the one or more types of suggestion scores is a sparseness score, wherein the sparseness score for a particular training example in the training examples is based on comparing a distribution of training example answers to the answer of a particular training example.
7. The method of claim 1 , further comprising: obtaining a user-defined utility for each of one or more predicted categories, wherein utility is a measure of importance for the category, wherein calculating one or more types of suggestion scores for a particular training example comprises calculating each of the one or more types of suggestion scores weighted by the user-defined utility of a predicted category of the particular training example.
8. The method of claim 1 , further comprising: requesting one or more additional training examples based on one or more of the highest-ranked training examples; receiving one or more additional training examples in response to the request; updating each trained model using the one or more received additional training examples; and recalculating suggestion scores for each of the plurality of training examples and the one or more received additional training examples; and providing one or more highest-ranked training examples based on the recalculated suggestion scores.
9. A system comprising: one or more data processing apparatus; and a computer-readable storage device having stored thereon instructions that, when executed by the one or more data processing apparatus, cause the one or more data processing apparatus to perform operations comprising: calculating one or more types of suggestion scores for each of a plurality of training examples wherein each type of suggestion score is based at least in part on a plurality of computed predictions for each training example generated by a plurality of different trained models, including weighting each type of suggestion score by an accuracy of the trained model that generated the prediction; calculating an overall suggestion score for each training example based at least in part on a combination of the one or more types of suggestion scores for each training example; ranking the training examples by the corresponding overall suggestion scores; and providing one or more highest-ranked training examples as a set of suggested training examples.
10. The system of claim 9 , wherein the operations further comprise providing one or more of highest-ranked training examples in response to a request.
11. The system of claim 9 , wherein one of the one or more types of suggestion scores is an ambiguity score, wherein the ambiguity score for a particular training example in the training examples is based on an answer distribution of a training example between two or more categories.
12. The system of claim 9 , wherein one of the one or more types of suggestion scores is difficulty score, wherein the difficulty score for a particular training example in the training examples is based on comparing a confidence associated with an incorrectly predicted category for the training example to a threshold.
13. The system of claim 9 , wherein one of the one or more types of suggestion scores is a sparseness score, wherein the sparseness score for a particular training example in the training examples is based on comparing a count of training examples for a particular category or feature space of each training example to a threshold.
14. The system of claim 9 , wherein one of the one or more types of suggestion scores is a sparseness score, wherein the sparseness score for a particular training example in the training examples is based on comparing a distribution of training example answers to the answer of a particular training example.
15. The system of claim 9 , wherein the operations further comprise: obtaining a user-defined utility for each of one or more predicted categories, wherein utility is a measure of importance for the category, wherein calculating one or more types of suggestion scores for a particular training example comprises calculating each of the one or more types of suggestion scores weighted by the user-defined utility of a predicted category of the particular training example.
16. The system of claim 9 , wherein the operations further comprise: requesting one or more additional training examples based on one or more of the highest-ranked training examples; receiving one or more additional training examples in response to the request; updating each trained model using the one or more received additional training examples; and recalculating suggestion scores for each of the plurality of training examples and the one or more received additional training examples; and providing one or more highest-ranked training examples based on the recalculated suggestion scores.
17. A computer-readable storage device having stored thereon instructions, which, when executed by data processing apparatus, cause the data processing apparatus to perform operations comprising: calculating one or more types of suggestion scores for each of a plurality of training examples, wherein each type of suggestion score is based at least in part on a plurality of computed predictions for each training example generated by a plurality of different trained models, including weighting each type of suggestion score by an accuracy of the trained model that generated the prediction; calculating an overall suggestion score for each training example based at least in part on a combination of the one or more types of suggestion scores for each training example; ranking the training examples by the corresponding overall suggestion scores; and providing one or more highest-ranked training examples as a set of suggested training examples.
18. The storage device of claim 17 , wherein the operations further comprise providing one or more of highest-ranked training examples in response to a request.
19. The storage device of claim 17 , wherein one of the one or more types of suggestion scores is an ambiguity score, wherein the ambiguity score for a particular training example in the training examples is based on an answer distribution of a training example between two or more categories.
20. The storage device of claim 17 , wherein one of the one or more types of suggestion scores is a difficulty score, wherein the difficulty score for a particular training example in the training examples is based on comparing a confidence associated with an incorrectly predicted category for the training example to a threshold.
21. The storage device of claim 17 , wherein one of the one or more types of suggestion scores is a sparseness score, wherein the sparseness score for a particular training example in the training examples is based on comparing a count of training examples for a particular category or feature space of each training example to a threshold.
22. The storage device of claim 17 , wherein one of the one or more types of suggestion scores is a sparseness score, wherein the sparseness score for a particular training example in the training examples is based on comparing a distribution of training example answers to the answer of a particular training example.
23. The storage device of claim 17 , wherein the operations further comprise: obtaining a user-defined utility for each of one or more predicted categories, wherein utility is a measure of importance for the category, wherein calculating one or more types of suggestion scores for a particular training example comprises calculating each of the one or more types of suggestion scores weighted by the user-defined utility of a predicted category of the particular training example.
24. The storage device of claim 17 , wherein the operations further comprise: requesting one or more additional training examples based on one or more of the highest-ranked training examples; receiving one or more additional training examples in response to the request; updating each trained model using the one or more received additional training examples; recalculating suggestion scores for each of the plurality of training examples and the one or more received additional training examples; and providing one or more highest-ranked training examples based on the recalculated suggestion scores.
Unknown
December 10, 2013
Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.