Probability Density Ratio Estimation

PublishedAugust 17, 2021

Assigneenot available in USPTO data we have

Technical Abstract

Patent Claims

20 claims

Legal claims defining the scope of protection, as filed with the USPTO.

1. A computer-implemented method comprising: obtaining a first sample set including a plurality of first samples and a second sample set including a plurality of second samples, wherein each of the first samples and the second samples is represented as a vector including a plurality of parameters, constructing at least one decision tree for estimating a ratio of probability density p(x)/q(x) based on the first sample set and the second sample set, wherein p(x) is a probability density of the first samples corresponding to an input vector x and q(x) is a probability density of the second samples corresponding to the input vector x such that computational resources needed for estimating the ratio of the probability density is reduced without impairing accuracy of estimation, the constructing at least one decision tree including using a single branching condition existing at the particular leaf node to generate a plurality of three or more new nodes therefrom and determining a parameter among the plurality of parameters and a threshold for a target node of the at least one decision tree that minimize an output of an objective function corresponding to a determined branching condition, the branching condition being determined using a greedy search.

2. The method of claim 1 , wherein the constructing at least one decision tree includes constructing a plurality of decision trees, each decision tree outputting a corresponding ratio of probability density p(x)/q(x).

3. The method of claim 2 , further comprising: sampling one or more first samples from the plurality of first samples to generate a first subset of the first sample set, and sampling one or more second samples from the plurality of second samples to generate a second subset of the second sample set, wherein at least one of the plurality of decision trees outputs a ratio of probability density p(x)/q(x) based on the first subset and the second subset.

4. The method of claim 3 , wherein the sampling the one or more first samples and the sampling the one or more second samples is performed based on a bootstrap algorithm.

5. The method of claim 1 , wherein the decision tree outputs a ratio of a rate of the first samples allocated to a leaf node of the decision tree and a rate of the second samples allocated to the leaf node of the decision tree.

6. The method of claim 1 , wherein the constructing at least one decision tree includes: determining a branching condition at a target node of the at least one decision tree, and branching the target node according to the branching condition to generate a plurality of new nodes by dividing the first samples of the target node according to the branching condition, and dividing the second samples of the target node according to the branching condition.

7. The method of claim 6 , wherein the objective function approximates a squared error of an estimation of the ratio of probability density.

8. The method of claim 7 , wherein the output of the objective function is positively correlated with the output of the probability density q(x).

9. The method of claim 8 , wherein the objective function is based on: rates of the first samples at nodes branched from the target node according to the branching condition to the first samples of the decision tree and, rates of the second samples at nodes branched from the target node according to the branching condition to the second samples of the decision tree.

10. The method of claim 9 , wherein the objective function is at least partially represented by: 1 M l ⁢ ∑ m = 1 M l ⁢ ⁢ r ^ ⁡ ( x q ( m ) ) 2 - 2 ⁢ 1 N l ⁢ ∑ n = 1 N l ⁢ ⁢ r ^ ⁡ ( x p ( n ) ) where N 1 is a number of the first samples at the target node of the decision tree, M 1 is a number of the second samples at the target node of the decision tree, x p (n) represents a vector corresponding to (n)-th sample in the N 1 first samples, x q (m) represents a vector corresponding to (m)-th sample in the M 1 second samples, and r{circumflex over ( )}(x) represents a density ratio function inputting the vector x and outputting the ratio of probability density at the target node corresponding to the vector x based on: the rates of the first samples at the nodes branched from the target nodes according to the branching condition to the first samples of the decision tree and, the rates of the second samples at the nodes branched from the target node according to the branching condition to the second samples of the decision tree.

11. The method of claim 6 , wherein the determining a branching condition includes determining a parameter and a threshold for the target node randomly.

12. The method of claim 2 , further comprising: obtaining a target vector x T , inputting the target vector x T into the plurality of decision trees, obtaining a plurality of outputs from the plurality of decision trees, and estimating a ratio of probability density p(x T )/q(x T ) based on the plurality of outputs from the plurality of decision trees.

13. The method of claim 12 , wherein the estimating the ratio of probability density p(x T )/q(x T ) includes calculating an average of the plurality of outputs from the plurality of decision trees.

14. A computer-implemented method, comprising: obtaining a first sample set including a plurality of first samples and a second sample set including a plurality of second samples, wherein each of the first samples and the second samples is represented as a vector including a plurality of parameters, constructing at least one decision tree for estimating a ratio of probability density p(x)/q(x) for a particular leaf node thereof for a machine learning application based on the first sample set and the second sample set, wherein p(x) is a probability density of the first samples corresponding to an input vector x and q(x) is a probability density of the second samples corresponding to the input vector x such that the decision tree outputs a ratio of a rate of the first samples allocated to the particular leaf node by the input vector x over a ratio of a rate of the second samples allocated to the particular leaf node by the input vector x as the ratio of probability density p(x)/q(x) when the input vector x allocated to the particular leaf node has reached the particular leaf node, such that computational resources needed for estimating the ratio of the probability density is reduced without impairing accuracy of estimation, wherein the constructing at least one decision tree includes using a single branching condition existing at the particular leaf node to generate a plurality of three or more new nodes therefrom, and determining a parameter among the plurality of parameters and a threshold for a target node of the at least one decision tree that minimize an output of an objective function corresponding to a determined branching condition, the branching condition being determined using a greedy search and the objective function being an approximate of squared error of a density ratio estimation.

15. The method of claim 14 , wherein the constructing at least one decision tree includes constructing a plurality of decision trees, each decision tree outputting a corresponding ratio of probability density p(x)/q(x).

16. The method of claim 15 , further comprising: sampling one or more first samples from the plurality of first samples to generate a first subset of the first sample set, and sampling one or more second samples from the plurality of second samples to generate a second subset of the second sample set, wherein at least one of the plurality of decision trees outputs a ratio of probability density p(x)/q(x) based on the first subset and the second subset.

17. The method of claim 15 , wherein the ratio of probability density p(x)/q(x) for the particular leaf node is estimated by each respective one of the plurality of decision trees as a first fraction over a second fraction.

18. The method of claim 17 , wherein the first fraction specifies a number of allocated ones of the first samples to the particular node over an overall number of the first samples, and wherein the second fraction specifies a number of allocated ones of the second samples to the particular node over an overall number of the second samples.

19. The method of claim 14 , wherein the at least one decision tree is comprised in a random forest.

20. The method of claim 1 , wherein the decision tree outputs the estimate of the ratio of probability density of the particular leaf node in response to the input vector, where the decision tree output R{circumflex over ( )}(x) of the decision tree ({circumflex over (R)}: X→ ) with I leaf nodes is represented by {circumflex over (R)}(x)=Σ i=1 I {circumflex over (r)} i II(x∈L i ), where I is a number of leaf nodes, L i , is a region of an i-th leaf node in the decision tree, L 1 ∪L 2 ∪ . . . ∪L I =X, L i ∩L j =Ø for i≠j, and II(B) is an indicator function that returns 1 if argument B is true and else returns 0.

Patent Metadata

Filing Date

Unknown

Publication Date

August 17, 2021

Inventors

Satoshi Hara

Tetsuro Morimura

Want to explore more patents?

Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.

Browse All Patents Try Prior Art Search