Systems and methods for model validation includes generating a first and a second time series of segmentation states for a data set representative of a simulated population, e.g., a collection of membership counts corresponding to respective segments of the simulated population. The first and second time series of segmentation states are generated by respectively processing the data set through a first and a second simulation each comprising iterative application of a plurality of event functions. The first and the second simulation differ in at least one capacity, e.g., one including a first event function configured with a first parameter, and the second not. Analysis of differences between the first and second time series may be compared to analysis of one of the time series using a subject model. The comparison is then used to validate the model or demonstrate accuracies, inaccuracies, and/or model bias with respect to a performance metric.
Legal claims defining the scope of protection, as filed with the USPTO.
1. A method for model validation, the method comprising: generating a data set comprising a collection of membership counts corresponding to segments; generating a first time series of states including a plurality of first membership counts corresponding to the segments at a plurality of points in time by processing the data set through a first simulation comprising application of a first plurality of functions including a first function configured with a first parameter, wherein processing the data set through the first simulation causes the collection of membership counts to change to one of the plurality of first membership counts at each of the plurality of points in time; generating a second time series of states including a plurality of second membership counts corresponding to the segments at the plurality of points in time by processing the data set through a second simulation comprising application of a second plurality of functions, wherein the second plurality of functions does not include the first function configured with the first parameter, wherein processing the data set through the second simulation causes the collection of membership counts to change to one of the plurality of second membership counts at each of the plurality of points in time; identifying a first value for a metric by analyzing differences in the plurality of first membership counts and the plurality of second membership counts; identifying, for a subject model, a second value for the metric, the second value representative of an output from application of the subject model to one of the first time series or the second first time series; and determining, by comparison of the first value to the second value, a score for the subject model.
2. The method of claim 1 , wherein the second plurality of functions includes the first function configured with a second parameter different from the first parameter.
3. The method of claim 1 , wherein the second plurality of functions does not include the first function.
4. The method of claim 1 , wherein the first plurality of functions includes a natural migration event.
5. The method of claim 1 , wherein the subject model is a media mix model.
6. The method of claim 5 , wherein the media mix model includes a time series multivariate ordinary least square (“OLS”) regression.
7. The method of claim 1 , comprising generating the data set at random.
8. A system for model validation, the system comprising: a computer-readable memory storing instructions; and a processor configured to execute instructions from the memory to: generate a data set comprising a collection of membership counts corresponding to segments; generate a first time series of states including a plurality of first membership counts corresponding to the segments at a plurality of points in time by processing the data set through a first simulation comprising application of a first plurality of functions including a first function configured with a first parameter, wherein processing the data set through the first simulation causes the collection of membership counts to change to one of the plurality of first membership counts at each of the plurality of points in time; generate a second time series of states including a plurality of second membership counts corresponding to the segments at the plurality of points in time by processing the data set through a second simulation comprising application of a second plurality of functions, wherein the second plurality of functions does not include the first function configured with the first parameter, wherein processing the data set through the second simulation causes the collection of membership counts to change to one of the plurality of second membership counts at each of the plurality of points in time; identify a first value for a metric by analyzing differences in the plurality of first membership counts and the plurality of second membership counts; and identify, for a subject model, a second value for the metric, the second value representative of an output from application of the subject model to one of the first time series or the second first time series; determine, by comparison of the first value to the second value, a score for the subject model.
9. The system of claim 8 , wherein the second plurality of functions includes the first function configured with a second parameter different from the first parameter.
10. The system of claim 8 , wherein the second plurality of functions does not include the first function.
11. The system of claim 8 , wherein the first plurality of functions includes a natural migration event.
12. The system of claim 8 , wherein the subject model is a media mix model.
13. The system of claim 12 , wherein the media mix model includes a time series multivariate ordinary least square (“OLS”) regression.
14. The system of claim 8 , wherein the processor is configured to generate the data set at random.
15. A non-transitory computer-readable memory storing instructions that cause a processor executing the instructions to: generate a data set comprising a collection of membership counts corresponding to segments; generate a first time series of states including a plurality of first membership counts corresponding to the segments at a plurality of points in time by processing the data set through a first simulation with a first parameter, wherein processing the data set through the first simulation causes the collection of membership counts to change to one of the plurality of first membership counts at each of the plurality of points in time; generate a second time series of states including a plurality of second membership counts corresponding to the segments at the plurality of points in time by processing the data set through a second simulation with a second parameter, wherein processing the data set through the second simulation causes the collection of membership counts to change to one of the plurality of second membership counts at each of the plurality of points in time; identify a first value for a metric by analyzing differences in the plurality of first membership counts and the plurality of second membership counts; and identify, for a subject model, a second value for the metric, the second value representative of an output from application of the subject model to one of the first time series or the second first time series; determine, by comparison of the first value to the second value, a score for the subject model.
16. The non-transitory computer-readable memory of claim 15 , wherein the first simulation comprises applying a first plurality of functions and the second simulation comprises applying a second plurality of functions, wherein applying the first plurality of functions comprises applying a first function of the plurality of functions configured with the first parameter; wherein the second plurality of functions includes the first function of the first plurality of functions configured with the second parameter, wherein the second parameter is different than the first parameter.
17. The non-transitory computer-readable memory of claim 15 , wherein the first simulation comprises applying a first plurality of functions and the second simulation comprises applying a second plurality of functions, wherein applying the first plurality of functions comprises applying a first function of the plurality of functions configured with the first parameter; wherein the second plurality of functions does not include the first function.
18. The non-transitory computer-readable memory of claim 16 , wherein the first plurality of functions includes a natural migration event.
19. The non-transitory computer-readable memory of claim 15 , wherein the subject model is a media mix model.
20. The non-transitory computer-readable memory of claim 15 , wherein the processor is configured to generate the data set at random.
Cooperative Patent Classification codes for this invention. Click any code to explore related patents in that topic.
September 18, 2017
July 21, 2020
Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.