Techniques for analyzing and modeling the frequency of queries are provided by a query analysis system. A query analysis system analyzes frequencies of a query over time to determine whether the query is time-dependent or time-independent. The query analysis system forecasts the frequency of time-dependent queries based on their periodicities. The query analysis system forecasts the frequency of time-independent queries based on causal relationships with other queries. To forecast the frequency of time-independent queries, the query analysis system analyzes the frequency of a query over time to identify significant increases in the frequency, which are referred to as “query events” or “events.” The query analysis system forecasts frequencies of time-independent queries based on queries with events that tend to causally precede events of the query to be forecasted.
Legal claims defining the scope of protection, as filed with the USPTO.
1. A method in a computing device for determining a causal relationship between a first query and a second query, the method comprising: for each of a plurality of intervals, providing the frequency of the first query and the frequency of the second query during the interval, each frequency indicating number of times a query is submitted to a search engine; identifying first events of the first query by analyzing the frequencies of the first query and second events of the second query by analyzing the frequencies of the second query, each event representing frequencies over a sequence of intervals that satisfy both a baseline criterion and a climax criterion, the baseline criterion indicating a minimum frequency for each interval in the sequence and the climax criterion indicating a minimum frequency for at least one interval in the sequence; for combinations of first events and second events, generating an event causal score indicating the causal relationship between the first event of the combination and the second event of the combination; and generating a query causal score indicating the causal relationship between the first query and the second query by aggregating the event causal scores of the combinations of events of the first query and the second query; wherein the generating of a query causal score includes generating event/query causal scores between first events and the second query indicating causal relationship between a first event and the second query and aggregating the event/query causal scores to generate the query causal score.
2. The method of claim 1 wherein each event has an event period and the event causal score for a combination of a first event and a second event is based on overlap between the event period of the first event and the event period of the second event.
3. The method of claim 1 wherein each event has an event area and the event causal score for a combination of a first event and a second event is based on intersection of the event area of the first event and the event area of the second event.
4. The method of claim 3 wherein the event area of an event is based on an event period of an event and frequency of the query during the time period.
5. The method of claim 1 wherein the event causal score between a first event and a second event is represented by the following: P ( e 1 ≺ e 2 ) = { 0 if t b ( e 1 ) after t b ( e 2 ) S ( e 1 ) ⋂ S ( e 2 ) max ( S ( e 1 ) , S ( e 2 ) ) else .
6. The method of claim 1 wherein the event/query causal score is a maximum of the event causal scores between a first event and each of the second events.
7. The method of claim 1 wherein the query score is a summation of an event area of each first event times the event/query causal score for the first event divided by a summation of the event area of each first event.
8. The method of claim 1 wherein the generating of the query causal score is represented as follows: P ( A ≺ B ) = ∑ i = 1 M A S ( e i A ) P ( e i A ≺ B ) ∑ i = 1 M A S ( e i A ) = ∑ i = 1 M A S ( e i A ) max 1 ≤ j ≤ M B P ( e i A ≺ e j B ) ∑ i = 1 M A S ( e i A ) .
9. The method of claim 1 including generating a query causal score for each of a plurality of queries indicating causal relationship between each of the plurality of queries and the second query and predicting events of the second query based on the query causal scores for the plurality of queries.
10. A computer-readable storage medium encoded with instructions for causing a computing device to predict frequency of a target query, by a method comprising: for each of a plurality of queries, generating a query causal score indicating causal relationship between the query and the target query, a causal relationship between the query and the target query indicating that a change in frequency of the query precedes a chance in frequency of the target query within a time period, each frequency indicating number of times a query is submitted; selecting candidate queries for predicting events of the target query, the candidate queries being selected based on their query causal scores; learning a query model using frequencies of the candidate queries for predicting frequency of the target query; and calculating a predicted frequency for the target query by applying the query model to frequencies of the candidate query; wherein the generating of a query causal score between a first query and the target query includes generating, for each first event of the first query, an event/query causal score between the first event and the target query and aggregating the event/query causal scores into the query causal score.
11. The computer-readable storage medium of claim 10 wherein a multiple regression algorithm is used to learn the query model.
13. The computer-readable storage medium of claim 12 wherein the learning of the query model estimates a parameter for each query based on a solution to the following: { q 1 = β 0 + β 1 c 11 + β 2 c 21 + … ++ β n C n 1 q N = β 0 + β 1 c 1 N + β 2 c 2 N + … ++ β n C nN .
14. The computer-readable storage medium of claim 10 wherein an event/query causal score for a first event and the target query is based on a maximum of event causal scores between the first event and each event of the target query.
15. A computing device for predicting frequency of a target query, comprising: a query frequency store having frequencies of each of a plurality of queries at intervals; a memory storing computer-executable instructions of a component that identifies events within the queries by analyzing the frequencies of the queries during the intervals, each frequency of an interval indicating number of times a query is submitted during that interval; a component that selects as candidate queries those queries whose events have a likely causal relationship with events of the target query; a component that learns a query model using frequencies of the candidate queries for predicting a frequency of the target query, wherein the component that learns the query model learns parameters for combining frequencies of the candidate queries to predict the frequency of the target query; and a component that predicts the frequency of the target query by applying the query model to frequencies of the candidate queries; and a processor for executing the computer-executable instructions stored in the memory.
16. The computing device of claim 15 wherein the component that selects candidate queries makes the selection based on a query causal score indicating causal relationship between a candidate query and the target query, the query causal score based on an event/query causal score for each event of a candidate query indicating causal relationship between the event and the target query.
17. The computing device of claim 15 wherein the component that predicts the frequency generates a linear combination of the frequency of the candidate queries weighted by the learned parameters.
Cooperative Patent Classification codes for this invention. Click any code to explore related patents in that topic.
June 28, 2007
March 23, 2010
Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.