{"schema_version":"1.0","canonical_url":"https://patentable.app/patents/US-9836697","patent":{"patent_number":"US-9836697","title":"Determining variable optimal policy for an MDP paradigm","assignee":null,"inventors":[],"filing_date":"2014-10-06T00:00:00.000Z","publication_date":"2017-12-05T00:00:00.000Z","cpc_codes":["G06N"],"num_claims":20,"abstract":"A method for determining a variable near-optimal policy for a problem formulated as Markov Decision Process, the problem comprising at least one limited action entry, the limited action entry being an entry of an action of a finite set of actions limited in the number of times its value may be changed, the method comprising using at least one hardware processor for: receiving data elements with respect to the problem, the data elements comprising: (a) a finite set of states, (b) the finite set of actions, (c) a transition probabilities matrix determining transition probabilities between states of the finite set of states, once actions of the set of actions are performed; (d) an immediate cost function, wherein the value of the immediate cost function is determined for a pair of a state of the finite set of states and an action of the finite set of actions, and (e) a discount factor; updating one or more data elements of the received data elements relating to the at least one limited action entry, wherein the one or more data elements are selected from the group consisting of: the transition probabilities matrix, the immediate cost function and the discount factor, and wherein the updating is triggered by a change of a value of a limited action entry of the at least one limited action entry; and following the updating of the one or more data elements, calculating a current near-optimal policy for the problem based on the updated one or more data elements."},"analysis":{"summary":null,"layman_explanation":null,"technical_analysis":null,"business_analysis":null,"faqs":null,"topics":[],"tech_cluster":null},"seo":{"title":"Determining variable optimal policy for an MDP paradigm","description":"A method for determining a variable near-optimal policy for a problem formulated as Markov Decision Process, the problem comprising at least one limited action entry, the limited action entry being an","keywords":[]},"attribution":{"source":"Patentable","source_url":"https://patentable.app","canonical_url":"https://patentable.app/patents/US-9836697","license":"CC-BY-4.0-like","license_terms":"AI-generated analysis on this page (summary, layman_explanation, technical_analysis, business_analysis, faqs) may be reused with attribution and a visible link back to the canonical URL above. Patent abstracts, claims, and bibliographic data are USPTO public domain.","required_link":"https://patentable.app/patents/US-9836697","citation_suggestion":"Patentable. \"Determining variable optimal policy for an MDP paradigm\" (US-9836697). https://patentable.app/patents/US-9836697","copyright_holder":"Nomic Interactive Technology LLC"},"links":{"html":"https://patentable.app/patents/US-9836697","json":"https://patentable.app/api/llm-context/US-9836697","site":"https://patentable.app","llms_txt":"https://patentable.app/llms.txt"},"generated_at":"2026-06-06T09:27:50.513Z"}