Legal claims defining the scope of protection, as filed with the USPTO.
1. A robotic apparatus, comprising: a platform comprising a controllable actuator; a sensor module configured to provide environmental information associated with an environment of the platform; and a controller configured to: provide a control instruction for the controllable actuator, the control instruction configured to cause the platform to execute an action to accomplish a target task in accordance with the environmental information; determine a predicted outcome of the action; determine a discrepancy signal based on an actual outcome of the action and the predicted outcome; and determine a repeat indication responsive to the discrepancy being within a range of a target value associated with the target task; wherein the repeat indication is configured to cause the robot to execute a second action to achieve the target task.
2. The apparatus of claim 1 , wherein: the target task is associated with an object within the environment; and the environmental information comprises sensory input characterizing one or more of a size, position, shape, or color of the object.
3. The apparatus of claim 1 , wherein: the predicted outcome and the actual outcome comprise a characteristic of at least one of the platform and the environment; and the actual outcome is determined based on an output of the sensor module obtained subsequent to the execution of the action.
4. The apparatus of claim 3 , wherein the characteristic is selected from a group consisting of a position of the platform, a position of an object within the environment, and a distance measure between the object and the platform.
5. The apparatus of claim 3 , wherein the characteristic comprises a parameter associated with the controllable actuator, the parameter being selected from a group consisting of an actuator displacement, a torque, a force, a rotation rate, and a current draw.
6. A method of training an adaptive robotic apparatus, the method comprising: for a given training trial: causing the apparatus to execute an action based on a context; determining a current discrepancy between a target outcome of the action and a predicted outcome of the action; comparing the current discrepancy to a prior discrepancy, the prior discrepancy being determined based on a prior observed outcome of the action and a prior predicted outcome of the action determined at a prior trial; and providing an indication responsive to the current discrepancy being smaller than the prior discrepancy, the indication being configured to cause the apparatus to execute the action based on the context during a trial subsequent to the given trial.
7. The method of claim 6 , wherein: the discrepancy is configured based on a difference between the actual outcome and the predicted outcome; a repeat indication is determined based on the discrepancy being greater than zero.
8. The method of claim 7 , wherein the controller is further configured to determine a stop indication based on the discrepancy being no greater than zero, the stop indication being configured to cause the adaptive robotic apparatus to execute another task.
9. The method of claim 6 , wherein: the determination of the current discrepancy is effectuated by a supervised learning process based on a teaching input; and the teaching input comprises the target outcome.
10. The method of claim 6 , wherein: the context is determined at a first time instance associated with the given trial; and the predicted outcome of the action is determined based on a delayed context obtained during another trial at a second time instance prior to the first time instance.
11. A non-transitory computer-readable storage medium having instructions embodied thereon, the instructions being executable by one or more processors to perform a method of adapting training of a learning apparatus, the method comprising: determining a discrepancy between a predicted outcome and an observed outcome of an action of the learning apparatus; determining an expected error associated with the determination of the discrepancy; comparing the expected error to a target error associated with the determination of the discrepancy; and providing a continue-training indication based on the expected error being smaller than the target error.
12. The storage medium of claim 11 , wherein: the observed outcome is associated with execution of the action during a trial at a first time instance; and the continue-training indication is configured to cause execution of the action at another trial at a second time instance subsequent to the first time instance.
13. The storage medium of claim 12 , wherein: the determination of the discrepancy is effectuated based on a first supervised learning process configured based on a first teaching input; and the first teaching input is configured to convey information related to the observed outcome of the action.
14. The storage medium of claim 13 , wherein: the determination of the expected error is effectuated based on a second supervised learning process configured based on a second teaching input; and the second teaching input is configured to convey information related to the target error.
15. The storage medium of claim 14 , wherein: the target error is determined based on one or more trials preceding the trial; and the method further comprises providing a cease-training indication based on the expected error being greater than or equal to the target error.
16. The storage medium of claim 15 , wherein: the execution of the action during the another trial is characterized by another expected error determination; and the method further comprises: adjusting the target error based on the comparison of the expected error to the target error, the adjusted target error being configured to be compared against the another expected error during the another trial.
17. The storage medium of claim 15 , wherein: the execution of the action at the first time instance is configured based on an output of a random number generator; and the method further comprises: determining one or more target error components associated with the one or more trials preceding the trial; and determining the target error based on a weighted sum of the one or more target error components.
18. The storage medium of claim 13 , wherein: the first supervised learning process is further configured based on a neuron network comprising a plurality of neurons communicating via a plurality of connections; individual connections provide an input into a given neuron, the plurality of neurons being characterized by a connection efficacy configured to affect operation of the given neuron; and the determination of the discrepancy comprises adjusting the efficacy of one or more connections based on the first teaching signal.
19. The storage medium of claim 13 , wherein: the action is configured based on a sensory context; the first supervised learning process is configured based on a look-up table, the look-up table comprising one or more entries, individual entries thereof corresponding to an occurrence of the sensory context, the action, and the predicted outcome; and an association development comprises adjusting at least one of the one or more entries based on the first teaching signal.
20. The storage medium of claim 11 , wherein: the action is configured based on a sensory context; the storage medium is embodied in a controller apparatus of a robot; responsive to the sensory context comprising a representation of an obstacle, the action comprises an avoidance maneuver executed by the robot; and responsive to the sensory context comprising a representation of a target, the action comprises an approach maneuver executed by the robot.
Unknown
February 2, 2016
Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.