US-11410558

Traffic control with reinforcement learning

PublishedAugust 9, 2022

Assigneenot available in USPTO data we have

Inventorsnot available in USPTO data we have

Technical Abstract

An action recommendation system uses reinforcement learning that provides a next action recommendation to a traffic controller to give to a vehicle pilot such as an aircraft pilot. The action recommendation system uses data of past human actions to create a reinforcement learning model and then uses the reinforcement learning model with current ABS-B data to provide the next action recommendation to the traffic controller. The action recommendation system may use an anisotropic reward function and may also include an expanding state space module that uses a non-uniform granularity of the state space.

Patent Claims

12 claims

Legal claims defining the scope of protection, as filed with the USPTO.

2. The apparatus of claim 1 wherein the action recommendation system further comprises an anisotropic reward function to exploit that the destination is known.

3. The apparatus of claim 2 wherein the anisotropic reward function uses flight criteria comprising: vehicle type, length of the trip, lateness of the arrival, weather conditions, and remaining fuel.

4. The apparatus of claim 1 wherein the past human actions are sourced from Automatic Dependent Surveillance Broadcast data.

5. The apparatus of claim 1 wherein the current vehicle data is derived from Automatic Dependent Surveillance Broadcast data for the current vehicle.

6. The apparatus of claim 1 wherein the reinforcement learning model comprises a neural network with deep Q reinforcement learning.

8. The method of claim 7 wherein the reward function is an anisotropic reward function that exploits that the destination is known.

9. The method of claim 8 wherein the reward function uses flight criteria comprising: the aircraft type, length of the trip, lateness of the arrival, weather conditions, FAA regulations, and remaining fuel.

10. The method of claim 7 wherein the past human actions are derived from Automatic Dependent Surveillance Broadcast data.

11. The method of claim 7 wherein the current aircraft data is derived from Automatic Dependent Surveillance Broadcast data of the aircraft.

12. The method of claim 7 wherein the reinforcement learning model is a neural network used with deep Q reinforcement learning.

14. The method of claim 13 wherein the past human actions are derived from Automatic Dependent Surveillance Broadcast data.

15. The method of claim 13 wherein the reinforcement learning model comprises a neural network with deep Q reinforcement learning.

Classification Codes (CPC)

Cooperative Patent Classification codes for this invention. Click any code to explore related patents in that topic.

G08G G06N

Patent Metadata

Filing Date

May 21, 2019

Publication Date

August 9, 2022

Want to explore more patents?

Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.

Browse All Patents Try Prior Art Search