The present disclosure provides systems and methods that enable adaptive training of a channel coding model including an encoder model, a channel model positioned structurally after the encoder model, and a decoder model positioned structurally after the channel model. The channel model can have been trained to emulate a communication channel, for example, by training the channel model on example data that has been transmitted via the communication channel. The channel coding model can be trained on a loss function that describes a difference between input data input into the encoder model and output data received from the decoder model. In particular, such a loss function can be backpropagated through the decoder model while modifying the decoder model, backpropagated through the channel model while the channel model is held constant, and then backpropagated through the encoder model while modifying the encoder model.
Legal claims defining the scope of protection, as filed with the USPTO.
1. A computing system to perform machine learning, the computing system comprising: at least one processor; and at least one tangible, non-transitory computer-readable medium that stores instructions that, when executed by the at least one processor, cause the computing system to: obtain data descriptive of a model that comprises an encoder model, a channel model and a decoder model, wherein the encoder model is configured to receive a first set of inputs and, in response to receipt of the first set of inputs, generate a first set of outputs, wherein the channel model is configured to receive the first set of outputs and, in response to receipt of the first set of outputs, generate a second set of outputs, and wherein the decoder model is configured to receive the second set of outputs and, in response to receipt of the second set of outputs, generate a third set of outputs; determine a first loss function that describes a difference between the first set of inputs and the third set of outputs; backpropagate the first loss function through the decoder model while modifying the decoder model to train the decoder model; after backpropagating the first loss function through the decoder model, continue to backpropagate the first loss function through the channel model without modifying the channel model; after backpropagating the first loss function through the channel model, continue to backpropagate the first loss function through the encoder model while modifying the encoder model to train the encoder model; send a second set of inputs over a communication channel; receive a fourth set of outputs from the communication channel; and train the channel model based at least in part from the second set of inputs and the fourth set of outputs.
2. The computing system of claim 1 , wherein training the channel model based at least in part from the second set of inputs and the fourth set of outputs comprises: providing the second set of inputs to the channel model and receiving a fifth set of outputs from the channel model in response to receipt of the second set of inputs; determining a second loss function between the fourth set of outputs and the fifth set of outputs; and backpropagating the second loss function through the channel model while modifying the channel model to train the channel model.
3. The computing system of claim 1 , wherein the decoder model is further configured to receive the fourth set of outputs and, in response to receipt of the fourth set of outputs, generate a sixth set of outputs.
4. The computing system of claim 3 , wherein the instructions further cause the computing system to: determine a third loss function that describes a difference between the second set of inputs and the sixth set of outputs; and backpropagate the third loss function through the decoder model while modifying the decoder model to train the decoder model.
5. The computing system of claim 3 , wherein the second set of inputs sent over the communication channel comprises an output of the encoder model generated by the encoder model in response to receipt of a third set of inputs.
6. The computing system of claim 5 , wherein the instructions further cause the computing system to: determine a fourth loss function that describes a difference between the third set of inputs and the sixth set of outputs; and backpropagate the fourth loss function through the decoder model while modifying the decoder model to train the decoder model.
7. The computing system of claim 1 , wherein one or more of the encoder model, the channel model and the decoder model comprises a neural network.
8. The computing system of claim 1 , wherein the model comprises a wireless communication model that includes a channel model trained to emulate a wireless communication channel.
9. The computing system of claim 1 , wherein the model comprises a wired communication model that includes a channel model trained to emulate a wired communication channel.
10. The computing system of claim 1 , wherein: the encoder model is configured to receive the first set of inputs expressed according to a first set of dimensions; the encoder model is configured to output the first set of outputs expressed according to a second set of dimensions in response to receipt of the first set of inputs; and the decoder model is configured to output the third set of outputs expressed according to the first set of dimensions.
11. A computing device that determines an encoding scheme for a communication channel, comprising: at least one processor; a machine-learned encoder model that is configured to receive a first set of inputs and output a first set of outputs, the encoder model having been trained by sequentially backpropagating a loss function through a decoder model to modify at least one weight of the decoder model, and then through a channel model without modifying the channel model, and then through the encoder model to modify at least one weight of the encoder model, the channel model configured to receive the first set of outputs and output a second set of outputs, the decoder model configured to receive the second set of outputs and output a third set of outputs, the loss function descriptive of a difference between the first set of inputs and the third set of outputs, wherein the channel model used to train the encoder model is trained by: sending a second set of inputs over a communication channel; receiving a fourth set of inputs from the communication channel; providing the second set of inputs to the channel model and receiving a fifth set of outputs from the channel model in response to receipt of the second set of inputs; determining a second loss function between the fourth set of outputs and the fifth set of outputs; and backpropagating the second loss function through the channel model while modifying the channel model to train the channel model; and at least one tangible, non-transitory computer-readable medium that stores instructions that, when executed by the at least one processor, cause the at least one processor to: obtain a first set of transmit data for transmitting over a communication channel; input the first set of transmit data into the machine-learned encoder model; receive, as an output of the machine-learned channel encoder model, an encoded version of the transmit data; and transmit the encoded version of the transmit data over the communication channel.
12. The computing device of claim 11 , wherein the encoder model comprises a deep neural network.
13. The computing device of claim 11 , wherein the decoder model used to train the encoder model is further trained by: receiving the fourth set of outputs and, in response to receipt of the fourth set of outputs, generating a sixth set of outputs; determining a third loss function that describes a difference between the second set of inputs and the sixth set of outputs; and backpropagating the third loss function through the decoder model while modifying the decoder model to train the decoder model.
14. One or more tangible, non-transitory computer-readable media storing computer-readable instructions that when executed by one or more processors cause the one or more processors to perform operations, the operations comprising: obtaining data descriptive of a machine-learned decoder model, the decoder model having been trained by sequentially backpropagating a loss function through the decoder model to train the decoder model, and then through a channel model without modifying the channel model, and then through an encoder model to train the encoder model, the encoder model configured to receive a first set of inputs and output a first set of outputs, the channel model configured to receive the first set of outputs and output a second set of outputs, the decoder model configured to receive the second set of outputs and output a third set of outputs, the loss function descriptive of a difference between the first set of inputs and the third set of outputs, wherein the channel model used to train the decoder model is trained by: sending a second set of inputs over a communication channel; receiving a fourth set of outputs from the communication channel; providing the second set of inputs to the channel model and receiving a fifth set of outputs from the channel model in response to receipt of the second set of inputs; determining a second loss function between the fourth set of outputs and the fifth set of outputs; and backpropagating the second loss function through the channel model while modifying the channel model to train the channel model; obtaining a first set of communication data received from a communication channel; inputting the first set of communication data into the machine-learned decoder model; and receiving as an output of the machine-learned decoder model, in response to receipt of the first set of communication data, a decoded version of the communication data.
15. The one or more tangible, non-transitory computer-readable media of claim 14 , wherein the machine-learned decoder model comprises a deep neural network.
16. The one or more tangible, non-transitory computer-readable media of claim 14 , the decoder model having been further trained by: receiving the fourth set of outputs and, in response to receipt of the fourth set of outputs, generating a sixth set of outputs; determining a third loss function that describes a difference between the second set of inputs and the sixth set of outputs; and backpropagating the third loss function through the decoder model while modifying the decoder model to train the decoder model.
17. The one or more tangible, non-transitory computer-readable media of claim 14 , wherein: backpropagating the loss function through the decoder model to train the decoder model comprises modifying at least one weight included in the decoder model; and backpropagating the loss function through the encoder model to train the encoder model comprises modifying at least one weight included in the encoder model.
18. The one or more tangible, non-transitory computer-readable media of claim 14 , wherein: the encoder model is configured to receive the first set of inputs expressed according to a first set of dimensions; the encoder model is configured to output the first set of outputs expressed according to a second set of dimensions in response to receipt of the first set of inputs; and the decoder model is configured to output the third set of outputs expressed according to the first set of dimensions.
19. The computing device of claim 11 , wherein: the encoder model is configured to receive the first set of inputs expressed according to a first set of dimensions; the encoder model is configured to output the first set of outputs expressed according to a second set of dimensions in response to receipt of the first set of inputs; and the decoder model is configured to output the third set of outputs expressed according to the first set of dimensions.
20. The computing device of claim 11 , wherein one or more of the channel model or the decoder model comprises a neural network.
Cooperative Patent Classification codes for this invention. Click any code to explore related patents in that topic.
December 15, 2016
February 4, 2020
Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.