System and Method for Accelerating Network Applications Using an Enhanced Network Interface and Massively Parallel Distributed Processing

PublishedMarch 21, 2017

Assigneenot available in USPTO data we have

Technical Abstract

Patent Claims

48 claims

Legal claims defining the scope of protection, as filed with the USPTO.

1. A system, comprising: at least one network interface comprising at least one first processor to: receive a plurality of packets from a network; for each packet in the plurality of packets, analyze packet contents to determine a specific data type to which the respective packet corresponds; filter the plurality of packets into a plurality of groups, each group based on the specific data type; insert the plurality of packets of a first group into a corresponding first buffer in memory of at least one graphics processing unit using direct memory access; assign each of the packets of the first group an index representing an offset indicating a location in the memory of the at least one graphics processing unit; determine that a pre-configured buffer flow capacity has been reached regarding the first buffer in the at least one graphics processing unit; and transmit an interrupt to the at least one graphics processing unit corresponding to the pre-configured buffer flow capacity regarding the first buffer in the least one graphics processing unit; and the at least one graphics processing unit connected to the at least one network interface over a bus and comprising at least one second processor to: start a first kernel preconfigured with packet handling code adapted to process packets of the specific data type in response to the interrupt to process the packets in the first buffer; wherein a substantially identical set of computer-readable instructions associated with the first kernel is executed on each of the packets in the first buffer; wherein a plurality of threads are executed on the at least one graphics processing units to process the packets in the first buffer at the index location assigned to each corresponding packet; wherein upon a failure of one or more graphics processing units, send packets to backup GPU buffers and transmit an interrupt to one or more backup graphics processing units.

2. The system of claim 1 , wherein the at least one network interface comprising the at least one first processor determines that a second pre-configured buffer flow capacity has been reached regarding a second buffer in the at least one graphics processing unit, and transmits a second interrupt to the at least one graphics processing unit corresponding to the second pre-configured buffer flow capacity regarding the second buffer in the least one graphics processing unit; and the at least one graphics processing unit comprising the at least one second processor starts a second kernel preconfigured with packet handling code adapted to process packets of the specific data type in the at least one graphics processing unit in response to the second interrupt to process second packets in the second buffer.

3. The system of claim 1 , further comprising a central processing unit comprising at least one third processor to map each buffer in the memory of the at least one graphics processing unit to at least one interrupt.

4. The system of claim 1 , wherein the at least one graphics processing unit locates each packet in the first buffer in the memory of the at least one graphics processing unit using the index and executes the first kernel to process each packet in the first buffer by at least one thread simultaneously executing first kernel code in lockstep.

5. The system of claim 1 , wherein the at least one graphics processing unit generates output packets into an output buffer in the memory of the at least one graphics processing unit.

6. The system of claim 5 , further comprising a storage device connected to the bus to store the output packets using a non-volatile memory host controller interface.

7. The system of claim 5 , wherein the at least one graphics processing unit notifies the at least one network interface that the output packets in the output buffer in the memory of the at least one graphics processing unit are ready to transmit.

8. The system of claim 1 , wherein the first group of packets is associated with at least one member of a group consisting of RADIUS authentication, Diameter processing, SSL processing, TCP/HTTP requests, SMS messaging, and SIP messaging.

9. The system of claim 1 , wherein the pre-configured buffer flow capacity regarding the first buffer in the least one graphics processing unit corresponds to at least one member of a group consisting of percentage of buffer memory used in the first buffer, buffer memory remaining in the first buffer, a number of packets currently in the first buffer, an elapsed time since a first packet was received in the first buffer, and an elapsed time since a last packet was received in the first buffer.

10. The system of claim 1 , wherein the bus comprises a Peripheral Component Interface Express bus.

11. The system of claim 1 , wherein the at least one network interface assigns a classification to each of the packets and inserts each of the packets into the first buffer in the memory of the at least one graphics processing unit responsive to the classification using direct memory access.

12. A system, comprising: at least one network interface comprising at least one first processor to: receive a stream of packets from a network; split the stream of packets into at least one packet stream subset; for each packet in the stream of packets, analyze packet contents to determine a specific data type to which the respective packet corresponds; filter the stream of packets into a plurality of groups, each group based on the specific data type; insert each packet of a first group into a corresponding first buffer in memory of at least one graphics processing unit using direct memory access; assign each of the packets of the first group an index representing an offset indicating a location in the memory of the at least one graphics processing unit; determine that a pre-configured buffer flow capacity has been reached regarding the first buffer in the at least one graphics processing unit; and transmit an interrupt to the at least one graphics processing unit corresponding to the pre-configured buffer flow capacity regarding the first buffer in the least one graphics processing unit; and the at least one graphics processing unit connected to the at least one network interface over a bus and comprising at least one second processor to: start a first kernel preconfigured with packet handling code adapted to process packets of the specific data type in response to the interrupt to process the packets in the first buffer; wherein a substantially identical set of computer-readable instructions associated with the first kernel is executed on each of the packets in the first buffer; wherein a plurality of threads are executed on the at least one graphics processing units to process the packets in the first buffer at the index location assigned to each corresponding packet; wherein upon a failure of one or more graphics processing units, send packets to backup GPU buffers and transmit an interrupt to one or more backup graphics processing units.

13. The system of claim 12 , wherein the at least one network interface assigns a classification to each packet in each packet stream subset and inserts each packet in each packet stream subset into the first buffer in memory of the at least one graphics processing unit responsive to the classification using direct memory access.

14. A server, comprising: at least one network interface comprising at least one first processor to: receive a plurality of packets from a network; for each packet in the plurality of packets, analyze packet contents to determine a specific data type to which the respective packet corresponds; filter the plurality of packets into a plurality of groups, each group based on the specific data type; insert the plurality of packets of a first group into a corresponding first buffer in memory of at least one graphics processing unit using direct memory access; assign each of the packets of the first group an index representing an offset indicating a location in the memory of the at least one graphics processing unit; determine that a pre-configured buffer flow capacity has been reached regarding the first buffer in the at least one graphics processing unit; and transmit an interrupt to the at least one graphics processing unit corresponding to the pre-configured buffer flow capacity regarding the first buffer in the least one graphics processing unit; and the at least one graphics processing unit connected to the at least one network interface over a bus and comprising at least one second processor to: start a first kernel preconfigured with packet handling code adapted to process packets of the specific data type in response to the interrupt to process the packets in the first buffer; wherein a substantially identical set of computer-readable instructions associated with the first kernel is executed on each of the packets in the first buffer; wherein a plurality of threads are executed on the at least one graphics processing units to process the packets in the first buffer at the index location assigned to each corresponding packet; wherein upon a failure of one or more graphics processing units, send packets to backup GPU buffers and transmit an interrupt to one or more backup graphics processing units.

15. The server of claim 14 , wherein the at least one network interface assigns a classification to each of the plurality of packets and inserts each of the packets into a second buffer in the memory of the at least one graphics processing unit responsive to the classification using direct memory access.

16. A method, comprising: receiving, by at least one network interface comprising at least one first processor, a plurality of packets from a network; analyzing, by the at least one network interface comprising at least one first processor, packet contents of each packet in the plurality of packets to determine a specific data type to which the respective packet corresponds; filtering, by the at least one network interface comprising at least one first processor, the plurality of packets into a plurality of groups, each group based on the specific data type; inserting, by the at least one network interface comprising the at least one first processor, each of the plurality of packets of a first group into a corresponding first buffer in memory of at least one graphics processing unit using direct memory access; assigning, by the at least one network interface comprising the at least one first processor, each of the packets of the first group an index representing an offset indicating a location in the memory of the at least one graphics processing unit; determining, by the at least one network interface comprising the at least one first processor, that a pre-configured buffer flow capacity has been reached regarding the first buffer in the at least one graphics processing unit; transmitting, by the at least one network interface comprising the at least one first processor, an interrupt to the at least one graphics processing unit corresponding to the pre-configured buffer flow capacity regarding the first buffer in the at least one graphics processing unit; and starting, by the at least one graphics processing unit comprising at least one second processor, a first kernel preconfigured with packet handling code adapted to process packets of the specific data type in response to the interrupt to process the packets in the first buffer; wherein a substantially identical set of computer-readable instructions associated with the first kernel is executed on each of the packets in the first buffer; wherein a plurality of threads are executed on the at least one graphics processing units to process the packets in the first buffer at the index location assigned to each corresponding packet; wherein upon a failure of one or more graphics processing units, send packets to backup GPU buffers and transmit an interrupt to one or more backup graphics processing units.

17. The method of claim 16 , wherein the at least one graphics processing unit is connected to the at least one network interface over a bus.

18. The method of claim 17 , further comprising: generating, by the at least one graphics processing unit, output packets into an output buffer in the memory of the at least one graphics processing unit.

19. The method of claim 18 , further comprising: storing the output packets in a storage device connected to the bus using a non-volatile memory host controller interface.

20. The method of claim 18 , further comprising: notifying, by the at least one graphics processing unit, the at least one network interface that the output packets in the output buffer in the memory of the at least one graphics processing unit are ready to transmit.

21. The method of claim 17 , wherein the bus comprises a Peripheral Component Interface Express bus.

22. The method of claim 16 , further comprising: inserting, by the at least one network interface comprising the at least one first processor, each of the plurality of packets of a first group into a corresponding second buffer in memory of at least one graphics processing unit using direct memory access; assigning, by the at least one network interface comprising the at least one first processor, each of the packets of the second group an index representing an offset indicating a location in the memory of the at least one graphics processing unit; determining, by the at least one network interface comprising the at least one first processor, that a pre-configured second buffer flow capacity has been reached regarding the second buffer in the at least one graphics processing unit, and transmitting a second interrupt to the at least one graphics processing unit corresponding to the second pre-configured buffer flow capacity regarding the second buffer in the least one graphics processing unit; and starting, by the at least one graphics processing unit comprising the at least one second processor, a second kernel preconfigured with packet handling code adapted to process packets of the specific data type in the at least one graphics processing unit in response to the second interrupt to process second packets in the second buffer.

23. The method of claim 16 , further comprising: mapping, by a central processing unit comprising at least one third processor, each buffer in the memory of the at least one graphics processing unit to at least one interrupt.

24. The method of claim 16 , further comprising: locating, by the at least one graphics processing unit, each packet in the first buffer in the memory of the at least one graphics processing unit using the index and executing the first kernel to process each packet in the first buffer by at least one thread simultaneously executing first kernel code in lockstep.

25. The method of claim 16 , wherein the first group of packets is associated with at least one member of a group consisting of RADIUS authentication, Diameter processing, SSL processing, TCP/HTTP requests, SMS messaging, and SIP messaging.

26. The method of claim 16 , wherein the pre-configured buffer flow capacity regarding the first buffer in the least one graphics processing unit corresponds to at least one member of a group consisting of one of percentage of buffer memory used in the first buffer, buffer memory remaining in the first buffer, a number of packets currently in the first buffer, an elapsed time since a first packet was received in the first buffer, and an elapsed time since a last packet was received in the first buffer.

27. The method of claim 16 , further comprising: assigning, by the at least one network interface, a classification to each of the packets and inserting each of the packets into the first buffer in the memory of the at least one graphics processing unit responsive to the classification using direct memory access.

28. A method, comprising: receiving, by at least one network interface comprising at least one first processor, a stream of packets from a network; splitting, by the at least one network interface comprising the at least one first processor, the stream of packets into at least one packet stream subset; analyzing, by the at least one network interface comprising at least one first processor, packet contents of each packet in the at least one packet stream subset to determine a specific data type to which the respective packet corresponds; filtering, by the at least one network interface comprising at least one first processor, the at least one packet stream subset into a plurality of groups, each group based on the specific data type; inserting, by the at least one network interface comprising the at least one first processor, each packet of a first group into a corresponding first buffer in memory of at least one graphics processing unit using direct memory access; assigning, by the at least one network interface comprising the at least one first processor, each of the packets of the first group an index representing an offset indicating a location in the memory of the at least one graphics processing unit; determining, by the at least one network interface comprising the at least one first processor, that a pre-configured buffer flow capacity has been reached regarding the first buffer in the at least one graphics processing unit; transmitting, by the at least one network interface comprising the at least one first processor, an interrupt to the at least one graphics processing unit corresponding to the pre-configured buffer flow capacity regarding the first buffer in the at least one graphics processing unit; and starting, by the at least one graphics processing unit comprising at least one second processor, a first kernel preconfigured with packet handling code adapted to process packets of the specific data type in response to the interrupt to process the packets in the first buffer; wherein a substantially identical set of computer-readable instructions associated with the first kernel is executed on each of the packets in the first buffer; wherein a plurality of threads are executed on the at least one graphics processing units to process the packets in the first buffer at the index location assigned to each corresponding packet; wherein upon a failure of one or more graphics processing units, send packets to backup GPU buffers and transmit an interrupt to one or more backup graphics processing units.

29. The method of claim 28 , wherein the at least one graphics processing unit is connected to the at least one network interface over a bus.

30. The method of claim 28 , further comprising: assigning, by the at least one network interface, a classification to each packet in each packet stream subset; and inserting, by the at least one network interface, each packet in each packet stream subset into the first buffer in the memory of at least one graphics processing unit using direct memory access.

31. A system, comprising: at least one network interface and at least one graphics processing unit communicating over a bus to execute computer-executable instructions to: receive a plurality of packets from a network by the at least one network interface; for each packet in the plurality of packets, analyze packet contents to determine a specific data type to which the respective packet corresponds; filter the plurality of packets into a plurality of groups, each group based on the specific data type; insert the plurality packets of a first group into a corresponding first buffer in memory of the at least one graphics processing unit using direct memory access; assign each of the packets of the first group an index by the at least one network interface representing an offset indicating a location in memory of the at least one graphics processing unit; transmit an interrupt to the at least one graphics processing unit regarding the first buffer in the least one graphics processing unit; and start a first kernel preconfigured with packet handling code adapted to process packets of the specific data type in the at least one graphics processing unit in response to the interrupt to process the packets in the first buffer; wherein a substantially identical set of computer-readable instructions associated with the first kernel is executed on each of the packets in the first buffer; wherein a plurality of threads are executed on the at least one graphics processing units to process the packets in the first buffer at the index location assigned to each corresponding packet; wherein upon a failure of one or more graphics processing units, send packets to backup GPU buffers and transmit an interrupt to one or more backup graphics processing units.

32. The system of claim 31 , wherein the at least one network interface and the at least one graphics processing unit execute computer-executable instructions to: determine that a pre-configured buffer flow capacity has been reached regarding the first buffer in the at least one graphics processing unit.

33. The system of claim 31 , wherein the at least one network interface and the at least one graphics processing unit execute computer-executable instructions to: insert the plurality packets of a first group into a corresponding second buffer in memory of the at least one graphics processing unit using direct memory access; determine that a second pre-configured buffer flow capacity has been reached regarding the second buffer in the at least one graphics processing unit; transmit a second interrupt to the at least one graphics processing unit corresponding to the second pre-configured buffer flow capacity regarding the second buffer in the least one graphics processing unit; and start a second kernel preconfigured with packet handling code adapted to process packets of the specific data type in the at least one graphics processing unit in response to the second interrupt to process second packets in the second buffer; wherein a substantially identical set of computer-readable instructions associated with the second kernel is executed on each of the packets in the second buffer; wherein a plurality of threads are executed on the at least one graphics processing units to process the packets in the first buffer at the index location assigned to each corresponding packet.

34. The system of claim 31 , wherein the at least one network interface and the at least one graphics processing unit execute computer-executable instructions to: map each buffer in the memory of the at least one graphics processing unit to at least one interrupt.

35. The system of claim 31 , wherein the at least one network interface and the at least one graphics processing unit execute computer-executable instructions to: locate each packet in the first buffer in the memory of the at least one graphics processing unit using the index and execute the first kernel to process each packet in the first buffer by at least one thread simultaneously executing first kernel code in lockstep in the at least one graphics processing unit.

36. The system of claim 31 , wherein the at least one network interface and the at least one graphics processing unit execute computer-executable instructions to: generate by the at least one graphics processing unit output packets into an output buffer in the memory of the at least one graphics processing unit.

37. The system of claim 36 , further comprising: a storage device connected to the bus to store the output packets using a non-volatile memory host controller interface.

38. The system of claim 36 , wherein the at least one network interface and the at least one graphics processing unit execute computer-executable instructions to: notify the at least one network interface that the output packets in the output buffer in the memory of the at least one graphics processing unit are ready to transmit.

39. The system of claim 31 , wherein the at least one packet is associated with at least one member of a group consisting of RADIUS authentication, Diameter processing, SSL processing, TCP/HTTP requests, SMS messaging, and SIP messaging.

40. The system of claim 31 , wherein the pre-configured buffer flow capacity regarding the first buffer in the least one graphics processing unit corresponds to at least one member of a group consisting of percentage of buffer memory used in the first buffer, buffer memory remaining in the first buffer, a number of packets currently in the first buffer, an elapsed time since a first packet was received in the first buffer, and an elapsed time since a last packet was received in the first buffer.

41. The system of claim 31 , wherein the bus comprises a Peripheral Component Interconnect Express bus.

42. The system of claim 31 , wherein the at least one network interface and the at least one graphics processing unit execute computer-executable instructions to: assign a classification to each of the packets; and insert each of the packets into the first buffer in memory of the at least one graphics processing unit using direct memory access.

43. A system, comprising: at least one network interface and at least one graphics processing unit communicating over a bus to execute computer-executable instructions to: receive a stream of packets from a network by the at least one network interface; split the stream of packets into at least one packet stream subset by the at least one network interface; for each packet in the stream of packets, analyze packet contents to determine a specific data type to which the respective packet corresponds; filter the stream of packets into a plurality of groups, each group based on the specific data type; insert each packet of a first group into a corresponding first buffer in memory of the at least one graphics processing unit using direct memory access; assign each packet of the first group an index by the at least one network interface representing an offset indicating a location in the memory of the at least one graphics processing unit; transmit an interrupt to the at least one graphics processing unit regarding the first buffer in the least one graphics processing unit; and start a first kernel preconfigured with packet handling code adapted to process packets of the specific data type in the least one graphics processing unit in response to the interrupt to process the packets in the first buffer; wherein a substantially identical set of computer-readable instructions associated with the first kernel is executed on each of the packets in the first buffer; wherein a plurality of threads are executed on the at least one graphics processing units to process the packets in the first buffer at the index location assigned to each corresponding packet; wherein upon a failure of one or more graphics processing units, send packets to backup GPU buffers and transmit an interrupt to one or more backup graphics processing units.

44. The system of claim 43 , wherein the at least one network interface and the at least one graphics processing unit execute computer-executable instructions to: determine that a pre-configured buffer flow capacity has been reached regarding the first buffer in the at least one graphics processing unit.

45. The system of claim 44 , wherein the at least one network interface and the at least one graphics processing unit execute computer-executable instructions to: assign a classification to each packet in each packet stream subset by the at least one network interface; and insert each packet in each packet stream subset into the first buffer in the memory of the at least one graphics processing unit using responsive to the classification using direct memory access.

46. A server comprising: at least one network interface and at least one graphics processing unit communicating over a bus to execute computer-executable instructions to: receive a stream of packets from a network by the at least one network interface; for each packet in the stream of packets, analyze packet contents to determine a specific data type to which the respective packet corresponds; filter the stream of packets into a plurality of groups, each group based on the specific data type; insert each of the packets of a first group into a corresponding first buffer in memory of the at least one graphics processing unit using direct memory access; assign each of the packets of the first group an index by the at least one network interface representing an offset indicating a location in the memory of the at least one graphics processing unit; transmit an interrupt to the at least one graphics processing unit regarding the first buffer in the at least one graphics processing unit; and start a first kernel preconfigured with packet handling code adapted to process packets of the specific data type in the at least one graphics processing unit in response to the interrupt to process the packets in the first buffer; wherein a substantially identical set of computer-readable instructions associated with the first kernel is executed on each of the packets in the first buffer; wherein a plurality of threads are executed on the at least one graphics processing units to process the packets in the first buffer at the index location assigned to each corresponding packet; wherein upon a failure of one or more graphics processing units, send packets to backup GPU buffers and transmit an interrupt to one or more backup graphics processing units.

47. The server of claim 46 , wherein the at least one network interface and the at least one graphics processing unit execute computer-executable instructions to: determine that a pre-configured buffer flow capacity has been reached regarding the first buffer in the at least one graphics processing unit.

48. The server of claim 46 , wherein the at least one network interface and the at least one graphics processing unit execute computer-executable instructions to: assign a classification to each of the packets; and insert each of the packets into the first buffer in the memory of the at least one graphics processing unit responsive to the classification using direct memory access.

Patent Metadata

Filing Date

Unknown

Publication Date

March 21, 2017

Inventors

Tracey M. Bernath

Want to explore more patents?

Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.

Browse All Patents Try Prior Art Search