Systems and Methods for Performing 16-Bit Floating-Point Vector Dot Product Instructions

PublishedJune 11, 2024

Assigneenot available in USPTO data we have

InventorsAlexander F. HEINECKE Robert VALENTINE Mark J. CHARNEY Raanan SADE Menachem ADELMAN+3 more

Technical Abstract

Patent Claims

17 claims

Legal claims defining the scope of protection, as filed with the USPTO.

2. The processor of claim 1, wherein the locations of each of the specified source and destination vectors are either in registers or in memory.

3. The processor of claim 1, wherein N is specified by the single instruction and has a value of one of 4, 8, 16, and 32.

4. The processor of claim 1, wherein the execution circuitry is to perform the multiplications without saturation and to saturate the result of the accumulation to plus or minus infinity in case of an overflow and to zero in case of any underflow.

5. The processor of claim 1, wherein the 16-bit floating-point format is bfloat16.

6. The processor of claim 1, wherein the execution circuitry is to generate all N elements of the specified destination in parallel.

8. The method of claim 7, wherein the locations of each of the specified source and destination vectors are either in registers or in memory.

9. The method of claim 7, wherein N is specified by the single instruction and has a value of one of 4, 8, 16, and 32.

10. The method of claim 7, wherein the execution circuitry is to perform the multiplications without saturation and to saturate the result of the accumulation to plus or minus infinity in case of an overflow and to zero in case of any underflow.

11. The method of claim 7, wherein the 16-bit floating-point format is bfloat16.

12. The method of claim 7, wherein the execution circuitry is to generate all N elements of the specified destination in parallel.

14. The system of claim 13, wherein the locations of each of the specified source and destination vectors are either in registers in the processor or in the memory.

15. The system of claim 13, wherein the 16-bit floating-point format is bfloat16.

16. The system of claim 13, wherein the execution circuitry is to generate all N elements of the specified destination in parallel.

18. The non-transitory machine-readable medium of claim 17, wherein the single instruction further includes a field to specify N, wherein N is an even number larger than 4.

19. The non-transitory machine-readable medium of claim 17, wherein the execution circuitry is to perform the multiplications without saturation and to saturate the result of the accumulation to plus or minus infinity in case of an overflow and to zero in case of any underflow.

20. The non-transitory machine-readable medium of claim 17, wherein the 16-bit floating-point format is bfloat16.

21. The non-transitory machine-readable medium of claim 17, wherein the execution circuitry is to generate all N elements of the specified destination in parallel.

Patent Metadata

Filing Date

Unknown

Publication Date

June 11, 2024

Inventors

Alexander F. HEINECKE

Robert VALENTINE

Mark J. CHARNEY

Raanan SADE

Menachem ADELMAN

Zeev SPERBER

Amit GRADSTEIN

Simon RUBANOVICH

Want to explore more patents?

Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.

Browse All Patents Try Prior Art Search