9069545

Relaxation of Synchronization for Iterative Convergent Computations

PublishedJune 30, 2015
Assigneenot available in USPTO data we have
Technical Abstract

Patent Claims
30 claims

Legal claims defining the scope of protection, as filed with the USPTO.

1

1. A method of relaxing synchronization in a parallel computing system, said method comprising: providing a system including a plurality of processors, said plurality of processors including at least one processor containing relaxation data registers and decoders; generating an application program from a source program, said application running multiple processing threads in parallel, at least one thread performing iterative convergent computations in an iterative loop, wherein said application program includes an instruction for storing data representing a mode of synchronization relaxation in one of said relaxation data registers; and running said application program in said system to store said data in said relaxation data registers, wherein said system decodes at least one command in said application program for synchronized data access between processors in said system as at least one command for unsynchronized data access based on said stored data, said at least one command for unsynchronized data access employed to substitute an atomic operation applied to a variable in an iterative loop with a non-atomic operation.

2

2. The method of claim 1 , further comprising: storing data indicating activation of said mode of synchronization relaxation in a first relaxation data register among said relaxation data registers; and storing data indicating frequency of synchronization relaxation in a second relaxation data register among said relaxation data registers.

3

3. The method of claim 1 , further comprising generating said application program by compiling said source program employing a compiler module including a compiler that runs in at least one computing means, said compiler module including instruction to store said data in said relaxation data registers, and said source program includes at least one instance of a compiler directive that enables said mode for synchronization relaxation.

4

4. The method of claim 3 , further comprising profiling, employing said compiler, combinations of synchronization commands and data classes to be synchronized according to said source program, wherein at least one combination of a data class and a synchronization command is identified during said profiling, wherein a frequency of synchronization in said at least one combination is reducible to a level less than 100% of occurrences specified in said source program without projected violation of a quality condition for a solution to be generated from said application program.

5

5. The method of claim 4 , further comprising marking a plurality of synchronization points in said source program with a hint, wherein said compiler assigns priority for identification as said at least one combination to commands for synchronized data access in said marked plurality of synchronization points over commands for synchronized data access in unmarked synchronization points.

6

6. The method of claim 4 , wherein said data stored in said relaxation data registers identify said at least one combination, and at least one command for synchronized data access among said at least one combination in said application program is decoded as at least one command for unsynchronized data access.

7

7. The method of claim 6 , further comprising: generating for one of said at least one combination, employing said compiler, a reduced frequency of synchronization that is not less than a minimum frequency of synchronization that avoids projected violation of said quality condition and less than corresponding frequency of synchronization specified in said source program; and transmitting, employing said system, to said relaxation data registers parameters for said reduced frequency of synchronization.

8

8. The method of claim 7 , wherein a fraction of commands for synchronized data access corresponding to said one of said at least one combination in said source program is executed as commands for unsynchronized data access during said running of said application program, wherein said fraction is greater than a ratio between said reduced frequency of synchronization to said corresponding frequency of synchronization and is less than 1.

9

9. The method of claim 1 , wherein said at least one command for unsynchronized data access in said application program includes at least one of an unsynchronized read command and an unsynchronized write command.

10

10. A method of relaxing synchronization in a parallel computing system, said method comprising: providing a system including a plurality of processors, said plurality of processors including at least one processor containing relaxation data registers and decoders; providing an application program and a set of parameters, said application running multiple processing threads in parallel, at least one thread performing iterative convergent computations in an iterative loop, and wherein the set of parameters include values for storing data representing a mode of synchronization relaxation in one of the relaxation data registers; and running said application program in said system to store said data in said relaxation data registers, wherein said system decodes at least one command in said application program for synchronized data access between processors in said system as at least one command for unsynchronized data access based on said stored data, said at least one command for unsynchronized data access substituting an atomic operation applied to a variable in an iterative loop with a non-atomic operation.

11

11. The method of claim 10 , further comprising: storing data indicating activation of said mode of synchronization relaxation in a first relaxation data register among said relaxation data registers; and storing data indicating frequency of synchronization relaxation in a second relaxation data register among said relaxation data registers.

12

12. The method of claim 10 , wherein said data stored in said relaxation data registers identify at least one combination of synchronization commands and data classes to be synchronized, and at least one command for synchronized data access, which is present among said at least one combination and is provided in said application program, is decoded as at least one command for unsynchronized data access.

13

13. A method of relaxing synchronization in parallel computing, said method comprising: providing a compiler module that enables recognition of a compiler directive for selective relaxation of synchronization during compilation; providing a source program for an application, said source program including at least one instance of said compiler directive; generating, employing a complier program that runs in at least one computing means, an application program from said source program by compiling said source program, said application running multiple processing threads in parallel, at least one thread performing iterative convergent computations in an iterative loop wherein at least one command for synchronized data access in said source program is compiled as at least one command for unsynchronized data access in said application program, said at least one command for unsynchronized data access substituting an atomic operation applied to a variable in an iterative loop with a non-atomic operation; and running said application program in a system including a plurality of processors and configured for parallel computing.

14

14. The method of claim 13 , further comprising marking a plurality of synchronization points in said source program with a hint, wherein said compiler assigns priority for compilation as commands for unsynchronized data access to commands for synchronized data access in said marked plurality of synchronization points over commands for synchronized data access in unmarked synchronization points.

15

15. The method of claim 13 , further comprising profiling, employing said compiler, combinations of synchronization commands and data classes to be synchronized according to a code in said source program, wherein at least one combination of a data class and a synchronization command is identified during said profiling, wherein a frequency of synchronization is reducible to a level less than 100% of the occurrences specified in said source program without projected violation of a quality condition for a solution for said application program in said at least one combination.

16

16. The method of claim 15 , further comprising generating for one of said at least one combination, employing said compiler, a reduced frequency of synchronization that is not less than a minimum frequency of synchronization that avoids projected violation of said quality condition and less than corresponding frequency of synchronization specified in said code in said source program.

17

17. The method of claim 16 , wherein a fraction of commands for synchronized data access corresponding to said one of said at least one combination in said source program is compiled as commands for unsynchronized data access in said application program, wherein said fraction is greater than a ratio between said reduced frequency of synchronization to said corresponding frequency of synchronization and is less than 1.

18

18. The method of claim 13 , wherein said at least one command for unsynchronized data access in said application program includes at least one of an unsynchronized read command and an unsynchronized write command.

19

19. The method of claim 13 , further comprising specifying at least one quality condition for a solution to be generated in said source program.

20

20. A system for parallel computing, said system comprising a plurality of processors configured to run an application in a parallel computing mode, said application running multiple processing threads in parallel, at least one thread performing iterative convergent computations in an iterative loop, wherein at least one of said plurality of processors includes relaxation data registers and a decoder that is configured to either convert a command for synchronized data access in the application program into a command for unsynchronized data access or transmit said command for synchronized data access unmodified based on contents of data stored in said relaxation data registers, wherein said at least one command for unsynchronized data access substitutes an atomic operation applied to a variable in an iterative loop with a non-atomic operation.

21

21. The system of claim 20 , wherein said relaxation data registers include: a first relaxation data register configured to store data indicating activation of a mode of synchronization relaxation; and a second relaxation data register configured to store data indicating frequency of synchronization relaxation.

22

22. The system of claim 20 , wherein said decoder includes an opcode mapper table which, when looked up with a command for synchronized data access in an application program, returns a command for unsynchronized data access or transmit said command for synchronized data access unmodified based on contents of data stored in said relaxation data registers.

23

23. The system of claim 20 , wherein said at least one of said plurality of processors is configured to decode an application program to determine presence of a code that allows selective relaxation of synchronization on a combination of a data class and a synchronization command, to detect parameters for selective relaxation of synchronization on said combination, and to store said parameters to said relaxation data registers as data.

24

24. The system of claim 20 , wherein said at least one of said plurality of processors is configured to convert every k-th command for synchronized data access within a combination of a data class and a synchronization command in said application program into said command for unsynchronized data access, and transmit unmodified other commands for synchronized data access within the combination, wherein k is an integer greater than 1.

25

25. The system of claim 20 , wherein said at least one of said plurality of processors is configured to transmit unmodified every k-th command for synchronized data access within a combination of a data class and a synchronization command in said application program, and convert other commands for synchronized data access within the combination into said command for unsynchronized data access, wherein k is an integer greater than 1.

26

26. At least one non-transitory machine readable data storage medium embodying a plurality of programs, said plurality of programs comprising: a compiler module including a code for enabling recognition of a compiler directive for selective relaxation of synchronization during compilation; and a compiler configured to use said complier module to recognize said compiler directive and to compile at least one command for synchronized data access in a source program as at least one command for unsynchronized data access in an application program upon detection of said compiler directive for selective relaxation of synchronization, wherein said application program runs multiple processing threads in parallel, at least one thread performing iterative convergent computations in an iterative loop, and said at least one command for unsynchronized data access employed substitutes an atomic operation applied to a variable in an iterative loop with a non-atomic operation.

27

27. The at least one non-transitory machine readable data storage medium of claim 26 , wherein said compiler is configured to perform the steps of: profiling combinations of synchronization commands and data classes to be synchronized according to a code in said source program; and identifying at least one combination of a data class and a synchronization command, wherein a frequency of synchronization is reducible to a level less than 100% of the occurrences specified in said source program without projected violation of a quality condition for a solution for said application program in said at least one combination.

28

28. The at least one non-transitory machine readable data storage medium of claim 27 , wherein said compiler is configured to perform a step of generating, for one of said at least one combination, a reduced frequency of synchronization that is not less than a minimum frequency of synchronization that avoids projected violation of said quality condition and less than corresponding frequency of synchronization specified in said code in said source program.

29

29. The at least one non-transitory machine readable data storage medium of claim 26 , wherein said source program is stored in said at least one-transitory machine readable data storage medium and includes at least one of said compiler directive.

30

30. The at least one non-transitory machine readable data storage medium of claim 26 , wherein said application program is stored in said at least one-transitory machine readable data storage medium and includes: at least one command for unsynchronized data access corresponding to at least one command for synchronized data access within a combination of a data class and a synchronization command in said source program; and at least another command for synchronized data access corresponding to at least another command for synchronized data access within said combination.

Patent Metadata

Filing Date

Unknown

Publication Date

June 30, 2015

Inventors

Lakshminarayanan Renganarayana
Vijayalakshmi Srinivasan

Want to explore more patents?

Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.

Citation & reuse

Analysis on this page is generated by Patentable — an AI-powered patent intelligence platform. AI-generated summaries, explanations, and analysis may be reused with attribution and a visible link back to the canonical URL below. Patent abstracts and claims are USPTO public domain.

Cite as: Patentable. “RELAXATION OF SYNCHRONIZATION FOR ITERATIVE CONVERGENT COMPUTATIONS” (9069545). https://patentable.app/patents/9069545

© 2026 Patentable. All rights reserved.

Patentable is a research and drafting-assistant tool, not a law firm, and does not provide legal advice. Documents we generate are drafts for review by a licensed patent attorney.