A computer-implemented method for creating all-atom, real-space protein conformers. The method involves constructing a backbone structure of &agr;-carbons of a protein from the amino acid sequence of the protein by adding and removing carbon atoms through chain elongation and backtracking. An atom is positioned based on a predicted two-dimensional space, and backtracking removes an atom if it is closer to its neighbour than allowed by van der Waals radii. The method also involves positioning &bgr; carbons, C, N, and O atoms to provide favourable bond lengths and bond angles, and positioning sidechain rotamers.
Legal claims defining the scope of protection, as filed with the USPTO.
1. A computer-imnplemented method for identifying at least one conformation of a protein of known or unknown structure which comprises the steps of: (a) providing an amino acid sequence of the protein; (b) constructing, a backbone structure of -carbons of the amino acid sequence of the protein by adding and removing carbon atoms through chain elongation and backtracking, which comprises randomly sampling trajectory distributions represerning a statistical sampling from each amino acid of the conformational space it is observed to visit in known proteins, and backtracking using a hierarchical data structure to remove a& atom if it is closer to its neighbor than allowed by van der Waals radii; (c) positioning carbons, C, N, and O atoms about the constructed backbone structure of step (b) wherein the atoms are positioned to provide favourable bond lengths and bond angles; (d) positioning sidechain rotamers including adding hydrogen atoms; and (e) displaying the results of steps (a)-(d); thereby outputting at least one conformation of the protein.
2. A method as claimed in claim 1 wherein in step (c) the conformation of the protein is outputted as an all atom protein structure including hydrogen atoms.
3. A method as claimed in claim 1 , wherein the conformation of the protein is constructed in O(NlogN) time, wherein N is the number of residues in the protein and O means the order of the algorithm.
4. A method as claimed in claim 1 , wherein in steps (a) to (d) the conformation of the protein is constructed in three dimensional unconstrained space.
5. A method as claimed in claim 1 , wherein the trajectory distributions are resolved into , , and coil secondary structure components for each amino acid residue.
6. A method as claimed in claim 5 wherein the trajectory distributions are recombined in predicted , , and coil secondary structure proportions for each amino acid to form a starling backbone conformation graph.
7. A method as claimed in claim 1 , wherein in step (b) a carbon atom is positioned based on a two-dimensional probability distribution function.
8. A method as claimed in claim 1 wherein the conformation of the protein is stored in memory or on a computer storage medium.
9. A method as claimed in claim 1 , comprising reiterating steps (a)-(d) an arbitrary number of times before displaying the results of step (e) to provide a plurality of different conformations of the protein.
10. A method as claimed in claim 9 , comprising assembling the plurality of different conformations to provide an ensemble of conformations of the protein.
11. A method as claimed in claim 10 wherein the ensemble of conformations of the protein are incorporated in a database.
12. A method as claimed in claim 11 wherein the database comprises about 50,000 to 500,000 different confirmations of the protein.
13. A method as claimed in claim 9 wherein the different conformations of the protein are stored in memory or on a computer storage medium.
Cooperative Patent Classification codes for this invention. Click any code to explore related patents in that topic.
January 25, 2000
December 3, 2002
Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.