Legal claims defining the scope of protection, as filed with the USPTO.
1. A method for analyzing the values of cellular constituents in a biological sample comprising converting a first profile comprising measurements of a plurality of cellular constituents in said biological sample into a projected profile comprising a plurality of cellular constituent set values according to a definition of co-varying basis cellular constituent sets; wherein said definition is based upon co-variation of measurements of cellular constituents that comprise at least a portion of said plurality of cellular constituents, under a plurality of different perturbations; and wherein said converting comprises projecting said first profile onto said co-varying basis cellular constituent sets, thereby analyzing the values of cellular constituents in said biological sample.
2. The method of claim 1 further comprising the step of indicating the state of said biological sample with said projected profile.
3. The method of claim 1 further comprises the steps of comparing said projected profile with a reference projected profile; and indicating similarity or difference between said projected profile and said reference profile.
4. The method of claim 1 wherein said definition is defined by a similarity tree derived by a cluster analysis of said measurements of cellular constituents comprising at least a portion of said plurality of cellular constituents, under said plurality of perturbations.
5. The method of claim 4 wherein said cellular constituent sets are defined as branches of said similarity tree.
6. The method of claim 5 wherein said branches are selected by applying a cutting level across said tree, wherein said cutting level is determined by expected number of biological pathways represented by said cellular constituents.
7. The method of claim 5 wherein distinction among said branches achieves a statistical significance at 95% confidence level.
8. The method of claim 7 wherein said statistical significance is evaluated with a test using Monte Carlo randomization of an index of said perturbations.
9. The method of claim 5, 6, 7, or 8 wherein said defined co-varying basis cellular constituent sets are refined based upon biological relationships among said cellular constituents comprising at least a portion of said plurality of cellular constituents.
10. The method of claim 1 wherein said definition is: ##EQU6## wherein V.sup.(n).sub.k is the contribution of cellular constituent k to cellular constituent set n.
11. The method of claim 10 wherein said step of converting comprises the execution of the operation: EQU P=[P.sub.1, . . . P.sub.i, . . . P.sub.n ]=p.multidot.V wherein P.sub.i is cellular constituent set value i and vector p is a profile of cellular constituents.
12. The method of claim 1 wherein each of said cellular constituent set values is the average value of the level of cellular constituents within a corresponding co-varying basis cellular constituent set.
13. The method of claim 1 wherein each of said cellular set values is a weighted average of the level of cellular constituents within a corresponding co-varying basis cellular constituent set.
14. The method of claim 1 wherein said plurality of measurements is normalized to a unity vector size.
15. The method of claim 1 wherein said measurements of cellular constituents are measurements of responses of said biological sample to a perturbation.
16. A method for defining co-varying cellular constituent sets comprising defining co-varying cellular constituent sets based upon the co-variation of a plurality of measurements of cellular constituents in a biological sample under a plurality of perturbations.
17. The method of claim 16 comprising the step of forming a clustering tree derived by a cluster analysis of similarity of said cellular constituents' behaviors under said plurality of perturbations.
18. The method of claim 17 wherein said cellular constituent sets are defined as branches of said clustering tree.
19. The method of claim 18, wherein said branches are selected by applying a cutting level across said tree, wherein said cutting level is determined by expected number of biological pathways represented by said cellular constituents.
20. The method of claim 19 wherein distinction among said branches achieves a statistical significance at 95% confidence level.
21. The method of claim 20 wherein said statistical significance is evaluated with a test using Monte Carlo randomization of an index of said perturbations.
22. A method for analyzing expression data from a biological sample comprising converting a first expression profile of a plurality of genes in said sample into a projected expression profile containing a plurality of geneset expression values according to a definition of basis genesets, wherein each of said basis genesets comprises genes having co-varying transcript levels under a plurality of different perturbations, and wherein said converting comprises projecting said first expression profile onto said basis genesets, thereby analyzing said expression data.
23. The method of claim 22 further comprises the step of indicating the state of said biological sample with said projected profile.
24. The method of claim 22 further comprising the steps of comparing said projected expression profile with a reference projected profile; and indicating similarity between said projected expression profile and reference profile.
25. The method of claim 22 wherein said definition is defined by a similarity tree derived by a cluster analysis of the expression of said genes under said plurality of perturbations.
26. The method of claim 25 wherein said basis genesets are defined as branches of said similarity tree.
27. The method of claim 26 wherein said branches are selected by applying a cutting level across said tree, wherein said cutting level is determined by expected number of biological pathways represented by said genes.
28. The method of claim 26 wherein distinction among said branches achieves a statistical significance at 95% confidence level.
29. The method of claim 22 wherein said definition is: ##EQU7## wherein V.sup.(n).sub.k is the contribution of gene k to geneset n.
30. The method of claim 22 wherein said step of converting comprises the execution of the operation: EQU P=[P.sub.1, . . . P.sub.i, . . . P.sub.n ]=p.multidot.V wherein P.sub.i is geneset value n; P is said projected profile; and vector p is said expression profile.
31. The method of claim 22 wherein each of said geneset expression values is the average value of the level of expression of said genes within a corresponding basis geneset.
32. The method of claim 22 wherein each of said geneset expression values is a weighted average of the expression of said genes within a corresponding basis geneset.
33. The method of claim 22 wherein said expression profile is a profile of responses of said biological sample to a perturbation.
34. A method for comparing two expression profiles comprising (a) converting a first expression profile into a first projected expression profile according to a definition of basis genesets, wherein each of said basis genesets comprises genes having co-varying transcript levels under a plurality of different perturbations, and wherein said converting comprises projecting said first expression profile onto said basis genesets; (b) converting a second expression profile into a second projected expression profile according to said definition of basis genesets, wherein said converting comprises projecting said expression profile onto said basis genesets; and (c) determining the generalized angle cosine between the vector of said first projected expression profile and the vector of said second projected expression profile as a similarity metric, thereby comparing said first and second expression profiles.
35. The method of claim 34 further comprising the step of determining the statistical significance of said similarity metric.
36. The method of claim 35 wherein said statistical significance is assessed using an empirical probability distribution generated under the null hypothesis of no correlation.
37. A method for determining the type of an unknown perturbation comprising: (a) converting a first expression profile from a biological sample subjected to an unknown perturbation into a first projected expression profile according to a definition of basis genesets, wherein each of said basis genesets comprises genes having co-varying transcript levels under a plurality of different perturbations, and wherein said converting comprises projecting said first expression profile onto said basis genesets; and (b) comparing said first projected expression profile with a plurality of reference projected expression profiles to determine the type of said perturbation, wherein each said reference projected expression profile is the product of a method comprising projecting a second expression profile from a biological sample subjected to a different, known perturbation onto basis genesets, wherein each of said basis genesets comprises genes having co-varying transcript levels under a plurality of different perturbations, wherein similarity of said first projected expression profile to one or more reference projected expression profiles indicates that said unknown perturbation is similar to the known one or more perturbations giving rise to said one or more reference projected profiles.
38. A method for analyzing expression data from a biological sample comprising: (a) selecting a geneset definition from a plurality of geneset definitions, wherein said geneset definitions are stored in a database management system, wherein said geneset definitions are based upon genes having co-varying transcript levels under a plurality of different perturbations; (b) inputting an expression profile of said biological sample; (c) converting said expression profile into a projected expression profile according to said selected geneset definition, wherein said converting comprises projecting said expression profile onto said geneset definition; and (d) displaying said projected expression profile, thereby analyzing expression data from said biological sample.
39. A computer system comprising (a) a processor, (b) storage media for storing a database, and (c) a program module, executable by said processor, said program module comprising computer readable program code for effecting the following steps within said computer system: (i)) retrieving a definition of basis genesets from said database, wherein said definition is based upon genes with co-varying transcript levels under a plurality of different perturbations; and (ii) converting a gene expression profile into a projected expression profile according to said definition, wherein said converting comprises projecting said gene expression profile onto said basis genesets.
40. A computer-usable medium having computer readable program code embodied thereon for effecting the following steps within a computer system: (a) retrieving a definition of basis genesets from a database, wherein said definition is based upon genes with co-varying transcript levels under a plurality of different perturbations; and (b) converting a gene expression profile into a projected expression profile according to said definition, wherein said converting comprises projecting said gene expression profile onto said basis genesets.
41. A method for analyzing expression data from a biological sample comprising causing a computer system to execute the following steps: (a) retrieving a definition of genesets from a database, wherein said definition is based upon genes with co-varying transcript levels under a plurality of different perturbations; and (b) converting a gene expression profile into a projected expression profile according to said definition, wherein said converting comprises projecting said gene expression profile onto said basis genesets.
42. The method of claim 1, wherein the plurality of different perturbations are independently selected from the group consisting of expression of a gene under the control of an exogenous titratable promoter, expression of a gene introduced into a cell by transfection or transduction, exposure to ribozymes or antisense nucleic acids or neutralizing antibodies, expression of a dominant negative mutant, and exposure to an exogenous drug.
43. A method for analyzing the values of cellular constituents in a biological sample comprising: (a) measuring a plurality of cellular constituents in said biological sample to produce a first profile; (b) converting said first profile into a projected profile comprising a plurality of cellular constituent set values according to a definition of co-varying basis cellular constituent sets; wherein said definition is based upon co-variation of measurements of cellular constituents that comprise at least a portion of said plurality of cellular constituents, under a plurality of different perturbations; and wherein said converting comprises projecting said first profile onto said co-varying basis cellular constituent sets, thereby analyzing the values of cellular constituents in said biological sample.
44. A method for analyzing the values of cellular constituents in a biological sample comprising: (a) selecting a definition of co-varying basis cellular constituent sets from a plurality of definitions of co-varying basis cellular constituent sets, wherein said definitions are stored in a database management system, and wherein said definitions are based upon co-variation of cellular constituents under a plurality of different perturbations; (b) inputting a profile comprising measurements of a plurality of cellular constituents in said biological sample; (c) converting said profile into a projected profile according to said selected definition of co-varying basis cellular constituent sets; wherein said converting comprises projecting said profile onto said co-varying basis cellular constituent sets; and (d) displaying said projected profile.
45. A computer system comprising (a) a processor, (b) storage media for storing a database, and (c) a program module, executable by said processor, said program module comprising computer readable program code for effecting the following steps within said computer system: (i) retrieving a definition of co-varying basis cellular constituent sets from said database, wherein said definition is based upon co-variation of measurements of cellular constituents under a plurality of different perturbations; and (ii) converting a first profile into a projected profile according to said definition, wherein said first profile comprises measurements of a plurality of cellular constituents in a biological sample, and wherein said converting comprises projecting said first profile onto said co-varying basis cellular constituent sets.
46. A computer-usable medium having computer readable program code embodied thereon for effecting the following steps within a computer system: (a) retrieving a definition of co-varying basis cellular constituent sets from a database, wherein said definition is based upon co-variation of measurements of cellular constituents under a plurality of different perturbations; and (b) converting a first profile into a projected profile according to said definition, wherein said first profile comprises measurements of a plurality of cellular constituents in a biological sample, and wherein said converting comprises projecting said first profile onto said co-varying basis cellular constituent sets.
47. A method for analyzing the values of cellular constituents in a biological sample comprising causing a computer system to execute the following steps: (a) retrieving a definition of co-varying basis cellular constituent sets from a database, wherein said definition is based upon co-variation of measurements of cellular constituents under a plurality of different perturbations; and (b) converting a first profile into a projected expression profile according to said definition, wherein said first profile comprises measurements of a plurality of cellular constituents in a biological sample, and wherein said converting comprises projecting said first profile onto said co-varying basis cellular constituent sets.
48. A method for identifying the cellular pathway affected by a drug comprising (a) converting a first expression profile from a biological sample subjected to said drug into a first projected expression profile according to a definition of basis genesets, wherein each of said basis genesets comprises genes having co-varying transcript levels under a plurality of different perturbations, and wherein said converting comprises projecting said first expression profile onto said basis genesets; and (b) comparing said first projected expression profile with a plurality of reference projected expression profiles to determine the cellular pathway affected by said drug, wherein each said reference projected expression profile is the product of a method comprising projecting a second expression profile from a biological sample subjected to a different perturbation known to be affecting a particular cellular pathway onto basis genesets, wherein each of said basis genesets comprises genes having co-varying transcript levels under a plurality of different perturbations, wherein similarity of said first projected expression profile to one or more reference projected expression profiles indicates that said drug affects a cellular pathway known to be affected by the one or more perturbations giving rise to said one or more reference projected profiles.
49. A method for determining whether a drug candidate has an activity similar to a known drug comprising: (a) converting a first expression profile from a biological sample subjected to said drug candidate into a first projected expression profile according to a definition of basis genesets, wherein each of said basis genesets comprises genes having co-varying transcript levels under a plurality of different perturbations, and wherein said converting comprises projecting said first expression profile onto said basis genesets; and (b) comparing said first projected expression profile with a plurality of reference projected expression profiles to determine whether said drug candidate has an activity similar to a known drug, wherein each said reference projected expression profile is the product of a method comprising projecting a second expression profile from a biological sample subjected to a different known drug onto basis genesets, wherein each of said basis genesets comprises genes having co-varying transcript levels under a plurality of different perturbations, wherein similarity of said first projected expression profile to one or more reference projected expression profiles indicates that said drug candidate has an activity similar to the known one or more drugs giving rise to said one or more reference projected profiles.
50. A method for identifying the cellular pathway affected by a drug comprising (a) converting a first profile from a biological sample subjected to said drug into a first projected profile according to a first definition of co-varying basis cellular constituent sets, wherein said first profile comprises measurements of a plurality of cellular constituents in said biological sample, and wherein said first definition is based upon co-variation of measurements of cellular constituents under a plurality of different perturbations, and wherein said converting comprises projecting said first profile onto said co-varying basis cellular constituent sets; and (b) comparing said first projected profile with a plurality of reference projected profiles to determine the cellular pathway affected by said drug, wherein each said reference projected profile is the product of a method comprising converting a second profile into said reference projected profile according to a second definition of co-varying basis cellular constituent sets, wherein said second profile comprises measurements of a plurality of cellular constituents from a biological sample subjected to a different perturbation known to be affecting a particular cellular pathway, and wherein said second definition is based upon co-variation of measurements of cellular constituents under a plurality of different perturbations, and wherein said converting comprises projecting said second profile onto said co-varying basis cellular constituent sets, wherein similarity of said first projected profile to one or more reference projected profiles indicates that said drug affects a cellular pathway known to be affected by the one or more perturbations giving rise to said one or more reference projected profiles.
51. A method for determining whether a drug candidate has an activity similar to a known drug comprising: (a) converting a first profile from a biological sample subjected to said drug candidate into a first projected profile according to a first definition of co-varying basis cellular constituent sets, wherein said first profile comprises measurements of a plurality of cellular constituents in said biological sample, and wherein said first definition is based upon co-variation of measurements of cellular constituents under a plurality of different perturbations, and wherein said converting comprises projecting said first profile onto said co-varying basis cellular constituent sets; and (b) comparing said first projected profile with a plurality of reference projected profiles to determine whether said drug candidate has an activity similar to a known drug, wherein each said reference projected profile is the product of a method comprising converting a second profile into said reference projected profile according to a second definition of co-varying basis cellular constituent sets, wherein said second profile comprises measurements of a plurality of cellular constituents from a biological sample subjected to a different known drug, and wherein said second definition is based upon co-variation of measurements of cellular constituents under a plurality of different perturbations, and wherein said converting comprises projecting said second profile onto said co-varying basis cellular constituent sets, wherein similarity of said first projected profile to one or more reference projected profiles indicates that said drug candidate has an activity similar to the known one or more drugs giving rise to said one or more reference projected profiles.
52. The method of claim 1, wherein the plurality of different perturbations is at least five different perturbations.
53. The method of any one of claims 1, 22, 34, 37, 38, 41-44, 45-48, or 47-51 , wherein the plurality of different perturbations is more than ten different perturbations.
54. The method of any one of claims 1, 22, 34, 37, 38, 41-44, or 47-51, wherein the plurality of different perturbations is more than 50 different perturbations.
55. The method of any one of claims 1, 22, 34, 37, 38, 41-44 or 47-51, wherein the plurality of different perturbations is more than 100 different perturbations.
56. The method of claim 1, 43, 44, 47, 50, or 51, wherein said measurements of a plurality of cellular constituents are measurements of abundances of a plurality of RNA species.
57. The method of claim 1, 43, 44, 47, 50, or 51, wherein said measurements of a plurality of cellular constituents are measurements of abundances of a plurality of protein species.
58. The method of claim 1 wherein said measurements of a plurality of cellular constituents are measurements of activities of a plurality of protein species.
59. The computer system of claim 39 or 45 wherein the plurality of different perturbations is more than 50 different perturbations.
60. The computer-usable medium of claim 40 or 46 wherein the plurality of different perturbations is more than 50 different perturbations.
61. The method of claim 28 wherein said statistical significance is evaluated with a test using Monte Carlo randomization of an index of said perturbations.
Cooperative Patent Classification codes for this invention. Click any code to explore related patents in that topic.
Unknown
March 20, 2001
Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.