9047272

System and Methods for Index Selection in Collections of Data

PublishedJune 2, 2015
Assigneenot available in USPTO data we have
InventorsWai-Yip To
Technical Abstract

Patent Claims
19 claims

Legal claims defining the scope of protection, as filed with the USPTO.

1

1. A method for finding an index configuration for a database, the method comprising: under control of a computing system comprising one or more physical computing devices, determining, from a plurality of structured query language (SQL) statements for a database, a set of index candidates for the database, the index candidates comprising at least one of: (1) a column in a table in the database, (2) disabling an index associated with a table in the database, and (3) a composite index for the database, the composite index comprising a plurality of columns from at least one table in the database; forming a gene pool comprising a plurality of genes associated with the index candidates for the database, wherein each gene in the gene pool is a representation of an index candidate from the set of index candidates, the representation comprising an alphanumeric identifier for the index candidate; determining a participation probability for each gene in the gene pool, the participation probability for a gene based at least in part on a total cost of the plurality of SQL statements for the index candidate associated with the gene; generating a parent population of chromosomes, each chromosome comprising one or more genes from the gene pool, wherein the generating comprises filling a gene in a chromosome based at least in part on the participation probability for that gene; evolving the parent population of chromosomes to form an offspring population of chromosomes, wherein the evolving comprises applying at least one of a crossover operator and a mutation operator to at least some of the chromosomes in the parent population; determining a fitness of each of the chromosomes in the offspring population; evaluating whether to terminate the evolving based at least in part on a termination criterion; and providing information associated with one or more of the chromosomes in the offspring population.

2

2. The method of claim 1 , wherein the database comprises a relational database.

3

3. The method of claim 1 , wherein the alphanumeric identifier comprises alphanumeric information identifying a table of the database and a column of the table.

4

4. A system for managing index configurations for a database, the system comprising: an index selection subsystem configured to communicate with a data repository configured to store a database, the index selection subsystem configured to execute one or more modules on a computing device, the index selection subsystem comprising: a population module configured to provide a population of chromosomes that represent possible index configurations for the database, wherein a chromosome comprises at least one gene associated with a possible index candidate for the database; wherein the population module is configured to provide the population of chromosomes by selecting genes for the chromosomes based at least in part on a participation probability for the gene, the participation probability based at least in part on a cost of SQL statements for the index candidate associated with the gene; a fitness module configured to provide a fitness for each of the chromosomes in the population; an evolution module configured to evolve the population of chromosomes based at least in part on one or more genetic operators, the evolution module further configured to (1) increase the number of chromosomes in the population in response to a first criterion or (2) increase the length of at least some of the chromosomes in the population in response to a second criterion; and an interface module configured to provide information associated with one or more of the chromosomes in the population.

5

5. The system of claim 4 , wherein the population module is configured to represent an index candidate in a gene based at least in part on an alphanumeric identifier for the index candidate.

6

6. The system of claim 4 , wherein the population module is configured to provide the population of chromosomes by retrieving from the data repository information related to a prior evolution of chromosomes for the database.

7

7. The system of claim 4 , wherein the fitness module is configured to provide the fitness of a chromosome based at least in part on a cost of a plurality of structured query language (SQL) statements for an index configuration comprising the index candidates associated with the genes in the chromosome.

8

8. The system of claim 4 , wherein the evolution module is further configured to terminate the evolution of the population based on one or more termination criteria, the termination criteria comprising one or more of: (1) a number of generations of the population, (2) an elapsed time, (3) a time-of-day, (4) an amount of computer processing time, (5) an amount of computer resources used by the system, and (6) a threshold level of fitness.

9

9. The system of claim 8 , wherein the interface module is configured to provide information associated with the chromosome having the greatest fitness in the population when the evolution is terminated.

10

10. The system of claim 4 , wherein the first criterion comprises determining whether increasing the length of at least some of the chromosomes in a prior population led to a significant fitness improvement after a threshold number of generations of evolution.

11

11. The system of claim 4 , wherein the second criterion comprises determining whether increasing the number of chromosomes in a prior population led to a significant fitness improvement after a threshold number of generations of evolution.

12

12. The system of claim 4 , wherein the database comprises a plurality of tables, each of the tables comprising one or more columns of data, and the index candidates comprise (1) a column in one of the plurality of tables in the database, (2) disabling an index from one of the plurality of tables in the database, and (3) a composite index for the database, the composite index comprising a plurality of columns from one or more of the plurality of tables in the database.

13

13. The system of claim 12 , wherein, for an index candidate comprising a composite index for the database, the population module is configured to determine a length for the composite index based at least in part on a probability distribution for lengths of composite index candidates.

14

14. The system of claim 13 , wherein the population module is further configured to select columns for the composite index based at least in part on a participation probability for the column.

15

15. A method for recommending an index configuration for a collection of data, the method comprising: analyzing a plurality of transactions for a collection of data to determine a set of index candidates for the collection of data; generating a group of index configurations for the collection of data, each index configuration associated with a plurality of index candidates from the set of index candidates, the index candidates represented in the index configuration as a non-bitmapped data structure; evaluating a fitness of each index configuration in the group of index configurations, the fitness based at least in part on a change in computer resources associated with using the index configuration when executing the plurality of transactions on the collection of data; for each index configuration in the group of index configurations, changing a first index candidate of the plurality of index candidates in the index configuration, wherein changing the first index candidate comprises: replacing the first index candidate with a second index candidate selected from the set of index candidates based at least in part on a mutation probability, the second index candidate different from the first index candidate; or replacing the first index candidate with a second index candidate selected from a second index configuration from the group of index configurations, the second index configuration selected based at least in part on a recombination probability, the recombination probability based at least in part on the fitness associated with the second index configuration; modifying the group of index configurations by (1) adding new index configurations to the group of index configurations, the new index configurations comprising index candidates selected from the set of index candidates, (2) adding new index candidates from the set of index candidates to each of the index configurations present in the group of index configurations, or both (1) and (2); repeating, one or more times, evaluating the fitness, changing the first index candidate, and modifying the group of index configurations; and recommending, from the group of index configurations, the index configuration having the greatest fitness, wherein the method is performed by a computing system comprising one or more physical computing devices.

16

16. The method of claim 15 , wherein the collection of data comprises a database.

17

17. The method of claim 16 , wherein the transaction comprises a structured query language (SQL) transaction on the database.

18

18. The method of claim 15 , wherein the non-bitmapped data structure comprises alphanumeric characters.

19

19. The method of claim 15 , wherein generating the group of index configurations for the collection of data comprises: determining, for each of the index candidates in the set of index candidates, a change in the computer resources associated with using an index comprising the index candidate when executing the plurality of transactions on the collection of data; determining, for each index candidate in the set of index candidates, a participation probability based at least in part on the change in the computer resources associated with the index candidate; and selecting, for each index configuration in the group of index configurations, index candidates based at least in part on the participation probability associated with the index candidate.

Patent Metadata

Filing Date

Unknown

Publication Date

June 2, 2015

Inventors

Wai-Yip To

Want to explore more patents?

Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.

Citation & reuse

Analysis on this page is generated by Patentable — an AI-powered patent intelligence platform. AI-generated summaries, explanations, and analysis may be reused with attribution and a visible link back to the canonical URL below. Patent abstracts and claims are USPTO public domain.

Cite as: Patentable. “SYSTEM AND METHODS FOR INDEX SELECTION IN COLLECTIONS OF DATA” (9047272). https://patentable.app/patents/9047272

© 2026 Patentable. All rights reserved.

Patentable is a research and drafting-assistant tool, not a law firm, and does not provide legal advice. Documents we generate are drafts for review by a licensed patent attorney.

SYSTEM AND METHODS FOR INDEX SELECTION IN COLLECTIONS OF DATA — Wai-Yip To | Patentable