9817612

High-Performance Hash Joins Using Memory with Extensive Internal Parallelism

PublishedNovember 14, 2017
Assigneenot available in USPTO data we have
Technical Abstract

Patent Claims
12 claims

Legal claims defining the scope of protection. Each claim is shown in both the original legal language and a plain English translation.

Claim 1

Original Legal Text

1. A system comprising: a memory; and one or more processor cores, communicatively coupled to the memory, the one or more processor cores configured to: issue, to a dynamic random access memory with extensive internal parallelism (DRAM with EIP), a first group of two or more load requests to load data from a hash table comprising one or more hash buckets, wherein the hash table is constructed from hashed join-key values of a dimension table for a hash-join procedure, and wherein each load request in the first group corresponds to an entry in a fact table of the hash-join procedure and seeks a hash bucket matching a hashed join-key value for the corresponding entry in the fact table; issue, to the DRAM with EIP, a second group of two or more load requests to load data from the hash table; receive, from the DRAM with EIP, first response data that is responsive to the first group of load requests, wherein the first response data comprises one or more hash buckets from the hash table; and process the first response data while awaiting second response data that is responsive to the second group of load requests, wherein processing the first response data comprises: identifying matches between the join-key values corresponding to entries in the two or more load requests of the first group and the one or more hash buckets in the first response data; wherein the size of the second group of two or more load requests is selected such that a time for processing the first response data is based on the latency in receiving the second response data.

Plain English Translation

A system performs hash joins by sending parallel load requests to a memory (DRAM) with internal parallelism. First, it sends a group of two or more load requests to fetch data from a hash table. This hash table contains hashed join-key values from a dimension table. Each load request corresponds to an entry in a fact table and seeks a matching hash bucket. Next, a second group of two or more load requests is sent. While waiting for the response to the second group, the system processes the first group's response, which contains hash buckets. This processing involves finding matches between the join-key values from the first group's requests and the hash buckets received. The size of the second group is chosen so that the time to process the first response is roughly equal to the time it takes to receive the second response.

Claim 2

Original Legal Text

2. The system of claim 1 , wherein issuing the first group of two or more load requests and issuing the second group of two or more load requests are performed on back-to-back processor cycles.

Plain English Translation

The hash join system described previously improves performance by issuing the first and second groups of parallel memory load requests on consecutive processor cycles. This minimizes the delay between sending the requests, allowing the memory to operate continuously and maximizing its internal parallelism to improve overall hash join speed.

Claim 3

Original Legal Text

3. The system of claim 1 , wherein the one or more processor cores are further configured to: read two or more entries of the fact table; hash a join-key value of each entry of the fact table; and add the hashed join-key value of each entry of the fact table, along with associated data, to a work queue; wherein issuing the first group of two or more load requests comprises issuing load requests corresponding to two or more entries of the work queue.

Plain English Translation

The hash join system first reads entries from the fact table and calculates a hash value from the join-key of each entry. It adds these hashed join-key values, along with other relevant data, into a work queue. When the system sends out the first group of memory load requests, it creates these requests based on the entries currently stored within the work queue. This allows for efficient pipelining of the fact table entries.

Claim 4

Original Legal Text

4. The system of claim 3 , wherein the one or more processor cores are further configured to sort the work queue to dynamically reduce differential latencies for receiving response data that is responsive to two or more groups of load requests issued.

Plain English Translation

In the hash join system with a work queue described earlier, the work queue is sorted to reduce differences in response times from the memory. By prioritizing requests that are expected to return more quickly (e.g., based on memory locality or access patterns), the system reduces the overall latency and processing time for multiple groups of load requests. This dynamic sorting helps to optimize the memory's parallel processing capabilities.

Claim 5

Original Legal Text

5. The system of claim 1 , wherein the one or more processor cores are further configured to dynamically modify the size of the second group of two or more load requests.

Plain English Translation

The hash join system dynamically adjusts the size of the second group of memory load requests. This allows the system to adapt to changing memory access patterns and processing loads. By modifying the number of parallel requests, the system can optimize the balance between request latency and data processing time, improving the overall performance of the hash join operation.

Claim 6

Original Legal Text

6. The system of claim 1 , wherein the one or more processor cores are further configured to select the size of the second group of two or more load requests, wherein selecting the size of the second group comprises: calculating an aggregate latency of a third group of two or more load requests issued by a single thread, wherein the aggregate latency is the time between issuing the third group of two or more load requests and receiving a response; identifying the dependence of the aggregate latency on the number of requests in the third group; and determining an optimum number of load requests in the second group based at least in part on the aggregate latency and the dependence of the aggregate latency on the number of requests in the third group.

Plain English Translation

To optimize the size of the second group of memory load requests, the hash join system calculates the "aggregate latency" for a third group of memory load requests issued by a single thread. Aggregate latency is the total time from issuing the requests to receiving the response. The system identifies how the aggregate latency changes based on the number of requests in the third group. Based on this relationship, the system determines the best number of load requests for the second group to optimize the balance between memory access time and data processing time.

Claim 7

Original Legal Text

7. A computer program product for managing a hash-join procedure, the computer program product comprising a computer readable storage medium having program instructions embodied therewith, the program instructions executable by a processor to cause the processor to perform a method comprising: issuing, to a dynamic random access memory with extensive internal parallelism (DRAM with EIP), a first group of two or more load requests to load data from a hash table comprising one or more hash buckets, wherein the hash table is constructed from hashed join-key values of a dimension table for a hash-join procedure, and wherein each load request in the first group corresponds to an entry in a fact table of the hash-join procedure and seeks a hash bucket matching a hashed join-key value for the corresponding entry in the fact table; issuing, to the DRAM with EIP, a second group of two or more load requests to load data from the hash table; receiving, from the DRAM with EIP, first response data that is responsive to the first group of load requests, wherein the first response data comprises one or more hash buckets from the hash table; and processing the first response data while awaiting second response data that is responsive to the second group of load requests, wherein processing the first response data comprises: identifying matches between the join-key values corresponding to entries in the two or more load requests of the first group and the one or more hash buckets in the first response data; wherein the size of the second group of two or more load requests is selected such that a time for processing the first response data is based on the latency in receiving the second response data.

Plain English Translation

A computer program manages hash joins by issuing parallel load requests to a memory (DRAM) with internal parallelism. First, it sends a group of two or more load requests to fetch data from a hash table. This hash table contains hashed join-key values from a dimension table. Each load request corresponds to an entry in a fact table and seeks a matching hash bucket. Next, a second group of two or more load requests is sent. While waiting for the response to the second group, the program processes the first group's response, which contains hash buckets. This processing involves finding matches between the join-key values from the first group's requests and the hash buckets received. The size of the second group is chosen so that the time to process the first response is roughly equal to the time it takes to receive the second response.

Claim 8

Original Legal Text

8. The computer program product of claim 7 , wherein issuing the first group of two or more load requests and issuing the second group of two or more load requests are performed on back-to-back processor cycles.

Plain English Translation

The computer program that performs hash joins, as described previously, improves performance by issuing the first and second groups of parallel memory load requests on consecutive processor cycles. This minimizes the delay between sending the requests, allowing the memory to operate continuously and maximizing its internal parallelism to improve overall hash join speed.

Claim 9

Original Legal Text

9. The computer program product of claim 7 , the method further comprising: reading two or more entries of the fact table; hashing a join-key value of each entry of the fact table; and adding the hashed join-key value of each entry of the fact table, along with associated data, to a work queue; wherein issuing the first group of two or more load requests comprises issuing load requests corresponding to two or more entries of the work queue.

Plain English Translation

The computer program that performs hash joins first reads entries from the fact table and calculates a hash value from the join-key of each entry. It adds these hashed join-key values, along with other relevant data, into a work queue. When the system sends out the first group of memory load requests, it creates these requests based on the entries currently stored within the work queue. This allows for efficient pipelining of the fact table entries.

Claim 10

Original Legal Text

10. The computer program product of claim 9 , the method further comprising sorting the work queue to dynamically reduce differential latencies for receiving response data that is responsive to two or more groups of load requests issued.

Plain English Translation

In the computer program that manages a hash join using a work queue, the work queue is sorted to reduce differences in response times from the memory. By prioritizing requests that are expected to return more quickly (e.g., based on memory locality or access patterns), the program reduces the overall latency and processing time for multiple groups of load requests. This dynamic sorting helps to optimize the memory's parallel processing capabilities.

Claim 11

Original Legal Text

11. The computer program product of claim 7 , the method further comprising dynamically modifying the size of the second group of two or more load requests.

Plain English Translation

The computer program that performs hash joins dynamically adjusts the size of the second group of memory load requests. This allows the program to adapt to changing memory access patterns and processing loads. By modifying the number of parallel requests, the program can optimize the balance between request latency and data processing time, improving the overall performance of the hash join operation.

Claim 12

Original Legal Text

12. The computer program product of claim 7 , the method further comprising selecting the size of the second group of two or more load requests, wherein the selecting comprises: calculating an aggregate latency of a third group of two or more load requests issued by a single thread, wherein the aggregate latency is the time between issuing the third group of two or more load requests and receiving a response; identifying the dependence of the aggregate latency on the number of requests in the third group; and determining an optimum number of load requests in the second group based at least in part on the aggregate latency and the dependence of the aggregate latency on the number of requests in the third group.

Plain English Translation

To optimize the size of the second group of memory load requests, the computer program calculates the "aggregate latency" for a third group of memory load requests issued by a single thread. Aggregate latency is the total time from issuing the requests to receiving the response. The program identifies how the aggregate latency changes based on the number of requests in the third group. Based on this relationship, the program determines the best number of load requests for the second group to optimize the balance between memory access time and data processing time.

Patent Metadata

Filing Date

Unknown

Publication Date

November 14, 2017

Inventors

Jeffrey H. Derby
Charles Johnson
Robert K. Montoye
Dheeraj Sreedhar
Steven P. VanderWiel

Want to explore more patents?

Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.

Citation & reuse

Analysis on this page is generated by Patentable — an AI-powered patent intelligence platform. AI-generated summaries, explanations, FAQs, and analysis may be reused with attribution and a visible link back to the canonical URL below. Patent abstracts and claims are USPTO public domain.

Cite as: Patentable. “HIGH-PERFORMANCE HASH JOINS USING MEMORY WITH EXTENSIVE INTERNAL PARALLELISM” (9817612). https://patentable.app/patents/9817612

© 2026 Nomic Interactive Technology LLC. Machine-readable context available at /api/llm-context/9817612. See llms.txt for full attribution policy.