Patentable/Patents/US-11468062
US-11468062

Order-independent multi-record hash generation and data filtering

PublishedOctober 11, 2022
Assigneenot available in USPTO data we have
Inventorsnot available in USPTO data we have
Technical Abstract

A process is provided for independently hashing and filtering a data set, such as during preprocessing. For the data set, one or more records, separately having one or more fields, may be identified. A record hash value set, containing one or more record hash values for the respective one or more records, may be generated. Generating a given record hash value may be accomplished as follows. For a given record, a hash value set may be generated, having one or more field hash values for the respective one or more fields of the given record. The record hash value for the given record may be generated based on the hash value set. A total hash value for the data set may be generated based on the record hash value set. The records of the data set may be filtered based on classification of the query that generated the records.

Patent Claims
14 claims

Legal claims defining the scope of protection. Each claim is shown in both the original legal language and a plain English translation.

Claim 2

Original Legal Text

2. The method of claim 1, wherein the first grouping criterion comprises rows of a table and the plurality of second grouping criteria correspond to table columns.

Plain English translation pending...
Claim 3

Original Legal Text

3. The method of claim 1, wherein the first grouping criterion comprises columns of a table and the plurality of second grouping criteria correspond to table rows.

Plain English Translation

This invention relates to data organization and retrieval, specifically improving the efficiency of grouping and querying structured data, such as tables. The problem addressed is the inefficiency in traditional data grouping methods, which often require multiple passes or complex algorithms to categorize data based on multiple criteria. The solution involves a hierarchical grouping approach that first organizes data into columns (first grouping criterion) and then further subdivides it into rows (second grouping criteria). This two-tiered structure allows for faster data retrieval and more efficient querying, particularly in large datasets where traditional methods would be computationally expensive. The method ensures that data is logically segmented by columns before being refined by row-based criteria, optimizing both storage and access patterns. This approach is particularly useful in database management systems, data analytics, and applications requiring rapid filtering or aggregation of structured data. By leveraging columnar and row-based grouping, the invention reduces redundancy and improves query performance, making it suitable for high-volume data processing environments.

Claim 4

Original Legal Text

4. The method of claim 1, wherein at least a first portion of the plurality of second grouping criteria correspond to deterministic data and at least a second portion of the plurality of second grouping criteria correspond to non-deterministic data and the calculating a first group hash value uses deterministic data and excludes non-deterministic data.

Plain English translation pending...
Claim 5

Original Legal Text

5. The method of claim 1, wherein the first data set comprises results from a executing a set of queries at a first database system and the second data set comprises results from executing the set of queries at a second database system.

Plain English translation pending...
Claim 8

Original Legal Text

8. The computing system of claim 7, wherein the first grouping criterion comprises rows of a table and the plurality of second grouping criteria correspond to table columns.

Plain English Translation

A computing system processes data by organizing it into hierarchical structures based on multiple grouping criteria. The system receives a dataset and applies a primary grouping criterion to partition the data into distinct groups. Each group is then further subdivided using secondary grouping criteria, creating a nested hierarchy. The primary grouping criterion is defined by rows of a table, while the secondary grouping criteria correspond to the table's columns. This hierarchical organization allows for efficient data analysis, filtering, and visualization by enabling multi-level categorization. The system dynamically adjusts the hierarchy based on user input or predefined rules, ensuring flexibility in data exploration. The hierarchical structure can be traversed or modified to extract insights or generate reports. This approach is particularly useful for large datasets where traditional flat structures are inefficient, improving performance and usability in data management applications.

Claim 9

Original Legal Text

9. The computing system of claim 7, wherein the first grouping criterion comprises columns of a table and the plurality of second grouping criteria correspond to table rows.

Plain English Translation

A computing system is designed to process and analyze structured data, particularly in tabular formats, to improve data organization and retrieval. The system addresses the challenge of efficiently grouping and categorizing large datasets, which is critical for applications in data analytics, database management, and business intelligence. The system includes a data processing module that applies multiple grouping criteria to a dataset, enabling hierarchical or multi-dimensional data segmentation. The system uses a first grouping criterion based on columns of a table, which allows data to be organized by distinct attributes or fields. Additionally, the system applies a plurality of second grouping criteria corresponding to table rows, enabling further subdivision of the data based on row-level characteristics. This dual-level grouping approach enhances data granularity and facilitates more precise querying and analysis. The system may also include a user interface for defining or modifying the grouping criteria, ensuring flexibility in data organization. The structured approach to grouping improves data accessibility and supports complex analytical operations, such as filtering, aggregation, and pattern recognition, across large datasets. This method is particularly useful in environments where data must be dynamically categorized for reporting or decision-making purposes.

Claim 10

Original Legal Text

10. The computing system of claim 7, wherein at least a first portion of the plurality of second grouping criteria correspond to deterministic data and at least a second portion of the plurality of second grouping criteria correspond to non-deterministic data and the calculating a first group hash value uses deterministic data and excludes non-deterministic data.

Plain English Translation

The invention relates to computing systems that process and group data based on deterministic and non-deterministic criteria. The system addresses the challenge of efficiently organizing data while distinguishing between predictable (deterministic) and unpredictable (non-deterministic) attributes. Deterministic data refers to values that are fixed or can be reliably predicted, such as predefined categories or static identifiers, while non-deterministic data includes variable or unpredictable values, such as timestamps or random identifiers. The system groups data into clusters using a multi-step process. First, it applies a set of initial grouping criteria to categorize the data into preliminary groups. Then, it further refines these groups by applying a second set of criteria, where at least some criteria are deterministic and others are non-deterministic. The system calculates a hash value for each group, but this hash value is derived solely from the deterministic data, excluding non-deterministic data. This ensures consistency in group identification, even if non-deterministic attributes vary. The system may also use the hash values to track or compare groups over time, enabling efficient data management and analysis. The approach improves data processing efficiency by reducing redundancy and ensuring stable group identification despite variations in non-deterministic attributes.

Claim 11

Original Legal Text

11. The computing system of claim 7, wherein the first data set comprises results from a executing a set of queries at a first database system and the second data set comprises results from executing the set of queries at a second database system.

Plain English translation pending...
Claim 13

Original Legal Text

13. The system of claim 7, wherein a first portion of the data elements of the plurality of first data elements are in irregular fields, wherein the first portion of the data elements of the plurality of first data elements in the irregular fields are hashed to generate irregular field hash values, and wherein the generated irregular field hash values are further combined and hashed to generate a combined irregular field hash value.

Plain English Translation

This invention relates to data processing systems that handle structured and unstructured data, particularly focusing on managing data elements in irregular fields. The problem addressed is the challenge of efficiently processing and comparing data elements that do not conform to standard field formats, which can complicate data analysis, deduplication, and matching tasks. The system processes a plurality of first data elements, where a subset of these elements are located in irregular fields—meaning they lack consistent formatting or structure. To handle these irregularities, the system applies a hashing technique to the data elements in irregular fields, generating individual hash values for each. These hash values are then combined and further hashed to produce a single combined irregular field hash value. This approach standardizes the representation of irregular data, enabling more reliable comparison and processing. The system may also include additional components, such as a data storage module to store the original and hashed data, and a comparison module to use the combined hash values for tasks like deduplication or record matching. The hashing process ensures that even irregularly formatted data can be consistently processed, improving the accuracy and efficiency of data operations. This method is particularly useful in applications like database management, data integration, and analytics where inconsistent data formats are common.

Claim 14

Original Legal Text

14. The system of claim 13, wherein the first portion of the data elements of the plurality of first data elements in the irregular fields comprise deterministic data elements and non-determinative data elements, and wherein the deterministic data elements of the first portion of the data elements of the plurality of first data elements in the irregular fields are hashed to generate irregular field hash values, and the non-determinative data elements of the first portion of the data elements of the plurality of first data elements in the irregular fields are excluded from being hashed to generate irregular field hash values.

Plain English translation pending...
Claim 15

Original Legal Text

15. The system of claim 7, wherein a first portion of the data elements of the plurality of first data elements are in a first plurality of fields, and a second portion of the data elements of the plurality of first data elements are in a second plurality of fields, and wherein the first portion of the data elements of the plurality of first data elements in the first plurality of fields are separately hashed to generate individual first field hash values, and the second portion of the data elements of the plurality of first data elements in the second plurality of fields are combined and hashed to generate a combined second filed hash value, and wherein the first field hash values and the combined second field hash value are further combined and hashed to generate the first hash value.

Plain English translation pending...
Claim 17

Original Legal Text

17. The one or more non-transitory computer-readable storage media of claim 16, wherein the first grouping criterion comprises rows of a table and the plurality of second grouping criteria correspond to table columns.

Plain English translation pending...
Claim 18

Original Legal Text

18. The one or more non-transitory computer-readable storage media of claim 16, wherein the grouping criterion comprises columns of a table and the plurality of second grouping criteria correspond to table rows.

Plain English Translation

This invention relates to data processing systems that organize and analyze structured data, particularly in database or tabular formats. The problem addressed is the need for efficient and flexible grouping of data based on multiple criteria, especially when dealing with large datasets stored in tables. Traditional methods often require complex queries or manual adjustments to achieve desired groupings, which can be time-consuming and error-prone. The invention provides a computer-implemented method for grouping data using a primary grouping criterion and multiple secondary grouping criteria. The primary criterion defines the main categories for grouping, while the secondary criteria further subdivide the data within each primary group. In the context of tabular data, the primary grouping criterion corresponds to the columns of a table, and the secondary criteria correspond to the rows. This allows for hierarchical organization of data, where each column represents a distinct grouping level, and rows within each column define subcategories. The system dynamically applies these criteria to process and display the grouped data, enabling users to analyze relationships and patterns across different levels of granularity. The approach improves efficiency by automating the grouping process and reducing the need for manual intervention, particularly in applications like data analytics, reporting, and database management.

Claim 19

Original Legal Text

19. The one or more non-transitory computer-readable storage media of claim 16, wherein the first data set comprises results from executing a set of queries at a first database system and the second data set comprises results from executing the set of queries at a second database system.

Plain English translation pending...
Classification Codes (CPC)

Cooperative Patent Classification codes for this invention. Click any code to explore related patents in that topic.

Patent Metadata

Filing Date

February 25, 2020

Publication Date

October 11, 2022

Want to explore more patents?

Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.

Citation & reuse

Analysis on this page is generated by Patentable — an AI-powered patent intelligence platform. AI-generated summaries, explanations, FAQs, and analysis may be reused with attribution and a visible link back to the canonical URL below. Patent abstracts and claims are USPTO public domain.

Cite as: Patentable. “Order-independent multi-record hash generation and data filtering” (US-11468062). https://patentable.app/patents/US-11468062

© 2026 Nomic Interactive Technology LLC. Machine-readable context available at /api/llm-context/US-11468062. See llms.txt for full attribution policy.