US-11468062

Order-independent multi-record hash generation and data filtering

PublishedOctober 11, 2022

Assigneenot available in USPTO data we have

Inventorsnot available in USPTO data we have

Technical Abstract

A process is provided for independently hashing and filtering a data set, such as during preprocessing. For the data set, one or more records, separately having one or more fields, may be identified. A record hash value set, containing one or more record hash values for the respective one or more records, may be generated. Generating a given record hash value may be accomplished as follows. For a given record, a hash value set may be generated, having one or more field hash values for the respective one or more fields of the given record. The record hash value for the given record may be generated based on the hash value set. A total hash value for the data set may be generated based on the record hash value set. The records of the data set may be filtered based on classification of the query that generated the records.

Patent Claims

14 claims

Legal claims defining the scope of protection, as filed with the USPTO.

2. The method of claim 1, wherein the first grouping criterion comprises rows of a table and the plurality of second grouping criteria correspond to table columns.

3. The method of claim 1, wherein the first grouping criterion comprises columns of a table and the plurality of second grouping criteria correspond to table rows.

4. The method of claim 1, wherein at least a first portion of the plurality of second grouping criteria correspond to deterministic data and at least a second portion of the plurality of second grouping criteria correspond to non-deterministic data and the calculating a first group hash value uses deterministic data and excludes non-deterministic data.

5. The method of claim 1, wherein the first data set comprises results from a executing a set of queries at a first database system and the second data set comprises results from executing the set of queries at a second database system.

8. The computing system of claim 7, wherein the first grouping criterion comprises rows of a table and the plurality of second grouping criteria correspond to table columns.

9. The computing system of claim 7, wherein the first grouping criterion comprises columns of a table and the plurality of second grouping criteria correspond to table rows.

10. The computing system of claim 7, wherein at least a first portion of the plurality of second grouping criteria correspond to deterministic data and at least a second portion of the plurality of second grouping criteria correspond to non-deterministic data and the calculating a first group hash value uses deterministic data and excludes non-deterministic data.

11. The computing system of claim 7, wherein the first data set comprises results from a executing a set of queries at a first database system and the second data set comprises results from executing the set of queries at a second database system.

13. The system of claim 7, wherein a first portion of the data elements of the plurality of first data elements are in irregular fields, wherein the first portion of the data elements of the plurality of first data elements in the irregular fields are hashed to generate irregular field hash values, and wherein the generated irregular field hash values are further combined and hashed to generate a combined irregular field hash value.

14. The system of claim 13, wherein the first portion of the data elements of the plurality of first data elements in the irregular fields comprise deterministic data elements and non-determinative data elements, and wherein the deterministic data elements of the first portion of the data elements of the plurality of first data elements in the irregular fields are hashed to generate irregular field hash values, and the non-determinative data elements of the first portion of the data elements of the plurality of first data elements in the irregular fields are excluded from being hashed to generate irregular field hash values.

15. The system of claim 7, wherein a first portion of the data elements of the plurality of first data elements are in a first plurality of fields, and a second portion of the data elements of the plurality of first data elements are in a second plurality of fields, and wherein the first portion of the data elements of the plurality of first data elements in the first plurality of fields are separately hashed to generate individual first field hash values, and the second portion of the data elements of the plurality of first data elements in the second plurality of fields are combined and hashed to generate a combined second filed hash value, and wherein the first field hash values and the combined second field hash value are further combined and hashed to generate the first hash value.

17. The one or more non-transitory computer-readable storage media of claim 16, wherein the first grouping criterion comprises rows of a table and the plurality of second grouping criteria correspond to table columns.

18. The one or more non-transitory computer-readable storage media of claim 16, wherein the grouping criterion comprises columns of a table and the plurality of second grouping criteria correspond to table rows.

19. The one or more non-transitory computer-readable storage media of claim 16, wherein the first data set comprises results from executing a set of queries at a first database system and the second data set comprises results from executing the set of queries at a second database system.

Classification Codes (CPC)

Cooperative Patent Classification codes for this invention. Click any code to explore related patents in that topic.

G06F

Patent Metadata

Filing Date

February 25, 2020

Publication Date

October 11, 2022

Want to explore more patents?

Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.

Browse All Patents Try Prior Art Search