Patentable/Patents/US-10740489
US-10740489

System and method for prediction preserving data obfuscation

PublishedAugust 11, 2020
Assigneenot available in USPTO data we have
Inventorsnot available in USPTO data we have
Technical Abstract

The invention relates to obfuscating data while maintaining local predictive relationships. An embodiment of the present invention is directed to cryptographically obfuscating a data set in a manner that hides personally identifiable information (PII) while allowing third parties to train classes of machine learning algorithms effectively. According to an embodiment of the present invention, the obfuscation acts as a symmetric encryption so that the original obfuscating party may relate the predictions on the obfuscated data to the original PII. The various features of the present invention enable third party prediction services to safely interact with PII.

Patent Claims
18 claims

Legal claims defining the scope of protection, as filed with the USPTO.

1

1. A system for data obfuscation comprising: a memory component that stores personally identifiable information; a communication interface; and a computer processor, coupled to the memory component and the communication interface, configured to perform the steps of: retrieving a dataset of the personally identifiable information where the personally identifiable information is to be obfuscated; identifying a set of security parameters for the dataset, wherein the set of security parameters comprises hyper parameters and an epsilon parameter that represents a probability of a shuffle corruption for the dataset; identifying a random covering for the dataset; applying a random permutation to the dataset; and generating obfuscated data representing the dataset.

2

2. The system of claim 1 , wherein the dataset comprises N points in a D dimensional feature space.

3

3. The system of claim 1 , wherein the random covering is identified using D dimensional hyper rectangles of convex hull of N points.

4

4. The system of claim 3 , wherein the computer processor is further configured to perform the step of: applying a geometric mapping between K hyper rectangles that rescales and shifts a point from a corresponding coordinate in a first region to a second region.

5

5. The system of claim 1 , wherein the computer processor is further configured to perform the step of: applying a transformation to the data set.

6

6. The system of claim 1 , wherein the obfuscated data is used to train a machine learning algorithm.

7

7. A system for data obfuscation comprising: a memory component that stores personally identifiable information; a communication interface; and a computer processor, coupled to the memory component and the communication interface, configured to perform the steps of: retrieving a dataset of the personally identifiable information where the personally identifiable information is to be obfuscated; identifying a security parameter for the dataset, wherein the security parameter comprises hyper parameters and an epsilon parameter that represents a probability of a shuffle corruption for the dataset; dividing the dataset into a plurality of bins based on the security parameter; shuffling the bins based on a random permutation; and composing obfuscated data, wherein the obfuscated data is used to train a machine learning algorithm.

8

8. The system of claim 7 , wherein the plurality of bins are overlapping bins.

9

9. The system of claim 7 , wherein the plurality of bins are non-overlapping bins.

10

10. The system of claim 7 , wherein the computer processor is further configured to perform the step of: adjusting each bin based on a ratio of data associated with each bin.

11

11. The system of claim 7 , wherein within each bin, observed data is also randomly re-labeled.

12

12. The system of claim 7 , wherein one or more dependent variables are also shuffled in a random manner.

13

13. A method for data obfuscation comprising the steps of: retrieving, via a communication interface, a dataset of the personally identifiable information where the personally identifiable information is to be obfuscated; identifying, via a computer processor, a set of security parameters for the dataset, wherein the set of security parameters comprises hyper parameters and an epsilon parameter that represents a probability of a shuffle corruption for the dataset; identifying, via the computer processor, a random covering for the dataset; applying, via the computer processor, a random permutation to the dataset; and generating, via the computer processor, obfuscated data representing the dataset.

14

14. The method of claim 13 , wherein the dataset comprises N points in a D dimensional feature space.

15

15. The method of claim 13 , wherein the random covering is identified using D dimensional hyper rectangles of convex hull of N points.

16

16. The method of claim 15 , further comprising the step of: applying a geometric mapping between K hyper rectangles that rescales and shifts a point from a corresponding coordinate in a first region to a second region.

17

17. The method of claim 13 , further comprising the step of: applying a transformation to the data set.

18

18. The method of claim 13 , wherein the obfuscated data is used to train a machine learning algorithm.

Classification Codes (CPC)

Cooperative Patent Classification codes for this invention. Click any code to explore related patents in that topic.

Patent Metadata

Filing Date

May 17, 2018

Publication Date

August 11, 2020

Want to explore more patents?

Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.

Citation & reuse

Analysis on this page is generated by Patentable — an AI-powered patent intelligence platform. AI-generated summaries, explanations, and analysis may be reused with attribution and a visible link back to the canonical URL below. Patent abstracts and claims are USPTO public domain.

Cite as: Patentable. “System and method for prediction preserving data obfuscation” (US-10740489). https://patentable.app/patents/US-10740489

© 2026 Patentable. All rights reserved.

Patentable is a research and drafting-assistant tool, not a law firm, and does not provide legal advice. Documents we generate are drafts for review by a licensed patent attorney.