A mechanism is provided that aggregates data in a way that permits data to be deleted efficiently, while minimizing the overhead necessary to support bulk deletion of data. A request is received for automatic deletion of segments in a container and a waterline is determined for the container. A determination is made if at least one segment in the container falls below the waterline. Finally, in response to one segment falling below the waterline, the segment from the container is deleted. Each object has an associated creation time, initial retention value, and retention decay curve (also known as a retention curve). At any point, based on these values and the current time, the object's current retention value may be computed. The container system continually maintains a time-varying waterline: at any point, objects with a retention value below the waterline may be deleted.
Legal claims defining the scope of protection, as filed with the USPTO.
1. A method for bulk deletion through segmented files, the method comprising: receiving a request for automatic deletion of segments in a container, wherein the container is exactly one file; determining a waterline for the container, wherein the waterline is a value, and wherein the value is based on a first retention decay curve of a given segment, and wherein the first retention decay curve is a decreasing mathematical function of the value over time, and wherein the value is a minimum value to retain a segment within a plurality of segments in the container; determining if at least one segment value within a plurality of segment values in the container falls below the waterline; and in response to the at least one segment value falling below the waterline, automatically deleting at least one segment associated with the at least one segment value from the container, wherein: the value is selected from a first range of numbers between and including a first real number and a second real number; the at least one segment value within the plurality of segment values is selected from a second range of numbers between and including a third real number and a fourth real number; the first real number is a lowest number of the first range of numbers; the second real number is a highest number of the first range of numbers; the third real number is a lowest number of the second range of numbers; the fourth real number is a highest number of the second range of numbers; the first real number, the second real number, the third real number, and the fourth real number are real numbers other than dates; the first real number and the second real number included in the range of numbers are on the first retention decay curve; the third real number and the fourth real number included in the second range of numbers are on a second retention decay curve; the second range of numbers is selected from the group consisting of the first range of numbers and another range of numbers different from the first range of numbers; and the second retention decay curve is selected from the group consisting of the first retention decay curve and another retention decay curve different from the first retention decay curve.
2. The method of claim 1 , wherein: the value is set to a segment retention value; the segment retention value is a function of information within the given segment; and the segment retention value is the minimum value to retain the given segment within the plurality of segments in the container.
3. The method of claim 2 , wherein the minimum value is determined by at least one of a creation date of the given segment, an initial retention value of the given segment, a current time, or a date for deletion of the given segment.
4. The method of claim 1 , wherein the value is determined by the decreasing mathematical function, and wherein the decreasing mathematical function is determined by the first retention decay curve of the given segment, and wherein determining if the at least one segment value within the plurality of segment values in the container falls below the waterline further comprises: identifying at least one segment within the plurality of segments in the container whose value is below the waterline to form an at least one identified segment; and deleting the at least one identified segment from the container.
5. The method of claim 4 , wherein segments that are not identified for deletion are not contiguous.
6. The method of claim 4 , wherein segments that are not identified for deletion are contiguous.
7. The method of claim 1 , wherein the value is determined by a function that converts a creation date of the given segment to the value and wherein determining if the at least one segment within the plurality of segments in the container falls below the waterline further comprises: scanning the plurality of segments in the container from a beginning of the container in ascending date order for a segment whose value is above the waterline; and deleting the segments from the beginning of the container up to the segment whose value is above the waterline.
8. A data processing system comprising: a bus system; a communications system connected to the bus system; a memory connected to the bus system, wherein the memory includes a set of instructions; and a processing unit connected to the bus system, wherein the processing unit executes the set of instructions to receive a request for automatic deletion of segments in a container, wherein the container is exactly one file; determine a waterline for the container, wherein the waterline is a value, and wherein the value is based on a first retention decay curve of a given segment, and wherein the first retention decay curve is a decreasing mathematical function of the value over time, and wherein the value is a minimum value to retain a segment within a plurality of segments in the container; determine if at least one segment value within a plurality of segment values in the container falls below the waterline; and automatically delete at least one segment associated with the at least one segment value from the container in response to the at least one segment value falling below the waterline, wherein: the value is selected from a first range of numbers between and including a first real number and a second real number; the at least one segment value within the plurality of segment values is selected from a second range of numbers between and including a third real number and a fourth real number; the first real number is a lowest number of the first range of numbers; the second real number is a highest number of the first range of numbers; the third real number is a lowest number of the second range of numbers; the fourth real number is a highest number of the second range of numbers; the first real number, the second real number, the third real number, and the fourth real number are real numbers other than dates; the first real number and the second real number included in the range of numbers are on the first retention decay curve; the third real number and the fourth real number included in the second range of numbers are on a second retention decay curve; the second range of numbers is selected from the group consisting of the first range of numbers and another range of numbers different from the first range of numbers; and the second retention decay curve is selected from the group consisting of the first retention decay curve and another retention decay curve different from the first retention decay curve.
9. The data processing system of claim 8 , wherein: the value is set to a segment retention value; the segment retention value is a function of information within the given segment; and the segment retention value is the minimum value to retain the given segment within the plurality of segments in the container.
10. The data processing system of claim 9 , wherein the minimum value is determined by at least one of a creation date of the given segment, an initial retention value of the given segment, a current time, or a date for deletion of the given segment.
11. The data processing system of claim 8 , wherein the value is determined by the decreasing mathematical function, and wherein the decreasing mathematical function is determined by the first retention decay curve of the given segment, and wherein the set of instructions to determine if the at least one segment within the plurality of segments in the container falls below the waterline further comprises: a set of instructions to identify at least one segment within the plurality of segments in the container whose value is below the waterline to form at least one identified segment; and delete the at least one identified segment from the container.
12. The data processing system of claim 11 , wherein segments that are not identified for deletion are not contiguous.
13. The data processing system of claim 11 , wherein segments that are not identified for deletion are contiguous.
14. The data processing system of claim 8 , wherein the value is determined by a function that converts a creation date of the given segment to the value and wherein the set of instructions to determine if the at least one segment within the plurality of segments in the container falls below the waterline further comprises: a set of instructions to scan the plurality of segments in the container from a beginning of the container in ascending date order for a segment whose value is above the waterline; and delete segments from the beginning of the container up to the segment whose value is above the waterline.
15. A computer program product comprising: a non-transitory computer readable medium including computer usable program code for bulk deletion through segmented files, the computer usable program code including: computer usable program code for receiving a request for automatic deletion of segments in a container, wherein the container is exactly one file; computer usable program code for determining a waterline for the container, wherein the waterline is a value, and wherein the value is based on a first retention decay curve of a given segment, and wherein the first retention decay curve is a decreasing mathematical function of the value over time, and wherein the value is a minimum value to retain a segment within a plurality of segments in the container; computer usable program code for determining if at least one segment value within a plurality of segment values in the container falls below the waterline; and computer usable program code for automatically deleting at least one segment associated with the at least one segment value from the container in response to the at least one segment value falling below the waterline, wherein: the value is selected from a first range of numbers between and including a first real number and a second real number; the at least one segment value within the plurality of segment values is selected from a second range of numbers between and including a third real number and a fourth real number; the first real number is a lowest number of the first range of numbers; the second real number is a highest number of the first range of numbers; the third real number is a lowest number of the second range of numbers; the fourth real number is a highest number of the second range of numbers; the first real number, the second real number, the third real number, and the fourth real number are real numbers other than dates; the first real number and the second real number included in the range of numbers are on the first retention decay curve; the third real number and the fourth real number included in the second range of numbers are on a second retention decay curve; the second range of numbers is selected from the group consisting of the first range of numbers and another range of numbers different from the first range of numbers; and the second retention decay curve is selected from the group consisting of the first retention decay curve and another retention decay curve different from the first retention decay curve.
16. The computer program product of claim 15 , wherein: the value is set to a segment retention value; the segment retention value is a function of information within the given segment; and the segment retention value is the minimum value to retain the given segment within the plurality of segments in the container.
17. The computer program product of claim 16 , wherein the minimum value is determined by at least one of a creation date of the given segment, an initial retention value of the given segment, a current time, or a date for deletion of the given segment.
18. The computer program product of claim 15 , wherein the value is determined by the decreasing mathematical function, and wherein the decreasing mathematical function is determined by the first retention decay curve of the given segment, and wherein the computer usable program code for determining if the at least one segment within the plurality of segments in the container falls below the waterline further comprises: computer usable program code for identifying at least one segment within the plurality of segments in the container whose value is below the waterline to form at least one identified segment; and computer usable program code for deleting the at least one identified segment from the container.
19. The computer program product of claim 18 , wherein segments that are not identified for deletion are not contiguous.
20. The computer program product of claim 18 , wherein segments that are not identified for deletion are contiguous.
21. The computer program product of claim 15 , wherein the value is determined by a function that converts a creation date of the given segment to the value and wherein the computer usable program code for determining if the at least one segment within the plurality of segments in the container falls below the waterline further comprises: computer usable program code for scanning the plurality of segments in the container from a beginning of the container in ascending date order for a segment whose value is above the waterline; and computer usable program code for deleting segments from the beginning of the container up to the segment whose value is above the waterline.
Cooperative Patent Classification codes for this invention. Click any code to explore related patents in that topic.
June 20, 2005
December 16, 2014
Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.