10621143

Methods and Systems of a Dedupe File-System Garbage Collection

PublishedApril 14, 2020
Assigneenot available in USPTO data we have
Technical Abstract

Patent Claims
2 claims

Legal claims defining the scope of protection. Each claim is shown in both the original legal language and a plain English translation.

Claim 1

Original Legal Text

1. A computerized system of concurrently synchronizing a garbage collection thread and a writer threads in a dedupe file system in a data gathering state comprising: a processor configured to execute instructions; a memory containing instructions when executed on the processor, causes the processor to perform operations that: while the dedupe file system is in a data gathering state, a garbage collector thread concurrently working with writer threads, generate, with at least one processor, a garbage list of data chunks that are candidates for deletion by a garbage collector thread, wherein the garbage collector thread enumerates all backups on the data store, and wherein the garbage collector thread traverses a list of valid backups and removes any data chunks of the list of valid backups from an eraser database of the dedupe file system; and with the writer threads, referring to the garbage list of data chunks while ingesting data by: matching the data chunks with those present in the garbage list; filtering out the matched data chunks from garbage list of data chunks; setting, with the garbage collector thread, the dedupe file system to a data deletion state; and setting the writer threads to ingest data into the dedupe file system in synchronization with garbage collector thread.

Plain English Translation

The invention relates to a computerized system for efficiently managing data deduplication in a file system by synchronizing garbage collection and data writing operations during a data gathering state. The system addresses the challenge of maintaining performance and consistency in a deduplication file system where data chunks are shared across multiple backups, requiring careful coordination between garbage collection and data ingestion to avoid data loss or corruption. The system includes a processor and memory containing instructions that, when executed, enable concurrent operation of a garbage collector thread and writer threads. The garbage collector thread generates a list of data chunks eligible for deletion by enumerating all backups in the data store and traversing a list of valid backups. It removes any data chunks from an eraser database that are still referenced in valid backups. Meanwhile, the writer threads ingest new data while referencing the garbage list, matching incoming data chunks against the list. If a match is found, the chunk is filtered out of the garbage list, preventing its deletion. Once the garbage list is processed, the system transitions to a data deletion state, where the garbage collector thread deletes the remaining unreferenced chunks. The writer threads then synchronize with the garbage collector to ensure new data is ingested without conflicts. This approach optimizes storage efficiency by safely removing redundant data while maintaining data integrity during concurrent operations.

Claim 2

Original Legal Text

2. A computerized system of synchronizing a garbage collection thread and a writer threads in a dedupe file system in Data deletion state comprising: a processor configured to execute instructions; a memory containing instructions when executed on the processor, causes the processor to perform operations that: provide one or more writer threads concurrently working with garbage collector thread, referring to the garbage list of data chunks while ingesting data by: matching the data chunks with those present in the garbage list; for the matched data chunk add one or more hard links to the data chunk file in a temporary location, and wherein the hard links lock the data chunk file from deletion, wherein the hard links are a directory entry that associates a name with a data chunk file on a file system of the computerized system; with a garbage collector thread: iterate through the garbage list; and obtain an exclusive access of each data chunk and delete any data chunk that is not marked by the one or more writer threads as having two hard links.

Plain English Translation

This invention relates to a computerized system for synchronizing garbage collection and writer threads in a deduplicated file system during data deletion. The system addresses the challenge of safely deleting redundant data chunks while ensuring active writer threads do not lose access to required data. The system includes a processor and memory storing instructions to manage concurrent writer and garbage collector threads. Writer threads ingest data by matching incoming data chunks against a garbage list of marked chunks. When a match is found, the writer threads add one or more hard links to the chunk's file in a temporary location. These hard links lock the file, preventing deletion by the garbage collector. Hard links are directory entries that associate a name with a file, allowing multiple references to the same data. The garbage collector thread iterates through the garbage list, obtaining exclusive access to each chunk. It deletes any chunk that lacks two hard links, indicating no active writer threads are using it. This mechanism ensures data integrity by preventing deletion of chunks still in use while efficiently reclaiming space for truly redundant data. The system optimizes storage by leveraging deduplication while maintaining thread safety during concurrent operations.

Patent Metadata

Filing Date

Unknown

Publication Date

April 14, 2020

Inventors

Ashish Govind Khurange
Kulangara Kuriakose George
Sachin Baban Durge
Kuldeep Sureshrao Nagarkar
Ravender Goyal

Want to explore more patents?

Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.

Citation & reuse

Analysis on this page is generated by Patentable — an AI-powered patent intelligence platform. AI-generated summaries, explanations, FAQs, and analysis may be reused with attribution and a visible link back to the canonical URL below. Patent abstracts and claims are USPTO public domain.

Cite as: Patentable. “METHODS AND SYSTEMS OF A DEDUPE FILE-SYSTEM GARBAGE COLLECTION” (10621143). https://patentable.app/patents/10621143

© 2026 Nomic Interactive Technology LLC. Machine-readable context available at /api/llm-context/10621143. See llms.txt for full attribution policy.