10929613

Automated Document Cluster Merging for Topic-Based Digital Assistant Interpretation

PublishedFebruary 23, 2021
Assigneenot available in USPTO data we have
Technical Abstract

Patent Claims
18 claims

Legal claims defining the scope of protection. Each claim is shown in both the original legal language and a plain English translation.

Claim 1

Original Legal Text

1. A non-transitory computer storage medium storing computer-useable instructions that, when used by at least one computing device, cause the at least one computing device to: obtain a set of determined representative phrases for each electronic document cluster in a generated plurality of electronic document clusters, wherein each electronic document cluster in the generated plurality of electronic document clusters includes a portion of electronic documents of a plurality of electronic documents, each electronic document of the plurality of electronic documents being associated with one of a plurality of stored command templates; define a plurality of logical relationships amongst the generated plurality of electronic document clusters; determine a plurality of contextually similar electronic document groups from the generated plurality of electronic document clusters based on the defined plurality of logical relationships, each contextually similar electronic document group including a corresponding portion of the generated plurality of electronic document clusters; determine a set of cluster tags for each contextually similar electronic document group of the determined plurality of contextually similar electronic document groups; for each contextually similar electronic document group of the determined plurality of contextually similar electronic document groups, extract a set of topics and corresponding sub-topics from the determined corresponding set of cluster tags; and store the extracted sets of topics and corresponding sub-topics to a data store, each stored set of topics and corresponding sub-topics being associated with one of the determined plurality of contextually similar electronic document groups.

Plain English Translation

This invention relates to automated document clustering and topic extraction for organizing and analyzing large collections of electronic documents. The system addresses the challenge of efficiently categorizing and summarizing documents by leveraging machine learning and natural language processing techniques. The process begins by clustering a plurality of electronic documents into groups, where each document is linked to a predefined command template. For each cluster, representative phrases are identified to capture key concepts. Logical relationships between clusters are then defined to determine contextually similar groups, which are further analyzed to generate cluster tags. These tags are used to extract hierarchical topic structures, including main topics and sub-topics, which are stored in a data store for retrieval and analysis. The system enables automated organization of documents into meaningful categories, facilitating improved search, retrieval, and knowledge management in applications such as enterprise content management, legal research, or academic literature analysis. The invention enhances document processing by automating the extraction of structured topic hierarchies from unstructured or semi-structured text data.

Claim 2

Original Legal Text

2. The medium of claim 1 , the instructions further cause the at least one computing device to: generate a searchable index based on the stored sets of topics and corresponding sub-topics; determine that a portion of the generated plurality of document clusters is relevant to a command received from a remote computing device based on the generated searchable index; and provide a determined result to the remote computing device based on the determined relevant portion of the generated plurality of document clusters as a response to the received command.

Plain English translation pending...
Claim 3

Original Legal Text

3. The medium of claim 2 , wherein the generated result corresponds to the determined relevant portion of the generated plurality of document clusters.

Plain English Translation

The invention relates to information retrieval systems, specifically improving the accuracy and relevance of search results by clustering documents and identifying relevant portions within those clusters. The problem addressed is the inefficiency of traditional search systems that return large, unstructured sets of documents, making it difficult for users to quickly find the most pertinent information. The system generates a plurality of document clusters from a corpus of documents, where each cluster groups related documents based on content similarity. A relevance determination module then analyzes these clusters to identify the most relevant portions within them. The generated result, such as a search output or summary, corresponds specifically to the determined relevant portion of the clusters, ensuring that users receive focused and highly relevant information rather than broad, unfiltered results. This approach enhances search efficiency by reducing noise and prioritizing the most significant content within the clusters. The invention may also include preprocessing steps to refine the clustering process, such as filtering or normalizing documents before clustering, to improve the accuracy of the relevance determination. The system is particularly useful in applications requiring precise information retrieval, such as legal research, medical diagnostics, or enterprise knowledge management.

Claim 4

Original Legal Text

4. The medium of claim 3 , wherein the generated result includes at least one of a plurality of action datasets mapped to the determined relevant portion of the generated plurality of document clusters.

Plain English Translation

This invention relates to information retrieval and document clustering systems, specifically addressing the challenge of efficiently organizing and retrieving relevant information from large document collections. The system generates a plurality of document clusters by analyzing and grouping documents based on their content, ensuring that related documents are grouped together. A user query is processed to determine a relevant portion of these document clusters, which are then mapped to one or more action datasets. These action datasets contain specific actions or responses that correspond to the identified relevant document clusters. The system generates a result that includes at least one of these action datasets, providing the user with targeted and actionable information based on the clustered documents. This approach improves the precision of information retrieval by leveraging document clustering and action datasets, ensuring that users receive relevant and actionable insights from the document collection. The invention enhances the efficiency of information processing by dynamically mapping user queries to specific actions, reducing the need for manual review of large document sets.

Claim 5

Original Legal Text

5. The medium of claim 1 , wherein each electronic document of the plurality of electronic documents is generated based on other electronic documents retrieved from at least one remote data store.

Plain English translation pending...
Claim 6

Original Legal Text

6. The medium of claim 5 , wherein each other electronic document is retrieved based on a query, the query being generated based on one of the plurality of stored command templates.

Plain English Translation

This invention relates to a system for retrieving electronic documents based on user commands. The system addresses the challenge of efficiently accessing relevant documents in large databases by using predefined command templates to generate queries. Each command template corresponds to a specific type of document retrieval task, allowing users to input commands that are translated into structured queries. The system stores multiple command templates, each designed for different retrieval scenarios, such as searching by keyword, date, author, or content similarity. When a user provides a command, the system selects an appropriate template, populates it with the user's input, and executes the resulting query to retrieve matching documents. The retrieved documents are then presented to the user, enabling quick and precise access to information. This approach improves search efficiency by reducing the need for manual query formulation and ensuring consistent retrieval results. The system may also include features for refining queries based on user feedback or historical search patterns to enhance accuracy over time.

Claim 7

Original Legal Text

7. A computer-implemented method for extracting topics and/or sub-topics from merged document clusters, the method comprising: obtain, by a computing device, a set of determined representative phrases for each electronic document cluster in a generated plurality of electronic document clusters, wherein each electronic document cluster in the generated plurality of electronic document clusters includes a portion of electronic documents of a plurality of electronic documents, each electronic document of the plurality of electronic documents being associated with one of a plurality of stored command templates; define, by the computing device, a plurality of logical relationships amongst the generated plurality of electronic document clusters; determine, by the computing device, a plurality of contextually similar electronic document groups from the generated plurality of electronic document clusters based on the defined plurality of logical relationships, each contextually similar electronic document group including a corresponding portion of the generated plurality of electronic document clusters; determine, by the computing device, a set of cluster tags for each contextually similar electronic document group of the determined plurality of contextually similar electronic document groups; for each contextually similar electronic document group of the determined plurality of contextually similar electronic document groups, extract, by the computing device, a set of topics and corresponding sub-topics from the determined corresponding set of cluster tags; and store, by the computing device, the extracted sets of topics and corresponding sub-topics to a data store, each stored set of topics and corresponding sub-topics being associated with one of the determined plurality of contextually similar electronic document groups.

Plain English translation pending...
Claim 8

Original Legal Text

8. The method of claim 7 , the instructions further cause the at least one computing device to: generate, by the computing device, a searchable index based on the stored sets of topics and corresponding sub-topics; determine, by the computing device, that a portion of the generated plurality of document clusters is relevant to a command received from a remote computing device based on the generated searchable index; and providing, by the computing device, a determined result to the remote computing device based on the determined relevant portion of the generated plurality of document clusters as a response to the received command.

Plain English translation pending...
Claim 9

Original Legal Text

9. The method of claim 8 , wherein the generated result corresponds to the determined relevant portion of the generated plurality of document clusters.

Plain English translation pending...
Claim 10

Original Legal Text

10. The method of claim 9 , wherein the generated result includes at least one of a plurality of action datasets mapped to the determined relevant portion of the generated plurality of document clusters.

Plain English translation pending...
Claim 11

Original Legal Text

11. The method of claim 7 , wherein each electronic document of the plurality of electronic documents is generated based on other electronic documents retrieved from at least one remote data store.

Plain English translation pending...
Claim 12

Original Legal Text

12. The method of claim 11 , wherein each other electronic document is retrieved based on a query, the query being generated based on one of the plurality of stored command templates.

Plain English Translation

This invention relates to a system for retrieving electronic documents based on user commands. The problem addressed is the inefficiency of manually searching for relevant documents, particularly in large databases or across multiple sources. The solution involves generating queries automatically from predefined command templates, which are stored in a database. These templates are structured to extract key parameters from user input, such as keywords, metadata, or contextual information, to form a precise search query. The system then retrieves electronic documents that match the generated query, improving search accuracy and reducing manual effort. The method ensures that the query is dynamically constructed based on the selected template, allowing for flexible and context-aware document retrieval. This approach enhances efficiency by standardizing the query generation process and ensuring consistency in search results. The invention is particularly useful in environments where users frequently need to access specific documents based on structured commands, such as legal research, medical records, or enterprise knowledge management. By automating query formation, the system minimizes errors and speeds up the retrieval process.

Claim 13

Original Legal Text

13. A system comprising: at least one processor; and at least one storage device storing computer-useable instructions that, when used by the at least one processor, cause the at least one processor to: obtain a set of determined representative phrases for each electronic document cluster in a generated plurality of electronic document clusters, wherein each electronic document cluster in the generated plurality of electronic document clusters includes a portion of electronic documents of a plurality of electronic documents, each electronic document of the plurality of electronic documents being associated with one of a plurality of stored command templates; define a plurality of logical relationships amongst the generated plurality of electronic document clusters; determine a plurality of contextually similar electronic document groups from the generated plurality of electronic document clusters based on the defined plurality of logical relationships, each contextually similar electronic document group including a corresponding portion of the generated plurality of electronic document clusters; determine a set of cluster tags for each contextually similar electronic document group of the determined plurality of contextually similar electronic document groups; for each contextually similar electronic document group of the determined plurality of contextually similar electronic document groups, extract a set of topics and corresponding sub-topics from the determined corresponding set of cluster tags; and store the extracted sets of topics and corresponding sub-topics to a data store, each stored set of topics and corresponding sub-topics being associated with one of the determined plurality of contextually similar electronic document groups.

Plain English translation pending...
Claim 14

Original Legal Text

14. The system of claim 13 , wherein the instructions further cause the at least one computing device to: generate a searchable index based on the stored sets of topics and corresponding sub-topics; determine that a portion of the generated plurality of document clusters is relevant to a command received from a remote computing device based on the generated searchable index; and provide a determined result to the remote computing device based on the determined relevant portion of the generated plurality of document clusters as a response to the received command.

Plain English translation pending...
Claim 15

Original Legal Text

15. The system of claim 14 , wherein the generated result corresponds to the determined relevant portion of the generated plurality of document clusters.

Plain English translation pending...
Claim 16

Original Legal Text

16. The system of claim 15 , wherein the generated result includes at least one of a plurality of action datasets mapped to the determined relevant portion of the generated plurality of document clusters.

Plain English Translation

The system relates to information retrieval and document clustering, addressing the challenge of efficiently organizing and retrieving relevant information from large datasets. The system generates clusters of documents based on their content and relevance to a user query, then identifies the most relevant subset of these clusters. The system further processes these clusters to produce actionable results, such as recommendations, summaries, or other derived insights. These results are mapped to the relevant document clusters, allowing users to access the underlying data that supports the generated output. The system may also include a user interface that displays the results alongside the corresponding document clusters, enabling users to explore the source material. The action datasets may include structured data, summaries, or other processed forms of the clustered documents, ensuring that the results are both informative and actionable. This approach improves the efficiency of information retrieval by focusing on the most relevant portions of the clustered data while maintaining transparency by linking results to their source documents.

Claim 17

Original Legal Text

17. The system of claim 13 , wherein each electronic document of the plurality of electronic documents is generated based on other electronic documents retrieved from at least one remote data store.

Plain English Translation

This invention relates to a system for managing and generating electronic documents, addressing the challenge of efficiently creating and organizing documents by leveraging existing data from remote sources. The system retrieves electronic documents from at least one remote data store and uses these retrieved documents as a basis for generating new electronic documents. Each generated document is derived from the content, structure, or metadata of the retrieved documents, ensuring consistency and reducing manual effort. The system may include a processing module that analyzes the retrieved documents to extract relevant information, a generation module that constructs new documents based on the extracted data, and a storage module that organizes the generated documents for future access. The system may also support collaborative editing, version control, and automated formatting to enhance document management. By automating the document generation process from remote sources, the system improves efficiency and accuracy in document creation workflows.

Claim 18

Original Legal Text

18. The system of claim 17 , wherein each other electronic document is retrieved based on a query, the query being generated based on one of the plurality of stored command templates.

Plain English translation pending...
Patent Metadata

Filing Date

Unknown

Publication Date

February 23, 2021

Inventors

Vladimir Dobrynin
David Patterson
Niall Rooney

Want to explore more patents?

Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.

Citation & reuse

Analysis on this page is generated by Patentable — an AI-powered patent intelligence platform. AI-generated summaries, explanations, FAQs, and analysis may be reused with attribution and a visible link back to the canonical URL below. Patent abstracts and claims are USPTO public domain.

Cite as: Patentable. “Automated Document Cluster Merging for Topic-Based Digital Assistant Interpretation” (10929613). https://patentable.app/patents/10929613

© 2026 Nomic Interactive Technology LLC. Machine-readable context available at /api/llm-context/10929613. See llms.txt for full attribution policy.