10679134

Automated Ontology Development

PublishedJune 9, 2020
Assigneenot available in USPTO data we have
Technical Abstract

Patent Claims
20 claims

Legal claims defining the scope of protection. Each claim is shown in both the original legal language and a plain English translation.

Claim 1

Original Legal Text

1. A method of automated ontology development for processing communication data via a computer system, wherein the ontology is a structural representation of language elements and relationships between those language elements within a domain stored in the memory of the computer system the method comprising: processing a corpus of communication data, the corpus comprising communication data from a plurality of interactions; extracting a plurality of terms from the corpus, wherein each term of the plurality is a plurality of words that identify a single concept within the corpus; automatedly generating an ontology from the extracted term by at least creating two context vectors for each of the plurality of terms and comparing the context vectors for each of the plurality of terms to one another to categorize the terms into a plurality of relations, wherein a first of the two context vectors of a given term predicts terms that will appear to the left of the given term based on a calculated score for terms to the left of the given term, wherein a second of the two context vectors predicts terms that will appear to the right of the given term based on a calculated score for terms to the right of the given term; and storing the automatedly generated ontology in an ontology database in the memory of the computer system.

Plain English Translation

This invention relates to natural language processing and specifically to automated ontology development for understanding communication data. The problem addressed is the manual effort and time required to create structured representations of language for analyzing large volumes of communication. The method involves processing a collection of communication data from multiple interactions. From this data, a set of terms is extracted, where each term represents a single concept and can consist of multiple words. An ontology, which is a structured representation of language elements and their relationships within a specific domain, is then automatically generated from these extracted terms. This generation process involves creating two context vectors for each term. One context vector predicts terms that are likely to appear before a given term, based on a calculated score of preceding terms. The second context vector predicts terms that are likely to appear after a given term, based on a calculated score of succeeding terms. By comparing these context vectors for each term, the terms are categorized into various relationships. Finally, the resulting automated ontology is stored in a database within the computer system's memory.

Claim 2

Original Legal Text

2. The method of claim 1 , wherein processing the corpus further comprises: receiving raw communication data; and applying a rank filter to select a portion of the raw communication data as the corpus of communication data.

Plain English Translation

This invention relates to processing communication data to improve analysis or decision-making. The problem addressed is the challenge of efficiently selecting relevant communication data from a large volume of raw inputs, such as messages, logs, or other unstructured or semi-structured data, to create a focused corpus for further analysis. The method involves receiving raw communication data, which may include text, metadata, or other structured or unstructured information from various sources. A rank filter is then applied to this raw data to select a portion of it as the corpus of communication data. The rank filter evaluates the raw data based on predefined criteria, such as relevance, importance, or quality, to filter out irrelevant or low-value data. This filtering step ensures that only the most pertinent data is included in the corpus, improving the efficiency and accuracy of subsequent analysis. The filtered corpus can then be used for various applications, such as sentiment analysis, trend detection, or decision support, by focusing on the most relevant subset of the original data. This approach reduces computational overhead and enhances the reliability of insights derived from the communication data. The rank filter may be based on statistical methods, machine learning models, or rule-based systems, depending on the specific requirements of the application.

Claim 3

Original Legal Text

3. The method of claim 2 , wherein the raw communication data comprises transcriptions of interactions, agent scripts, service manuals, and product manuals.

Plain English Translation

This invention relates to a system for analyzing communication data to improve customer service interactions. The problem addressed is the difficulty in extracting actionable insights from diverse communication data sources to enhance agent performance and customer satisfaction. The method involves processing raw communication data, which includes transcriptions of customer interactions, agent scripts, service manuals, and product manuals. This data is analyzed to identify patterns, trends, and areas for improvement in customer service processes. The analysis may involve natural language processing, machine learning, or other data mining techniques to extract meaningful insights. The processed data is then used to generate recommendations for agents, such as suggested responses, training materials, or process optimizations. The system may also track the effectiveness of these recommendations over time to refine future analyses. By integrating multiple data sources, the method provides a comprehensive view of customer service interactions, enabling organizations to improve agent training, refine service protocols, and enhance overall customer experience. The approach ensures that insights are derived from real-world interactions, manuals, and scripts, making the recommendations more relevant and actionable.

Claim 4

Original Legal Text

4. The method of claim 1 , wherein processing the corpus further comprises: identifying scripts within the corpus, wherein scripts are recurring patterns of three or more words.

Plain English Translation

The invention relates to natural language processing (NLP) and text analysis, specifically addressing the challenge of identifying recurring patterns or scripts within a text corpus. Scripts are defined as recurring sequences of three or more words that appear in a predictable order, often representing common linguistic or behavioral patterns. The method involves analyzing a text corpus to detect these scripts, which can be used for applications such as sentiment analysis, dialogue systems, or automated content generation. The process includes preprocessing the corpus to standardize text, followed by pattern recognition techniques to extract multi-word sequences that repeat frequently. The identified scripts are then stored or used for further analysis, enabling insights into language use or behavior. This approach improves the accuracy of NLP tasks by leveraging structured, repetitive language patterns. The method may also include filtering scripts based on frequency or context to enhance relevance. By automating script detection, the invention reduces manual effort in linguistic analysis and supports scalable text processing.

Claim 5

Original Legal Text

5. The method of claim 4 , wherein processing the corpus further comprises: zoning the communication data to segment the communication data into meaning units.

Plain English Translation

This invention relates to processing communication data to extract meaningful information. The problem addressed is the difficulty in analyzing unstructured communication data, such as text or speech, to identify and segment meaningful units of information for further analysis or interpretation. The method involves processing a corpus of communication data to enhance its usability. The communication data is first preprocessed to prepare it for analysis, which may include cleaning, normalizing, or formatting the data. This preprocessing step ensures the data is in a consistent and structured form suitable for further processing. The processed communication data is then zoned to segment it into meaning units. Zoning involves dividing the communication data into distinct segments, where each segment represents a coherent unit of meaning. This segmentation helps in isolating and analyzing specific parts of the communication data independently, improving the accuracy and efficiency of subsequent analysis tasks. The segmentation process may involve identifying boundaries between different meaning units based on linguistic, syntactic, or semantic cues. For example, in text data, this could involve identifying sentence boundaries, paragraphs, or topic shifts. In speech data, this could involve detecting pauses, intonation changes, or other acoustic features that indicate the end of one meaning unit and the start of another. By segmenting the communication data into meaning units, the method enables more precise and context-aware analysis, which can be useful in applications such as natural language processing, machine translation, sentiment analysis, or automated content summarization. The segmented data can then be used for further processing, such as extracting key inform

Claim 6

Original Legal Text

6. The method of claim 5 , wherein the plurality of terms are extracted from the corpus on a meaning unit-by-meaning unit basis.

Plain English Translation

This invention relates to natural language processing and information extraction, specifically addressing the challenge of accurately extracting meaningful terms from a corpus of text. The method involves analyzing a text corpus to identify and extract terms on a meaning unit-by-meaning unit basis, ensuring that each extracted term retains its contextual and semantic relevance. The approach improves upon traditional term extraction techniques by focusing on discrete meaning units rather than arbitrary segments, which enhances the precision and coherence of the extracted terms. This method is particularly useful in applications such as document summarization, semantic search, and knowledge graph construction, where maintaining the integrity of meaning is critical. The extracted terms can be further processed or used to generate structured data representations, enabling more accurate and context-aware information retrieval and analysis. By operating at the level of meaning units, the method avoids the pitfalls of fragmented or misleading term extraction, leading to more reliable and interpretable results. The technique can be applied across various domains, including legal, medical, and technical documentation, where precise term extraction is essential for downstream tasks.

Claim 7

Original Legal Text

7. The method of claim 1 , wherein the plurality of interactions are customer service interactions and the ontology is tailored for use in analyzing customer service interactions.

Plain English Translation

This invention relates to a method for analyzing customer service interactions using a specialized ontology. The method involves processing a plurality of customer service interactions, such as calls, chats, or emails, to extract and analyze data. The ontology is specifically designed for customer service applications, enabling structured categorization and interpretation of interaction content. The method includes identifying key elements within the interactions, such as customer queries, agent responses, and resolution outcomes, and mapping these elements to predefined categories within the ontology. This structured analysis allows for improved tracking of common issues, agent performance, and customer satisfaction trends. The ontology may include hierarchical relationships between terms, enabling deeper insights into interaction patterns. The method may also involve natural language processing (NLP) techniques to parse unstructured text or speech data, ensuring accurate classification. By tailoring the ontology to customer service contexts, the method enhances the efficiency and accuracy of interaction analysis, supporting better decision-making in service operations. The invention addresses the challenge of extracting meaningful insights from large volumes of unstructured customer service data, providing a systematic approach to understanding and improving service quality.

Claim 8

Original Legal Text

8. The method of claim 1 , wherein the ontology comprises a plurality of terms, a plurality of relations, and a plurality of themes identified from the corpus.

Plain English Translation

This invention relates to a method for constructing an ontology from a corpus of text data. The method addresses the challenge of automatically extracting structured knowledge from unstructured text by identifying key terms, relationships between those terms, and broader themes present in the corpus. The ontology serves as a knowledge representation framework that organizes information in a way that supports semantic search, data analysis, and machine learning applications. The method involves processing the corpus to identify a plurality of terms, which are significant words or phrases that represent concepts within the text. It also identifies a plurality of relations, which are connections or associations between the terms, such as hierarchical, causal, or contextual relationships. Additionally, the method identifies a plurality of themes, which are higher-level groupings of related terms and relations that represent broader topics or subjects in the corpus. These themes help in organizing the ontology into meaningful clusters. The ontology is constructed by integrating these extracted elements—terms, relations, and themes—into a structured format that can be queried and analyzed. This structured representation enables more efficient information retrieval and supports applications like natural language processing, knowledge graphs, and semantic reasoning. The method ensures that the ontology dynamically adapts to the content of the corpus, making it useful for evolving datasets.

Claim 9

Original Legal Text

9. The method of claim 1 , wherein the plurality of interactions is from multiple platforms.

Plain English Translation

A system and method for analyzing user interactions across multiple digital platforms to improve user engagement and personalization. The technology addresses the challenge of fragmented user data, where interactions on different platforms (e.g., websites, mobile apps, social media) are often siloed, limiting the ability to gain a comprehensive understanding of user behavior. The invention collects and aggregates interaction data from these diverse platforms, including clicks, views, purchases, and social shares, into a unified dataset. By analyzing this consolidated data, the system identifies patterns, preferences, and trends that would not be visible when examining individual platforms in isolation. This enables more accurate user segmentation, personalized content recommendations, and targeted marketing strategies. The method may also apply machine learning techniques to predict future user behavior based on historical interactions across platforms. The solution enhances user experience by delivering more relevant content and reduces inefficiencies in marketing and customer support by leveraging cross-platform insights. The invention is particularly useful for businesses operating in multi-channel environments where understanding the full user journey is critical for optimizing engagement and conversion rates.

Claim 10

Original Legal Text

10. The method of claim 1 , wherein the first of the two context vectors of a given term is a list of terms that predicts terms that will appear to the left of a given term, the second of the two context vectors is a second list of terms that predicts terms that will appear to the right of the given term, and each of the context vectors includes up to a predetermined number of potential terms in the first or second list of terms.

Plain English Translation

This invention relates to natural language processing (NLP) and text analysis, specifically improving the representation of terms in a document by capturing contextual relationships. The problem addressed is the limitation of traditional term representation methods, which often fail to account for the directional context of words in a sentence or document. For example, a word's meaning can vary based on whether it appears before or after another word, but existing models may not distinguish between left and right contextual influences. The solution involves generating two distinct context vectors for each term in a document. The first context vector is a list of terms that predict or are likely to appear to the left of the given term, while the second context vector is a list of terms that predict or are likely to appear to the right of the given term. Each list is constrained to a predetermined number of potential terms, ensuring computational efficiency and relevance. This bidirectional approach allows the system to better model the directional dependencies between words, improving tasks such as semantic analysis, machine translation, and text generation. The method enhances term representation by explicitly capturing asymmetric contextual relationships, leading to more accurate and nuanced language understanding.

Claim 11

Original Legal Text

11. The method of claim 1 , wherein automatedly generating the ontology further comprises: comparing the plurality of relations to one another to categorize the relations into a plurality of themes.

Plain English Translation

The invention relates to automated ontology generation, specifically improving the organization of relational data within a knowledge graph or semantic network. The core problem addressed is the lack of structured thematic categorization in automatically generated ontologies, which can lead to disorganized or redundant relationships between entities. The method involves analyzing a plurality of relations extracted from data sources to identify and categorize them into distinct themes. This is achieved by comparing the relations to one another based on shared attributes, contextual similarities, or semantic proximity. The categorized themes are then used to refine the ontology, ensuring that related concepts are grouped logically, improving clarity and usability. This step enhances the ontology's ability to represent complex knowledge structures efficiently, making it more suitable for applications like natural language processing, data integration, or expert systems. The method may also include preprocessing steps to extract and normalize relations from unstructured or semi-structured data, ensuring consistency before thematic categorization. The resulting ontology can be dynamically updated as new relations are identified, maintaining relevance over time. The approach reduces manual effort in ontology curation while improving the accuracy and coherence of the generated knowledge structure.

Claim 12

Original Legal Text

12. A method of automated ontology development, the method comprising: processing a corpus of communication data, the corpus comprising communication data from a plurality of interactions, by zoning the communication data to segment the communication data into a plurality of meaning units; extracting a plurality of terms from each of the plurality of meaning units, wherein each term of the plurality is a plurality of words that identify a single concept within the corpus; automatedly generating an ontology that comprises the extracted terms by at least creating two context vectors for each of the plurality of terms and comparing the context vectors for each of the plurality of terms to one another to categorize the terms into a plurality of relations, wherein a first of the two context vectors of a given term predicts terms that will appear to the left of the given term based on a calculated score for terms to the left of the given term, wherein a second of the two context vectors predicts terms that will appear to the right of the given term based on a calculated score for terms to the right of the given term; and storing the automatedly generated ontology in an ontology database.

Plain English Translation

The field of automated ontology development involves creating structured knowledge representations from unstructured communication data. A challenge in this domain is efficiently extracting meaningful concepts and relationships from large datasets to build accurate ontologies without manual intervention. This method processes a corpus of communication data from multiple interactions by segmenting it into meaning units, which are smaller portions of text that convey distinct concepts. From these segments, multi-word terms representing single concepts are extracted. The method then generates an ontology by creating two context vectors for each term. The first vector predicts terms that typically appear to the left of a given term based on a calculated score, while the second predicts terms that appear to the right. By comparing these vectors, the terms are categorized into relational groups, forming the ontology. The resulting ontology is stored in a database for further use. This approach automates the extraction of semantic relationships from communication data, enabling the creation of structured knowledge models without manual input. The use of bidirectional context vectors improves the accuracy of term categorization by considering both preceding and succeeding terms in the corpus.

Claim 13

Original Legal Text

13. The method of claim 12 , wherein processing the corpus further comprises: receiving raw communication data; and applying a rank filter to select a portion of the raw communication data as the corpus of communication data.

Plain English Translation

This invention relates to processing communication data to improve information retrieval or analysis. The problem addressed is the challenge of efficiently selecting relevant data from large volumes of raw communication data, such as messages, emails, or other digital exchanges, to form a meaningful corpus for further analysis. The method involves receiving raw communication data, which may include unstructured or semi-structured text-based interactions. A rank filter is applied to this raw data to select a portion of it as the corpus of communication data. The rank filter evaluates the raw data based on predefined criteria, such as relevance, importance, or quality, to filter out irrelevant or low-value content. This ensures that only the most significant or useful data is retained for subsequent processing, improving efficiency and accuracy in downstream tasks like sentiment analysis, topic modeling, or search indexing. The rank filter may use techniques such as keyword matching, statistical scoring, or machine learning models to assess the data. The filtered corpus is then used for further analysis, such as extracting insights, detecting trends, or generating reports. This approach reduces computational overhead and enhances the reliability of results by focusing on high-quality data. The method is applicable in fields like customer support, social media monitoring, and business intelligence, where processing large datasets efficiently is critical.

Claim 14

Original Legal Text

14. The method of claim 13 , wherein the rank filter selects data files from the raw communication data that include a threshold of identified related terms to the domain of the ontology that is to be developed.

Plain English Translation

This invention relates to a method for filtering and ranking data files from raw communication data to develop an ontology. The method addresses the challenge of efficiently identifying and selecting relevant data files that contain a sufficient number of terms related to a specific domain, thereby improving the accuracy and relevance of the ontology development process. The method involves analyzing raw communication data to identify terms that are related to the domain of the ontology being developed. A rank filter is then applied to select data files that include a threshold number of these identified related terms. This ensures that only the most relevant data files are used in the ontology development process, enhancing the quality and specificity of the resulting ontology. The method may also include preprocessing the raw communication data to extract and normalize terms, as well as applying additional filters to further refine the selection of data files. By focusing on data files with a high concentration of relevant terms, the method improves the efficiency and effectiveness of ontology development in various domains.

Claim 15

Original Legal Text

15. The method of claim 14 , wherein the raw communication data comprises interaction data from the interactions from multiple platforms including interactions made via one or more of by phone, email, internee chat, text message, web page comment, social media interaction, customer surveys, an audio recording, streaming audio, a transcription of spoken content, or written correspondence.

Plain English Translation

This invention relates to analyzing raw communication data from multiple interaction platforms to derive insights. The problem addressed is the fragmentation of customer or user interactions across diverse communication channels, making it difficult to gain a unified understanding of engagement patterns. The solution involves collecting and processing interaction data from various sources, including phone calls, emails, internet chat, text messages, web page comments, social media interactions, customer surveys, audio recordings, streaming audio, transcriptions of spoken content, and written correspondence. The method aggregates this data to identify trends, sentiments, or other actionable insights. The system may use natural language processing or other analytical techniques to extract meaningful information from the raw data. By integrating interactions from multiple platforms, the invention enables businesses or organizations to assess customer behavior, improve service quality, or enhance decision-making based on comprehensive communication data. The approach ensures that no single interaction channel is analyzed in isolation, providing a holistic view of user engagement.

Claim 16

Original Legal Text

16. The method of claim 12 , wherein automatedly generating the ontology further comprises: comparing the plurality of relations to one another to categorize the relations into a plurality of themes.

Plain English Translation

This invention relates to automated ontology generation, specifically improving the organization of relational data within a knowledge graph or semantic network. The problem addressed is the lack of structured thematic categorization in automatically generated ontologies, which can lead to disorganized or redundant relationships between entities. The method involves analyzing a plurality of relations extracted from data sources to identify and categorize them into a plurality of themes. These relations represent connections between entities in a knowledge graph, such as subject-predicate-object triples. By comparing the relations to one another, the system groups them into broader thematic categories, ensuring that semantically similar relations are clustered together. This thematic categorization enhances the ontology's coherence and usability by reducing redundancy and improving logical consistency. The method may also include preprocessing the relations to standardize formats, resolving ambiguities, and filtering irrelevant or low-confidence connections. The thematic categorization step ensures that the ontology reflects meaningful, high-level groupings of relationships, making it easier for users or downstream applications to navigate and query the knowledge structure. This approach is particularly useful in large-scale knowledge graphs where manual organization would be impractical.

Claim 17

Original Legal Text

17. The method of claim 16 , wherein the ontology further comprises the plurality of relations and the plurality of themes.

Plain English Translation

A system and method for organizing and analyzing data using an ontology-based framework addresses the challenge of efficiently structuring and retrieving information from large, unstructured datasets. The ontology includes a hierarchical arrangement of concepts, relations, and themes to model domain-specific knowledge. Concepts represent fundamental entities or ideas, while relations define connections between these concepts, such as hierarchical, associative, or causal relationships. Themes group related concepts and relations to form higher-level abstractions, enabling more intuitive and context-aware data retrieval. The ontology is dynamically updated based on new data inputs, ensuring relevance and adaptability. This approach improves information retrieval accuracy, supports semantic search capabilities, and facilitates knowledge discovery by leveraging structured relationships between concepts. The system may be applied in fields like natural language processing, data mining, and expert systems to enhance decision-making and automate knowledge management tasks.

Claim 18

Original Legal Text

18. A system for automated ontology development, the system comprising: a communication data database populated with communication data; a processor communicatively connected to the database of communication data and communicatively connected to a computer readable medium programmed with computer readable code that upon execution by the processor causes the processor to: process a corpus of communication data received from the database; extract a plurality of terms from the corpus, wherein each term of the plurality is a plurality of words that identify a single concept within the corpus; and automatedly generate an ontology from the extracted terms by at least creating two context vectors for each of the plurality of terms and comparing the context vectors for each of the plurality of terms to one another to categorize the terms into a plurality of relations, wherein a first of the two context vectors of a given term predicts terms that will appear to the left of the given term based on a calculated score for terms to the left of the given term, wherein a second of the two context vectors predicts terms that will appear to the right of the given term based on a calculated score for terms to the right of the given term; and an ontology database upon which the processor stores the automatedly generated ontology.

Plain English Translation

The system automates the development of ontologies by analyzing communication data to extract and categorize terms into relational structures. The system includes a database storing communication data, a processor, and a computer-readable medium with code that processes a corpus of communication data to extract multi-word terms representing single concepts. The processor generates two context vectors for each term: one predicting terms that appear to its left and another predicting terms that appear to its right, based on calculated scores. These vectors are compared to categorize terms into relational groupings, forming an ontology. The generated ontology is stored in a dedicated database. This approach automates the creation of structured knowledge representations from unstructured communication data, improving efficiency in ontology development by leveraging contextual term relationships. The system eliminates manual term extraction and categorization, reducing human effort and potential bias in ontology construction. The use of bidirectional context vectors enhances accuracy in term relationship mapping, ensuring robust ontology generation.

Claim 19

Original Legal Text

19. The system of claim 18 , wherein the communication data comprises transcriptions of interactions, agent scripts, service manuals, and product manuals.

Plain English Translation

This system automates ontology development by processing various communication data. It comprises a database storing this communication data, a processor running specific code, and an ontology database for storing the results. The processor is configured to: process a selected corpus of communication data from its database; extract multi-word terms, each identifying a single concept; and automatically generate an ontology from these terms. This generation involves creating two context vectors for each term: a first vector predicting terms appearing to its left and a second predicting terms appearing to its right, both based on calculated scores. These context vectors are compared to categorize terms into relations. The resulting ontology is then stored in the ontology database. Specifically, the communication data processed by this system includes transcriptions of interactions, agent scripts, service manuals, and product manuals. ERROR (embedding): Error: Failed to save embedding: Could not find the 'embedding' column of 'patent_claims' in the schema cache

Claim 20

Original Legal Text

20. The system of claim 18 , further comprising: a script database communicatively connected to the processor; and wherein execution of the computer readable code by the processor further causes the processor to: surface a plurality of scripts from the communication data; store the plurality of scripts at the script database; and apply the plurality of scripts from the script database to the corpus of communication data to identify scripts within the corpus of communication data.

Plain English Translation

This invention relates to a system for analyzing communication data to identify and apply scripts, which are predefined sequences of interactions or messages. The system addresses the challenge of efficiently detecting and utilizing recurring patterns in communication data, such as customer service interactions, to improve response accuracy and consistency. The system includes a processor configured to execute computer-readable code to process a corpus of communication data, such as text or voice interactions. The processor extracts communication data from various sources, including customer service logs, chat transcripts, or call recordings. The system then surfaces a plurality of scripts from this communication data, which are stored in a script database. These scripts represent common interaction patterns, such as frequently used responses or workflows. The processor applies the stored scripts to the corpus of communication data to identify additional instances of these scripts. This allows the system to recognize recurring interactions, enabling automated responses or suggesting best practices to agents. The script database may be updated dynamically as new scripts are identified, ensuring the system remains current with evolving communication patterns. The system enhances efficiency by reducing manual effort in identifying and applying scripts, improving response consistency and accuracy in communication workflows.

Patent Metadata

Filing Date

Unknown

Publication Date

June 9, 2020

Inventors

Roni Romano
Yair Horesh
Jeremie Dreyfuss

Want to explore more patents?

Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.

Citation & reuse

Analysis on this page is generated by Patentable — an AI-powered patent intelligence platform. AI-generated summaries, explanations, FAQs, and analysis may be reused with attribution and a visible link back to the canonical URL below. Patent abstracts and claims are USPTO public domain.

Cite as: Patentable. “AUTOMATED ONTOLOGY DEVELOPMENT” (10679134). https://patentable.app/patents/10679134

© 2026 Nomic Interactive Technology LLC. Machine-readable context available at /api/llm-context/10679134. See llms.txt for full attribution policy.

AUTOMATED ONTOLOGY DEVELOPMENT