10938779

Guided Word Association Based Domain Name Detection

PublishedMarch 2, 2021
Assigneenot available in USPTO data we have
Technical Abstract

Patent Claims
20 claims

Legal claims defining the scope of protection. Each claim is shown in both the original legal language and a plain English translation.

Claim 1

Original Legal Text

1. A computer program product including one or more computer readable storage mediums collectively storing program instructions that are executable by a computer to cause the computer to perform operations comprising: obtaining an original domain name; constructing a feature space from a corpus of text, wherein each word appearing in the corpus is represented as a vector in the feature space; detecting whether a domain name registration exists for each combination of the original domain name and each of a plurality of seed words from the feature space; determining, for each seed word included in an existing domain name registration, a plurality of nearest neighbor candidate words, based on vector distance in the feature space; and repeating, for at least one repetition, the detecting and the determining, wherein the plurality of nearest neighbor candidate words are utilized as the plurality of seed words.

Plain English Translation

This invention relates to domain name generation and specifically addresses the problem of identifying potentially valuable or related domain names based on an original domain name. The system involves a computer program product stored on computer-readable media. When executed, it performs operations to generate candidate domain names. It begins by taking an original domain name as input. A feature space is constructed from a collection of text, where each word is represented as a vector. The system then checks if a domain name registration already exists for combinations of the original domain name and various "seed words" derived from the feature space. For any seed word that is part of an existing domain name registration, the system identifies a set of "nearest neighbor" candidate words based on their vector proximity in the feature space. This process of checking for existing registrations and identifying nearest neighbors is repeated, using the newly identified nearest neighbor candidate words as the new set of seed words for subsequent iterations. This iterative process allows for the discovery of a broader range of related domain names.

Claim 2

Original Legal Text

2. The computer program product of claim 1 , further comprising disqualifying, from the plurality of nearest neighbor candidate words of each seed word included in an existing domain name registration, each candidate word having a vector distance within a threshold distance of any seed word included in a non-existent domain name registration; wherein the repeating further includes the disqualifying.

Plain English translation pending...
Claim 3

Original Legal Text

3. The computer program product of claim 2 , wherein the disqualifying includes maintaining, as a candidate word, every nth candidate word having a vector distance within a threshold distance of any seed word included in a non-existent domain name registration.

Plain English translation pending...
Claim 4

Original Legal Text

4. The computer program product of claim 1 , wherein prior to the repeating, the plurality of seed words include words from the corpus that appear most frequently.

Plain English translation pending...
Claim 5

Original Legal Text

5. The computer program product of claim 1 , wherein each combination of the original domain name and each of the plurality of seed words is one of a combination in which the seed word follows the original domain name and a combination in which the seed word precedes the original domain name.

Plain English translation pending...
Claim 6

Original Legal Text

6. The computer program product of claim 1 , wherein the constructing includes cataloging the unique words in the corpus, and generating a vector for each unique word in the feature space.

Plain English Translation

This invention relates to natural language processing (NLP) and text analysis, specifically addressing the challenge of efficiently representing textual data for machine learning tasks. The system constructs a feature space from a corpus of text by cataloging all unique words present in the corpus and generating a vector for each unique word within the feature space. The vectors represent the words in a structured format, enabling subsequent analysis such as classification, clustering, or semantic similarity detection. The feature space is built by processing the corpus to identify distinct words, then mapping each word to a numerical vector that captures its linguistic properties or contextual relationships. This approach facilitates the transformation of unstructured text into a structured format suitable for machine learning algorithms, improving computational efficiency and accuracy in NLP applications. The method ensures that all unique words are systematically represented, allowing for consistent and scalable text processing across different datasets. The generated vectors can be used in various downstream tasks, such as document classification, sentiment analysis, or information retrieval, by leveraging the structured feature space to enhance model performance. The invention focuses on automating the feature extraction process to reduce manual effort and improve the reliability of text-based machine learning models.

Claim 7

Original Legal Text

7. The computer program product of claim 1 , further comprising selecting a corpus of text according to one of location and time.

Plain English Translation

This invention relates to a computer program product for processing text data, specifically for selecting and analyzing text corpora based on location or time. The system addresses the challenge of efficiently retrieving relevant text data from large datasets by enabling targeted selection of text based on geographic or temporal criteria. The program product includes a method for filtering and analyzing text, where the selection of the text corpus is determined by either location or time parameters. This allows users to focus on region-specific or time-specific text data, improving the relevance and accuracy of subsequent analysis. The invention also involves preprocessing the text data, which may include cleaning, normalization, or other preparatory steps to ensure the data is suitable for analysis. The selected text corpus is then processed to extract meaningful insights, such as trends, patterns, or other relevant information. The system may also include additional features for refining the analysis, such as adjusting the selection criteria or applying different analytical techniques. By enabling precise text corpus selection based on location or time, the invention enhances the efficiency and effectiveness of text data analysis in various applications, including research, business intelligence, and social media monitoring.

Claim 8

Original Legal Text

8. A computer-implemented method comprising: obtaining an original domain name; constructing a feature space from a corpus of text, wherein each word appearing in the corpus is represented as a vector in the feature space; detecting whether a domain name registration exists for each combination of the original domain name and each of a plurality of seed words from the feature space; determining, for each seed word included in an existing domain name registration, a plurality of nearest neighbor candidate words, based on vector distance in the feature space; and repeating, for at least one repetition, the detecting and the determining, wherein the plurality of nearest neighbor candidate words are utilized as the plurality of seed words.

Plain English translation pending...
Claim 9

Original Legal Text

9. The computer-implemented method of claim 8 , further comprising disqualifying, from the plurality of nearest neighbor candidate words of each seed word included in an existing domain name registration, each candidate word having a vector distance within a threshold distance of any seed word included in a non-existent domain name registration; wherein the repeating further includes the disqualifying.

Plain English translation pending...
Claim 10

Original Legal Text

10. The computer-implemented method of claim 9 , wherein the disqualifying includes maintaining, as a candidate word, every nth candidate word having a vector distance within a threshold distance of any seed word included in a non-existent domain name registration.

Plain English Translation

The invention relates to a computer-implemented method for processing candidate words in the context of domain name registration. The method addresses the challenge of efficiently filtering and selecting candidate words for domain name suggestions, particularly when dealing with non-existent domain names. The process involves evaluating candidate words based on their vector distance from seed words associated with unregistered domain names. To refine the selection, the method disqualifies certain candidate words while retaining specific ones. Specifically, it maintains every nth candidate word that falls within a predefined threshold distance from any seed word linked to a non-existent domain name registration. This approach ensures that a diverse set of relevant candidate words is preserved for further consideration, improving the accuracy and relevance of domain name suggestions. The method leverages vector-based similarity metrics to assess the relationship between candidate words and seed words, enabling precise filtering. By selectively retaining candidate words, the system optimizes the domain name generation process, reducing computational overhead while maintaining high-quality suggestions. The technique is particularly useful in automated domain name recommendation systems, where efficiency and relevance are critical.

Claim 11

Original Legal Text

11. The computer-implemented method of claim 8 , wherein prior to the repeating, the plurality of seed words include words from the corpus that appear most frequently.

Plain English translation pending...
Claim 12

Original Legal Text

12. The computer-implemented method of claim 8 , wherein each combination of the original domain name and each of the plurality of seed words is one of a combination in which the seed word follows the original domain name and a combination in which the seed word precedes the original domain name.

Plain English translation pending...
Claim 13

Original Legal Text

13. The computer-implemented method of claim 8 , wherein the constructing includes cataloging the unique words in the corpus, and generating a vector for each unique word in the feature space.

Plain English translation pending...
Claim 14

Original Legal Text

14. The computer-implemented method of claim 8 , further comprising selecting a corpus of text according to one of location and time.

Plain English Translation

This invention relates to a computer-implemented method for processing text data, specifically focusing on selecting a corpus of text based on location or time. The method addresses the challenge of efficiently filtering and analyzing large volumes of text data by dynamically adjusting the corpus selection criteria. The method involves retrieving text data from a database or data source, where the text may include documents, social media posts, or other textual content. The method further includes analyzing the text data to extract relevant information, such as keywords, entities, or sentiment. The selection of the corpus is based on predefined criteria related to location or time, allowing for targeted analysis of text data from specific regions or time periods. For example, the method may filter text data to include only content generated within a particular geographic area or within a specified time range. This targeted selection enables more precise and relevant insights from the text data, improving the efficiency and accuracy of text analysis tasks. The method may be applied in various domains, including market research, social media monitoring, and content moderation, where location-based or time-based filtering of text data is essential.

Claim 15

Original Legal Text

15. An apparatus comprising: an obtaining section configured to obtain an original domain name; a constructing section configured to construct a feature space from a corpus of text, wherein each word appearing in the corpus is represented as a vector in the feature space; a detecting section configured to detect whether a domain name registration exists for each combination of the original domain name and each of a plurality of seed words from the feature space; a determining section configured to determine, for each seed word included in an existing domain name registration, a plurality of nearest neighbor candidate words, based on vector distance in the feature space; and a repeating section configured to cause the detecting section and the determining section to repeat, for at least one repetition, their respective functions utilizing the plurality of nearest neighbor candidate words as the plurality of seed words, wherein the obtaining, constructing, detecting, determining, and repeating sections are implemented by a memory device for storing program code and a hardware processor for executing the program code.

Plain English translation pending...
Claim 16

Original Legal Text

16. The apparatus of claim 15 , further comprising a disqualifying section configured to disqualify, from the plurality of nearest neighbor candidate words of each seed word included in an existing domain name registration, each candidate word having a vector distance within a threshold distance of any seed word included in a non-existent domain name registration; wherein the repeating section further causes the disqualifying section to repeat its function.

Plain English translation pending...
Claim 17

Original Legal Text

17. The apparatus of claim 16 , wherein the disqualifying section includes a maintaining section configured to maintain, as a candidate word, every nth candidate word having a vector distance within a threshold distance of any seed word included in a non-existent domain name registration.

Plain English translation pending...
Claim 18

Original Legal Text

18. The apparatus of claim 15 , wherein prior to the repeating section causing the detecting section and the determining section to repeat, the plurality of seed words include words from the corpus that appear most frequently.

Plain English translation pending...
Claim 19

Original Legal Text

19. The apparatus of claim 15 , wherein each combination of the original domain name and each of the plurality of seed words is one of a combination in which the seed word follows the original domain name and a combination in which the seed word precedes the original domain name.

Plain English translation pending...
Claim 20

Original Legal Text

20. The apparatus of claim 15 , wherein the constructing includes a cataloging section configured to catalog the unique words in the corpus, and generating section configure to generate a vector for each unique word in the feature space.

Plain English Translation

This invention relates to natural language processing (NLP) and text analysis, specifically addressing the challenge of efficiently representing textual data for machine learning or information retrieval tasks. The apparatus processes a corpus of text by identifying and cataloging unique words within it. A cataloging section systematically organizes these unique words, ensuring each is distinctly identified. A generating section then creates a vector representation for each unique word in a predefined feature space, enabling numerical analysis of the text. The feature space may include dimensions such as word frequency, contextual embeddings, or other linguistic features. This vectorization process transforms unstructured text into structured numerical data, facilitating tasks like document classification, semantic search, or topic modeling. The apparatus may further include preprocessing steps to clean or normalize the text before cataloging and vectorization. The resulting vectors can be used in machine learning models, clustering algorithms, or similarity comparisons to extract meaningful insights from the corpus. This approach improves computational efficiency and accuracy in text-based applications by standardizing word representations and enabling scalable analysis of large text datasets.

Patent Metadata

Filing Date

Unknown

Publication Date

March 2, 2021

Inventors

Pablo Loyola
Kugamoorthy Gajananan
Yuji Watanabe
Fumiko Akiyama

Want to explore more patents?

Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.

Citation & reuse

Analysis on this page is generated by Patentable — an AI-powered patent intelligence platform. AI-generated summaries, explanations, FAQs, and analysis may be reused with attribution and a visible link back to the canonical URL below. Patent abstracts and claims are USPTO public domain.

Cite as: Patentable. “GUIDED WORD ASSOCIATION BASED DOMAIN NAME DETECTION” (10938779). https://patentable.app/patents/10938779

© 2026 Nomic Interactive Technology LLC. Machine-readable context available at /api/llm-context/10938779. See llms.txt for full attribution policy.