{"schema_version":"1.0","canonical_url":"https://patentable.app/patents/US-9672278","patent":{"patent_number":"US-9672278","title":"Category-based lemmatizing of a phrase in a document","assignee":null,"inventors":[],"filing_date":"2015-08-07T00:00:00.000Z","publication_date":"2017-06-06T00:00:00.000Z","cpc_codes":["G06F","G06F","G06F","G06F"],"num_claims":14,"abstract":"A processor receives a string of binary data that represents an initial phrase that includes multiple words and is associated with a specific category. The processor removes one or more letters from an end of a word in the initial phrase to form an initial truncated version of the phrase. The processor runs a TF-IDF algorithm on the initial truncated version of the phrase, and lemmatizes subsequent truncated versions of the initial phrase by recursively removing remaining letters from the end of the word. The processor runs the TF-IDF algorithm on subsequent truncated versions of the initial truncated version of the initial phrase until a highest TF-IDF value is identified. The processor defines a breadth of a lemma for a lexeme based on the specific category of the phrase, and assigns the specific truncated version having the highest TF-IDF value to the specific category."},"analysis":{"summary":null,"layman_explanation":null,"technical_analysis":null,"business_analysis":null,"faqs":null,"topics":[],"tech_cluster":null},"seo":{"title":"Category-based lemmatizing of a phrase in a document","description":"A processor receives a string of binary data that represents an initial phrase that includes multiple words and is associated with a specific category. The processor removes one or more letters from a","keywords":[]},"attribution":{"source":"Patentable","source_url":"https://patentable.app","canonical_url":"https://patentable.app/patents/US-9672278","license":"CC-BY-4.0-like","license_terms":"AI-generated analysis on this page (summary, layman_explanation, technical_analysis, business_analysis, faqs) may be reused with attribution and a visible link back to the canonical URL above. Patent abstracts, claims, and bibliographic data are USPTO public domain.","required_link":"https://patentable.app/patents/US-9672278","citation_suggestion":"Patentable. \"Category-based lemmatizing of a phrase in a document\" (US-9672278). https://patentable.app/patents/US-9672278","copyright_holder":"Nomic Interactive Technology LLC"},"links":{"html":"https://patentable.app/patents/US-9672278","json":"https://patentable.app/api/llm-context/US-9672278","site":"https://patentable.app","llms_txt":"https://patentable.app/llms.txt"},"generated_at":"2026-06-06T11:41:45.138Z"}