{"schema_version":"1.0","canonical_url":"https://patentable.app/patents/US-11532378","patent":{"patent_number":"US-11532378","title":"Protein database search using learned representations","assignee":null,"inventors":[],"filing_date":"2021-11-23T00:00:00.000Z","publication_date":"2022-12-20T00:00:00.000Z","cpc_codes":["G06N","G16B","G06F","G06F","G06N","G06N","G16B","G06N","G06N"],"num_claims":17,"abstract":"A method for efficient search of protein sequence databases for proteins that have sequence, structural, and/or functional homology with respect to information derived from a search query. The method involves transforming the protein sequences into vector representations and searching in a vector space. Given a database of protein sequences and a learned embedding model, the embedding model is applied to each amino acid sequence to transform it into a sequence of vector representations. A query sequence is also transformed into a sequence of vector representations, preferably using the same learned embedding model. Once the query has been embedded in this manner, proteins are retrieved from the database based on distance between the query embedding and the protein embeddings contained within the database. Rapid and accurate search of the vector space is carried out using exact search using metric data structures, or approximate search using locality sensitive hashing."},"analysis":{"summary":null,"layman_explanation":null,"technical_analysis":null,"business_analysis":null,"faqs":null,"topics":[],"tech_cluster":null},"seo":{"title":"Protein database search using learned representations","description":"A method for efficient search of protein sequence databases for proteins that have sequence, structural, and/or functional homology with respect to information derived from a search query. The method ","keywords":[]},"attribution":{"source":"Patentable","source_url":"https://patentable.app","canonical_url":"https://patentable.app/patents/US-11532378","license":"CC-BY-4.0-like","license_terms":"AI-generated analysis on this page (summary, layman_explanation, technical_analysis, business_analysis, faqs) may be reused with attribution and a visible link back to the canonical URL above. Patent abstracts, claims, and bibliographic data are USPTO public domain.","required_link":"https://patentable.app/patents/US-11532378","citation_suggestion":"Patentable. \"Protein database search using learned representations\" (US-11532378). https://patentable.app/patents/US-11532378","copyright_holder":"Nomic Interactive Technology LLC"},"links":{"html":"https://patentable.app/patents/US-11532378","json":"https://patentable.app/api/llm-context/US-11532378","site":"https://patentable.app","llms_txt":"https://patentable.app/llms.txt"},"generated_at":"2026-05-31T01:19:41.597Z"}