{"schema_version":"1.0","canonical_url":"https://patentable.app/patents/US-10546575","patent":{"patent_number":"US-10546575","title":"Using recurrent neural network for partitioning of audio data into segments that each correspond to a speech feature cluster identifier","assignee":null,"inventors":[],"filing_date":"2016-12-14T00:00:00.000Z","publication_date":"2020-01-28T00:00:00.000Z","cpc_codes":["G10L","G10L","G10L","G10L"],"num_claims":20,"abstract":"Audio features, such as perceptual linear prediction (PLP) features and time derivatives thereof, are extracted from frames of training audio data including speech by multiple speakers, and silence, such as by using linear discriminant analysis (LDA). The frames are clustered into k-means clusters using distance measures, such as Mahalanobis distance measures, of means and variances of the extracted audio features of the frames. A recurrent neural network (RNN) is trained on the extracted audio features of the frames and cluster identifiers of the k-means clusters into which the frames have been clustered. The RNN is applied to audio data to segment audio data into segments that each correspond to one of the cluster identifiers. Each segment can be assigned a label corresponding to one of the cluster identifiers. Speech recognition can be performed on the segments."},"analysis":{"summary":null,"layman_explanation":null,"technical_analysis":null,"business_analysis":null,"faqs":null,"topics":[],"tech_cluster":null},"seo":{"title":"Using recurrent neural network for partitioning of audio data into segments that each correspond to a speech feature cluster identifier","description":"Audio features, such as perceptual linear prediction (PLP) features and time derivatives thereof, are extracted from frames of training audio data including speech by multiple speakers, and silence, s","keywords":[]},"attribution":{"source":"Patentable","source_url":"https://patentable.app","canonical_url":"https://patentable.app/patents/US-10546575","license":"CC-BY-4.0-like","license_terms":"AI-generated analysis on this page (summary, layman_explanation, technical_analysis, business_analysis, faqs) may be reused with attribution and a visible link back to the canonical URL above. Patent abstracts, claims, and bibliographic data are USPTO public domain.","required_link":"https://patentable.app/patents/US-10546575","citation_suggestion":"Patentable. \"Using recurrent neural network for partitioning of audio data into segments that each correspond to a speech feature cluster identifier\" (US-10546575). https://patentable.app/patents/US-10546575","copyright_holder":"Nomic Interactive Technology LLC"},"links":{"html":"https://patentable.app/patents/US-10546575","json":"https://patentable.app/api/llm-context/US-10546575","site":"https://patentable.app","llms_txt":"https://patentable.app/llms.txt"},"generated_at":"2026-05-31T20:42:04.141Z"}