Method and Device for Information Processing

PublishedMarch 23, 2021

Assigneenot available in USPTO data we have

Technical Abstract

Patent Claims

18 claims

Legal claims defining the scope of protection, as filed with the USPTO.

1. A method, comprising: obtaining, using a processor, audio data collected by a slave device; obtaining, using the processor, contextual data characterizing a voice environment where the audio data is collected by the slave device, the contextual data including data characterizing a space where the slave device is located and including contextual parameters generated using historical audio data collected by the slave device according to a frequency of occurrence of a voice input entry in the historical audio data that has a relevance with a context in a historical collection period; and obtaining, using the processor, a recognition result of recognizing the audio data based on the contextual data, including: determining a function of the space where the slave device is located according to the contextual data, the function indicating an intended use of the space; and recognizing the audio data based on the function of the space to obtain the recognition result, the recognition result belonging to a topic associated with the intended use of the space, wherein obtaining the contextual data further comprises: determining the contextual data at (n+1)th moment according to topic information mapped by audio data collected at (n)th moment, n being a positive integer, wherein a pause time between the (n)th moment of collecting the audio data and the (n+1)th moment is less than a pre-set pause time.

2. The method according to claim 1 , wherein: obtaining the audio data collected by the slave device includes: receiving, by a master device, the audio data sent from the slave device via a first connection mode; and obtaining the contextual data corresponding to the slave device includes: sending, via a second connection mode, the audio data and the contextual data to a server; and receiving, via the second connection mode, the recognition result returned from the server after the server recognizes the audio data and the contextual data, wherein a maximum communication distance of the first connection mode is less than a maximum communication distance of the second connection mode, the first connection mode being a local network transmission mode, and the second connection mode being a mobile data signal transmission mode.

3. The method according to claim 1 , wherein: obtaining the contextual data corresponding to the slave device includes: receiving, from the slave device among at least two slave devices, attribute data characterizing device attributes of the slave device, and determining the contextual data based on the attribute data.

4. The method according to claim 3 , wherein determining the contextual data based on the attribute data includes: determining the contextual data, based on the attribute data and a predetermined correspondence relationship between the attribute data and the contextual data.

5. The method according to claim 1 , wherein obtaining the recognition result of recognizing the audio data based on the contextual data further includes: for the audio data containing one or more homophone entries corresponding to a plurality of recognition results, selecting a recognition result matched with the contextual data as a final recognition result of the one or more homophone entries.

6. The method according to claim 1 , wherein obtaining the recognition result of recognizing the audio data based on the contextual data further includes: for correcting the recognition result of the audio data, selecting a correction result matched with the contextual data as a final recognition result of the audio data.

7. A device, comprising: a first device, configured to obtain audio data collected by a slave device; a second device, configured to obtain contextual data characterizing a voice environment where the audio data is collected by the slave device, the contextual data including data characterizing a space where the slave device is located and including contextual parameters generated using historical audio data collected by the slave device according to a frequency of occurrence of a voice input entry in the historical audio data that has a relevance with a context in a historical collection period; and a third device, configured to obtain a recognition result of recognizing the audio data based on the contextual data, by: determining a function of the space where the slave device is located according to the contextual data, the function indicating an intended use of the space; and recognizing the audio data based on the function of the space to obtain the recognition result, the recognition result belonging to a topic associated with the intended use of the space, wherein the second device is further configured to: determine the contextual data at (n+1)th moment according to topic information mapped by audio data collected at (n)th moment, n being a positive integer, wherein a pause time between the (n)th moment of collecting the audio data and the (n+1)th moment is less than a pre-set pause time.

8. The device according to claim 7 , wherein: the first device receives the audio data from the slave device among at least two slave devices; and the second device receives, from the slave device, attribute data characterizing device attributes of the slave device, and determines the contextual data based on the attribute data.

9. The device according to claim 8 , wherein: the second device determines the contextual data, based on the attribute data and a predetermined correspondence relationship between the attribute data and the contextual data.

10. The device according to claim 8 , wherein: for the audio data containing one or more homophone entries corresponding to a plurality of recognition results, the third device selects a recognition result matched with the contextual data as a final recognition result of the one or more homophone entries.

11. A device, comprising: a communication interface; and a processor, operatively coupled to the communication interface, wherein: the processor, under a predetermined execution instruction, uses the communication interface to: obtain audio data collected by a slave device; obtain contextual data characterizing a voice environment where the audio data is collected by the slave device, the contextual data including data characterizing a space where the slave device is located and including contextual parameters generated using historical audio data collected by the slave device according to a frequency of occurrence of a voice input entry in the historical audio data that has a relevance with a context in a historical collection period; and obtain a recognition result of recognizing the audio data based on the contextual data, by: determining a function of the space where the slave device is located according to the contextual data, the function indicating an intended use of the space; and recognizing the audio data based on the function of the space to obtain the recognition result, the recognition result belonging to a topic associated with the intended use of the space; wherein the processor is configured to: determine the contextual data at (n+1)th moment according to topic information mapped by audio data collected at (n)th moment, n being a positive integer, wherein a pause time between the (n)th moment of collecting the audio data and the (n+1)th moment is less than a pre-set pause time.

12. The device according to claim 11 , wherein: the communication interface includes a first communication interface and a second communication interface different from the first communication interface, wherein: the first communication interface receives the audio data sent from the slave device via a first connection method; and the second communication interface sends, via a second connection mode, the audio data and the contextual data to a server, and receives, via the second connection mode, the recognition result returned from the server after the server recognizes the audio data and the contextual data, wherein a maximum communication distance of the first connection mode is less than a maximum communication distance of the second connection mode.

13. The device according to claim 11 , wherein: the communication interface receives the audio data from the slave device among at least two slave devices; and receives, from the slave device, attribute data characterizing device attributes of the slave device; and the processor determines the contextual data based on the attribute data.

14. The device according to claim 13 , wherein the processor further: determines the contextual data, based on the attribute data and a predetermined correspondence relationship between the attribute data and the contextual data.

15. The device according to claim 11 , wherein the processor further: when the audio data contains one or more homophone entries corresponding to a plurality of recognition results, selects a recognition result matched with the contextual data as a final recognition result of the one or more homophone entries.

16. The device according to claim 11 , wherein the processor further: when correcting the recognition result of the audio data, selects a correction result matched with the contextual data as a final recognition result of the audio data.

17. The method according to claim 1 , wherein: the space is selected from at least one of a kitchen or conference rooms of different departments.

18. The method according to claim 1 , wherein: the space is a kitchen and the topic associated with the intended use of the space is food and beverage; or the space is a conference room of a chemical department, and the topic associated with the intended use of the space is chemistry.

Patent Metadata

Filing Date

Unknown

Publication Date

March 23, 2021

Inventors

Weixing SHI

Want to explore more patents?

Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.

Browse All Patents Try Prior Art Search