A unified web-based voice messaging system provides voice application control between a proxy browser having a web browser, and an application server via an hypertext transport protocol (HTTP) connection on an Internet Protocol (IP) network. The proxy browser serves as an HTTP interface for a user device that lacks HTML and HTTP processing capabilites, such as an analog telephone, a cellular telephone, a voice over IP telephone, and the like. The web browser receives an HTML page from the application server having an XML element that defines data for an audio operation to be performed by an executable audio resource within the proxy browser. The audio resource, also referred to as a media resource, selectively executes the HTML tags and the audio operation based on the determined capabilities of the user device. If the user device does not have audio capabilities, the media resource ignores the audio operation, and merely presents the HTML information, assuming the user device has a display. If the media resource determines that the user device has at least a speaker and possibly a microphone, the media resource executes the audio operation based on enhanced audio control specified by the XML element. Similarly, if the media resource determines that the user device does not have a display, the HTML tag information is discarded by the media resource. Hence, a proxy browser can be used by user devices to access enhanced voice control for voice enabled web applications.
Legal claims defining the scope of protection, as filed with the USPTO.
1. A method in a proxy browser configured for executing a web browser according to hypertext transport (HTTP) protocol, the method comprising: receiving a hypertext markup language (HTML) page by the web browser, from an HTTP connection, having an HTML tag and at least one extensible markup language (XML) element specifying a directive for controlling an audio operation to be performed by an executable media resource; determining capabilities of a user device configured for receiving prescribed media information from the proxy browser; and selectively executing by the executable media resource at least one of the HTML tag and the audio operation for delivery of the prescribed media information to the user device, based on the determined capabilities of the user device.
2. The method of claim 1 , wherein the selectively executing step includes playing an audio file for the user device based on the determined capabilities specifying that the user device includes an audio speaker.
3. The method of claim 2 , wherein the selectively executing step includes supplying audio data generated during the playing of the audio file to a selected network interface card configured for communication with the user device according to a prescribed network protocol.
4. The method of claim 3 , wherein the supplying step includes supplying the audio signals to one of a data network interface card configured for communication with the user device according to Internet Protocol and a telephony card configured for communication with the user device according to a public switched telephone network protocol.
5. The method of claim 3 , wherein the playing step further includes interrupting the supplied audio data in response to a detected user key input, the method further including: posting the detected user key input via the HTTP connection to a Uniform Resource Locator (URL) specified in the HTML page; receiving a second HTML page from the HTTP connection having a second XML element specifying a directive for controlling a second audio operation to be performed by the executable media resource; and executing the second audio operation for delivery to the user device.
6. The method of claim 2 , wherein the selectively executing step includes supplying data from the HTML tag to the user device based on the determined capabilities specifying that the user device includes a display.
7. The method of claim 6 , wherein the selectively executing step includes initiating a recording operation for audio signals received from the user device based on the determined capabilities specifying that the user device includes a microphone.
8. The method of claim 7 , wherein the step of initiating a recording operation further includes recording a voice input and sending the recorded voice input according to HTTP protocol to a destination specified in the XML element.
9. The method of claim 8 , further comprising posting a user input, that describes the recorded voice input, by the web browser to a second destination specified by the HTML page in a prescribed sequence relative to the sending of the recorded voice input.
10. The method of claim 1 , wherein the step of selectively executing the audio operation includes recording, by the executable media resource, a voice input and sending the recorded voice input according to HTTP protocol to a destination specified in the XML element, based on the determined capabilities specifying that the user device includes a microphone.
11. The method of claim 10 , further comprising posting a user input, that describes the recorded voice input, by the web browser to a second destination specified by the HTML page in a prescribed sequence relative to the sending of the recorded voice input.
12. The method of claim 1 , wherein the selectively executing step includes executing the audio operation and ignoring the HTML tag based on the determined capabilities specifying that the user device includes an audio speaker and a determined absence of a display in the user device.
13. A method in a proxy browser configured for executing a web browser according to a hypertext transport protocol (HTTP), the method comprising: detecting an incoming call from a user device configured for sending and receiving prescribed media information according to a prescribed device network protocol; sending an HTML request by the web browser to a destination server via an HTTP connection in response to the incoming call; receiving an HTML page by the web browser, from the HTTP connection, having an HTML tag and at least one extensible markup language (XML) element specifying a directive for controlling an audio operation to be performed by an executable media resource; and selectively executing by the executable media resource at least one of the HTML tag and the audio operation for delivery of the prescribed media information to the user device, based on the determined capabilities of the user device.
14. The method of claim 13 , wherein the detecting step includes: detecting an off hook condition of the user device by a network interface card configured for interacting with the user device according to the prescribed network protocol; and sending a signal by the network interface card to the executable media resource indicating the user device is requesting service; and sending a message to the web browser by the executable media resource to send the HTML request.
15. The method of claim 14 , wherein the step of sending a message to the web browser includes sending digits received from the user device.
16. The method of claim 15 , wherein the step of sending an HTML request includes sending the digits received from the user device as part of a Uniform Resource Locator (URL) specifying a location of the destination server via the HTTP connection.
17. The method of claim 13 , wherein the selecting step includes playing an audio file for the user device based on the determined capabilities specifying that the user device includes an audio speaker.
18. The method of claim 17 , wherein the selectively executing step includes supplying audio data generated during the playing of the audio file to a selected network interface card configured for communication with the user device according to the prescribed device network protocol.
19. The method of claim 18 , wherein the supplying step includes supplying the audio signals to one of a data network interface card configured for communication with the user device according to Internet Protocol and a telephony card configured for communication with the user device according to a public switched telephone network protocol.
20. The method of claim 18 , wherein the playing step further includes interrupting the supplied audio data in response to a detected user key input, the method further including: posting the detected user key input via the HTTP connection to a Uniform Resource Locator (URL) specified in the HTML page; receiving a second HTML page from the HTTP connection having a second XML element specifying a directive for controlling a second audio operation to be performed by the executable media resource; and executing the second audio operation for delivery to the user device.
21. The method of claim 17 , wherein the selectively executing step includes supplying data from the HTML tag to the user device based on the determined capabilities specifying that the user device includes a display.
22. The method of claim 13 , wherein the selectively executing step includes initiating a recording operation for audio signals received from the user device based on the determined capabilities specifying that the user device includes a microphone.
23. The method of claim 22 , wherein the step of initiating a recording operation further includes recording a voice input and sending the recorded voice input according to HTTP protocol to a destination specified in the XML element.
24. The method of claim 23 , further comprising posting a user input, that describes the recorded voice input, by the web browser to a second destination specified by the HTML page in a prescribed sequence relative to the sending of the recorded voice input.
25. The method of claim 13 , wherein the step of selectively executing the audio operation includes recording, by the executable media resource, a voice input and sending the recorded voice input according to HTTP protocol to a destination specified in the XML element, based on the determined capabilities specifying that the user device includes a microphone.
26. The method of claim 25 , further comprising posting a user input, that describes the recorded voice input, by the web browser to a second destination specified by the HTML page in a prescribed sequence relative to the sending of the recorded voice input.
27. The method of claim 13 , wherein the selectively executing step includes executing the audio operation and ignoring the HTML tag based on the determined capabilities specifying that the user device includes an audio speaker and a determined absence of a display in the user device.
28. A computer readable medium having stored thereon sequences of instructions for executing a web browser by a proxy browser according to hypertext transport (HTTP) protocol, the sequences of instructions including instructions for performing the steps of: receiving a hypertext markup language (HTML) page by the web browser, from an HTTP connection, having an HTML tag and at least one extensible markup language (XML) element specifying a directive for controlling an audio operation to be performed by an executable media resource; determining capabilities of a user device configured for receiving prescribed media information from the proxy browser; and selectively executing by the executable media resource at least one of the HTML tag and the audio operation for delivery of the prescribed media information to the user device, based on the determined capabilities of the user device.
29. The medium of claim 28 , wherein the selectively executing step includes playing an audio file for the user device based on the determined capabilities specifying that the user device includes an audio speaker.
30. The medium of claim 29 , wherein the selectively executing step includes supplying audio data generated during the playing of the audio file to a selected network interface card configured for communication with the user device according to a prescribed network protocol.
31. The medium of claim 30 , wherein the supplying step includes supplying the audio signals to one of a data network interface card configured for communication with the user device according to Internet Protocol and a telephony card configured for communication with the user device according to a public switched telephone network protocol.
32. The medium of claim 30 , wherein the playing step further includes interrupting the supplied audio data in response to a detected user key input, the medium further including instructions for performing the steps of: posting the detected user key input via the HTTP connection to a Uniform Resource Locator (URL) specified in the HTML page; receiving a second HTML page from the HTTP connection having a second XML element specifying a directive for controlling a second audio operation to be performed by the executable media resource; and executing the second audio operation for delivery to the user device.
33. The medium of claim 29 , wherein the selectively executing step includes supplying data from the HTML tag to the user device based on the determined capabilities specifying that the user device includes a display.
34. The medium of claim 28 , wherein the selectively executing step includes initiating a recording operation for audio signals received from the user device based on the determined capabilities specifying that the user device includes a microphone.
35. The medium of claim 34 , wherein the step of initiating a recording operation further includes recording a voice input and sending the recorded voice input according to HTTP protocol to a destination specified in the XML element.
36. The medium of claim 35 , further comprising instructions for performing the step of posting a user input, that describes the recorded voice input, by the web browser to a second destination specified by the HTML page in a prescribed sequence relative to the sending of the recorded voice input.
37. The medium of claim 28 , wherein the step of selectively executing the audio operation includes recording, by the executable media resource, a voice input and sending the recorded voice input according to HTTP protocol to a destination specified in the XML element, based on the determined capabilities specifying that the user device includes a microphone.
38. The medium of claim 37 , further comprising instructions for performing the step of posting a user input, that describes the recorded voice input, by the web browser to a second destination specified by the HTML page in a prescribed sequence relative to the sending of the recorded voice input.
39. The medium of claim 28 , wherein the selectively executing step includes executing the audio operation and ignoring the HTML tag based on the determined capabilities specifying that the user device includes an audio speaker and a determined absence of a display in the user device.
40. A computer readable medium having stored thereon sequences of instructions for executing in a proxy browser a web browser according to a hypertext transport protocol (HTTP), the sequences of instructions including instructions for performing the steps of: detecting an incoming call from a user device configured for sending and receiving prescribed media information according to a prescribed device network protocol; sending an HTML request by the web browser to a destination server via an HTTP connection in response to the incoming call; receiving an HTML page by the web browser, from the HTTP connection, having an HTML tag and at least one extensible markup language (XML) element specifying a directive for controlling an audio operation to be performed by an executable media resource; and selectively executing by the executable media resource at least one of the HTML tag and the audio operation for delivery of the prescribed media information to the user device, based on the determined capabilities of the user device.
41. The medium of claim 40 , wherein the detecting step includes: detecting an off hook condition of the user device by a network interface card configured for interacting with the user device according to the prescribed network protocol; and sending a signal by the network interface card to the executable media resource indicating the user device is requesting service; and sending a message to the web browser by the executable media resource to send the HTML request.
42. The medium of claim 41 , wherein the step of sending a message to the web browser includes sending digits received from the user device.
43. The medium of claim 42 , wherein the step of sending an HTML request includes sending the digits received from the user device as part of a Uniform Resource Locator (URL) specifying a location of the destination server via the HTTP connection.
44. The medium of claim 40 , wherein the selecting step includes playing an audio file for the user device based on the determined capabilities specifying that the user device includes an audio speaker.
45. The medium of claim 44 , wherein the selectively executing step includes supplying audio data generated during the playing of the audio file to a selected network interface card configured for communication with the user device according to the prescribed device network protocol.
46. The medium of claim 45 , wherein the supplying step includes supplying the audio signals to one of a data network interface card configured for communication with the user device according to Internet Protocol and a telephony card configured for communication with the user device according to a public switched telephone network protocol.
47. The medium of claim 45 , wherein the playing step further includes interrupting the supplied audio data in response to a detected user key input, the medium further including instructions for performing the steps of: posting the detected user key input via the HTTP connection to a Uniform Resource Locator (URL) specified in the HTML page; receiving a second HTML page from the HTTP connection having a second XML element specifying a directive for controlling a second audio operation to be performed by the executable media resource; and executing the second audio operation for delivery to the user device.
48. The medium of claim 44 , wherein the selectively executing step includes supplying data from the HTML tag to the user device based on the determined capabilities specifying that the user device includes a display.
49. The medium of claim 40 , wherein the selectively executing step includes initiating a recording operation for audio signals received from the user device based on the determined capabilities specifying that the user device includes a microphone.
50. The medium of claim 49 , wherein the step of initiating a recording operation further includes recording a voice input and sending the recorded voice input according to HTTP protocol to a destination specified in the XML element.
51. The medium of claim 50 , further comprising posting a user input, that describes the recorded voice input, by the web browser to a second destination specified by the HTML page in a prescribed sequence relative to the sending of the recorded voice input.
52. The medium of claim 40 , wherein the step of selectively executing the audio operation includes recording, by the executable media resource, a voice input and sending the recorded voice input according to HTTP protocol to a destination specified in the XML element, based on the determined capabilities specifying that the user device includes a microphone.
53. The medium of claim 52 , further comprising posting a user input, that describes the recorded voice input, by the web browser to a second destination specified by the HTML page in a prescribed sequence relative to the sending of the recorded voice input.
54. The medium of claim 40 , wherein the selectively executing step includes executing the audio operation and ignoring the HTML tag based on the determined capabilities specifying that the user device includes an audio speaker and a determined absence of a display in the user device.
55. A processor-based device configured for executing audio operations based on a hypertext markup language (HTML) page received from a server according to hypertext transport protocol (HTTP), the device comprising: a web browser configured for selectively interpreting the HTML page, the HTML page including at least one of an HTML tag and an XML element that specifies a directive for controlling an audio operation to be performed for a user device; and a media resource configured for selectively executing at least one of the audio operation and the HTML tag based on determined capabilities of the user device.
56. The device of claim 55 , further comprising an XML parser for extracting the XML element and supplying the extracted XML element to the media resource.
57. The device of claim 55 , wherein the media resource includes a device capabilities table specifying the capabilities of at least the user device.
58. The device of claim 57 , further comprising a device interface configured for communicating with the user device according to a selected one of a plurality of network access protocols based on the corresponding capabilities of the user device.
59. The device of claim 58 , wherein the device interface includes: a voice over Internet Protocol (IP) network interface card for communication with the user device according to Internet Protocol; and a telephony card configured for communication with the user device according to a public switched telephone network protocol, the media resource selecting one of the voice over IP network interface card and the telephony card for communication with the user device based on the corresponding capabilities of the user device.
60. The method of claim 1 , wherein the HTML page includes a plurality of XML elements that specify directives for controlling respective audio operations, the selectively executing including executing the audio operations based on fetching a plurality of audio files and playing the plurality of audio files in a prescribed sequence based on the respective XML elements.
61. The method of claim 13 , wherein the HTML page includes a plurality of XML elements that specify directives for controlling respective audio operations, the selectively executing including executing the audio operations based on fetching a plurality of audio files and playing the plurality of audio files in a prescribed sequence based on the respective XML elements.
62. The medium of claim 28 , wherein the HTML page includes a plurality of XML elements that specify directives for controlling respective audio operations, the selectively executing including executing the audio operations based on fetching a plurality of audio files and playing the plurality of audio files in a prescribed sequence based on the respective XML elements.
63. The medium of claim 40 , wherein the HTML page includes a plurality of XML elements that specify directives for controlling respective audio operations, the selectively executing including executing the audio operations based on fetching a plurality of audio files and playing the plurality of audio files in a prescribed sequence based on the respective XML elements.
64. The device of claim 55 , wherein the HTML page includes a plurality of XML elements that specify directives for controlling respective audio operations, the media resource configured for executing the audio operations based on fetching a plurality of audio files and playing the plurality of audio files in a prescribed sequence based on the respective XML elements.
Cooperative Patent Classification codes for this invention. Click any code to explore related patents in that topic.
December 14, 1999
May 18, 2004
Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.