An audio-based operation system includes an audio input-output device that receives an audio-based operation performed by a user, a server that receives an instruction corresponding the audio-based operation received by the audio input-output device, an image forming apparatus that executes a job transmitted from the server. The server includes circuitry configured to receive audio-based operation information indicating the audio-based operation received by the audio input-output device, convert the received audio-based operation information into a job interpretable by the image forming apparatus, and instruct the image forming apparatus to execute the job converted from the audio-based operation information.
Legal claims defining the scope of protection. Each claim is shown in both the original legal language and a plain English translation.
1. An audio-based operation system comprising: an audio input-output device that receives an audio-based operation performed by a user; a server that receives an instruction corresponding to the audio-based operation received by the audio input-output device; an image forming apparatus that executes a job transmitted from the server; wherein the server includes circuitry configured to: receive audio-based operation information indicating the audio-based operation received by the audio input-output device; convert the received audio-based operation information into a job interpretable by the image forming apparatus; instruct the image forming apparatus to execute the job converted from the audio-based operation information; and display, on a display provided for the image forming apparatus, a screen described in a language used for the audio-based operation, wherein a memory stores information associating device identification information identifying the audio input-output device with languages used for the audio-based operation, wherein the circuitry displays, on the display provided for the image forming apparatus, the screen described in a given language specified by the device identification information identifying the audio input-output device that receives the audio-based operation based on the information associating the device identification information with the languages.
This invention relates to an audio-based operation system for controlling an image forming apparatus, such as a printer or copier, using voice commands. The system addresses the challenge of enabling users to interact with imaging devices without physical input, particularly for users with mobility or accessibility limitations. The system includes an audio input-output device that captures voice commands from the user, a server that processes these commands, and an image forming apparatus that executes the corresponding tasks. The server converts the audio-based operations into machine-readable instructions, ensuring compatibility with the imaging device. Additionally, the system displays user interfaces in the language associated with the specific audio input device, enhancing accessibility. A memory stores mappings between device identifiers and supported languages, allowing the server to dynamically adjust the displayed interface language based on the user's input device. This ensures that the system responds in a language matching the user's voice commands, improving usability and reducing errors. The invention streamlines voice-controlled imaging workflows while accommodating multilingual environments.
2. The audio-based operation system according to claim 1 , wherein the memory stores information associating one or more phrases and one or more language types, wherein the circuitry displays, on the display provided for the image forming apparatus, the screen described in a specific language specified by any one of the phrases used for the audio-based operation.
This invention relates to an audio-based operation system for image forming apparatuses, such as printers or copiers, that enhances user interaction by allowing language selection through voice commands. The system addresses the challenge of users needing to manually navigate language settings, which can be cumbersome, especially for non-native speakers or users with limited mobility. The system includes a memory that stores associations between specific phrases and language types. When a user speaks a recognized phrase, the system identifies the corresponding language type and displays the apparatus's operation screen in that language. This enables seamless language switching without manual input, improving accessibility and user experience. The system leverages audio recognition to interpret spoken phrases and dynamically adjust the display language, ensuring intuitive and efficient operation. The circuitry processes the audio input, retrieves the associated language from memory, and updates the display accordingly, eliminating the need for physical interaction with language selection menus. This feature is particularly useful in multilingual environments where users may require quick language adjustments. The invention streamlines language customization, making the image forming apparatus more adaptable to diverse user needs.
3. The audio-based operation system according to claim 2 , wherein the memory further stores one or more particular phrases used for activating the audio input-output device and the one or more language types in association with each other, wherein the circuitry displays, on the display provided for the image forming apparatus, the screen described in a specific language specified by any one of the particular phrases used for activating the audio input-output device.
This invention relates to an audio-based operation system for an image forming apparatus, such as a printer or copier, that enhances user interaction through voice commands. The system addresses the challenge of providing intuitive, language-specific user interfaces in devices that may lack physical controls or require hands-free operation. The system includes an audio input-output device, such as a microphone and speaker, and circuitry that processes voice commands. The memory stores predefined phrases used to activate the audio input-output device, along with associated language types. When a user speaks a particular activation phrase, the system identifies the language of that phrase and displays an on-screen interface in the corresponding language. This ensures that the user interface adapts dynamically to the user's preferred language based on their spoken input, improving accessibility and ease of use. The system may also include a display for the image forming apparatus, which presents the language-specific screen after detecting the activation phrase. This eliminates the need for manual language selection, streamlining the user experience. The circuitry processes the audio input, matches it to stored phrases, and triggers the appropriate language display, ensuring seamless integration between voice commands and visual feedback. The invention thus provides a more efficient and user-friendly way to interact with image forming devices through natural language processing.
4. The audio-based operation system according to claim 2 , wherein the server includes the memory.
The invention relates to an audio-based operation system designed to enhance user interaction with devices through voice commands. The system addresses the problem of limited accessibility and convenience in traditional input methods, such as keyboards or touchscreens, by enabling hands-free operation via audio signals. The system includes a server equipped with a memory that stores audio data, such as voice commands or other sound inputs, to facilitate processing and execution of user instructions. The server processes these audio signals to determine the intended operation and transmits corresponding control signals to connected devices, allowing users to interact with multiple devices seamlessly. The memory within the server stores not only the audio data but also any necessary algorithms or reference data required for accurate command recognition and execution. This integration ensures efficient and reliable operation, reducing latency and improving responsiveness. The system may also include a microphone array or other audio input devices to capture user commands with high accuracy, further enhancing the overall user experience. By leveraging audio-based inputs, the system provides a more intuitive and accessible means of device control, particularly beneficial for users with mobility impairments or in environments where traditional input methods are impractical.
5. A method of processing information using an audio-based operation, comprising: receiving audio-based operation information indicating an audio-based operation received by an audio input-output device; converting the received audio-based operation information into a job interpretable by an image forming apparatus; instructing the image forming apparatus to execute the job converted from the audio-based operation information; displaying, on a display provided for the image forming apparatus, a screen described in a language used for the audio-based operation; storing, in a memory, information associating device identification information identifying the audio input-output device with languages used for the audio-based operation; displaying, on the display provided for the image forming apparatus, the screen described in a given language specified by the device identification information identifying the audio input-output device that receives the audio-based operation based on the information associating the device identification information with the languages.
This invention relates to audio-based operations for controlling an image forming apparatus, such as a printer or copier. The problem addressed is the need for seamless integration between audio input-output devices (e.g., smart speakers or voice assistants) and image forming apparatuses, ensuring that user interactions via voice commands are accurately translated into executable jobs while maintaining language consistency. The method involves receiving audio-based operation information from an audio input-output device, such as a voice command to print a document. This audio input is converted into a job format interpretable by the image forming apparatus, which then executes the task. The system displays a user interface on the image forming apparatus's display, presented in the same language used in the audio-based operation. To ensure language consistency, the system stores associations between device identification information (e.g., a smart speaker's ID) and the languages supported by that device. When a job is initiated, the display screen is rendered in the language specified by the device's identification information, ensuring a cohesive user experience. This approach eliminates language mismatches and improves usability for multilingual environments.
6. The method according to claim 5 , further comprising: storing, in the memory, information associating one or more phrases and one or more language types; and displaying, on the display provided for the image forming apparatus, the screen described in a specific language specified by any one of the phrases used for the audio-based operation.
This invention relates to an image forming apparatus with audio-based operation capabilities, addressing the challenge of providing user interfaces in multiple languages based on spoken commands. The apparatus includes a microphone for capturing audio input, a processor for processing the audio to identify phrases, and a display for presenting operational screens. The method involves storing in memory a database that associates specific phrases with one or more language types. When a user speaks a recognized phrase to perform an operation, the apparatus retrieves the associated language type and displays the corresponding screen in that language. For example, if a user says "print in French," the system identifies the phrase, associates it with French, and renders the print settings screen in French. This dynamic language selection ensures users can interact with the device in their preferred language without manual configuration. The system may also include a network interface for updating the phrase-language associations remotely. The invention enhances accessibility and usability by adapting the interface language based on spoken commands, eliminating the need for manual language selection.
7. The method according to claim 6 , further comprising: storing, in the memory, one or more particular phrases used for activating the audio input-output device and the one or more language types in association with each other; and displaying, on the display provided for the image forming apparatus, the screen described in a specific language specified by any one of the particular phrases used for activating the audio input-output device.
This invention relates to an image forming apparatus with an audio input-output device and a display, addressing the challenge of providing language-specific user interfaces in a voice-activated system. The apparatus includes a processor, memory, and an audio input-output device for receiving voice commands. The system stores predefined phrases used to activate the audio input-output device, along with associated language types, in memory. When a user speaks one of these activation phrases, the system identifies the language type linked to that phrase and displays a user interface screen in the specified language on the apparatus's display. This ensures that the displayed content matches the user's preferred language, enhancing usability for multilingual environments. The method involves detecting voice input, processing it to determine the activation phrase, retrieving the corresponding language type, and rendering the interface in that language. The system may also include a network interface for communicating with external devices, allowing for remote updates or additional language support. The invention improves accessibility by dynamically adapting the display language based on voice commands, reducing the need for manual language selection.
8. A non-transitory computer readable storage medium storing one or more instructions that, when performed by one or more processors, cause the one or more processors to execute a method of processing information using an audio-based operation, the method comprising: receiving audio-based operation information indicating an audio-based operation received by an audio input-output device; converting the received audio-based operation information into a job interpretable by an image forming apparatus; instructing the image forming apparatus to execute the job converted from the audio-based operation information; displaying, on a display provided for the image forming apparatus, a screen described in a language used for the audio-based operation; storing, in a memory, information associating device identification information identifying the audio input-output device with languages used for the audio-based operation; displaying, on the display provided for the image forming apparatus, the screen described in a given language specified by the device identification information identifying the audio input-output device that receives the audio-based operation based on the information associating the device identification information with the languages.
This invention relates to audio-based control of image forming devices, such as printers or copiers, to improve accessibility and user interaction. The problem addressed is the lack of seamless integration between audio input-output devices (e.g., smart speakers or voice assistants) and image forming apparatuses, particularly in multilingual environments where users may need to interact with the device in their preferred language. The system includes a non-transitory computer-readable storage medium storing instructions that, when executed by a processor, enable an audio-based operation to control an image forming apparatus. The method involves receiving audio-based operation information (e.g., voice commands) from an audio input-output device, converting this information into a job interpretable by the image forming apparatus, and instructing the apparatus to execute the job. The system also displays a user interface on the image forming device’s display, presented in the language used for the audio-based operation. Additionally, the system stores associations between device identification information (e.g., a unique identifier for the audio input-output device) and the languages used for audio operations. When an audio-based operation is received, the system retrieves the associated language and ensures the displayed screen matches that language, enhancing user experience by maintaining consistency between voice input and visual output. This approach simplifies interaction for users who rely on audio commands while ensuring proper language alignment.
9. The non-transitory computer readable storage medium according to claim 8 , further comprising: storing, in the memory, information associating one or more phrases and one or more language types; and displaying, on the display provided for the image forming apparatus, the screen described in a specific language specified by any one of the phrases used for the audio-based operation.
This invention relates to a non-transitory computer-readable storage medium for an image forming apparatus, addressing the challenge of providing user-friendly language selection in audio-based operations. The system stores information that links specific phrases with corresponding language types, enabling the apparatus to display screens in a language determined by the phrases used during audio-based interactions. The apparatus includes a memory for storing the phrase-language associations and a display for presenting the user interface in the selected language. The system processes audio input to identify phrases, retrieves the associated language type from memory, and dynamically adjusts the display language accordingly. This ensures that users can interact with the apparatus in their preferred language through voice commands, improving accessibility and usability. The invention enhances existing image forming apparatuses by integrating language selection based on audio input, eliminating the need for manual language settings and streamlining the user experience. The solution is particularly useful in multilingual environments where users may prefer different languages for operation.
10. The non-transitory computer readable storage medium according to claim 9 , further comprising: storing, in the memory, one or more particular phrases used for activating the audio input-output device and the one or more language types in association with each other; and displaying, on the display provided for the image forming apparatus, the screen described in a specific language specified by any one of the particular phrases used for activating the audio input-output device.
This invention relates to a system for an image forming apparatus that enhances user interaction through voice commands. The system addresses the challenge of providing intuitive, language-specific user interfaces in a multi-language environment. The apparatus includes a non-transitory computer-readable storage medium storing instructions for executing operations. These operations involve storing particular phrases used to activate an audio input-output device in association with specific language types. When a user speaks one of these phrases, the system identifies the language type linked to that phrase and displays a user interface screen in the corresponding language. This ensures that the displayed content matches the user's preferred language, improving accessibility and usability. The system may also include a memory for storing language data and a display for presenting the interface. The audio input-output device captures voice commands, and the system processes these commands to determine the appropriate language for the display. This approach eliminates the need for manual language selection, streamlining user interaction with the image forming apparatus. The invention is particularly useful in environments where multiple languages are spoken, ensuring seamless communication between users and the device.
Cooperative Patent Classification codes for this invention. Click any code to explore related patents in that topic.
November 7, 2019
April 12, 2022
Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.