Patentable/Patents/US-20250315494-A1
US-20250315494-A1

Application Remote Operation System, Application Execution Control Device, and Application Execution Control Method

PublishedOctober 9, 2025
Assigneenot available in USPTO data we have
Inventorsnot available in USPTO data we have
Technical Abstract

The present invention includes: a device image providing unit configured to provide a user device with a device image generated by transparentizing and overlaying a UI component, which is an HTML element related to a UI, on a background image showing the appearance of an operation target device including the UI; and an application operation control unit configured to operate a native application of the operation target device when acquiring event information of a user operation performed on the UI component from the user device, and can thereby execute an application testing of the native application installed in the operation target device by performing a user operation on the UI component overlaid on the background image through a testing tool for a website.

Patent Claims

Legal claims defining the scope of protection, as filed with the USPTO.

1

. An application remote operation system comprising:

2

. The application remote operation system according to, wherein the application operation control unit acquires event information of the user operation detected by a web browser of the user device from the user device, generates virtual operation instruction information for the operation target device on a basis of the acquired event information, and supplies the generated operation instruction information to the operation target device to operate the native application.

3

. The application remote operation system according to, wherein

4

. The application remote operation system according to, wherein the device image providing unit includes:

5

. The application remote operation system according to, wherein

6

. The application remote operation system according to, wherein

7

. The application remote operation system according to, wherein the UI component generation unit optimizes the intermediate hierarchy information by combining, with the intermediate hierarchy information, text information generated by optical character recognition of the background image generated by the background image generation unit, and converts the optimized intermediate hierarchy information into DOM hierarchy information of the HTML element.

8

. The application remote operation system according to, wherein the UI component generation unit optimizes the intermediate hierarchy information by combining, with the intermediate hierarchy information, text information generated by optical character recognition of the background image generated by the background image generation unit, and converts the optimized intermediate hierarchy information into DOM hierarchy information of the HTML element.

9

. An application execution control device comprising:

10

. An application execution control method comprising:

Detailed Description

Complete technical specification and implementation details from the patent document.

This application claims priority to Application No. 2024-062070 filed in Japan on Apr. 8, 2024, under 35 U.S.C. § 119. The entire contents of each application are hereby incorporated by reference.

The present invention relates to an application remote operation system, an application execution control device, and an application execution control method, and is particularly suitable for use in a cloud-based platform that enables remote execution of real device testing of a native application installed in an operation target device.

A cloud service has conventionally been provided that enables remote execution of real device testing of a device such as a smartphone and a tablet (see, for example, “Remote TestKit” (retrieved on 2024.3.27 from the Internet <URL:https://appkitbox.com/>). The cloud service is a service that allows a personal computer or the like used by a user to remotely operate an application installed in a device on the cloud service to enable execution of a visual inspection and an application testing similarly to a real machine.

Currently, various tools for testing an application program are provided. Roughly speaking, the tools include a testing tool for a native application (hereinafter, referred to as application) installed in a personal computer (PC) or a smartphone, and a testing tool for a website. The testing tool for an application executes a test by creating a test code using hierarchical structure information of an application UI on the basis of a user's operation on a button or the like displayed as an input element. On the other hand, the testing tool for a website executes a test by creating a test code using hierarchical structure information of a document object model (DOM) of an HTML element of a webpage.

Here, in the cloud service described above, the appearance of an entire smartphone including a display, a physical button, and the like is displayed as an image of an HTML canvas element, which does not have any hierarchical structure information. Therefore, currently, no testing tool in the market can directly execute a test on an image provided by the cloud service.

Since the testing tool for a website and the testing tool for an application use different types of hierarchical structure information, a supported target is typically limited. That is, the testing tools are classified into three types, that is, a testing tool supporting only a website, a testing tool supporting only an application, and a testing tool supporting both the website and the application. A supported OS is also typically limited. A user needs not only to select and use an appropriate tool according to the content of a test, but also to learn a method of operating a plurality of tools. Moreover, since a new model or a new OS version of a smartphone is released every day, there is the problem that a test cannot be executed or a lot of time is required to construct an environment to make a test executable in a case where a user does not have a corresponding testing tool, for example.

A technique for transparently displaying a web image on a native image to enable an operation on a web image portion has been known (see, for example, JP 2015-95256 A and JP 2018-41240 A). In an information processing apparatus described in JP 2015-95256 A, at least part of a native image and at least part of a web image are changed to transparent or translucent and overlaid one above the other, so that element images other than the background, such as icons, UI elements, text information, graphics, and thumbnail images, i.e., components of each image, are displayed at the same time without being hidden, and when a user operation of updating at least part of the web image is performed on the web image, the native image is notified of the operation.

In a display device described in JP 2018-41240 A, a web application is displayed so as to make an area for displaying a native application transparent in a display area of the web application, and the display of the native application is overlaid on the transparent area. In a case where user input is received in the display area of the native application, the display area of the web application is not changed, and the display area of the web application is displayed according to the web application after execution of the native application.

The present invention has been made to solve the above problems, and an object thereof is to enable testing of a native application using a testing tool for a website.

In order to solve the above problems, in the present invention, a user device is provided with a device image generated by transparentizing and overlaying a UI component, which is an HTML element related to a UI, on a background image showing the appearance of an operation target device including the UI. When event information of a user operation performed on the UI component is acquired from the user device, a native application installed in the operation target device is operated on the basis of the acquired event information, and the user device is provided with a device image generated by transparentizing and overlaying the UI component on a background image in which the execution result is reflected in the appearance.

According to the present invention configured as described above, an application testing of the native application installed in the operation target device can be executed by the user operation performed on the UI component overlaid on the background image. This makes it possible to test the native application using a testing tool for a website.

Hereinafter, an embodiment of the present invention will be described with reference to the drawings.is a diagram illustrating an overall configuration example of an application remote operation system according to the present embodiment. As illustrated in, the application remote operation system of the present embodiment includes a server device(corresponding to an application execution control device), one or more operation target devices,, . . . (sometimes collectively referred to as operation target device), and a user device.

The server deviceand the operation target deviceare connected by, for example, a wired connection such as USB or a wireless interface such as Bluetooth (registered trademark). The server deviceand the user deviceare connected by a communication networksuch as the Internet and/or a mobile phone network.

The operation target deviceincludes a device incorporating an OS, for example, a smartphone, a tablet, a wearable device, or a car navigation system. In the present embodiment, the plurality of operation target devices,, . . . are connected to the server device. The plurality of operation target devices,, . . . are different models from each other, and various OSs, including, but not limited to Android (registered trademark), iOS (registered trademark), BlackBerry OS (BlackBerry is registered trademark), and Windows Phone (registered trademark), are installed.

The user deviceis a device used by a user who executes an application testing of the operation target device, and includes, for example, a personal computer, a smartphone, or a tablet. A web browseris installed in the user device. The user can access the server devicethrough the web browser, and perform an application testing by selecting one of the plurality of operation target devices,, . . . through a test execution screen provided from the server device. At this time, the user can install any native applicationin the operation target deviceand execute an application testing of the native application

For example, the native applicationis actually operated in the operation target device. The server deviceprovides the web browserwith the same image as a screen displayed on a display of the operation target deviceat this time to cause the web browserto display the image. The user deviceremotely operates a UI provided on the screen by the native applicationof the operation target devicethrough the image displayed on the web browser. The server devicealso provides the web browserwith an image showing an execution result of the native applicationoperating according to the remote operation to cause the web browserto display the image.

This makes it possible for the user deviceto remotely verify whether the native applicationnormally operates in the operation target device. In addition, by operating the UI through a testing tool for a website, it is possible to execute the application testing of the native applicationaccording to a test code generated by recording the manual operation.

is a block diagram illustrating a functional configuration example of the server device(application execution control device) according to the present embodiment. As illustrated in, the server deviceof the present embodiment includes an application operation control unitand a device image providing unitas a functional configuration. These functional blocksandexecute processing described below by cooperation between hardware and software. For example, the processing of the above functional blocksandis executed by a program stored in a storage medium such as a RAM, a ROM, a hard disk, or a semiconductor memory operating under the control of a microcomputer including a CPU, a RAM, a ROM, and the like.

The application operation control unitcontrols an operation of the native applicationon the basis of event information regarding an operation of the native applicationtransmitted from the user device. For example, when acquiring event information of an operation of starting the native applicationfrom the user device, the application operation control unitexecutes a process of starting the native applicationon the basis of the acquired event information.

In addition, when acquiring, from the user device, event information of a user operation performed on a UI component (described in detail later) included in a device image provided to the web browserfrom the device image providing unit, the application operation control unitcauses the native applicationto execute an operation according to an UI operation on the basis of the acquired event information. That is, when acquiring event information of a user operation detected by the web browserof the user devicefrom the user device, the application operation control unitgenerates virtual operation instruction information for the operation target deviceon the basis of the acquired event information, and supplies the generated operation instruction information to the operation target deviceto operate the native application

The device image providing unitprovides the web browserof the user devicewith a device image generated by transparentizing and overlaying the UI component, which is an HTML element related to the UI, on a background image showing the appearance of the operation target deviceincluding the UI. The UI included in the operation target deviceis an UI provided by the native applicationinstalled in the operation target device. The background image showing the appearance of the operation target deviceis, for example, an image captured by a camera from a position where the operation target deviceis viewed from the front, and may include a physical button of the operation target device. As the background image showing the appearance of the operation target device, an image directly acquired from the inside through an interface included in the operation target deviceor a screenshot image of the screen displayed on the display of the operation target devicemay be used.

In a case where the application operation control unitexecutes the operation of the native application, the device image providing unitprovides the user devicewith a device image generated by transparentizing and overlaying the UI component on a background image in which the execution result is reflected in the appearance. In a case where the image captured by the camera is used as the background image showing the appearance of the operation target device, when the operation of the native applicationhas been executed by the application operation control unit, the device image providing unitis notified that and controls the camera to capture the appearance of the operation target device. The device image providing unitprovides the user devicewith the device image generated by overlaying the transparentized UI component on the background image acquired by capturing the appearance. In a case where the image acquired from the inside or the screenshot image of the operation target deviceis used as the background image showing the appearance of the operation target device, when the operation of the native applicationhas been executed by the application operation control unit, the device image providing unitis notified that and controls the operation target deviceto acquire the image. The device image providing unitprovides the user devicewith the device image generated by overlaying the transparentized UI component on the acquired background image.

As illustrated in, the device image providing unitincludes, as a more specific functional configuration, a background image generation unitthat generates the background image, a UI component generation unitthat generates the UI component, and a UI component overlay unitthat transparentizes and overlays the UI component on the background image.

As described above, the background image generation unitgenerates the background image by capturing the operation target devicewith, for example, a camera installed at a position where the operation target deviceis viewed from the front. The background image generation unittransmits the generated background image to the web browserof the user deviceto cause the web browserto display the background image.is a drawing illustrating an example of the background image generated by the background image generation unit. Here, an example of a background imagein a state where the native applicationas a calculator is started and an operation screen is displayed is illustrated.

The UI component generation unitconverts UI hierarchy information of the native applicationacquired from the operation target deviceinto DOM hierarchy information of an HTML element, and generates the UI component that is the HTML element related to the UI on the basis of the DOM hierarchy information. The UI hierarchy information is a UI library obtained by modeling a plurality of UIs included in the native applicationas a tree structure, and is created in a unique data format according to the OS of the operation target device. For example, in the case of Android, the UI hierarchy information is created in XML format.is a drawing illustrating an example of the UI hierarchy information. On the other hand, the DOM hierarchy information is a UI library obtained by modeling a plurality of UIs included in the native applicationusing HTML elements.is a drawing illustrating an example of the DOM hierarchy information.

The UI component overlay unittransparentizes the UI component generated by the HTML element of the DOM hierarchy information by the UI component generation unit, and overlays the UI component on the background image displayed on the web browserby the background image generation unitto cause the web browserto display the UI component.is a drawing illustrating an example of the device image generated by overlaying the transparentized UI component on the background image. Here, an example is illustrated in which a div element, an input element, and a button elementof HTML are overlaid on the background imageof a state where the native applicationas a calculator is started and the operation screen is displayed. In, the UI component is indicated by a thick frame for the sake of description, but is actually transparent and invisible to the user. That is, the appearance is the same as that in.

The user of the user devicecan operate the input elementor the button elementindividually through, for example, a testing tool for a website. The testing tool for a website is a tool capable of recording a manual operation on the UI component, adjusting a test script on the basis of the record, executing a test, and generating a report of the execution result.

For example, when the user operates any button element, the web browserdetects the operation, and transmits event information of the user operation generated using JSON-RPC or the like to the server device. When acquiring the event information of JSON-RPC or the like from the user device, the application operation control unitof the server deviceconverts the event information into an agent event suitable for the system of an agent included in the server device. Subsequently, the application operation control unitgenerates virtual operation instruction information for the operation target deviceon the basis of the agent event by cooperation between the agent in the server deviceand a background service in the operation target device, and supplies the generated operation instruction information to the operation target deviceto operate the native application

After the application operation control unitexecutes the operation of the native application, the background image generation unitgenerates the background image in which the execution result is reflected in the appearance, and transmits the background image to the web browserof the user deviceto cause the web browserto display the background image. The UI component generation unitgenerates the UI component in the same procedure as described above. The UI component overlay unittransparentizes the UI component generated by the UI component generation unit, and overlays the UI component on the background image displayed on the web browserby the background image generation unitto cause the web browserto display the UI component.

are flowcharts illustrating an operation example of the server deviceaccording to the present embodiment configured as described above.illustrates an overall operation example of the server device, andillustrates an operation example of the device image providing unit.

In, the application operation control unitdetermines whether or not the event information of the operation of starting the native applicationhas been acquired from the web browserof the user device(step S). Here, when acquiring the event information of the application starting operation, the application operation control unitexecutes the process of starting the native applicationon the basis of the acquired event information (step S).

The device image providing unitprovides the web browserwith the device image generated by transparentizing and overlaying the UI component, which is the HTML element related to the UI, on the background image showing the appearance including an initial screen (including the UI) displayed by starting the native application(step S). Details of this processing are illustrated in the flowchart of.

Thereafter, the application operation control unitdetermines whether or not the event information of the user operation performed on the UI component has been acquired from the user device(step S). Here, in a case where the event information of the UI operation has not been acquired, the process proceeds to step S. On the other hand, in a case where the event information of the UI operation has been acquired, the application operation control unitexecutes the process of causing the native applicationto execute the operation according to the UI operation on the basis of the acquired event information (step S).

When the application operation control unitexecutes the operation of the native application, the device image providing unitprovides the web browserof the user devicewith the device image generated by transparentizing and overlaying the UI component on the background image in which the execution result is reflected in the appearance (step S). The processing performed here is similar to the processing of step S, and details thereof are illustrated in the flowchart of. Thereafter, the process proceeds to step S.

In step S, the application operation control unitdetermines whether or not event information of an operation of ending the native applicationhas been acquired from the web browserof the user device. Here, in a case where the event information of the application ending operation has not been acquired, the process returns to step S. On the other hand, in a case where the event information of the application ending operation has been acquired, the application operation control unitexecutes a process of shutting down the native applicationon the basis of the acquired event information (step S), and ends the process of the flowchart illustrated in.

In, the background image generation unitgenerates the background image by, for example, controlling the camera to capture the appearance of the operation target device, and transmits the generated background image to the web browserto cause the web browserto display the background image (step S). Subsequently, the UI component generation unitacquires the UI hierarchy information of the native applicationfrom the operation target device(step S), converts the acquired UI hierarchy information into the DOM hierarchy information of the HTML element, and generates the UI component on the basis of the converted DOM hierarchy information (step S). Subsequently, the UI component overlay unittransparentizes the UI component generated by the UI component generation unit, and overlays the UI component on the background image to cause the web browserto display the UI component (step S).

As described above in detail, the server deviceof the present embodiment provides the user deviceof the test executor with the device image generated by transparentizing and overlaying the UI component, which is the HTML element related to the UI, on the background image showing the appearance of the operation target deviceincluding the UI. When the event information of the user operation performed on the UI component is acquired from the user device, the native applicationinstalled in the operation target deviceis operated on the basis of the acquired event information, and the user deviceis provided with the device image generated by transparentizing and overlaying the UI component on the background image in which the execution result is reflected in the appearance.

According to the present embodiment configured as described above, the application testing of the native applicationinstalled in the operation target devicecan be executed by performing the user operation on the UI component overlaid on the background image through the testing tool for a website. This makes it possible to test the native applicationusing the testing tool for a website.

In the above embodiment, an example has been described in which the UI hierarchy information having a different structure depending on the OS of the operation target deviceis directly converted into the DOM hierarchy information of the HTML element. However, the present invention is not limited to this example. For example, the UI component generation unitmay convert the UI hierarchy information of the native applicationdependent on the OS of the operation target deviceinto intermediate hierarchy information independent of the OS of the operation target device, and then convert the intermediate hierarchy information into the DOM hierarchy information of the HTML element.is a drawing illustrating an example of the intermediate hierarchy information. Here, an example of a case where the intermediate hierarchy information is configured in JSON format is illustrated.

In the case of conversion into the intermediate hierarchy information, the UI component generation unitmay optimize the intermediate hierarchy information by combining, with the intermediate hierarchy information, text information generated by optical character recognition (OCR) of the background image (the captured image showing the appearance of the operation target device) generated by the background image generation unit, and convert the optimized intermediate hierarchy information into the DOM hierarchy information.

Depending on the native applicationto be tested, characters and numbers displayed on the UI may be represented by images instead of text information. In this case, information corresponding to characters and numbers displayed by images may be missing in the intermediate hierarchy information converted from the UI hierarchy information. The above-described optimization processing is a process of complementing the missing characters and numbers in the intermediate hierarchy information converted from the UI hierarchy information with the text information recognized by performing OCR processing on the captured image of the operation target device.

Such optimization processing can be executed for the UI hierarchy information. However, since the UI hierarchy information has various different structures depending on the OS of the operation target device, it is necessary to create an algorithm for performing the optimization processing for each OS. On the other hand, performing the optimization processing on the intermediate hierarchy information independent of the OS after the conversion into the intermediate hierarchy information provides an advantage that it is only necessary to prepare one algorithm for performing the optimization processing.

The application remote operation system of the present embodiment is suitably applied to a use case in which the application testing of the native applicationis performed by remotely operating the operation target device, but can also be applied to a use case other than the test.

In the above embodiment, a configuration example has been described in which the server devicehas all the functions of the device image providing unitand the application operation control unit. However, the user devicemay have some or all of these functions. For example, as illustrated in, the web browserof the user devicemay have the function of the UI component overlay unit

Alternatively, as illustrated in, the web browserof the user devicemay have all the functions of the device image providing unitand the application operation control unit. In this case, the server deviceis omitted, and the user deviceand the operation target deviceare connected via the communication network. For example, by providing the web browserwith an application programming interface (API) for directly operating a USB device, the operation target devicecan be remotely operated directly from the web browser, and all the functions of the device image providing unitand the application operation control unitcan be implemented in the web browser

In addition, each of the above embodiments is merely an example of implementation in implementing the present invention, and the technical scope of the present invention should not be interpreted in a limited manner. That is, the present invention can be implemented in various forms without departing from the gist or main features thereof.

Patent Metadata

Filing Date

Unknown

Publication Date

October 9, 2025

Inventors

Unknown

Want to explore more patents?

Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.

Citation & reuse

Analysis on this page is generated by Patentable — an AI-powered patent intelligence platform. AI-generated summaries, explanations, and analysis may be reused with attribution and a visible link back to the canonical URL below. Patent abstracts and claims are USPTO public domain.

Cite as: Patentable. “APPLICATION REMOTE OPERATION SYSTEM, APPLICATION EXECUTION CONTROL DEVICE, AND APPLICATION EXECUTION CONTROL METHOD” (US-20250315494-A1). https://patentable.app/patents/US-20250315494-A1

© 2026 Patentable. All rights reserved.

Patentable is a research and drafting-assistant tool, not a law firm, and does not provide legal advice. Documents we generate are drafts for review by a licensed patent attorney.