According to one embodiment, an information processing apparatus includes one or more processors. The processors determine weights by m determination methods (m is an integer more than or equal to 2). Each determination method is a method using training data sets, each set including an explanatory variable and a visibility value as an objective variable, to determine the weights for the visibility values. The m determination methods includes a first determination method of determining to assign a larger weight to the training data set including a first visibility value than to other visibility values. The first visibility value occurs with lower frequency than the other visibility values. The processors train m forecast models configured to forecast the visibility values by inputting the at least one explanatory variable. The m forecast models are trained by using the training data sets assigned with the weights determined by the m determination methods.
Legal claims defining the scope of protection, as filed with the USPTO.
. An information processing apparatus comprising
. The information processing apparatus according to, wherein the hardware processors are configured to train a probability model configured to forecast a probability that the visibility value is the first visibility value, the probability model being trained by inputting the at least one explanatory variable by using the training data sets.
. The information processing apparatus according to, wherein the hardware processors are configured to
. The information processing apparatus according to, wherein the hardware processors are configured to select the at least one first candidate that enables a model configured to forecast the visibility value by using the candidates as an explanatory variable to perform forecasting with higher accuracy than the candidates other than the at least one first candidate.
. The information processing apparatus according to, wherein the hardware processors are configured to select the at least one first candidate having a higher cross-correlation with the visibility value than the candidates other than the at least one first candidate.
. The information processing apparatus according to, wherein the at least one explanatory variable includes at least one of relative humidity, temperature, or PM2.5.
. The information processing apparatus according to, wherein the m determination methods include a second determination method of determining the weights by using kernel density estimation.
. The information processing apparatus according to, wherein the hardware processors are configured to
. The information processing apparatus according to, wherein the hardware processors are configured to,
. An information processing apparatus comprising
. An information processing method implemented by a computer, the method comprising:
. A computer program product comprising a non-transitory computer-readable recording medium on which a computer program executable by a computer is recorded, the computer program instructing the computer to perform processing, the processing including:
Complete technical specification and implementation details from the patent document.
This application is based upon and claims the benefit of priority from Japanese Patent Application No. 2024-063340, filed on Apr. 10, 2024; the entire contents of which are incorporated herein by reference.
Embodiments described herein relate generally to an information processing apparatus, an information processing method, and a computer program product.
In a traffic control system, visibility forecasting is performed by utilizing weather data in order to secure the safety of road and air traffic.
In conventional visibility forecasting, through a machine learning technique that uses weather data including data indicating measured visibility as a training data set, a forecast model is trained to minimize the overall forecasting error.
However, the conventional techniques sometimes fail in training a forecast model capable of forecasting visibility with high accuracy.
An information processing apparatus according to one embodiment includes one or more hardware processors. The hardware processors are configured to determine weights by m mutually different determination methods (m is an integer greater than or equal to 2). Each determination method is a method using training data sets, each set including at least one explanatory variable and a visibility value as an objective variable, to determine the weights for the visibility values included in the training data sets. The m determination methods include a first determination method of determining to assign a larger weight to the training data set including a first visibility value than to other visibility values. The first visibility value occurs with lower frequency than the other visibility values. The hardware processors are configured to train m forecast models configured to forecast the visibility values by inputting the at least one explanatory variable. The m forecast models are trained by using the training data sets assigned with the weights determined by the m determination methods.
Hereinafter, embodiments of an information processing apparatus according to the present invention will be described in detail with reference to the accompanying drawings.
The conventional art sometimes failed in training a forecast model capable of forecasting visibility with high accuracy, as described above. One of the reasons is that data indicating visibility has the following features.
The visibility value is a standard of how far away an object can be seen and is expressed by, for example, the maximum distance at which an object can be seen with the naked eye. In the following, data indicating the visibility value are sometimes referred to as visibility data. A scale for the visibility data is sometimes changed in accordance with the use. For example, visibility at human eye level is several hundred meters, whereas visibility at aircraft level is several kilometers.
In the conventional art, by using the visibility data having the above-mentioned features, a forecast model is trained to minimize an overall forecast error. Therefore, a high visibility value occurring with high frequency can be forecasted with high accuracy, meanwhile a low visibility value occurring with low frequency cannot be forecasted with high accuracy in some cases.
is a diagram illustrating an example of visibility data. In, visibility values are divided into actual values and forecast values (bold line), and how the visibility values change with time is illustrated. The forecast values are examples of visibility values forecasted by the conventional art.
In the example in, visibility values exceeding the upper limit of 10 are excluded. Moreover, the number of pieces of data on low visibility values of 3 or less, for example, is smaller than the number of pieces of data on high visibility values. In other words, the frequency of occurrence of visibility values is biased. Therefore, as for forecast values by the conventional art, high visibility values are forecasted with relatively high accuracy, meanwhile forecasting of low visibility values has been unsuccessful.
Different phenomena (factors) that affect visibility sometimes occur at different places (regions). For example, PM2.5 in China, dense fog in India, and snowstorms in Japan often cause poor visibility (a decrease in visibility value). Therefore, in the case of forecasting visibility by using weather data, it is desirable to select a more appropriate factor for a forecast target region and use the selected factor as an explanatory variable for the forecasting of visibility (objective variable).
An information processing apparatus of a first embodiment trains (builds) two forecast models including a forecast model for more accurately forecasting a visibility value occurring with low frequency (for example, a low visibility value). The information processing apparatus of the present embodiment forecasts a visibility value in a forecast target period by applying data for forecasting (forecast data) to the trained forecast models. Moreover, the information processing apparatus of the present embodiment selects at least one factor (visibility factor) effective for forecasting of visibility from a plurality of factors (a plurality of types of weather data), and performs training of the forecast models by using the selected visibility factor and performs forecasting using the forecast models.
In the present embodiment, the two forecast models are trained as described above, and forecasting processing is executed using the two forecast models. The number m of the forecast models is not limited to 2, but may be 3 or more. A configuration in which m is 3 or more will be described in a second embodiment.
is a block diagram illustrating a configuration example of an information processing apparatusof the first embodiment. As illustrated in, the information processing apparatusincludes a training controllerand a forecast controller.
The training controllercontrols the training processing of the forecast models. The forecasting controllercontrols the forecasting processing using the trained forecast models.
At least one of the above-mentioned controllers (the training controllerand the forecasting controller) may be implemented by one or more processing units. The above-mentioned controllers are implemented by, for example, one or more processors. The above-mentioned units may be implemented by causing a processor such as a central processing unit (CPU) or a graphics processing unit (GPU) to execute a computer program, namely, implemented by software. The above-mentioned controllers may be implemented by a processor such as a dedicated integrated circuit (IC), namely, implemented by hardware. The above-mentioned controllers may be implemented using a combination of software and hardware. In the case of using multiple processors, each of the processors may implement one of the controllers or may implement two or more of the controllers.
The information processing apparatusmay be physically composed of a single device or may be physically composed of a plurality of devices. In one example, the information processing apparatusmay be built on a cloud environment. The controllers of the information processing apparatusmay be distributed among devices. In one example, the information processing apparatus(an information processing system) may include a device including the training controller(for example, a training device) and a device including the forecasting controller(for example, a forecast device).
At least one constituent of the information processing apparatusmay be chip-based. At least one constituent of the information processing apparatusmay be incorporated in a system-on-chip (SoC), such as an edge device. In this case, a memory unit configured to store data to be used for training (a memory unitto be described later) and a memory unit configured to store data to be used for forecasting (a memory unitto be described later) may be provided outside the SoC and accessible via an interface device.
The training controllerincludes the memory unit, a selection unit, a weight determination unit_, a weight determination unit_, a model training unit_, a model training unit_, and a probability training unit.
The memory unitstores various types of information to be used by the training controller. The memory unitstores, for example, input data for training TD and a visibility threshold TH that are used for training processing, and output data outputted through the training processing. The output data includes, for example, a visibility factor VF, a forecast model FM_, a forecast model FM_, and a probability model PRM. In, for convenience of description, only the input data for training TD are illustrated inside the memory unit, but the memory unitmay store other types of data (for example, the output data) as well.
The input data for training TD are data to be inputted for training the forecast models (the forecast model FM_and the forecast model FM_). In one example, the input data for training TD include a plurality of types of weather data for a plurality of dates and times (time stamps). The weather data may be any type of data, and may include an atmospheric pressure, a dew-point temperature, a wind velocity, a relative humidity, an amount of precipitation, an atmospheric temperature, and an air pollution index (for example, PM2.5, PM10, NOx, CO, or NMHC).
is a diagram illustrating an example of the input data for training TD. In the example in, the input data for training TD include a time stamp, an atmospheric pressure, a dew-point temperature, a wind velocity, a relative humidity, an amount of precipitation, PM10, an atmospheric temperature, PM2.5, and visibility.
A plurality of types of weather data (factors) included in the input data for training TD corresponds to a plurality of candidates for the visibility factor VF serving as an explanatory variable. In other words, at least one explanatory variable defined by the visibility factor VF and a visibility value (visibility data) serving as an objective variable are selected from the input data for training TD, whereby training data sets including the selected explanatory variable and the objective variable are generated. In the following, data selected as the visibility factor VF are sometimes referred to as visibility factor data for training.
The at least one explanatory variable may be any data, and may include some of or all the following data.
The visibility value serving as the objective variable corresponds to a visibility value to be used as correct data during training. In one example, when visibility values are forecasted in a forecast target period that is after the lapse of a certain period of time (for example, from one to six hours later at one-hour intervals), based on weather data observed in the past, visibility values observed after the lapse of the certain period of time following the observation of the weather data corresponding to the explanatory variables are used as the correct data.
In the example in, dataindicating relative humidity, dataindicating PM2.5, and dataindicating visibility are selected as visibility factors (explanatory variables). Thus, in the example in, the visibility factor data for training include relative humidity, PM2.5, and visibility.
A visibility threshold TH is a threshold for dividing the training data set including the visibility factor data for training into plural groups. Since the range of visibility changes with the purpose of use, a value of the visibility threshold TH may be changed with the purpose of use.
The memory unitcan be composed of any commonly used storage medium, such as a flash memory, a memory card, a random access memory (RAM), a hard disk drive (HDD), or an optical disk. Some of or all the pieces of data to be stored in the memory unit(the input data for training TD, the visibility threshold TH, the visibility factor VF, the forecast model FM_, the forecast model FM_, and the probability model PRM) may be stored in physically different storage media, or may be stored in different storage areas of physically the same storage medium.
Return to the description of. The selection unitselects the visibility factor VF from the input data for training TD. In one example, from among candidates (weather data) for the explanatory variable included in the input data for training TD, the selection unitselects, as a visibility factor VF, at least one candidate C(a first candidate) that affects the forecasting of a visibility value. The selection unitstores the selected visibility factor VF in the memory unit. Moreover, the selection unitgenerates training data sets including the at least one visibility factor VF (the candidate C) selected as an explanatory variable.
The following methods can be applied as methods for selecting the visibility factor VF.
A description of (M2) is given below. It is assumed that the input data for training TD include an atmospheric pressure, a dew-point temperature, a wind velocity, a relative humidity, an amount of precipitation, PM10, an atmospheric temperature, and PM2.5 as candidates for the factors. The selection unitcalculates a correlation coefficient between visibility and each of the candidates, and selects, as the visibility factor VF, the candidate for which a correlation coefficient indicating high correlation has been calculated. In the case of a correlation coefficient whose higher value indicates a higher correlation, the selection unitselects, as the visibility factor VF, a candidate having a correlation coefficient value larger than a threshold value, a certain number of candidates in descending order of correlation coefficient value, or a certain proportion of top-ranked candidates in descending order of correlation coefficient value.
In one example, when the threshold value is 0.5 and the correlation coefficient between each of the candidates and visibility is as follows, the selection unitselects, as the visibility factors VF, the relative humidity and PM2.5, whose correlation coefficient values are larger than the threshold value of 0.5.
A description of (M3) is given below. The selection unitdivides the input data for training TD into teaching data and validation data. The selection unitgenerates various combinations of candidates included in the teaching data and trains a model M_VF by using the training data set including the combinations as explanatory variables. The model M_VF may be any model configured to forecast visibility, and may be, for example, a model having the same structure as that of the forecast model FM_or the forecast model FM_and being built by using the same training method as for that of the forecast model FM_or the forecast model FM_.
The selection unitevaluates the trained model M_VF by using the validation data and calculates an evaluation score. The evaluation score may be any index. Examples of the index that can be used include a root mean squared error (RMSE), a mean absolute error (MAE), and a determination coefficient (R).
The selection unitselects, as the visibility factors VF, candidates included in a combination exhibiting the best evaluation score. Similar to the above, when the input data for training TD include eight candidates, namely, an atmospheric pressure, a dew-point temperature, a wind velocity, a relative humidity, an amount of precipitation, PM10, an atmospheric temperature, and PM2.5, there are 255 (=28-1) combinations of the candidates. Examples of the combinations are listed below.
{atmospheric pressure}, {dew-point temperature}, {wind velocity}, {relative humidity}, {amount of precipitation}, {PM10}, {atmospheric temperature}, {PM2.5}, . . . , {relative humidity, PM2.5}, {dew-point temperature, PM10}, . . . , {atmospheric pressure, dew-point temperature, wind velocity, relative humidity}, {amount of precipitation, PM10, atmospheric temperature, PM2.5}, . . . , {atmospheric pressure, dew-point temperature, wind velocity, relative humidity, amount of precipitation, PM10, atmospheric temperature, PM2.5}
In the above examples, visibility is excluded from candidate factors to be included in the combinations. In this case, the selection unitmay further include visibility as the visibility factor VF. The selection unitmay perform the above-mentioned processing while visibility is included as a candidate factor to be included in the combinations.
When generating combinations of candidates, the selection unitmay use sequential forward selection (SFS), sequential backward selection (SBS), and genetic algorithms.
Return to the description of. The weight determination unit_and the weight determination unit_determine weights in accordance with visibility values included in a plurality of training data sets by using mutually different determination methods. In the present embodiment, the number m of the determination methods is assumed to be 2.
The determination methods are methods using plural training data sets to determine weights in accordance with visibility values included in the training data sets. A weight is determined for each of the training data sets (each sample), and each of the determined weights is assigned to a corresponding one of the training data sets. The two determination methods include a determination method DT_A (a first determination method) to be used by the weight determination unit_and a determination method DT_B to be used by the weight determination unit_.
The determination method DT_A is a method of determining to assign a larger weight Wto the training data set including a visibility value VA (a first visibility value) than other visibility values, the visibility value VA occurring with lower frequency than the other visibility values. The visibility value VA is, for example, a low visibility value. The determination method DT_B may be any method that differs from the determination method DT_A, and is, for example, a method (a second determination method) of determining a weight Wby using the kernel density estimation.
The weight determination unit_determines a weight to be assigned to the training data set for training the forecast model FM_in accordance with the determination method DT_A. First, the weight determination unit_divides the training data set into m groups by using the visibility threshold TH.
When m=2 as in the present embodiment, the weight determination unit_classifies the training data set including visibility values (low visibility values) equal to or smaller than the visibility threshold TH, as a group corresponding to the low visibility values (hereinafter referred to as a low visibility group). The visibility values equal to or smaller than the visibility threshold TH are examples of the visibility value VA that occurs with lower frequency than other visibility values. Moreover, the weight determination unit_classifies the training data set including visibility values (high visibility values) larger than the visibility threshold TH, as a group corresponding to the high visibility values (hereinafter referred to as a high visibility group).
The weight determination unit_determines a predetermined value for each of the groups as a value of the weight W. A value Bfor a group corresponding to a visibility value to be focused on is defined to be larger than a value Bfor another group. In one example, when low visibility is focused on, the weight determination unit_determines to assign β=9999 as a value of the weight Wto a group corresponding to low visibility values. Moreover, the weight determination unit_determines to assign β=1 as a value of the weight Wto another group (the group corresponding to the high visibility values).
In accordance with the determination method DT_B, the weight determination unit_determines a weight Wto be assigned to the training data set for training the forecast model FM_. Hereinafter, an example of the determination method DT_B, which uses the kernel density estimation to determine the weight W, will be described.
As the method that uses the kernel density estimation, a method of determining a weight for a rare value, based on the kernel density estimation can be used (see, for example, Steininger, M., Kobs, K., Davidson, P. et al., “Density-based weighting for imbalanced regression”, Mach Learn 110, 2187-2211 (2021)). The weight determination unit_determines the weight W(y) for visibility y by using the following formula (1), for example.
Note that determining the weight for the visibility y is equivalent to determining a weight for a training data set including the visibility y. The visibility y means a value of visibility included in one of training data sets. Hereinafter, Y represents a set of all visibility y included in the training data sets.
Unknown
October 16, 2025
Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.