The SVR method applies a support vector machine (SVM) to regression analysis and can be used to construct nonlinear models by applying a kernel trick as well as the SVM. The OSVR method is a method efficiently updating a SVR model to meet the Karush–Kuhn Tucker (KKT) conditions that the SVR model must fulfill when training data are added or deleted.
The primal form of SVR can be shown to be the following optimization problem.
where yi, and xi are training data, f is a SVR model, w is a weight vector, ε is a threshold, and C is a penalizing factor that controls the trade-off between model complexity and training errors. The second term of Eq. A1 is the ε-insensitive loss function and given as follows
Through the minimization of Eq. A1, we can construct a regression model that has good balance between generalization capabilities and the ability to adapt to the training data. A y-value predicted by inputting data x is represented as follows
where N is the number of training data, b is a constant term, and K is a kernel function. The kernel function in our application is a radial basis function
where γ is a tuning parameter controlling the width of the kernel function. From Eqs. A1 and A2, αi and αi* in Eq. A3 are obtained by minimizing the equation given as
Kij in Eq. A5 is represented as follows
Now, we define θi as follows
From Eqs. A3, A4, and A8, a predicted y-value of data xi is given as
where θi meets the following equation
The error function h is defined as
Each training data must meet one of Eqs. A13–A17. All training data can be divided into the following sets: error support vectors, E, which meet Eq. A13 or A17, margin support vectors, S, which meet Eq. A14 or A16, and remaining vectors, R, which meet Eq. A15.
When new data xc, yc are added, there is no need to update the SVR model θi, b if xc belongs to R. On the other hands, if xc belongs to E or S, the initial value of θc that is θi corresponding to xc is set as 0, and θc, θi, and b are gradually changed to meet the KKT conditions. There are possibilities that each training data moves to another region due to the changes. But, assuming no such movements, variations of h(xi), θc, θi, and b, Δh(xi), Δθc, Δθi, and Δb, respectively, can be represented from Eqs. A11 and A12 as follows
The θi-values of the training data belonging to E and R did not change because of Eqs. A13, A15, and A17, and thus, Eq. A18 can be transformed as
The h(xi)-values of the training data belonging to S are settled due to Eqs. A14 and A16. Thus Eqs. A19 and A20 can change to
Here M is the number of the training data that belong to S. From Eqs. A20, A23, and A24, h(xi) for the training data belonging to E and R can be transformed as
From Eqs. A24 and A27, Δθc for the movement of each training data is represented as
The absolute Δθi-values for each training data to move from the current region to another region, i.e. from E to S, from S to E or R and from R to S, are calculated by using Eqs. A29 and A30. The minimum value of the absolute Δθi-values calculated with all training data is selected, and the data having the minimum Δθi-value is actually moved to a new region. The calculation of the absolute Δθi-values and the movement of the data having the minimum value of the absolute Δθi-values are repeated until each of all the training data meets the KKT conditions, namely, one of Eqs. A13–A17. When one data are deleted from training data, the same iterative calculation is performed until all the data meet the KKT conditions.