Multiobjective genetic programming for reinforced concrete beam modeling

This paper presents the application of multiobjective genetic programming (MOGP) in engineering issues. An evolutionary symbolic implementation was developed based on a case study on prediction of the shear strength of slender reinforced concrete beams without stirrups including 1942 set of published test results. In the implementation of the MOGP model, the nondominated sorting genetic algorithm II with adaptive regression by mixing algorithm with considering the optimization of mean-square error as the fitness measure and the subtree complexity was used. The developed MOGP model was compared to previously developed genetic programming models, different building codes, and additional machine learning based approaches. It is clearly shown that the MOGP model outperformed the other algorithms applied on this database and can be a general solution on any engineering problems with the main advantage of prediction equations without assuming prior form of the relevance among the input predictor variables.


| INTRODUCTION
Since its conception in 1992 by John Koza, 1 genetic programming (GP) has been an attractive tool for researchers to identify models and systems. GP and its close predecessor genetic algorithm (GA), 2 are two most significant members of a group of methods, called evolutionary algorithms (EA), which use natural selection principles to pursue better solutions to various types of engineering problems. One important difference between GP and GA is that GA generally does not allow for variation of the model structure during the computation, 3 while in GP the structure of the model is a part of the problem and will be a part of the solution as well. In better words, GA is more concerned with the model parameters and inputs and assumes the behavior of the to-be-optimized system is known, while GP does not make such assumption about the internals of the system beforehand, and there lies its true power. GP is specifically effective for the problems in which the structure of the solution-in terms of how the variables inside the model are related to each other (interaction among variables) and to the output-is unknown, or expected to be poorly known. 4 A good example for such problems is the synthesis of electronic circuits such as amplifiers, voltage-current converter, low-voltage balun circuit, and so forth 1,5,6 that was achieved by merely describing the higher level behavior of the circuit. Streeter et al 6 managed to duplicate the functionality of five patented circuits using a GP approach, which outputs the topology of the circuit and the numerical value of its components. Other than producing novel topologies for electronic solutions, GP has been used for synthesizing topologies of PID controllers. 7 Not only was the proposed GP approach able to produce with the controller topology, but also it managed to come up with tuning rules that would outperform Ziegler-Nichols and Astrom-Hagglund tuning rules, which have been extensively in use for the majority of the twentieth century. Another area of strength for GP based approaches is where there are powerful simulation tools for a given problem but no concrete approach, 4 especially when the problem is somehow related to mimicking a natural system or phenomenon. Evolutionary robotics (ER) is a field in which EA are applied to solve robot design problems. For instance, something as rudimentary as walking has been posed as a great challenge to robotics researchers for decades. 8 A multitree GP approach was adopted in Reference 9 to optimize the way a swarm of cubic shaped robots assemble themselves into a larger unit. Others used GP approach to create controllers for mobile robots. [8][9][10] Due to the complex nature of walking locomotion, GP is proven very effective in producing satisfactory results. GP has been utilized to model complex processes in other engineering issues where the internal parts of the model are unknown or time consuming to attain. 11,12 In Reference 12, the authors used GP to model the strength of the concrete cylinders after they are enhanced by carbon-fiber reinforced polymer composites. In this paper, an evolutionary symbolic implementation for the prediction of the shear strength of slender reinforced concrete (RC) beams based on ACI 318-08 (2008) 13 without stirrups was investigated. In this implementation, various models including different characteristics via the combination of multiobjective genetic programming (MOGP) and adaptive regression by mixing algorithm were proposed. The proposed models were compared to the benchmark machine learning algorithms along with the model proposed by Gandomi et al 14 (Data S1). Additionally, the proposed model has shown better accuracy than models based on several building codes such as ACI building code, EC2 building code, Canadian Standard Code, and New Zealand Standard. 13 (1) Select points on the lower front with higher crowding distance, (2) Create next generation, (3) Binary tournament selection, (4) Recombination and mutation.

| MULTIOBJECTIVE GENETIC PROGRAMMING
One area of use for GP 1 as a symbolic optimization technique is to find a model that best represents the dataset, in the simplest and most accurate form. This is done by searching the space of the mathematical expressions. GP was originally formulated based on functional programming language as an evolutionary method to use Darwin's theory of natural selection to create computer programs aimed at a specific problem. Instead of using one candidate, GP uses a group of individuals (known as population), formed by randomly combining mathematical building blocks such as constants, mathematical operators, analytic functions, state variables, and genetic operators, to make new individuals (generations) guided by fitness and complexity as objective functions that are meant to gauge the quality of each individual. The regression model 15,16 in Algorithm 1 (flowchart available in Reference 17) is implemented as an MOGP approach based on the work of Deb et al on nondominated sorting genetic algorithm II (NSGA-II). 18 Instead of fitness sharing (which was used in NSGA-I), NSGA-II uses the concept of crowding distance. In other words, in NSGA-II, parents and offspring are combined to form one set and a nondominated sorting is used to classify the entire population based on an estimate of the density of solutions using the crowding distance. There are two values that the algorithm will minimize; first, the error, which can be the mean squared error (MSE) or mean absolute error (MAE), and second, the subtree complexity measure. 15,16,19 GP has great potential in predicting complex patterns. [20][21][22] It is also a suitable approach to be implemented using common parallelization frameworks such as message passing interface (MPI) and open multiprocessing (OpenMP). 23 The proposed model forms a Pareto front of models based on the fitness and subtree complexity measures. In this study, the most accurate model, and the model at the knee of the Pareto front (the knee of the 2D space of the objectives: fitness-complexity space, which presents the maximal trade-offs between objectives) were presented. Additionally, a fused model of the Pareto front obtained with adaptive regression by mixing (ARM) algorithm (shown in Algorithm 2) introduced by Yang 24 was presented.
As an example of engineering problems, a case study on prediction of the shear strength of slender RC beams without stirrups including 1942 set of published test results was employed which are gathered from different studies by Gandomi et al. 14 The input predictor variables are (1) web width (b w in mm), (2) effective depth (d in mm), (3) shear span to depth ratio ( a d ), (4) concrete compressive strength (f c in MPa), and (5) the amount of longitudinal reinforcement (ρ l in %), as well as shear strength of the RC beam without stirrups (V in kN) as the output. Figure 1 illustrates the kernel density distribution of the predictor variables in the database. As seen, all of the variables have a Gaussian distribution. The database randomly splitted into training and testing sets including 1458 and 486 instances, respectively. An MOGP model based on NSGA-II employing the most important factors commonly used in the previous building codes was developed. In order to find the optimized mutation and crossover rates of the model, various rates ranging from 0.1 to 0.9 were employed for the developed MOGP model for 200 generations. The 3-dimensional surface plot of the mutation rates and the crossover rates based on the fitness and subtree complexity measures are illustrated in Figure 2A and B, respectively. The highest fitness value (0.9389) was obtained with a value of 0.6 as the crossover rate, and a value of 0.4 as the mutation rate. Additionally, the lowest subtree complexity (153) was obtained with a value of 0.2 for the crossover rate, and a value of 0.8 for the mutation rate. However, the crossover and the mutation rates of 0.1 and 0.9 showed a fitness value of 0.9337 and a subtree complexity value of 181, accordingly. Therefore, the aforementioned rates were considered for running the developed MOGP model with 2000 generations and 1000 population for the regression task. The details of the parameters setting for the MOGP model are presented in Table 1. The developed MOGP model was trained on the training dataset with 1458 instances and tested on the testing dataset with 486 instances. This would decrease the chance of over-fitting of the model. The fitness and the complexity values at each generation were reported for the best individuals. Figure 3A depicts the evolution of the best individuals in the Pareto front of the MOGP model through generations for both fitness and complexity measures. The left y-axis was set to fitness measure, the right y-axis to complexity, and x-axis to number of generations. As seen, after roughly about 100 generations the model reached the fitness with 94% accuracy. After 2000generations, the fitness reached a value of 96% with the complexity value of 2227 which is about three times more than the value that was obtained for the 94% fitness measure. By increasing the number of generations, both fitness and complexity values increased as well, however the rate of improvement of the fitness value was almost constant after 200 generations.
In the context of statistical modeling with the main purpose of prediction of future outcomes, coefficient of determination (R 2 ) provides a measure of how well observed outcomes are replicated by the model based on the proportion of total variation of outcomes explained by the model. Additionally, root-mean-square error (RMSE) is frequently used to measure the differences between values predicted by a model and the values actually observed. To explore the performance of the proposed models, both regression metrics were computed for all of the models in the Pareto front (shown in dark violet circles) as shown in Figure 3B. Furthermore, the specified models including the most MOGP-accurate model (dark orchid star), the MOGP-fused model (purple triangle), the MOGP-knee model (cyan square), and the GP model by Gandomi et al 14 (yellow pentagon) were indicated. It is clearly shown that the ARM algorithm found a model  To study the quality assurance, mean value (μ) and coefficient of variation (CV) of this ratio are also reported. Figure 4 presents the comparison of the histograms of the MOGP-fused model and the GP model presented by Gandomi et al. 14 As seen, the MOGP-fused model has a mean value of 1.0840 and a coefficient of variation of 0.309 which is better than the GP model with a mean value of 1.0679 and a coefficient of variation of 0.477. By using the coefficient of variation, we achieve more clear understanding of the SD in light of the data mean value. Figure 5 shows the measured vs predicted shear strength values resulted from the developed models. Multistage genetic programming (MSGP) can be served as a substitute for the NSGA-II based model that was developed to decrease error decomposition. 11,25 There are two main stages in the MSGP algorithm: 1. Incorporating the individual effect of the input variables; 2. Incorporating the interactions among the input variables.
The MSGP formulates these two terms in an efficient procedure to optimize the error among predicted and actual values. This procedure can be parallelized to decrease overall computation time. [MOGP] [GP] Here complex models (eg, deep neural networks) were not used, first, since this study looks for explicit models, not black box models. Second, this is not a large-scale problem. Therefore, complex model are not good candidates due to the high over-fitting potential. A receiver operating characteristic (ROC) curve for a binary classifier is a way to [R]  visualize how the trade-off between true positive rate and true negative rate can affect the performance of the system. The area under the curve (AUC) is a measure of expected performance for that classifier. 27 In the case of a regressor, regression error characteristic (REC) curve is used as a way to visualize the performance. It shows the percentage of the data instances predicted within tolerance (on the y-axis) vs the tolerance level (on the x-axis). The resulting curve estimates the cumulative distribution function of the error. The area over the ROC and REC curve (AOC), which is just 1 − AUC is a biased estimate of the expected error. The coefficient of determination R 2 can be also calculated with respect to the AOC. 27 In addition, The shape of the REC curve can also be used to provide insight for the analysts about the data modeling. The REC curve was implemented in Python 1 and the details of the error metrics and scaling of the residuals are also available. 28 Figure 6A compares