Reliability assessment of existing structures using results of nondestructive testing

Making optimal decisions about the reliability of existing structures requires that the information used in assessment adequately represents the properties and the condition of the structures. The knowledge gap regarding a structure to be assessed can be successively filled by individually purposeful observations on site. This paper gives an overview of an approach for utilizing nondestructively gathered measurement results in reliability assessment of existing structures. An essential part of measurement‐based stochastic modeling of basic variables is the calculation of measurement uncertainties, which serves to establish confidence in measurement, to ensure the comparability of unambiguously expressed measurement results, and to quantify the quality of the measured information. Regarding the current discourse on how to treat information collected on‐site in the context of assessment, the authors recommend that measurement uncertainty becomes an uncertainty component mandatorily to be represented in measurement‐based stochastic models. The main steps of the proposed concept are presented, and the advantages of its application are emphasized by means of a prestressed concrete bridge as case study. The bridge is assessed regarding the serviceability limit state decompression using ultrasonic and radar data measured at the structure.


| INTRODUCTION
The aging of structures, deteriorating conditions, and changing loads only represent a variety of reasons for which the reliability assessment of existing structures is an ongoing key challenge both nationally and internationally, and a highly topical issue in standardization (cf. 1 ). Calculated values of reliability measures such as Discussion on this paper must be submitted within two months of the print publication. The discussion will then be published in print, along with the authors' closure, if any, approximately nine months after the print publication. the failure probability are not to be understood as structural properties. They depend on the incorporated knowledge about the considered system, and can be interpreted as a measure of the quality of the information available about the parameters considered important for a decision. 2 Appreciating measurement data in assessment of existing structures has the potential to extend remaining lifetimes of structures, to avoid closures or use restrictions, and to save resources, since initially insufficient computation models used for the assessment can be refined purposefully by individual and quality-assessed observations made on site. This way, both the covered uncertainty in and the bias of stochastic models of basic variables can be reduced, and the level of approximation 3 increased.
Besides the established regular inspections, additional advanced measurements on structures have been proven to be suitable and useful in condition assessment. 4,5 Information on monitoring-supported reliability analyses can be found, for example, in Frangopol et al. 6,7 The purpose of this contribution is to propose an approach for using measurement data collected nondestructively on site in stochastic modeling of characteristics to be appreciated as basic variables in reliability assessment of existing structures (see Section 2). The concept is demonstrated by means of a case study (Section 3). The investigated prestressed concrete bridge and the structure scanner system mounted to conduct ultrasonic and ground penetrating radar measurements automatically are shown in Figure 1. The use of nondestructive testing (NDT) results is emphasized because inspections are performed in many cases when knowledge about a structure to be assessed is qualitatively or quantitatively insufficient, when doubts have arisen about the available information, or, for example, when visual damage becomes apparent. Additionally, and in terms of bridge assessment, traffic loads are continually increasing, and changing climatic actions trigger material degradation. Thus, further damage (due to testing) should be avoided as far as possible. The utility of NDT in reliability assessment should be quantified and the potential of the technical developments in the past decades leveraged to establish NDT as reliable and valuable source of information for reliability assessments.
Compared to the rather scientific case study of a box girder bridge presented in Küttenbaum et al. 9,10 which has been verified in the ultimate limit states shear and bending using measured geometrical quantities, this paper deals with decompression, that is, a serviceability limit state that frequently appears decisive for prestressed concrete bridge assessment in practice. Another improvement is the calculation of measurement uncertainties attributed to the nondestructively measured mounting depths of tendons in relation to the measuring surface.

| CONCEPT AND SIGNIFICANCE OF MEASUREMENT UNCERTAINTY IN STOCHASTIC MODELING
The concept for the reliability assessment of existing structures using measured data is outlined in Figure 2. The strategy consists of four steps, on which the structure of the case study in Section 3 is based. The definition of the limit state(s) and the modeling of the initial basic variables (initial, as they are based on the information available prior to testing) serve as the starting point. Based on this, the preliminary investigations are performed and analyzed. This involves an extended, distribution parameter-specific sensitivity analysis. The result of the first step (Section 3.1) is the reliability-based, that is, individually purposeful definition of crucial basic variables to be measured since they significantly influence reliability. In addition, requirements on the measurements, such as a maximum permissible measurement uncertainty or limits of structural properties, can be derived from the preliminary reliability analysis. The provision of evidence that the application of a specific measurement procedure meets such specified requirements can be referred to as validation 12 and is demonstrated in Section 3.5.
The inspections to be performed to measure the quantity of interest (measurand) defined in the first step with the specified accuracy is planned, conducted, and analyzed in the second step. An essential component of the measurement evaluation is the measurement uncertainty calculation, which will be discussed in more detail below and is based on the internationally harmonized and accepted Guide to the Expression of Uncertainty in Measurement (GUM)-framework. [13][14][15] The objective is to compute a measurement result consisting of a (representative) measured value and an uncertainty attributed to this value (Section 3.2). With regard to the ultrasonic and ground penetrating radar (GPR) inspections emphasized in this paper, it should be noted that the quantification of accuracy in locating construction elements inside the concrete such as reinforcement or tendons implies that the objects of interest could be reliably detected objectively. The development of probability of detection (POD)-curves can yield valuable conclusions in this context. [16][17][18] POD is delimited in the present article.
In a third step, the NDT-supported basic variable is modeled using the measurement result(s). Principle challenges in stochastic modeling such as the choice of a suitable distribution family, the tail-sensitivity-problem, competing models, statistical uncertainties and correlation have to be appreciated. Furthermore, a consistent interface between metrology and reliability analysis is needed. How can we link measured values and measurement uncertainties to the distribution parameters of the basic variables? Which types of uncertainty have to be covered in addition to the measurement uncertainty? How to ensure the comparability of the measurement data-based basic variables? The associated considerations can be found in this chapter and in Section 3.3. The measurement data-based basic variable is then incorporated into the reliability analysis instead of the corresponding initial stochastic model (fourth step acc. to Figure 2, Section 3.4). The assessment of an existing structure using measured data can in turn be the starting point for the definition of further measurands. The First Order Reliability Method (FORM) is applied both in the preliminary investigations and in reliability analysis using measured data.
Stochastic modeling is considered a main issue in reliability assessment. The standardization of a measurement data-based stochastic modeling procedure appears necessary in order to provide the basis for a consistent and homogeneous modeling and decision-making process incorporating information measured on site. Up to this point, measurement uncertainty has not been decisively integrated into the probabilistic modeling recommendations.
From the metrological point of view, a measured value to which no measurement uncertainty has been assigned is useless. The calculation of measurement uncertainty serves to establish confidence in F I G U R E 2 Concept for the reliability assessment of existing structures using measurement data; extracted from Küttenbaum 11 , translated measurement, to ensure the comparability of measurement results and to express the quality, that is, trueness and precision, of the information measured about a characteristic. In the context of modeling basic variables to be used in assessment, two central requirements on stochastic models can be met by adequate measurement uncertainty considerations: verifiability and comparability. Moreover, a measurement result is required to be unambiguously expressed and transparently documented. Thus, the objectivity is assured in the sense that the calculated results as well as the models, input quantities, and assumptions underlying the measurement uncertainty considerations are deniable.
With the Guide to the Expression of Uncertainty in Measurement, 13 its supplements, and further recommendations, such as those recently given in Joint Committee for Guides in Metrology, 19 metrology provides an internationally harmonized, flexibly applicable, and broadly accepted framework for measurement uncertainty calculation. The metrological terms are defined in the international vocabulary of metrology (VIM). 12 In principle, a model of the measurement has to be formulated, which consists of different input quantities that influence the outcome of the measurement or are necessary for calculating the measurement result. These (in most cases random) variables can be mathematically related to each other in the form of an explicit model equation. Inserting the best estimates of the input quantities into the model equation leads to the measured quantity value (of the measurand). The application of the error propagation law to the model equation yields the measurement uncertainty. The concept is discussed in more detail and applied to the specific case study in Section 3.2.
The calculation (and appreciation) of measurement uncertainty should become an integral part in measured data-based stochastic modeling of basic variables. On the one hand, probabilistic models are required to cover all types of uncertainty relevant for the assessment. 20 In general, coverage of different types of uncertainties in basic variables may be necessary. These include aleatoric uncertainty, that is, the inherent natural variability of the characteristic, 21 and epistemic uncertainty. Their differentiation is not necessarily straightforward. However, model uncertainties, measurement uncertainties, as well as statistical uncertainties may be characterized epistemic. 22 More detailed information on uncertainties to be conceivably covered can be found, for example, in Kiureghian and Ditlevsen, 23 Kiureghian, 24 Faber. 25 On the other hand, it has been found that the measurement uncertainty contributes significantly to the uncertainty to be represented in stochastic models of measured characteristics, at least in NDT on concrete with ultrasonic and GPR methods. 11 Even though the statistical uncertainty can take on significant values, 26 the number of observations in NDT is large in various cases. Thus, the statistical uncertainty may be considered negligible. This finding is consistent with the metrological view that statistical uncertainty is commonly insignificant. Further, in relation to the other uncertainty contributions captured in the model of a measurement, the definitional uncertainty arising from the limited level of detail of the measurand definition (corresponds as type of modeling uncertainty to the lower limit of measurement uncertainty) is considered negligible according to the GUM-framework. 12 It should be conclusively mentioned that a good or rather useful measured data-based probabilistic model should cover the uncertainty associated with information acquisition and processing besides the uncertainty quantifying the inherent natural variability of the considered characteristic. The measurement uncertainty describes the limits of an interval containing the (generally unknown) true value of the measurand with a certain probability, and is epistemic, provided that an alternative exists to obtain the information (different testing methods, etc.). A stochastic model that has been created based on observations on site and that does not cover the uncertainty to be attributed to the information acquisition and processing appears to be equally useless as a measurement value to which no measurement uncertainty has been attributed to.
The reliability analyses in the present research work were performed using the First Order Reliability Method (FORM) proposed in Hasofer and Lind 27 and refined by i. a. Rackwitz and Fiessler, 28 Hohenbichler and Rackwitz. 29 The requirements for the application of this approximation method and information about the transformation between the original (x-)space and the standard (u-)space can be taken from Spaethe 30 Rackwitz and Zilch 31 Michael Hohenbichler and Rackwitz 29 Der Kiureghian and Liu. 32 The procedure for probabilistic reliability analyses of cross-sections can be found, for example, in Faber. 22,33,34 The right-hand term in Equation (1) describes the approximation solution of the probability of failure P f according to FORM.
The solution is based on the search of the value of the (geometrical) reliability index β. Since the joint probability density function f X x ð Þ of the random vector X (and also the limit state function) cannot be known exactly in practice, the measures of structural reliability should be considered as estimators whose values depend on the accuracy of the parameters incorporated into the reliability analysis. Roughly speaking, both the result and the validity (in terms of trueness and precision) of a reliability analysis depend on the quantity and quality of the included relevant information about the analyzed system. Methods for calculating a predictive reliability index that incorporates the uncertainties attributed to the parameters of stochastic or physical models are presented in Der Kiureghian, 35 where the inclusion of additional information has been found to more likely increase than decrease the value of the predictive reliability index. Furthermore, the uncertainties associated with the estimated values of P f or β can be reduced by reducing the uncertainties in the parameters of the input quantities, 35 that is, also by incorporating relevant and accurate measurement results. The values reported in the present paper quantify the reliability index according to Hasofer and Lind 27 : In Equation (2), u Ã is the most likely failure point (β-point) and u k k the corresponding Euclidean norm. Search algorithms have been developed to determine this point, which can be found in, among others,. 28,36 The sensitivity coefficients and the elasticities discussed in Sections 3.1 and 3.4 have been calculated computeraided. 37 The sensitivity coefficients allow conclusions to be drawn about the stochastic significance of the considered basic variables. Further information can be found, for example, in Rackwitz and Zilch, 31 Hohenbichler and Rackwitz, 38 Ditlevsen and Madsen. 2 The elasticities facilitate distribution parameter-specific conclusions.
European guidelines that mention the use of probabilistic methods in assessment include but are not limited 39 to the German assessment guideline 40 with its supplements, 41,42 the Austrian, 43 the Swiss, 44 and the Danish 45 sets of regulations. An example for a level four assessment according to the German guideline including a probabilistic assessment can be found in Morgen et al. 46 3 | PROBABILISTIC ASSESSMENT OF A PRESTRESSED CONCRETE BRIDGE USING NDT-RESULTS 3.1 | Bridge, limit state, initial stochastic models, and pre-investigation The investigated bridge is a longitudinally and transversely prestressed concrete structure with four spans, is located in Northern Germany and carries a four-lane federal highway over a park. The slab-and-beam cross-section with its two longitudinally haunched main girders is broader than 23 m and widens to the west towards the adjacent junction. In relation to the gauge of the bridge and the height of the beams (approx. 1.20 m up to 1.60 m in the pier area), the cross-section was constructed comparatively flat. The slab height is reported to be less than 50 cm in most areas. The length of the bridge is 95.80 m. The views and cross-section of the bridge shown in Figure 1 can be found in Figure 3. The structure was built in 1980.
During the assessment, the serviceability limit state (SLS) decompression was found to be decisive in transverse direction. The decompression proof serves in a broader sense to ensure the durability of the structure. The main objective is to protect the tendons against corrosion 47 and stress corrosion cracking, respectively, by excluding concrete cracking due to tensile stresses in a certain area around the tendons, at least mathematically. It should be noted that the decompression proof is occasionally performed very precisely in design for economic reasons, since in practice the calculation results often determine the number of tendons to be installed.
Initially, the semi-probabilistic assessment in SLS decompression was attempted using a girder grillage model. Based on this, the proof could not be successfully performed in transverse bridge direction. For this reason, a three-dimensional finite element (FE) model consisting of shell elements was developed (see Figure 4a,b). The main advantage of the shell model, that is, that the areal load-bearing behavior is accounted for, yields lower values for the internal forces in transverse direction compared to the grillage model. The semi-probabilistically determined tensile stresses are plotted in Figure 4c for the decisive cross-section in bridge center within the representative 1-m-strip on which the assessment is concentrated. Due to inconsistencies in the information available prior to any inspections, it could not be decided sufficiently certain whether the transverse tendons are located above or below the vertical center of the crosssection. The tensile stresses calculated on the basis of the two conceivable model variants differ noticeably according to Figure 4c (cf. stress flows for options 1 and 2). Although the first variant results in tensile stresses occurring on the upper slab surface, the position of the transverse tendons below the cross-section center implies that the simplified decompression proof would have to be performed on the slab undersurface, where no tensile stresses have been identified. In the second model variant, the tendon is located above the center of the crosssection. Tensile stresses do not occur in this case (option 2 in Figure 4c). In order to evaluate the validity of the competing prior information about the tendons and validate the results, the vertical position of the transverse tendons was to be measured nondestructively in crucial cross-sections. The shell model shown in Figure 4a,b was used to calculate the internal forces for the probabilistic assessment in SLS decompression linear-elastically assuming the tendon position according to option 2 in Figure 4c.
The calculated characteristic values of the internal forces and moments were converted into probabilistic models using common approaches. Since the shell model was developed to perform the assessment according to level 2 of the German assessment guideline (see Section 2), the loading assumptions provided in Eurocode 1 49,73,75 are considered. The traffic loads are represented using load model 1 (LM 1). With respect to the return period of once per 1000 years, the quantile values correspond to 99.9%-fractiles. This reference was considered too conservative in Germany, so that the adjustment factors were modified and the tandem load acting on the third lane has been delimited within the national application document. 50 For this reason, the LM 1 considered in the present assessment corresponds to a return period of one time in 50 years and yields approx. 98%-quantile values. The associated reference period has been implicitly modeled using extreme value distributions representing the internal forces due to traffic stochastically.
In this case study, the limit state function is developed following the design equations given in Eurocode 2. [51][52][53] The standardized equations provide the basis for the probabilistic reliability assessment with regard to single cross-sections. The stress analysis is performed time-invariantly. Partial safety factors were not intended to be modified on the basis of the conducted probabilistic calculations. Creep and shrinkage are considered finished (t ! ∞). The limit state function is: where N and M are the sums of the normal forces and of the bending moments calculated using the FE shell model, A is the cross-section area, W the section modulus, z p is the lever arm between the vertical center of the F I G U R E 3 Views on and standard cross-section of the investigated bridge (dimensions stated in meters); extracted from Küttenbaum 11 , translated investigated cross-section and the tendon axis, d Sp,y is the spacing between the bottom of the slab and the tendon duct, ϵ the eccentricity of the strands inside the duct, and h is the height of the cross-section. Both the detailed descriptions of the quantities used in Equation (3) and the initial stochastic models are given in Table 1. Their modeling is based on the information available prior to any measurements on site.
The results of the preliminary investigations based on Equation (3) and on the models provided in Table 1 are plotted in Figure 5. The reliability index is β ≈ 5:2 (FORM result equals to SORM result), which is significantly larger than the target value β target ¼ 1:5 (reference period T ¼ 50 a) acc. to EN 1990. 59 This finding is consistent with the results of the comparative deterministic analysis in which the concrete around the tendons was obtained to be entirely under compressive stresses (cf. Figure 4c), option 2). The computed stress in transverse direction on the upper slab surface was found to be σ y,up ¼ À6:96 MPa. The probability of concrete tensile stresses occurring at the upper edge of the slab is P f ≈ 10 À7 . The target reliability value chosen in this case was basically defined for new structure design. Approaches for the optimization of target reliability levels considering the expected costs over the numerical lifetime of a structure, with respect to deviating reference periods, and with regard to the consequence classes are presented in Holicky et al. 60 The vertical position of the tendons significantly influences reliability. Both the eccentricity of the strands inside the ducts ϵ and the distance between the bottoms of the slab and of the tendon ducts d Sp,y can be assigned sensitivity coefficients with comparatively large values α r,ϵ ¼ 0:5 and α r,dspy ¼ 0:74, respectively (cf. initial sensitivity analysis in Figure 5). The elasticity of the mean of d Sp,y is noticeably larger than the corresponding value of ϵ because the calculated values are related to a 1% change in the considered distribution parameter and the mean value of d Sp,y is larger (cf. Table 1). The crucial internal force is the normal force due to prestressing N P . The elasticities of the standard deviations e σ,i indicate that the F I G U R E 4 View of the finite element shell model; a) isometric drawing; b) modelled longitudinal and transversal tendons; c) computed tensile stresses in transverse direction s y =MPa for the same investigated cross-section based on the competing information available prior to testing; geometrical dimensions stated in cm; background in c) visualizes the maximum tensile stresses at the upper slab surface regarding option 1; 4a, 4b extracted from (Internal report, 2016)48; crosssections and stress flows based on (Thierling, 2020) reduction of uncertainties represented in the geometric quantities d Sp,y and ϵ leads to a significant increase in reliability (see elasticities in Figure 5). The functions of the reliability and the failure probability, respectively, against the coefficient of variation V and the mean value μ of the vertical tendon duct position, that is, d Sp,y , plotted at the bottom of Figure 5 are consistent with the findings mentioned above. The parameter study of the mean indicates that the values of β and P f still change significantly even with a larger shift in the tendon position. Such parameter studies, that is, the successive variation of individual distribution parameters, facilitate more global conclusions than the sensitivity analyses based on alpha values. They are feasible at least for normally distributed basic variables since both distribution parameters are independent of each other.
The stochastic significance of the vertical transverse tendon position and the large range of variation in reliability due to a (without testing possibly undetected and mathematically perhaps unfavorable) deviation of the T A B L E 1 Initial stochastic models based on the information available prior to testing; according to 11 Abbr. Description  Figure 5 facilitate the specification of requirements on the measurements based on the results of the preliminary reliability analysis. In this specific case, the objective is to quantify a maximum permissible uncertainty T MPU to be represented in the stochastic model of the measurement-based basic variable d 00 Sp,y . The validation in Section 3.5 consists of the comparison of this upper limit value T MPU with the uncertainty achieved. The quantification of the value of T MPU can be based on flexible criteria, for example, on a minimum value requirement for the numerical reliability after including the measured information. It is evident that validation using this criterion is likely to fail in the case of an adverse bias in the initial stochastic model even when the calculated measurement uncertainty is arbitrarily small. In this paper, two other validation criteria are used to specify the requirements. First, minor errors in the calculation of measurement uncertainty should not have a disproportionate impact on reliability. That is why a robustness criterion (in the sense of stability of the results to small errors in the models of the input quantities) has been defined. In this specific case, a 1% change in the uncertainty covered in d Sp,y should not lead to reliability variations greater than 5%. In principle, this limit value can be defined individually considering the investigated structure and limit state, respectively, and depends on the risk awareness of the assessing engineer. The suitability of the value chosen in this specific case study is to be proven in view of the comparatively high structural reliability in SLS decompression by evaluating a number F I G U R E 5 Results of the individual pre-investigation, comprising the sensitivity coefficients (top), the elasticities of the mean and of the standard deviation of the basic variables, and the functions of reliability against the distribution parameters of the spacing between the bottom of the slab and of the tendon duct (bottom); extracted from Küttenbaum 11 ,translated of other assessment scenarios in subsequent works. Second, it is required that the uncertainty covered in the measurement-based basic variable has to be smaller than or equal to the initially modeled uncertainty. In this individual case, the robustness criterion has been found to be decisive in determining T MPU . Since the gradient of the reliability index against the coefficient of variation of d Sp,y is greater than 5% when V dspy ≥ 2%, it follows that Figure 5, bottom right). It should be noted that such low values of measurement uncertainty are rarely calculated when applying ultrasound or GPR to localize the (relative to the measuring surface) axial position of a single construction element inside the concrete.

| Measurements and measurement results
The vertical position of the transverse tendons is measured using both the ground penetrating radar (GPR) and the ultrasonic pulse echo method. Since the measuring surface is spanned on the undersurface of the slab, the quantities d S,i,y (GPR) and d Sp,i,y (ultrasound), which are referred to as sampling points in this paper, describe the distance between the lower edges of the tendons and the concrete undersurface. The time signals recorded in a certainly small area (biaxial a few centimeters) around an analyzed measuring position are appreciated to calculate these sampling points. The reason is that localization first requires the reliable detection of a reflector. For this, in turn, data must be recorded and evaluated at equidistant measuring points around the decisive cross-section in reliability assessment. A sampling point is calculated for the ith tendon at location y in transverse bridge direction. It should be noted that, for example, the measured spacing between the transverse tendons can also be incorporated into the FE model and the reliability analysis, respectively. General information on nondestructive testing methods for civil engineering, on the ultrasonic technique, and on GPR on concrete can be found in ACI 228.2R-13 61 , IAEA 62 and Gucunski et al. 63 . The individually performed GPR measurements are described in Küttenbaum et al. 64 and taken up in this paper for comparative purposes. The following discussion focuses on the ultrasonic measurements exemplarily. The measurement models used to derive the individual measurement uncertainties were developed in Küttenbaum 11 , where detailed information about the individual testing on site, further measurement models suitable to provide orientation for future and comparable measurement scenarios, and a comprehensive discussion of the calculations can also be found.
The measurements were performed at a center frequency of f ¼ 55 kHz. The sampling rate is f s ¼ 1 MHz and the measuring point distance is two centimeters in both lateral directions. Commercially available bistatic array transducers, each consisting of 12 parallel-connected transmitting and receiving dry point contact probes, 65 and structural scanners developed at BAM (see Figure 1, right) were applied. The imaging of the data measured over half the cross-section width is shown in Figure 6 including indications of four transverse tendons inside the slab and various longitudinal tendons inside the main girder.
In the following, it will be shown how a (quantitative) measurement result, whose quality is evaluated and whose comparability is ensured, can be derived from such (qualitatively) imaged, nondestructively measured findings. For this purpose, the concept of calculating measurement uncertainties according to GUM 13 will first be briefly outlined.
The objective is to stochastically model the measurand Y by computing the measurement result. One part is the calculation of the best estimate of the measurand b y representing the measurement result (measured quantity value). Because of a certainly existing lack of knowledge, this value is generally considered as approximation of the purely theoretical true value of the investigated characteristic. Thus, there is basically an uncertainty associated with the measured value b y, which we can refer to as measurement uncertainty. By definition, the measurement uncertainty quantifies the dispersion of the values assigned to the measurand based on the incorporated information. 12 The key part in GUM-framework and the prerequisite for the calculation of the measurement result consisting of the measured value b y and the attributed measurement uncertainty, is the modeling of the measurement. Since a variety of components may contribute appreciably to the measurement uncertainty, the measurement model is composed of a number of input quantities. These quantities are usually treated as random variables and characterized by certain probability distribution functions. 66 The input quantities can be denoted by X i . The functional relationship of these input quantities X i can be formulated in the form of an explicit model equation: The GUM provides two types of evaluation for the quantification, that is, the stochastic modeling, of the identified and relevant input quantities X i . The evaluation of measurement series using statistical methods is termed Type A evaluation and presupposes that the included observations are independent, identically distributed (iid). This requirement can be at least approximately met for ultrasonic and GPR measurements by considering time signals recorded in a certainly small area around the measuring point of interest (sampling point). The Type B evaluation of the input quantity is based on nonstatistical methods. Scientific judgments are permissible, which may be founded on subjective information. Accordingly, knowledge available prior to testing can be processed and the requirement formulated in ISO 2394 20 that the incorporation of subjective information in uncertainty quantification shall be feasible is fulfilled. Regarding the choice of a distribution type in Type B evaluation, reference to the principle of maximum entropy 67 may be useful. Especially if the number of observations is limited, the application of statistical methods may lead to less reliable results compared to Type B evaluation. Overall, both evaluation types A and B count as equal.
Regarding Type A evaluation, the sample mean x is considered the best estimate b x of a (directly measurable) input quantity in many cases, provided that systematic measurement errors b have been corrected.
The standard measurement uncertainty u b x ð Þ is to be attributed to the best estimate b x of an input quantity, can be interpreted as standard deviation of this mean σ X and is calculated by dividing the sample standard deviation S F I G U R E 6 Imaging of the ultrasonic measurement data with indications of four transverse tendon ducts inside the slab, of various longitudinal tendon ducts inside the beam, and of the upper concrete edge; extracted from Küttenbaum 11 ,translated and the square root of the number n of independently observed measured values: A standard deviation of a parameter generally expresses the expected uncertainty in the estimate of that parameter . 2 Thus, the standard measurement uncertainty in Equation (6) may be taken as a measure of how well the mean of the observed values approximates the expected value of a (normally distributed) measurand. 13 It characterizes the dispersion of an estimator 68 or, more specifically, the accuracy of the best estimate of the measurand. The standard deviation of the mean σ X appreciates the convergence behavior of the mean against a theoretically exact value. It can be interpreted in such a way that the true value falls into an interval x À σ X ; x þ σ X À Á at a level of confidence of, for example, 68%. Thus, σ X characterizes the scattering behavior of the characteristic of interest, that is, of the directly measurable quantity. The "more common" standard deviation σ X , on the other hand, describes the dispersion of observations (in the case of normal distribution around the mean) and can be interpreted such that, for example, approximately 68 values out of 100 future individual observations will be included in an interval x À σ X ;x þ σ X ð Þ . Consequently, future individual observations are predicted. Basically, when modeling an input quantity, the objective is not to predict future observations, but to describe the quantity to be measured, that is, a characteristic. This is also the purpose in modeling basic variables. A distribution characterized by σ X facilitates the prediction of what values the characteristic to be measured will take on a given level of confidence based on the incorporated information provided the characteristic relates to the mean. Thus, the choice of the standard deviation of the mean is consistent to the purpose in this paper.
The choice of a normal distribution for Type A evaluated input quantities can be justified by the central limit theorem. In this specific case, the number of observations is comparatively large, since NDT was applied. In other cases, it is conceivable that the t-distribution is better suited to describe a directly measured quantity.
The best estimate of the measurandthe measured quantity value b y-is calculated by inserting the best estimates of the input quantities b x i into the model function expressed explicitly in Equation (4).
Conclusively, the error propagation law is applied to the model equation to derive the combined standard measurement uncertainty u b y ð Þ: In Equation (8), the empirical covariance of two input quantities is denoted by u b x i ,b x j À Á , and the sensitivity coefficient associated with the input quantity X i by c i . These coefficients correspond to the slope of the linearized model equation at the operating point and are calculated from the partial derivatives of the model equation with respect to the individual input quantities at the coordinates of the best estimates b x i .
The combined standard uncertainty u b y ð Þ expresses the measurement uncertainty as an estimated standard deviation of the measured quantity value b y. In metrology, the central limit theorem is often cited as a justification for the choice of the normal distribution as representation of Y. As already provided within the GUM-framework 14 , additional Monte-Carlo-Simulation results were used in the present case study to verify this choice. The introduction of the expanded measurement uncertainty is delimited in this paper. Further information can be found in Joint Committee for Guides in Metrology. 13 The individual model function applied for calculating the vertical position of a transverse tendon using the ultrasonic echo technique is given in Equation (9). The measurand, that is, the vertical position d Sp,i,y of the ith tendon in direction of the transverse bridge axis y, is modeled as a function of the travel time T of the pulse and of the propagation velocity C T of the elastic wave inside the measuring object: where The symbols used in Equation (9) are explained in Table 2. The underlying evaluation types and the developed stochastic models representing the input quantities can be found in Table 2 as well. The relevance of the contributing uncertainty components are shown for both the individual ultrasonic and the GPR measurements in Figure 7.
The formulation of a stochastic model representing an input quantity is illustrated subsequently using one example each for the Type A and the Type B evaluation. The aim of a time-of-flight measurement is to determine the time span needed for a pulse to travel a certain distance within the measuring object. A recorded time signal contains (at least partly) in addition the time span required to generate, transmit, and sample the signalthe so-called lead time T V . The systematic error due to the recorded lead time was estimated and corrected based on a Type A evaluation, that is, laboratory measurements. Areal measurements were carried out on reinforced concrete specimen whose properties are representative for the investigated bridge. The idea was to estimate the lead time on the basis of the time marks of the backwall echo T ME 1 and the time marks of the multiple reflection of the backwall T ME 2 . Bandpass filtered raw data were evaluated and the maximums of the envelope (ME) according to Hilbert picked. The measuring series T ME 1 $ N 337:38 μs; ð 0:119 μs) and T ME 2 $ N 646:93 μs;0:221 μs ð Þ were derived from n 1 ¼ 1:938 and n 2 ¼ 1:392 observations, respectively. Calculating the difference T V ¼ À T ME 2 À 2T ME 1 À Á yields the best estimate b t V ¼ 27:83 μs and applying Equation (8) the standard uncertainty u b t V À Á ¼ 0:33 μs assuming T V $ N. In this way, the lead time is modeled using statistical methods (Type A) for the individually used equipment and considered material.
Another input quantity can be traced back to the circumstance, that the spring-mounted probe is pressed onto the concrete surface during ultrasonic testing. In principle, the measuring surface is considered as a flat reference to specify a perpendicular depth position. Although surface irregularities might be recorded in the measuring series T A ð Þ, the indicated depth positions of the reflectors would shift when incorrectly assuming a flat reference surface. In the present case, the modeling of the imperfections of the measuring surface D Sp,U based on standardized tolerances as limit values seems too conservative, since no irregularity has been visually observed on site. Instead, a deviation compared with an ideal reference surface of ΔD ¼ AE5 mm is estimated. Since only two boundary values can be derived from this estimation, a uniform distribution with E D Sp,U À Á ¼ 0 cm and u b d Sp, U ¼ 2ΔD=ð2 ffiffi ffi 3 p Þ ≈ 0:29 cm is chosen based on the principle of maximum entropy. In the case that such a simply via Type B evaluation determined model should be insufficient for the individual purpose, it can generally be refined by, for example, measuring the irregularities on site.
The computation of the individual ultrasonic measurement results is based on the GUM concept outlined above, the input quantities provided in Table 2, and T A B L E 2 Stochastic models of the input quantities in Equation (9) used to determine the vertical position of a transverse tendon in the decisive cross-section in the center of the bridge 11

Abbr. Description
Evaluation type The systematic measurement errors marked with an asterisk were corrected during the reconstruction of the measurement data, that is performed to derive spatially resolved volume information as well as the imaging shown in Figure 6 in excerpts, as they influence the quality of the focused indications.
F I G U R E 7 Uncertainty balancesensitivity coefficients attributed to the single uncertainty components Equation (9). The plot in Figure 8 shows the calculated measured quantity values b d S,i,y (GPR) and b d Sp,i,y (ultrasound), each quantifying the vertical position of the lower edge of the ith tendon duct related to the slab undersurface. The values correspond to sampling points spaced Δy ¼ 50 cm between the center of the cross-section and one of the main girders. The position y ¼ 0 cm (crosssection center) is investigated for the subsequent use in reliability assessment (Section 3.4). In this paper, the second tendon shown in Figure 6 (areal perspective) from above is discussed representatively. The combined standard uncertainties of the sampling points were determined to be u b d Sp,i,y ¼ 6 mm…7 mm depending on the position in y-direction, that is, on the mounting depth.
The ultrasound and GPR results are largely consistent with each other. A significant difference has been found for the tendon position at the center of the cross-section (y ¼ 0 cm in Figure 8). In Küttenbaum 11 it is shown that the values measured over a range of 30 cm in y-direction are not covered by the overlap of the coverage intervals spanned vertically around the radar and ultrasonic measurement values. These intervals are assumed to contain the value of the measured characteristic in this specific case with a probability of approx. 95%. A conceivable reason for the difference is the relatively large spacing between the GPR antenna and the measuring surface on site of locally (especially in the cross-section center) several centimeters. The bias of the GPR result referring to the values based on the ultrasonic measurements can be traced back to the robustness in ultrasound testing with respect to the "roof-shaped" edge in the cross-section center, as the transducers are applied directly onto the concrete surface. The measured values can be verified by manual GPR measurements because the spacing between antenna and measuring surface then tends to zero. However, without additional knowledge it can only be decided arbitrarily which measured value is to be attest a greater validity. Thus, the GPR result competes in the cross-section center with the ultrasound result. One option for processing the competing models in assessment is to apply the principle of imprecise probabilities 69 as outlined in section 5.4.
The measurement results for the position to be assessed in y-direction can be found in Table 3. Both quantities can be adequately represented using a normal distribution as verified by the slight difference between the results based on simulation and on the conventional GUM-method. Conclusively, it should be mentioned that the correlations between the Type A evaluated input quantities estimated by the empirical covariance have no discernible influence on the values of measurement uncertainty in this particular case.

| NDT-supported basic variables
In order to facilitate the utilization of on-site measurement results in reliability reassessment, two research domains, that is, assessment of structures and metrology needs to be brought together. The starting point for the NDT-supported modeling of basic variables as proposed in Figure 2 is the measurement result expressed according to GUM (cf. Table 3). Although the tabulated results in this case study correspond to the NDT-based models of the basic variables, some general considerations should be made. The GUM provides a universally applicable method whose application yields comparable and revisable results that can guide comparable future at y ¼ 0 cm computed using the common GUM-approach acc. to the main document Joint Committee for Guides in Metrology 13 and comparison with Monte-Carlosimulation results (M-C-S; 10 7 runs); results extracted from Küttenbaum 11 Measurement method  13 This is advantageous because realistic values should run through the assessment process and not increasingly conservative values. Additional safeties can still be conclusively captured in decision-making regarding the structural reliability. Another argument in favor of the GUMapplication is that the evaluation of measuring series with common statistical methods does not expect to lead to workable solutions, 70 since the determined distribution then does not allow any inferences to be drawn about those realizations which have not been observed. In Thoft-Christensen and Baker 70 it is concluded that the reasonable approach is to synthesize the distribution of a random variable (as in GUMframework) from all available information on uncertainty components. GUM and FORM are not methodically merged, among others because an impracticable number of basic variables in assessment may arise, because the measurement results could no longer be verified intermediately and since the operating points in linearization of the limit state function and the model equation differ. The combined standard uncertainty u b y ð Þ corresponds to the square root of the variance of the distribution of the measurand. The expanded uncertainty U b y ð Þ, in turn, is an interval estimator and a multiple of u b y ð Þ. Computing such intervals may be useful. However, its calculation does not affect the shape of the distribution of the measurand. Thus, both the measurement uncertainty and the inherent variability of the characteristic, that is also captured in the measuring series, are covered by using u b y ð Þ as starting point for modeling the scattering behavior of the measurementsupported basic variable. Moreover, the measured quantity value b y is suitable to determine the expected value of the basic variable-especially if the assumption of a normal distribution is justified. Since a basic variable should cover all types of uncertainty relevant to describe a characteristic, 20 the additional incorporation of uncertainties related to modeling random variables or physical phenomena, human factors, and competing models, and also statistical uncertainty may be necessary in order to obtain an adequate representation of the characteristic being modeled. [22][23][24] Fundamental challenges in stochastic modeling for the calculation of very small probabilities such as the quantification of correlation and the tail-sensitivity-problem 23 should also be noted. The latter does not affect the present case study, since both the modeling recommendations (and thus the initial stochastic model) as well as the NDT-based model are represented by normal distributions. Nevertheless, guidance regarding the tails of basic variables, as required, for example, in Ditlevsen and Madsen, 2 and the distribution types, respectively, would be meaningful in order to prevent arbitrary decisions in modeling that may significantly influence reliability.
Another issue is the appreciation of prior knowledge. In this specific case, all information available prior to testing has been incorporated into the measurement uncertainty calculation. Further prior knowledge does not have to be processed, because time-invariant quantities are considered, and the sample size is comparatively large (due to the composition of the measurand from a number of uncertainty components and nondestructive testing). Further, the measured data comprehensively describe the characteristic of interest, that is, the vertical tendon position. A different situation may occur with composite measurands such as the center of a tendon bundle. The incorporation of prior knowledge (e.g., using the Bayes' theorem) may also be necessary if the information available does not facilitate a reasonable decision on which of many models is best suited to represent a characteristic. Competing models may exist in practice, for example, when two different measuring methods are applied, and different measurement results obtained (as shown in this case study). In this paper, the different variants of the models are processed via the principle of imprecise probabilities. 69 Both NDT-based models are entered successively, and the effects of choosing one out of two apparently equally suitable models are estimated by calculating reliability twice.
Regarding the specific modeling of the ultrasoundbased basic variable it should be added that the measured quantity value which is required to be corrected for systematic errors corresponds to the mean value of the normally distributed basic variable. The standard uncertainty u b y ð Þ covers the inherent variability and the measurement uncertainty as a standard deviation. The statistical uncertainty has been found to be less than 0.1 mm. 11 The additional coverage does not reveal any noticeable impact on structural reliability. In view of the tail-sensitivity-problem, an additional justification of the normal distribution (besides the central limit theorem and modeling recommendations 34 for geometrical dimensions) may be based on the finding, that the design value of the vertical tendon position in cross section center d Ã Sp,y in original space is enclosed by an interval bounded by three times the standard uncertainty u b y ð Þ around the measured value b y. There are no excessive doubts about the suitability of the distribution of the measurand to describe the characteristic of interest in a certain (physically meaningful) area around the best estimate b y.

| Reliability analyses appreciating the NDT-results
The NDT-based basic variables are listed below. These are successively implemented in the reliability analysis replacing the initial model of d Sp,y given in Table 1.
The effect of incorporating the measurement-based basic variables on reliability is: While the probability of concrete tensile stresses occurring on the upper edge of the slab decreases considerably when the GPR result is included, P f changes insignificantly when using the ultrasound-based model. The reason for the second finding is that the effects of the smaller mean value compared to the initial model (shift in the direction of the vertical cross-section center, that is, computationally unfavorable away from the upper edge to be verified) and the reduced uncertainty act in opposite directions. However, due to the comparatively large β-values in relation to target beta β t ¼ 1:5 defined in Eurocode 0 for SLS, RC2, and T ¼ 50 a, 59 the engineer's power of judgment is not unduly restricted by the two competing models. In case of doubt, it is generally possible to include the model with the less favorable effects. Nevertheless, a measurement may reveal that the tendons actually run on the other side of the vertical cross-section center than assumed prior to testing. Such observations are likely to have a substantial impact on reliability since the other (in this case the lower) edge of the slab would have to be verified in SLS decompression.
The computed sensitivity coefficients and both the elasticities of the mean and of the standard deviation after incorporating the NDT-based basic variables are plotted in Figures 9 and 10. First and in contrast to other limit states, the sensitivity attributed to the model uncertainties Θ is small. Second, the measured data-based variable d 00 Sp,y remains stochastically significant. Further, it can be derived from the relatively large values of the elasticities of h and ϵ, that the lever arm z p which is multiplied by the normal force N P to calculate the moment due to prestressing according to Equation (3) and which is a function of the cross-sectional height h, the eccentricity ϵ, and d Sp,y still represents a crucial parameter of the structure. The elasticities are consistent with the sensitivity coefficients in this regard.

| Validation of the NDT-procedures
Finally, the measurement requirement specified in Section 3.1 as maximum permissible uncertainty T MPU is compared with the uncertainties τ mod covered by the NDT-based models for the individual validation of the applied NDT procedures. The results of comparing the related coefficients of variation V are shown in Figure 11. The decision rule is binary. The validation can be performed successfully, if τ mod < T MPU ¼ 2%.
The suitability of the radar and ultrasonic measurement procedure for modeling the vertical tendon position cannot be successfully demonstrated because of the strict robustness criterion (in relation to common uncertainties in nondestructive measurement of mounting depths of construction elements such as reinforcement or tendons). Admittedly, the uncertainty in both NDT-based basic variables could be reduced to 4% and 4.8%, respectively, compared to the initial model (V ¼ 6:1% corresponding to T O initial ð Þ in Figure 11). However, the value of β changes by more than 5% (cf. Section 3.1) in the area F I G U R E 9 Sensitivity coefficients (squared values in brackets) prior to incorporating the measurement results (centered bar) and after including the ultrasound (top bar) and GPR results (bottom); extracted from Küttenbaum 11 , plots merged, translated around the achieved uncertainties for a 1% adjustment of the coefficient of variation V . The slope of the reliability plotted against V in Figure 11 is approx. 6% regarding ultrasound and 8% regarding GPR. It should be mentioned that the suitability of the arbitrarily chosen value for the maximum permissible gradient is to be proven by evaluating a number of case studies in subsequent research.
The robustness criterion arises from the demand that computed probabilities should be insensitive to small changes in the functions of the underlying probabilistic input models. This characteristic is referred to as robustness in this paper as well as in Ditlevsen. 71 Alternative definitions are given in Baker et al. 72 Nevertheless, the implementation of the measured information increases the engineer's power of judgment regarding the decision on structural reliability since the measurement-based models are more robust than the initial model and since crucial structure parameters should be mandatorily verified by measurements. The measurement capability index (see Figure 11) is used to compare the suitability of different NDT procedures to solve certain testing tasks and takes values of C m ¼ 0:11…0:125 for T MPU ¼ 2%. Accordingly, the suitability of the two measurement procedures considered in this case study differs only insignificantly. F I G U R E 1 0 Elasticities of the mean (black bars) and of the standard deviation (grey bars) after incorporating a) the ultrasound and b) the GPR measurement results (unfilled bars: initial result; filled bars: NDT-supported result; changes referring to pre-investigations in brackets); extracted from Küttenbaum 11 , translated F I G U R E 1 1 Change in reliability after incorporating the ultrasound (US) and ground penetrating radar (GPR)-based model of the vertical transverse tendon position and validation of the applied NDT procedures by comparing the uncertainties covered in the measurementbased stochastic models τ mod and the maximum permissible uncertainty T MPU ; extracted from Küttenbaum, 11 translated

| CONCLUSION
The sole basis for decisions is the information available about the considered system. 74 This article emphasizes a method for the preparation of measured data purposefully collected with nondestructive testing methods on site for the explicit use as basic variables in reliability assessment of existing concrete structures. The approach was demonstrated using a case study in Section 3 that deals with ultrasonic and GPR data measured in order to verify a prestressed concrete bridge regarding the serviceability limit state decompression using the NDT results. Even though the initially calculated reliability in the SLS was comparatively high, it should be mentioned that the authors applied the measurement procedures in order to validate key assumptions, minimize biases in the stochastic model of the tendon position and reduce the uncertainties to be covered in assessment. Especially the costs associated with the GPR measurements are moderate. The additionally conducted ultrasonic inspections are not necessarily required but were performed for comparison purposes. The total expense on site amounted to 1 week due to the limitation of the measuring areas to the critical cross sections on the basis of the stresses obtained from the FE analysis. A considerable utility of the testing already consists in the finding, on which side the tendons were mounted in relation to the cross-section center, to deduce whether the simplified decompression proof has to be performed at the top or bottom of the slab. The presented results and practical experience give reason to expect that displacements of the actual tendon position in comparison to the initially assumed will also have significant effects on the reliability in SLS decompression in assessment of other prestressed concrete structures. The suitability of the applied measurement methods was evaluated individually using a validation approach that is based on a specified maximum permissible uncertainty. The objective of the (ongoing) research presented in this paper is to establish NDT as a reliable source of information that can facilitate a more realistic reliability assessment of existing structures while minimizing further damage to the structure. In many cases, measurements are suitable to increase the validity of assessment results and the engineer's power of judgment regarding the reliability of a structure. In the best case, appreciating NDT results can extend the remaining service life of a structure, increase infrastructural availabilities, and optimize the consumption of resources.