Functional regression control chart for monitoring ship CO 2$_{2}$ emissions

On modern ships, the quick development in data acquisition technologies is producing data‐rich environments where variable measurements are continuously streamed and stored during navigation and thus can be naturally modelled as functional data or profiles. Then, both the CO 2$_{2}$ emissions (i.e. the quality characteristic of interest) and the variable profiles that have an impact on them (i.e. the covariates) are called to be explored in the light of the new worldwide and European regulations on the monitoring, reporting and verification of harmful emissions. In this paper, we show an application of the functional regression control chart (FRCC) with the ultimate goal of answering, at the end of each ship voyage, the question: given the value of the covariates, is the observed CO 2$_{2}$ emission profile as expected? To this aim, the FRCC focuses on the monitoring of residuals obtained from a multivariate functional linear regression of the CO 2$_{2}$ emission profiles on the functional covariates. The applicability of the FRCC is demonstrated through a real‐case study of a Ro‐Pax ship operating in the Mediterranean Sea. The proposed FRCC is also compared with other alternatives available in the literature and its advantages are discussed over some practical examples.

process control (SPC) that provides a suite of methods to continuously give a solution to the urging issue of evaluating the stability over time of functional quality characteristics. Recent contributions are Colosimo and Pacella, 5 Grasso et al 6 and Menafoglio et al. 7 As in the classical SPC, where data are scalars, profile control charts have the task of monitoring the functional quality characteristic and of triggering a signal when assignable sources of variations (i.e. special causes) act on it. When this happens, the process is said to be out of control (OC). Otherwise, when only normal sources of variation (i.e. common causes) apply, the process is said to be in control (IC).
Only recently, Centofanti et al 8 introduced the functional regression control chart (FRCC) framework to monitor a functional quality characteristic when this is influenced by one or more functional covariates. In particular, the aim of the FRCC is to improve the monitoring of the quality characteristic by including the information coming from the covariates. In this scenario, if one of these covariates manifests itself with an extreme realization, the quality characteristic may wrongly be judged to be OC, even though it shows perfectly reasonable values given the covariates. Otherwise, there may be situations where the covariates are not extreme and the quality characteristic may wrongly appear IC because the information in the covariates is not used to increase the power of the monitoring.
The FRCC framework is the functional extension of the basic Mandel's idea, 9 where the quality characteristic is monitored after being adjusted for the effect of covariates. That is, the control variable is the residual obtained from a regression of the quality characteristic on the covariates, and the focus is spotted on the residual variability not explained by the knowledge of the observed value of the covariates. In a more direct phrasing, the FRCC answers the question: given the value of the covariates, is the quality characteristic as expected? Alternatively, the question could be phrased as: does the assumed model fit the reality of the quality characteristic? If the answer is no, then special causes may have occurred that are beyond the information brought by the covariates through the chosen regression model. In particular, in the FRCC framework, the quality characteristic and the covariates are linked through a multiple functional linear regression model (MFLR), where both the response and the explanatory variables can be described by functional data. Recent examples of MFLR model can be found in Palumbo et al, 10 Centofanti et al 11,12 and Chiou et al. 13 The idea of monitoring model residuals arises also in SPC with autocorrelated data, for example, time series. [14][15][16] In this setting, residuals from an autoregressive model are used to recover independence of the observations, because conventional control charts are known not to work well if the quality characteristic exhibits even low levels of correlation over time. 17 In the regression control chart idea, residuals are used to adjust the quality characteristic for the information in the covariates.
In recent years, profile monitoring has emerged as an effective technique also in the field of maritime transport where the issue of CO 2 emission monitoring is becoming of paramount importance. 8,18,19 Indeed, in view of climate change and global warming crises, the maritime transport industry is currently facing new challenges related to CO 2 emissions. Indeed, the Marine Environment Protection Committee of the International Maritime Organization [20][21][22] has urged shipping companies to set up a framework for the monitoring, reporting and verification (MRV) of CO 2 emissions based on fuel consumption. In the face of these regulations, shipping companies are updating DAQ systems on their fleets, enabling large volumes of observational data to be automatically streamed and transferred to a remote server, bypassing human intervention. A large proportion of these data can be modelled as functional data, thus representing a new challenge for FDA and related SPC methods in this area. The DAQ system installed on modern ships facilitates in fact the collection of functional data related to CO 2 emissions as well as to other functional covariates affecting them, which include the speed of the ship, engine variables such as the propeller pitch variables, environmental variables that describe the wind and sea conditions or the cumulative navigation time (we refer to Section 2.2 for more details on the variables considered in this work). However, most of the approaches that have already appeared in maritime literature [23][24][25] do not take advantage of the potential help to managerial decision making represented by the modelling of the entire voyage profiles acquired and usually compress information in one or more scalar features extracted from them. In this setting, a recurrent request posed by maritime engineers is related to the CO 2 emissions corresponding to the given values of other recorded covariates. That is, they want to assess if CO 2 emissions are coherent with the values of the covariates, in order to identify unexpected behaviours and take corrective measures. Engineers are less concerned in identifying CO 2 emission profiles that are extreme with respect to their marginal distribution, if this can be explained by some extreme value in the covariates. They are rather interested in CO 2 emission profiles that are not consistent with covariates affecting them because, for example, they can reveal anomalous ship performance.
The FRCC can be used to meet this engineering need. In this paper, we propose to use the FRCC to monitor ship CO 2 emissions throughout each voyage, in order to identify special causes, at given values recorded by functional covariates. Specifically, we consider a particular implementation of the FRCC framework, where the functional quality characteristic, hereinafter referred to also as response, and the functional covariates are related through the multivariate functional linear regression (MFLR) model. In this paper, the MFLR model is estimated based on the multivariate functional principal components analysis (FPCA). 13,26 Then, studentized residuals are monitored through the simultaneous application of the Hotelling's 2 and the squared prediction error (SPE) control charts. [4][5][6]27 In this paper, the FRCC framework is used both retrospectively, as an aid to the practitioner to determine the IC state of the process under study and to identify an IC reference sample (Phase I), and, prospectively to monitor any departure from the IC state at future voyages (Phase II). The applicability of the FRCC is demonstrated through a real-case study of a Ro-Pax ship operating in the Mediterranean Sea, courtesy of the shipping company Grimaldi group. This work does not directly aim at a real-time feedback control, that is, in the online monitoring and immediate actions during an ongoing voyage. The FRCC framework is indeed used to signal OC voyages once they are completed, as 2 and monitoring statistics are going to be calculated on the entire profiles. Instead, the proposed FRCC focuses on tracking automatically (and possibly the whole fleet) the future OC signals, or patterns and trends that may identify, for example, engine malfunctioning, the need for hull cleaning or for any other energy efficiency initiative. Even though no action could be taken during an ongoing voyage, we believe this paper motivates the use of a functional data approach as the FRCC can be indeed used by shipping companies to evaluate, at the end of a voyage, if the observed CO 2 emission profile is anomalous, given the value of the observed covariates. This information can help to prevent and disguise possible problems in future voyages, or to schedule maintenance operations.
The paper is structured as follows. In Section 2, the structure of the data and technological details of the ship equipment are provided for the real-case study at hand. In Section 3, the main materials and methods behind the particular implementation of the FRCC framework are summarized. In Section 4, by means of the real-case study mentioned before, the FRCC is practically applied to monitor CO 2 emissions and compared with alternative methods already appeared in the SPC literature. In Section 5, we draw conclusions. All computations and plots have been obtained using the software environment R. 28

TECHNOLOGICAL BACKGROUND AND DATA STRUCTURE
For confidentiality reasons, in what follows we omit the name of the ship considered in the real-case study. However, to give an idea of the ship type, in Section 2.1, we provide its main technical features. Whereas, in Section 2.2, we describe in detail all the variables and the data used for the analysis.

Technical features
The main technical features of the ship are illustrated in Table 1. The considered ship is characterized by two engine sets, each consisting of two main diesel engines for propulsion Wärtsilä, Type 16ZAV40S, four-stroke, with a maximum continuous rating of 11,520 kW at 510 revolutions per minute (rpm) and by two variable pitch propellers and a shaft generator for electric power supply. The main engine power is used both for propulsion and electrical generation through the shaft generators, which are themselves keyed on a gearbox. The gearbox has two fast inlet shafts powered by the engine shaft, a slow outlet shaft for the propeller and a faster one to which the shaft alternator is connected. The gear ratio between the engine shaft and the propeller shaft is equal to 3.24, whereas the gear ratio between the engine shaft and the shaft alternator is equal to 0.32. The main diesel engine can be powered by three types of fuel with different percentage of sulphur (S) content in order to be compliant with the regulation in force on the geographical area to be sailed: heavy fuel oil, very low sulphur fuel oil (≤0.5% ), ultra-low sulphur fuel oil or marine gas oil (≤0.1% ). The electrical power supply of the ship consists of three diesel generators (1840 kVA, 690 V), two shaft generators (2100 kVA, 690 V) and one emergency diesel generator (480 kVA). The main engines can supply power in two different modes, at fixed rpm (constant mode) or at variable rpm (combined mode). In the constant mode, shaft generators can be used to supply electric power, even though the maximum speed cannot be reached because speed variations are only possible by changing the pitch of the propellers. Whereas, in the combined mode, the ship speed can be regulated by increasing both pitch propeller and engine rpm, but, vice versa, the possibility to engage the shaft generator is lost.

Data description
Data come from a DAQ system installed on the ship that are transmitted to the cloud with different frequencies varying from 2 to 5 min. The data refer to a specific route, that is, each profile in the data set corresponds to a voyage of the ship and all voyages have the same departure and arrival ports. A period of 11 consecutive months following a dry-dock operation on the ship is considered. Voyages in the first nine months (i.e. from the beginning of February 2020 to the end of October 2020) are used in Phase I, that is, to identify a reference data set, estimate the model and control chart limits. Voyages in the last 2 months (i.e. from the end of October 2020 to the end of December 2020) are used in Phase II to show the performance of the FRCC on monitoring new voyages. We start with a data set of 190 voyages for the Phase I (Section 4.1); then we remove outliers as described in Section 4.1.2 and we end up with a reference data set of 169 voyages; finally, we consider 22 voyages for the Phase II (Section 4.2). Note that the data refer to the navigation phase. More specifically, the navigation phase begins with the finished with engine order (when the ship leaves the departure port) and ends with the stand by engine order (when the ship enters the arrival port). Moreover, we need to identify an adequate functional domain for each voyage. Even if time is naturally suitable as a functional domain, total travel time can vary from voyage to voyage. Therefore, we prefer to use the fraction of distance traveled over the voyage as the common domain (0,1) of the data.
All the signals acquired by the DAQ system are summarized into several variables that we describe here to select later the functional covariates and response considered in the MFLR model. The ship is tracked by its global positioning system (GPS), which provides longitude and latitude coordinates. The course over ground (COG) is the actual direction of progress of a vessel, between two points, with respect to the surface of the earth, measured in degrees. The sailed distance over ground (SDOG) is the distance travelled by the vessel between two points, measured in nautic miles (NM), calculated from the GPS sensor through the Haversine formula. The speed over ground (SOG) is measured in knots (kn) and is the ratio between SDOG and the sailing time, measured in hours. The propeller pitch (P) is measured in degrees and represents the angle between the intersection of the chord line of the blade section and a plane normal to the propeller axis. An anemometer sensor provides data about the true speed ( ), measured in knots, and direction ( ), measured in degrees, of the wind. The latter is obtained as the difference between the true wind angle in earth system and the COG. Additional information on the wind variables can be found on Bocchetti et al. 25 From the two anemometer variables, the longitudinal component of the wind is calculated as cos , while the transversal component is calculated as | sin |. Note that a positive (respectively, negative) longitudinal component of the wind means that the wind blows from stern (respectively, bow). Moreover, a data fusion process also allows the integration of marine data into the data set, that is, weather forecasts about the sea state furnished by private held weather service provider. The sea state is characterized by the provider through the typical parameters, namely, height and period, used to model waves that, in turn, are roughly divided into two components: wind-driven waves, or simply waves (generated by the immediate local wind) and swell (generated by distant weather systems and usually having larger period). In particular, the height, measured in meters, is defined as the vertical distance from wave crest to wave trough; whereas, the period, measured in seconds, represents the time between successive crests of a train of waves passing a fixed point in a ship, at a fixed angle of encounter. 29 Regarding the CO 2 measurement, MRV regulation proposes direct and indirect methods. The direct method determines the amount of CO 2 emitted measuring the flow of these emissions passing in exhaust gas funnels. Instead, the indirect method calculates the CO 2 emissions based on the fuel consumption. The direct method is based on the determination of CO 2 emitted that flow in exhaust gas stacks based on the measurement of the CO 2 in the exhaust gas and the measurement of the volume of the exhaust gas flow per unit of time. This method is very sensitive to the calibration and the uncertainty related to the measurement devices. Whereas, the class of indirect method calculates CO 2 emissions as a product of the whole amount of fuel consumption of the main and auxiliary engines, boilers, gas turbines and inert gas generators times the so-called emission factor, which is calculated as the average emission rate of a greenhouse gas (GHG) relative to the activity data of a source stream, assuming complete oxidation for combustion and complete conversion for all other chemical reactions. In this paper, we use the indirect method and we focus on the main engines only.
In what follows, we list the functional variables chosen for the analysis in this work that are obtained from the signals acquired by the DAQ system. The functional response is the signal corresponding to the CO 2 emissions per hour along the TA B L E 2 Functional covariates used in the MFLR

Variable name
Unit of measurement entire voyage. In order to select functional covariates among the available signals, a very long preliminary investigation was carried out to identify the covariates that could better explain the CO 2 emissions. However, in practice, many signals that could have played the role of covariates were not able to be measured accurately. The intersection between the set of candidate and truly measurable covariates was finally identified after an intensive exchange of information and experience with marine engineers, shipping managers and operators. The following nine functional covariates have been identified, which are, thus, assumed as a characterization of the ship operational conditions: SOG, left propeller pitch, right propeller pitch, transversal and longitudinal component of the wind, wave height, wave period, swell height and derivative of the cumulative navigation time. Table 2 reports the complete list of the functional covariates considered in this work. Moreover, Figures 1 and 2 show the profiles of covariates and response, respectively, included in the reference data set used for model building and control chart limits estimation. In Section 4, we describe with more details how smooth profiles are obtained from the discrete observations acquired by the DAQ system over time, with 2-5 min frequency.

METHODOLOGY
The FRCC is a general framework for profile monitoring that can be divided into three main steps. 8 First, (i) define an MFLR model which links the functional response variablẽ, defined on the compact domain  and a vector̃of random functional covariates̃1, … ,̃, defined on the compact domain . Secondly, (ii) define the estimation method of the chosen model and thirdly, (iii) define the monitoring strategy of the functional residual, which is defined as the difference between the fitted value and̃. In what follows, we assume that̃1, … ,̃and̃have smooth realizations in 2 () and 2 ( ), that is, the Hilbert spaces of square integrable functions defined on ,  ⊂ ℝ.
A specific implementation of the FRCC can be obtained by assuming for the step (i) the following MFLR model: where and are the standardized versions of̃and̃, obtained through the transformation approach of Chiou et al. 13 The regression coefficient = ( 1 , … , ) , is a vector where 's are square integrable bivariate functions defined on the closed interval  ×  , and the random error function has zero mean and variance function 2 , and is independent of . For the step (ii), we use an estimation method based on the multivariate Karhunen-Loève's Theorem. 30 In particular, we assume that the standardized covariate and response variables can be represented as follows: An estimator of ( , ) can be readily obtained by considering the truncated version of Equation (3), that is with = E( )∕ , and , < ∞. Plugging Equation (4) into Equation (1), due to the orthonormality of the PCs and , we obtain where = ( 1 , … , ) , = ( 1 , … , ) , = ( 1 , … , ) and = { } =1,…, , =1,…, , with = ∫  ( ) ( ) . Therefore, the problem of estimating reduces to the estimation of the matrix , which can be obtained through least squares given independent realizations (̃,̃) of (̃,̃). Then, given the least-squares estimator̂of , the estimator̂of can be calculated aŝ wherê= (̂1 , … ,̂) ,̂= (̂1 , … ,̂) , witĥand̂= (̂1, … ,̂) estimators of and , respectively.
Finally, an estimator̂of iŝ= wherêare the entries of̂and̂= ∑ =1 ∫  ( )̂( ) . For the step (iii), we can define the raw functional residual as However, by following the remarks in Centofanti et al, 8 we shall better consider a scaled version of it, named studentized functional residual, and defined as The residual variance function is estimated asĈov ( −̂)( ) =̂2( ) +̂( , ), for ∈  , wherê2 is an estimator of 2 and̂is defined aŝ wherêis the estimator of the score vector of ,̂̂is the estimator of Cov( , ),̂is the estimator of the vector of the first eigenfunctions of , and̂is the estimator of Cov( ). As stated in Ref. [ 8 ], the use of the studentized functional residual, in place of the raw residual, is needed to reduce the effect of covariate mean shifts on the performance of the FRCC to identify OC condition of the quality characteristic, which can be large especially when the coefficient function is poorly estimated. In this case, it can be demonstrated that the interpretation of the FRCC becomes cumbersome, indeed a point falling outside the FRCC control limits could be wrongly assigned to a mean shift in the quality characteristic even though it shows a perfectly reasonable behaviour given the value of the covariates. This is problematic because the aim of the FRCC is to monitor the quality characteristic assigned the value of the covariates, and, thus, if a mean shift in the covariates causes an OC, the chart is saying that the value of quality characteristic disagrees with those of the covariates that, of course, it is not true. In this respect, the studentized functional residual is less influenced by covariate mean shifts than the raw residual. Indeed, the aim of Cov ( −̂) 1∕2 is to weight the raw residual on the basis of its uncertainty, such that for an extreme realization of , the residual is heavily scaled. Thus, the probability of the quality characteristic to be judged IC, when no special causes apply to the process, is larger than that corresponding to the raw residual. Note that, for a large sample sizes, consistently with the data set complexity, the studentized functional residual leads to the same results of the raw residual, because in this case it is independent of the values achieved by the covariates. Therefore, the use of studentized residual allows controlling the false alarm rate in presence of covariate mean shifts, which comes at a cost of reducing the power of the FRCC in identifying OCs in the quality characteristic, especially for extreme values of the covariates. However, this behaviour is inevitable because comes from the greater uncertainty in the model estimation originated by a limited number of extreme realizations in the reference sample.
We use a monitoring strategy based on the Hotelling's 2 and the control charts 4, 6 27, 32 applied to in Equation (9). In particular, the studentized functional residual is approximated as where the scores = ∫  ( ) ( ) and the PCs are the eigenfunctions corresponding to the eigenvalues in descending order of the covariance function of . The Hotelling's statistic 2 is obtained as follows: where = diag( 1 , … , ) is the variance-covariance matrix of = ( 1 , … , ) . Note that 2 is the squared distance of the projection of from the origin of the space spanned by the PCs standardized for the score variances. Analogously, changes along directions orthogonal to the latter space are monitored by the statistic The control charts are designed in Phase I by means of a set of functional studentized residuals , , = 1, … , , obtained by independent realizations (̃,̃) acquired under IC conditions. Phase I includes also the estimation of the MFLR model unknown parameters, the PCs and the matrix (calculated by means of the sample covariance) as well as the estimation of the control limits for both the Hotelling's 2 and the control charts. The latter can be obtained by means of the (1 − )-quantiles of the empirical distribution of the two statistics, where is chosen to control the overall type I error probability. In the monitoring phase (Phase II), the functional studentized residuals of new data are calculated and an alarm signal is issued if at least one of the corresponding 2 and statistics violates the control limits.

RESULTS AND DISCUSSION
In this section, we show the results of the application of the FRCC to the data set described in Section 2.2. In particular, the retrospective and prospective phases, that is, Phase I and Phase II, are described in Section 4.1 and Section 4.2, respectively. Moreover, in Section 4.3, a comparison with simpler monitoring approaches is shown. We have used the R package funcharts to build the FRCC and perform all the analysis shown in this paper. The package is available on CRAN at https://cran.r-project.org/web/packages/funcharts/index.html. Moreover, in the supplementary material we provide an R script and the data to reproduce the results in this paper. Note that the data have been scaled for confidentiality reason.

Phase I
Phase I comprises the recovery of smooth functional data from the discrete observations for each voyage (Section 4.1.1), the identification of the reference data set of IC voyages (Section 4.1.2) and, the estimation of the MFLR model as described in Section 3 (Section 4.1.3).

Data smoothing
The first step of the analysis is to get smooth functional data from the discrete observations for each voyage of the ship. We use B-spline basis expansion and penalized least squares to estimate the corresponding basis coefficients. A common approach is to set a quite large number of basis functions and then select the optimal smoothing parameter by minimizing the generalized cross-validation (GCV) error. 1 However, the number of available discrete points is above 200 for each voyage and, by following this approach, the GCV criterion leads to choosing the smoothing parameter equal to zero in practice for all functional variables. This is a typical problem of overfitting, as also pointed out by Reiss and Ogden, 33 which show that at finite sample sizes GCV is likely to develop multiple minima and undersmooth. Therefore, we encourage parsimony and achieve regularization by choosing a small, efficient number of basis functions, with the smoothing parameter fixed to a small positive value (i.e. 10 −10 ) to ensure identifiabilty. In Figure 3, we plot the GCV error against the number of B-spline basis functions. While increasing the number of basis functions reduces the GCV error, we select 25 basis functions for all functional variables as the elbow point of these curves.

Reference data set
Once functional data are obtained, in order to monitor a set of voyages in a considered period, it is necessary to identify a reference data set of previous IC voyages that can be used for model building and estimation of the Hotelling 2 and control chart limits. Starting from an historical data set, any voyage that is not representative of the IC conditions has to be removed from it. Specific Phase I techniques are designed for the problem of eliminating anomalous voyages from the historical data set and generally lead to the construction of control charts with different limits from the ones calculated in Phase II. However, the problem of selecting the best method to perform Phase I is beyond the scope of this paper, which is instead mainly focused on Phase II monitoring. In this paper, we use the same FRCC framework also in Phase I and based on experts' opinion, to establish which voyages are actually considered anomalous and have to be excluded. That is, we first use the historical data set to build the FRCC and estimate the control chart limits, and we plot the FRCC for the same voyages; then, voyages signalled as anomalous are carefully investigated by maritime engineers to understand if special causes occurred; in these cases, voyages are removed from the data set; the process is repeated until the final reference data set is obtained, which contains only voyages considered as IC. Starting from the initial historical data set of 190 voyages, at the end of this iterative process of identifying outliers, detecting anomalies, removing anomalous voyages and refitting the model, we get a reference data set of 169 voyages. Figure 4 shows the FRCC applied to the initial data set. The -axis label indicates the voyage number (VN) and is here intended to be a progressive counting label to denote subsequent voyages in the data set. Moreover, Figure 5 shows some OC studentized residual profiles correctly signalled as anomalous.

Model building
The FRCC relies on the choice of and in Equation (4), as well as in Equation (11). Figure 6 shows the cumulative fraction of variance explained by the functional PCs in the multivariate functional covariates, the functional response and the functional studentized residuals, respectively. Based on the results in Figure 6, we opt for a more parsimonious choice than that suggested in Centofanti et al 8  95%, respectively, that is, we select = 7, = 1 and = 8. The corresponding actual fractions of variance explained are 81%, 96% and 96%.
To allow for a possible interpretation of the selected functional PCs, in Figure 7 we plot the eigenfunctions of the covariance operator of the standardized multivariate functional covariates. Since they all have unit norm, we multiply them by the square root of the corresponding eigenvalues so that profiles with larger norm are PCs that explain a larger fraction of the total variance in the data. It can be readily envisaged that the first PC depends almost entirely on the two propeller pitch variables, the SOG and the navigation time. The latter is negatively correlated with the other variables, and for all these variables, weights are almost constant over the entire functional domain. The second PC strongly depends on the sea state descriptors (swell height, wave height and wave period) and only on the transversal component of the wind. These functional variables all have positive weight, with some parts of the domain showing slightly larger weight than others. The third PC seems to mainly depend on the longitudinal component of the wind alone. In other words, the first PC describes how fast the ship is moving, while the second and third PCs capture two distinct aspects of environmental conditions. Figure 8A shows the first PC (i.e. eigenfunction) of the covariance operator of the standardized functional response, that alone explains most of the variability in the data. This highlights that, after standardization, all  Figure 8B shows the eigenfunctions of the covariance operator of the studentized residuals. The first PC depends on the average value over the entire voyage. The second PC accounts for the difference between the first and the second half of the voyage. Whereas, some of the following PCs seem to assign a larger weight to the boundaries of the functional domain, but, however, their interpretation becomes less straightforward. Figure 9 shows the estimated functional coefficients obtained as in Equation (6). Since the functional response is approximated with a single functional PC, which, in practice, is constant over the entire domain, the functional coefficient shows only vertical bands along the direction of . The most important predictors are those associated with the first and third  Figure 10 shows the effect of this choice, where some of the more extreme raw functional residuals, shown in Figure 10A, are properly attenuated by the studentization (Figure 10B), when, as in this case, they correspond to more extreme profiles of covariates as described in Section 3. Figure 11 shows the FRCC used in Phase II, that is, the actual monitoring phase. The points correspond to 22 subsequent voyages, each of them is denoted by its VN. For simplicity, we count and label voyages again with VN 1 through 22, even though these voyages must not be confused with the first 22 ones in Phase I in Figure 4. Some Phase II voyages are signalled as OC in Figure 11. In particular voyages 1, 3, 15 and 16 are OC in both Hotelling 2 and control charts, while voyages 20 and 21 are OC in the control chart only. OC voyages are generally characterized by some unexpected behaviour in the CO 2 emissions, because they have not been properly predicted by the MFLR model in some specific part of the domain, or because the prediction error was overall moderately large for a considerable part of the voyage. Functional studentized residuals for these OC voyages are plotted in Figure 12 against the studentized residuals in the reference data set. Voyages 1, 3 and 15 plot far above the upper control limits in both the Hotelling's 2 and control charts. In particular, the functional studentized residual of voyage 1 is signalled as OC because it is larger than usual on average and at the end of the voyage. Voyage 3 shows an unusual studentized residual that is positive in the first part of the voyage, negative in the second part and extremely large at the very end of the voyage with respect to the Phase I reference profiles (depicted in grey lines). Voyage 15 shows the same unusual negative deviation at the end of the voyage as in voyage 3 and the residual profile remains negative for the whole voyage. Voyages 16, 20 and 21 are signalled as OC in Figure 11, even though they are close to the upper control limits and show less dramatic behaviour, as shown by the plots reported in the second row of panels in Figure 12. Studentized residual profiles corresponding to these voyages have in fact less prominent peaks/valleys. In particular, it is worth noting that voyage 16 is signalled as anomalous plausibly because its residual profile achieves large positive values for almost the entire voyage. Residual profile of voyage 20, even if more regular, as it is near zero in the middle of the voyage, has mild peak and valley at the extremes of the voyage. Residual profile of voyage 21, if considered pointwise, is instead almost entirely in the range of the Phase I studentized residuals. However, it is signalled as OC plausibly because of a sudden profile jump in the middle of the voyage. Finally, we want to highlight the importance of using both the Hotelling's 2 and the monitoring statistics to detect OC voyages, by discussing their difference. With the FPCA model on the functional residuals, we use the first = 8 components to approximate the data as in Equation (11), since they explain 95% of their variability ( Figure 6C). In particular, note that the first component, which alone explains about 75% of the variability in the data ( Figure 6C), in practice looks at shifts in mean during the entire voyage. Then, the Hotelling's 2 statistic in Equation (12), which is calculated on the scores of the first eight components, monitors the deviation within the FPCA model and from the discussion above we can state that a functional residual with a particularly large shift in mean will have a large 2 . On the other hand, the statistic in Equation (13) monitors the remaining part, that is, the error (11) of approximating the residual function with its projection based on FPCA, which should be small when the profile is consistent with the FPCA model. Therefore, a large value indicates that a profile is clearly different than the underlying FPCA model. In Figure 11, all the voyages signalled have above the upper control limit, but voyages 20 and 21 have the 2 IC. These OC voyages show smaller shifts in mean than the others in Figure 12, then it makes sense that the 2 statistic is IC, however they still show an anomalous behaviour, as the other OC voyages, which is well captured by the statistic.

Comparison with other methods
In this section, we try to show if it is actually convenient to use the FRCC rather than simpler approaches. Centofanti et al 8 showed that the FRCC is more powerful than the index based (INBA) control chart, which monitors the area under the response variable, and the RESP control chart, which monitors the coefficients coming from the functional PC decomposition of the response via Hotelling's 2 and control charts. We compare the FRCC with these two control charts and discuss if there are different results in terms of detection of OC voyages in this specific application. Figure 13 shows the INBA control chart, which is not able to detect any of the voyages signalled by the FRCC and seems not appropriate for this type of application. Moreover, this control chart only detects voyage 22 as OC, which is signalled also by the RESP control chart in Figure 14, but, on the other hand, is IC in the FRCC. Apparently, voyage 22 could be an anomalous one that the FRCC is not able to correctly identify. By observing the functional response profile plotted in the left panel of Figure 15, it is plausible that this voyage is signalled as OC by the INBA and RESP control charts because, marginally, the CO 2 emissions were particularly low during the entire voyage. However, by observing that the corresponding functional studentized residual plotted in the right panel of Figure 15, we can conclude that these low values of the response variable are predicted well by the MFLR model. Therefore, conditionally on the functional covariate profiles, the response variable profile of voyage 22 is correctly judged as IC by the FRCC. This example highlights the convenience in using the FRCC with respect to simpler approaches, when the interest is in monitoring a functional quality of interest conditionally on functional covariates having influence on it. The RESP control chart seems to correctly detect some of the voyages identified by the FRCC, that is, voyages 1, 3, 15 and 16, but it misses voyages (i.e. 20 and 21), while it signals voyage 11 that is IC in the FRCC. Note that voyage 11 is close the upper control limits in both the RESP control chart and the FRCC.

CONCLUSIONS
A particular implementation of the FRCC framework proposed in Centofanti et al 8 is applied in this paper to monitor the ship CO 2 emission profiles, in order to identify any special causes at given values of the functional covariates that may have an influence on them. The CO 2 emission profiles are adjusted for the effect of these covariates by means of an MFLR model estimated via multivariate FPCA. That is, the residuals from the MFLR model are monitored by using jointly the Hotelling's 2 and the control charts built on their functional PC decomposition. The specific implementation of the FRCC relies on the use of the studentized functional residual to take into account the different residual variance at different covariate values. The proposed FRCC demonstrated to be effective in the identification of anomalous voyages over the real-case study presented, which is concerned with the data collected during 2020 on a Ro-Pax ship operating in the Mediterranean Sea. Moreover, the particular implementation of the FRCC framework was compared against alternative approaches available in the literature, which, however, only look at the marginal distribution of the functional response, or only to some specific features. All these competing methods showed a lack of ability to signal some important OC and, in other situations, provided false alarms.
Since the monitoring statistics are calculated on the entire profiles acquired at the end of each voyage, this work does not directly allow for a real-time feedback control, that is, to guide actions during an ongoing voyage. Instead, the FRCC can be used to track automatically (and possibly the whole fleet) patterns and trends that may identify malfunctioning in the engines, the need for hull cleaning, or for any other energy efficiency initiative. This information can help to diagnose and prevent possible problems in future voyages, or to schedule maintenance operations.
Finally, one important output achieved in our research is the technological transfer of the FRCC tool to the shipping company Grimaldi Group. The very practical applicability of these statistical tools is in fact further investigated by providing the energy saving department of the company with R code that is able to automatically import new data from the company server, to envelop the mathematical and numerical details provided in the paper, and to routinely produce automatic voyage reports for some Ro-Pax ships of interest from their fleet.

A C K N O W L E D G E M E N T S
The authors are deeply grateful to the anonymous referees and to the editors for their suggestions and help in significantly improving the manuscript. The authors are also extremely grateful to the Grimaldi Group's Energy Saving Department engineers Dario Bocchetti, Andrea D'Ambra and Rosa Di Matteo for the access to observational data, the maritime domain insight and the general support over the course of these activities.

A U T H O R B I O G R A P H I E S
Christian Capezza is a postdoc researcher at the Department of Industrial Engineering of the University of Naples Federico II. He works on advanced statistical methodologies for engineering applications and his research project regards the development of interpretable statistical methods for the analysis of complex systems in industry 4.0.
Fabio Centofanti is a PhD student at the Department of Industrial Engineering of the University of Naples Federico II, Italy. His main research interests include functional data analysis and statistical process monitoring for industrial applications.
Antonio Lepore is an Assistant Professor at the Department of Industrial Engineering of the University of Naples Federico II, Italy. His main research interests include the industrial application of statistical techniques to the monitoring of complex measurement profiles from multisensor acquisition systems, with particular attention to renewable energy and harmful emissions. Alessandra Menafoglio is an Assistant Professor at MOX, Department of Mathematics, Politecnico di Milano. Her research interests focus on the development of innovative statistical models and methods for the analysis and statistical process control of complex observations (e.g. curves, images and functional signals), possibly characterized by spatial dependence. Biagio Palumbo is an Associate Professor in 'Statistics for experimental and technological research' at the Department of Industrial Engineering of the University of Naples Federico II, Italy. His major research interests include reliability, design and analysis of experiments, statistical methods for process monitoring and optimization and data science for technology.
Simone Vantini is an Associate Professor of Statistics at the Politecnico di Milano, Italy. He has been publishing widely in Functional and Object-Oriented Data Analysis. His current research interests include: permutation testing,