Characterizing the spatiotemporal evolution of building material stock in China's Greater Bay Area: A statistical regression method

More than half of the materials extracted from natural environments eventually accumulate as building material stock (BMS). From a linear‐to‐circular economy perspective, BMS transforms the building sector from a virgin material consumer and a waste generator to a future depository of secondary resources. Studies characterizing the amount and distribution of BMS adopt different approaches, but high data requirements restrict their applicability. This research proposes an alternative method for regional BMS quantification. The method leverages the permanent population, electricity consumption, and BMS of a sample city to develop a statistical regression model; then uses it to estimate the BMS of a larger, homogenous region. With relatively low data requirements, the new method is especially applicable in underdeveloped areas where data required for BMS quantification methods are usually unavailable or incomplete. We apply the method to characterize the spatiotemporal evolution of BMS in China's Greater Bay Area. From 2000 to 2021, the total BMS in this region increased from 4.4 to 7.7 billion tonnes, with concrete, brick, and steel accounting for 72.32%, 17.57%, and 4.71% of the total BMS, respectively. The most rapid BMS growth occurred in Guangzhou (from 534.75 to 1277.82 Mt) and Shenzhen (517.80 to 1235.48 Mt). A core–edge BMS accumulation pattern emerged in this area while the BMS peak showed a coast‐to‐inland shift. Future studies can explore generalizing this new method to characterize BMS in other developing regions.

Under the traditional economic mode (i.e., extract-make-use-dispose), BMS is regarded as a virgin material consumer because of construction, maintenance, and expansion, and a solid waste generator due to renovation and demolition (Marcellus-Zamora et al., 2016;Wang et al., 2018).
However, the circular economy strategy and sustainable development goals have shifted perceptions of BMS, now seen as a secondary resource depository with potential for reducing the need for primary materials in the future (Gontia et al., 2020;Mesta et al., 2019).
Given its significance, many methods have been developed to characterize BMS in different regions and nations.For example, Müller (2006) proposed a demand-driven approach to estimate the in-use concrete stock of residential buildings in the Netherlands.Hashimoto et al. (2009) adopted a material flow-driven approach to quantify the in-use construction mineral stock in Japan, based on national material flow statistics.
Benefitting from the advancement of geographic information system (GIS) technologies, a method has been used to estimate multiple materials' stocks in Japan (Tanikawa & Hashimoto, 2009), Vienna (Kleemann et al., 2017), and elsewhere.Moreover, Peled and Fishman (2021) developed a nighttime light-based method to spatially map BMS in Europe.More relevant studies can be seen in a review article published by Nasir et al. (2021).
Despite the growing research interest in BMS, research gaps remain.For example, existing BMS quantification methods depend heavily on the availability of highly integrated data (Lanau et al., 2019;Sprecher et al., 2022), limiting their applicability.Furthermore, most BMS case studies have focused on developed cities and countries while underdeveloped ones are less investigated (Lanau et al., 2019).
Therefore, this research aims to characterize the BMS in the Guangdong-Hong Kong-Macao Greater Bay Area (GBA) by developing a novel methodology.The GBA is an urban agglomeration the focus of key strategic planning in the development blueprint of China (CMAB, 2022).It comprises 11 cities (i.e., Hong Kong, Macao, Guangzhou, Shenzhen, Zhuhai, Foshan, Huizhou, Dongguan, Zhongshan, Jiangmen, and Zhaoqing), many of which are undergoing a rapid urbanization process.Our proposed methodology is to refine multiple feature variables from freely available data sources, and then use them to estimate BMS based on robust statistical regression.We expect to uncover the BMS pattern in the GBA, while contributing a more applicable and reliable method for quantifying BMS in developing regions.The remainder of the paper is organized as follows.
Following this introductory section, Section 2 reviews studies on BMS quantification and clarifies research gaps.Section 3 introduces the research methodology and Section 4 describes the detailed BMS estimation process.Section 5 discusses the research contributions and shortcomings, and conclusions are drawn in Section 6.

Quantification approaches
Previous studies have proposed a range of approaches to quantifying in-use material stock in buildings.We conducted an extensive literature review and summarized existing BMS methods into seven categories, as follows: 1. Material flow-driven approach.Also called a "top-down approach" in some articles (Fu et al., 2022;Lanau et al., 2019), this estimates BMS by subtracting the material outflow from the material inflow of the building sector within a district.Material inflow data can be either directly extracted from governmental statistics reports (Fishman et al., 2014), or calculated based on building construction activity quantities (measured by construction area or expenditure) and material consumption indicators (material weight per unit of construction area or cost) (Hashimoto et al., 2009).Material outflow data are from either the demolition waste statistics (Wang et al., 2019), or the product of material inflows and assumed material survival probability distributions (Bergsdal et al., 2007).
2. Gross floor area (GFA) statistics-enabled approach.This approach benefits from governmental GFA statistics.It quantifies BMS by using GFA to multiply a material intensity (material weight per unit of floor area).Material intensities can be obtained from building specifications (Tanikawa & Hashimoto, 2009), design codes (Han & Xiang, 2013), construction handbooks (Gao et al., 2020), and so forth.
3. Demand-driven approach.This method leverages the product of population and per capita floor area to derive GFA, and then multiplies GFA by the material intensity to output BMS (Kalcher et al., 2017;Müller, 2006).When the per capita floor area indicator is unavailable in official statistics, filed surveys can be used to acquire the data.
4. GIS database-enabled approach.This approach benefits from availability of urban/national-level GIS databases that generally contain building footprint and height.GFA data is extracted, then GFA is multiplied by the material intensity to derive BMS (Lanau & Liu, 2020;Tanikawa & Hashimoto, 2009).
5. Nighttime light-based approach.This method also leverages the principle of multiplying GFA by material intensity, but it creatively uses remote sensing data (e.g., nighttime light radiance) to estimate GFA (Peled & Fishman, 2021).
6. Building archetype-based approach.This method derives BMS from the product of the number of buildings and material stock per building (Ergun & Gorgolewski, 2015).The indicator material stock per building represents the average material stock of a type of building, and it is usually extracted from building specifications, drawings, and likewise (de Tudela et al., 2020;Heeren & Hellweg, 2019).
7. Regression-enabled approach.This approach uses several easy-to-access feature data (e.g., building height, floor area, building age) and machine learning regression models to predict the material stock of individual buildings (Yuan et al., 2022).Summing up all individual stocks can output the total BMS.The data required for machine learning regression model training are from building demolition projects.

Case study areas
As summarized by Lanau et al. (2019), previous BMS case studies have mainly investigated developed countries and cities, such as the Netherlands (Müller, 2006;Sprecher et al., 2021), Germany (Ortlepp et al., 2016(Ortlepp et al., , 2018;;Schiller, 2007), Switzerland (Heeren & Hellweg, 2019;Wittmer et al., 2007), Japan (Fishman et al., 2014;Hashimoto et al., 2007), and Vienna (Gassner et al., 2020;Kalcher et al., 2017;Kleemann et al., 2017), possibly because the data needed for BMS estimation are readily accessible.In recent years, case studies in developing areas have also increased.For instance, Hong et al. (2016) estimated the material stock of China's residential and commercial buildings.Han and Xiang (2013) also conducted a nationwide BMS quantification in China, considering residential buildings.Other BMS studies have made efforts to cover all building types by narrowing the spatial scope to city, for example, Beijing (Hu et al., 2010;Mao et al., 2020) and Shanghai (Gao et al., 2020).A recent study adopted the nighttime light-based approach to estimate the BMS of three urban agglomerations in China: Beijing-Tianjin-Hebei, the Yangtze River Delta, and the GBA.

Research gaps
As reviewed above, existing BMS studies have proposed multiple BMS quantification methods and investigated many different countries and cities.
Nevertheless, research gaps remain in this field.First, the applicability of most existing methods is subject to the availability of highly integrated data.Specifically, the material flow-driven approach needs detailed material in/out-flow statistics, but such data generally only indicate nationwide material flows.Thus, this approach is applicable for nationwide BMS accounting but not for other spatial scales such as city and region (Cheng et al., 2018).It is the same for the GFA statistics-enabled approach.The demand-driven approach has been frequently used to quantify material stocks in residential buildings, but rarely applied in non-residential buildings due to the lack of key indicator data (i.e., per capita non-residential floor area).
As for the nighttime light-based approach, this outputs higher than actual BMS unless light radiance from lit infrastructures (e.g., streets, parks, urban roads, piers, and so on) can be accurately measured and then eliminated (Hsu et al., 2011).The GIS database-enabled, building archetypebased, and regression-enabled approaches are data intensive.Their use requires massive efforts to aggregate raw data for refining input variables, for example, massive data for defining building archetype or multiple building feature data for regression-enabled approach.Thus, innovations in BMS quantification methodologies are still needed to tackle data availability restrictions.
Meanwhile, as noted above, developing areas have received relatively less research attention than developed areas.The amount, composition, distribution, and accumulation rate of BMS in underdeveloped areas may have different patterns from those in developed ones.Hence, more research attention should be paid to emergent nations and regions (Lanau et al., 2019;Vilaysouk et al., 2021).The proposed study area, that is, the GBA, is a core urban agglomeration of the world's largest developing country, China.A recently published work quantified the BMS in the GBA based on nighttime light (Liang et al., 2023), but as explained above, such estimates tend to be higher than actual BMS (Hsu et al., 2011).This work also adopted national average material intensities and only distinguished them according to the building structure difference (brick-concrete structure and reinforced-concrete structure), introducing further quantification errors.Thus, finely characterizing BMS in the GBA will be an important contribution.

The Guangdong-Hong Kong-Macao Greater Bay Area
The case study area, the GBA, is located in the southern coastal region of China, as shown in Figure 1.It has a total area of around 56,000 km 2 .
In 2020, the total population was over 86 million and the gross domestic product (GDP) was 1668.8 billion US$ (CMAB, 2022).The development objective of the GBA is to facilitate in-depth integration within the region and promote coordinated regional economic development, with a view to establishing an international first-class bay area ideal for living, working, and traveling (Bao et al., 2020).
In this study, an underlying assumption is that the buildings in the GBA can be treated as a homogenous whole based on the following facts and literature: • The 11 GBA cities are all situated in the same climate zone.According to Yang, Lyu et al. (2020) and Zhang et al. (2022), buildings in the same climate zone tend to have similar design and construction standards, meaning similar material intensities.Being in the same seismic zone also makes their buildings' material intensities similar (Ortlepp & Gul, 2021).• The economic difference between these cities may increase the proportion heterogeneity of different types of buildings.For example, the developed GBA cities (Hong Kong and Macao) tend to have a higher commercial building proportion and a smaller industrial building percentage than those developing members (e.g., Zhaoqing and Huizhou).However, this impact is very limited as residential buildings generally take up the dominant share (by floor area) in cities, ranging from approximately 70% to 90% (Harvey et al., 2014).• Besides, the construction standards of the 11 GBA cities are similar.First, they have the same architectural origin (Lingnan architecture) (Carroll, 2007;Xue, 2014), meaning that the original building specifications should be similar.Although some European buildings appeared in Hong Kong after being colonized by Britain, their proportion was rather small (Carroll, 2007).Second, most construction materials used in Hong Kong are imported from the cities in Mainland China, including some GBA members like Shenzhen, Guangzhou, and Foshan (CEED, 2023).This might lead to the similarity of construction standards between Hong Kong and those GBA exporters.Last, in recent years, with the cooperation of the Hong Kong and Macao governments, the Guangdong government has expedited to unify construction standards within the GBA over the past years (LOCPG, 2021).

Developed statistical regression model
Dependent variable:

The rationale
As indicated by several previous studies, material stocks potentially correlate with some demographical features like population (Deng et al., 2022;Miatto et al., 2017), and socioeconomic features such as GDP (Deng et al., 2022;Miatto et al., 2017) and urbanization rate (Deng et al., 2022).If the correlation can be modeled in a robust statistical regression, then one can leverage the statistical regression model to estimate the material stocks in a district.The data of these feature variables are freely available in many cities and countries; consequently, this regression method may significantly reduce data availability restrictions faced by previous BMS quantification approaches.Inspired by this idea, this study quantifies the BMS in the GBA based on one or more feature variables and statistical regression.
Modeling the correlation between BMS and feature variables needs feature data as the independent variable and known BMS data as the dependent variable.For cities in the GBA, obtaining the required feature data is easy but the BMS data do not exist.To tackle this problem, we select Hong Kong as a case to develop the proposed regression model because the raw data needed for deriving accurate BMS are freely available in Hong Kong but not in other GBA cities.Also, the developed model based on the context of Hong Kong can be generalized to estimate the BMS of other GBA cities, because of their building similarity (see Section 3.1).
To derive the BMS of Hong Kong, this research adopts the GFA statistics-enabled approach, as presented in Equation (1).The stock of frequently used building materials is accounted and then aggregated to derive the total BMS.To improve BMS quantification accuracy, all buildings in Hong Kong are classified into multiple clusters according to type and structure differences.This is a common error minimization practice in previous BMS studies.
TA B L E 1 Independent variables preliminarily selected for building material stock estimation.

No. Independent variable Explanation Reference
1 Permanent population Also known as usual resident, it is equal to the total population minus commuters, unit: person (Bergsdal et al., 2007;Ortlepp et al., 2016) 2 Per capita GDP Total GDP divided by total population in a year, unit: CNY/person (Deng et al., 2022) 3 Construction GDP GDP of the construction industry in a year, unit: million CNY (Hashimoto et al., 2009;Ortlepp et al., 2016) 4 Urbanization rate Ratio of urban population to total population (containing urban and rural populations) (Cao et al., 2018;Deng et al., 2022;Huang et al., 2013) 5 Electricity consumption Total amount of electricity consumed in a year, unit: terajoule (Guo et al., 2019) 6 Newly constructed floor area Gross floor area of newly constructed buildings every year, unit: m 2 (Guo et al., 2019) Note: Gross floor area refers to the area contained within the external walls of the building measured at each floor level, including any floor below the level of the ground but excluding any floor space used solely for parking and equipment installation (HKBD, 2023).
where M refers to building material types (e.g., concrete, wood, brick, and so on).T represents building types (e.g., residential building, commercial building, and so on).S means building structures (e.g., brick-concrete structure, reinforced-concrete structure, and so on).Other building attributes that may influence the material intensity (e.g., construction year) are not considered due to data accessibility limitations.GFA t,s refers to the gross floor area of the t type of s structure building.MI m,t,s represents the material intensity of the m type of material in the t type of s structure building (unit: kg/m 2 ).

Data sources
In

BUILDING MATERIAL STOCK QUANTIFICATION
According to the methodology shown in Figure 2, the first step is to develop the statistical regression model based on the data collected from Hong Kong, and the second step is to use the model to quantify the BMS of the GBA.The two steps are expanded upon as follows.

Independent variables
Multiple studies have indicated that BMS is correlated with several socioeconomic feature variables, such as GDP and population.We reviewed these studies and then extracted several independent variables from them.Such a variable selection way is helpful for preventing the data fishing problem, which refers to the unethical activity of picking only the "correct" observations (Sang & Aitkenhead, 2020).Also, we referred to the indicators used for building energy consumption prediction and refined some indicators as potential independent variables for BMS estimation.Table 1 summarizes all feature variables preliminarily extracted from the literature.Some other features may also have a significant correlation with BMS, for example, annual investment in construction activities (Ortlepp et al., 2016), built-up area (Yu et al., 2020), construction and demolition waste amount (Lu et al., 2021), and even carbon emission (Peng et al., 2021;Wu et al., 2018); but this study does not consider these variables because acquiring their data are relatively difficult.
After selecting potential feature variables, different data sources were accessed to acquire feature data.Specifically, the total GDP, population, construction GDP, and electricity consumption were extracted directly from the annual statistics published by the Hong Kong HKCSD (2022).
Annual constructed floor area was extracted from the statistics of HKBD (2023).Annual per capita GDP and annual urbanization rate were calculated based on population and the total GDP every year.

Dependent variables
The BMS of Hong Kong is the dependent variable.As clarified in Section 3, Equation (1) presents the BMS calculation method.First, this study takes 11 building materials into account: aluminum, asphalt, brick, ceramic tile, concrete, copper, glass, lime, plastic, steel, and wood.These may not cover all materials used in buildings, but they do make up the majority in terms of quantity and type.Some previous BMS studies quantify the stock of cement, sand, and gravel rather than that of concrete (Liang et al., 2023;Miatto et al., 2021).For convenience, this research selects concrete; and dividing the concrete stock using a proportion provided in many concrete composition studies (Ji et al., 2006) and standard specifications (Nawy, 2008) can also output cement, sand, and gravel stocks.
Then, all buildings are divided into different cohorts according to building type and structural differences.Referring to Mollaei et al. (2021) and Kleemann et al. (2017), this study divided five building types, namely, commercial, residential, industrial, public, and other.Residential, industrial, public, and other buildings are further classified into either the brick-concrete or reinforced-concrete structure group; commercial buildings are divided into three structure groups: brick-concrete, reinforced-concrete, and composite.The latter is a combination of reinforced-concrete (as the core) and steel structure (as the outer embracing frame) (Wong, 2003).This kind of structure has been widely applied in super high-rise commercial buildings in Hong Kong (Wong, 2003).Additionally, steel structure buildings, while common in some cities, are rare in Hong Kong (less than 2% by building quantity) (Yang et al., 2019); these are thus excluded from the structure division.More details about differentiating building types and structures are provided in Supporting Information S1.
After determining material types and classifying building groups, the next step determines material intensities.The detailed material intensity coefficient for buildings in the GBA has not yet been investigated, but several building energy studies (Chau et al., 2012;Sham et al., 2011) have contributed some construction material consumption data obtained from the bill of quantity (BoQ) of sampled buildings in Hong Kong.Also, a recent BMS study conducted by Yuan et al. (2022) provides a few material intensity references for concrete.Multiple BMS studies based on China's context provide nationwide average material intensities (Cao et al., 2018;Huang et al., 2018;Yang, Guo et al., 2020).With these references, we decided to leverage the nationwide average material intensities as a benchmark and then calibrate it by making full use of the data from Hong Kong, so as to obtain a more precise material intensity set for Hong Kong.Supporting Information S2 provides the complete building material intensity dataset built by this study.
Next, we leveraged the iB1000 database from Hong Kong's Lands Department and the new construction and demolition datasets from the Buildings Department to calculate the annual net GFA between 2000 and 2021.The iB1000 database provides the net GFA in 2021; while the annual completed and demolished GFAs from 2000 to 2021 are displayed in the Buildings Department's datasets.Using Equation (2) we backward derive the annual net GFA from 2020 to 2000.For conciseness, the detailed GFA calculation procedure and result appear in Supporting Information S1.
where i refers to the i year and ranges from 2021 to 2000.NGFA i means the net GFA in year i.CGFA i represents the newly constructed GFA in year i.DGFA i is the demolished GFA in year i.
The final step is to input material intensities and GFA data into Equation (1) to calculate the annual stock of the 11 selected materials, and then sum up all individual material stocks to obtain the total BMS in Hong Kong every year.

Statistical regression modeling
Based on the independent variable data introduced in Section 4.1.1 and the dependent variable data stated in Section 4.1.2,the proposed statistical regression model can be developed.Prior to regression modeling, proper metrics are needed for assessing model performance.This study chooses four metrics popular in regression modeling research, that is, root mean square error (RMSE), mean absolute error (MAE), mean absolute percentage error (MAPE), and the adjusted coefficient of determination (Adjusted R-square, also known as R 2 ).RMSE represents the standard deviation of The key parameters of developed multi-linear regression models.

Material stock type Formula of multi-linear regression models MAE MAPE (%) RMSE
Adjusted R-square The next step is to determine an appropriate regression method.Several existing studies (Deng et al., 2022;Miatto et al., 2017) imply that most of the independent variables listed in Table 1 show a roughly linear correlation with BMS.Hence, this research adopts the multi-linear regression method to model the correlation between feature variables and BMS.Equation (3) presents the general formula of multi-linear regression.Moreover, given the number of independent variables, a stepwise linear regression technique is used for modeling.Stepwise linear regression refers to constructing a regression model iteratively (Agostinelli, 2002).In stepwise regression, independent variables will be added into or removed from the regression model one-by-one (respectively, "forward stepwise regression" and "backward stepwise regression").Adding or removing one independent variable represents an iteration.After each iteration, a statistical significance test is performed to analyze whether the dependent variable outputs present a significant difference (measured by p value) from previous ones.Using the stepwise regression technique can achieve an optimal regression performance under the condition of using as few feature variables as possible; consequently, reducing the feature data collection cost.
where f(x 1 , … , x n ) refers to the dependent variable (e.g., BMS in this study).
x n refers to independent variables (e.g., GDP and population in this study). n represents the coefficient of each independent variable.n is the number of independent variables. is the constant term (also called "model intercept"). is the model error term (also known as "model residual").
Next is feature processing.Among the six selected feature variables, as some are cross-correlated (e.g., permanent population and per capita GDP), the intercorrelation may result in multicollinearity (sometimes called "collinearity"), where two or more input variables are highly correlated with each other (Alin, 2010).Multicollinearity may make input variables' coefficients unstable or even counterfactual (Alin, 2010), but it would not change the performance of regression models (Kutner et al., 2004).In this research, the major focus is the regression model's performance rather than its feature variable parameters, so it is not necessary to take additional measures to deal with multicollinearity; furthermore, as mentioned above, this study uses the stepwise linear regression technique, which is an effective means of controlling variable collinearity (Chang & Mastrangelo, 2011).
Finally, we implement the designed regression modeling scheme in the MATLAB programming platform.the two features (permanent population and electricity consumption) can explain around 97.9% of the variation of the dependent variable (concrete stock).According to the analysis on the regression model for concrete stock estimation, it can be seen from Table 2 that all multi-linear regression models have excellent performance in estimating BMS and can be used to quantify the BMS of the GBA.
Regarding the model feature variables, whether the total BMS or the individual material stock is used as the dependent variable, stepwise linear regression results indicate that the four features, that is, urbanization rate, per capita GDP, construction GDP, and completed floor area, have no significant contribution (p value > 0.05) to improving model estimation accuracy, and thus are removed.Permanent population and electricity consumption are retained due to their significant contributions (p value < 0.05) to improving estimation accuracy.The permanent population and electricity consumption data are freely available in the government's annual statistics.Thus, applying these regression models in quantifying the GBA BMS will, to a large extent, overcome the data availability problem faced by existing BMS quantification approaches.

Applying regression models in the GBA
The model application procedure comprises three steps.First, the permanent population and electricity consumption of the other 10 GBA cities were extracted from their government's annual statistics, such as the Statistical Yearbook by Shenzhen Statistics Bureau (SSB, 2021).Note that these data can be for 1 year only (to estimate static BMS) or for a period (to reveal temporal dynamic BMS).We gathered data ranging from 2000 to 2021 to uncover the temporal BMS change in the GBA.The second step is data processing, mainly including unifying feature variables' data units and filling in their missing values (less than 1% data missing rate in this study).The last step is inputting these processed feature data into the multi-linear regression models to estimate each city's BMS.Through these steps, we successfully quantified the annual BMS of the GBA from 2000 to 2021.
Figure 3a visualizes the overall BMS change of the 11 GBA cities during the 21-year period.Basically, all cities witnessed material stock growth, but the growth rate varies from city to city.The most rapid BMS increase occurred in Guangzhou (from 534.75 to 1277.82Mt) and Shenzhen (517.80 to 1235.48 Mt), and they have become the largest BMS contributors in the GBA.The total BMS in Dongguan, Foshan, and Huizhou also rose significantly.In comparison, the BMS increments in Hong Kong, Macao, Jiangmen, Zhongshan, Zhaoqing, and Zhuhai were small during the period.
These findings indicate that the 21-year BMS growth in the GBA mainly resulted from Guangzhou, Shenzhen, Dongguan, Foshan, and Huizhou.In other words, the five cities were the major construction material consumers in the GBA from 2000 to 2021; also, it is foreseeable that their role may not change in the coming several years, according to their uptrends shown in  and steel (4.71%) make up the second and third largest proportions, respectively.The other materials account for a small percentage only, being F I G U R E 4 Mapping the spatiotemporal evolution of building material stock in the Greater Bay Area (unit: t).The underlying data can be found in Supporting Information S3.
Figure 4 maps the spatiotemporal BMS evolution.The spatial distribution of BMS in the GBA shows a clear core-edge pattern.That is, most BMS is accumulated in the four central cities, that is, Hong Kong, Shenzhen, Dongguan, and Guangzhou; while the peripheral cities only account for a small share.Moreover, it can be clearly observed that the BMS peak moved inland from the coastal area, namely, from Hong Kong to Shenzhen, and then to Guangzhou.

Comparison with previous studies
Comparing the BMS quantified by this study to the counterpart of previous studies can verify the validity of the statistical regression method.
Direct comparison of the total BMS is difficult due to the difference in covered material types, building types, and built-up area (Liang et al., 2023).
This research compares the BMS density (BMS per built-up area) of three GBA cities, where BMS densities are also available in existing studies.
As presented in Table 3, although the BMS density gap between this and previous studies is not negligible, it is acceptable when considering its large spatial scale.This result provides a solid proof to demonstrate that the statistical regression method is effective and feasible; it can be an alternative regional BMS quantification approach, under the common situation where only one or a few member cities within a homogenous region have available data.

Figure 2
Figure 2 summarizes the overall methodological framework proposed by this research.It comprises two parts: (1) developing a regression model based on the Hong Kong case, and (2) applying the developed regression model to characterize the BMS of the GBA.The methodology is built on two theoretical foundations, including (1) the significant statistical correlation between BMS and feature variables; and (2) the building similarity within a neighboring area, as clarified in Section 3.1.
The methodology of this research.
Hong Kong, a lot of feature variable data (e.g., population, GDP, urbanization rate, and so on) are accessible in the Annual Digest of Statistics issued by the Census and Statistics Department (HKC&SD) (2022).Some databases provided by other Hong Kong Government departments, for example, the Lands Department (HKLD, 2022) and the Buildings Department(HKBD, 2023), also offer potential independent variables.These public statistics containing rich feature variable data are also freely accessible in the other GBA cities.Several open sources provide the raw data for calculating BMS in Hong Kong.The iB1000 database, which is a 1:1000 digital topographic map issued by the Lands Department (HKLD, 2022), contains the footprint, perimeter, height, usage type, and address of all buildings in Hong Kong.But it can only derive the GFA at the end of 2021 because the information of demolished buildings will be wiped out from the database after regular updates.Another two datasets, therefore, will be gathered and processed to obtain the past annual demolished GFA and constructed GFA, respectively.The two datasets are from the annual statistics published by the Building Department (HKBD, 2023) and cover a time range from 2000 to mid-2022.Combing the two datasets with the iB1000 database allows us to derive the annual net GFA of different building clusters from 2000 to 2021.Regarding material intensities, there is no current existing dataset in Hong Kong, but these can be created by aggregating useful data from many previous publications, suchas Cao et al. (2018) andYang, Guo et al. (2020).With the abovementioned datasets, the annual BMS of Hong Kong between 2000 and 2021 can be output to work as the dependent variable for regression modeling.

F
I G U R E 3 (a) The total building material stock change of the 11 GBA cities from 2000 to 2021.(b) The proportion of different material stocks of the 11 Greater Bay Area cities.The underlying data can be found in Supporting Information S3.
Figure 4. Additionally, the total BMS increased from around 4.4 billion tonnes in 2000 to 7.7 billion tonnes in 2021, meaning an average BMS growth rate of approximately 155.6 Mt/year.

Figure
Figure 3b reveals the proportion of different material stocks of the GBA.Concrete takes up the largest share (around 72.32%), and brick (17.57%) Total material stock (f T )f T = 31.44X 1 +1162.76X 2 +217621617.9Note:X 1 and X 2 represent permanent population and electricity consumption respectively.The unit for permanent population and electricity consumption is person and terajoule, respectively.estimated values by a regression model.MAE measures the error between actual and estimated data points.MAPE is a measure of the estimation accuracy of a regression model.Adjusted R-square indicates how much variation of a dependent variable can be explained by independent variables in a regression model.More explanations about the four indicators can be found in Emmert-Streib and Dehmer (2019).
Table2showcases the key parameters of developed multi-linear regression models.Among the four selected model assessment indicators, the MAPE and Adjusted R-square values of different models are comparable while the others are not.When using the concrete stock as the dependent variable, the developed regression model has the largest MAPE (0.28%) and the smallest Adjusted R-square (0.979).The two extra values mean that the model's performance is the worst when compared with others.However, a MAPE of 0.28% (equivalent to an accuracy of 99.72%) indicates that the regression model already has a sufficiently high estimation accuracy.This is further proved by the MAE of 1,212,090, which is much smaller when compared to the concrete stock value, for example,407,605,105 in 2000 and 459,168,853 in 2021.Meanwhile, compared to the concrete stock value, an RMSE of 1,616,316 means that the estimated concrete stock by the regression model has a very small standard deviation.The Adjusted R-square value (0.979) indicates that