The PIOLab-Building global physical input-output tables in a virtual laboratory

Informed environmental-economic policy decisions require a solid understanding of the economy’s biophysical basis. Global physical input-output tables (gPIOTs) collate a vast array of information on the world economy’s physical structure and its interdependence with the environment. However, building gPIOTs requires dealing with mismatched and incomplete primary data with high uncertainties, which makes it a time-consuming and labor-intensive endeavor. We address this challenge by introducing the PIOLab: A virtual laboratory for building gPIOTs. It represents the newest branch of the Industrial Ecology virtual laboratory (IELab) concept, a cloud-computing platform and collaborative research environment through which participants can use each other’s resources to assemble individual input-output tables targeting specific research questions. To overcome the lack of primary data, the PIOLab builds extensively upon secondary data derived from a variety of models commonly used in Industrial Ecology. We use the case of global iron-steel supply chains to describe the architecture of the PIOLab and highlight its analytical capabilities. A major strength of the gPIOT is its ability to provide massbalanced indicators on both apparent/direct and embodied/indirect flows, for regions and disaggregated economic sectors. We present the first gPIOTs for 10 years (2008-2017), covering 32 regions, 30 processes and 39 types of iron/steel flows. Diagnostic tests of the data reconciliation show a good level of adherence between raw data and the values realized in the gPIOT. We conclude with elaborating on how the PIOLab will be extended to cover other materials and energy flows.


INTRODUCTION
Progress towards sustainability necessitates comprehensive, quantitative research analyzing the relation between socioeconomic activities and biophysical processes across different spatiotemporal scales (Haberl et al., 2019). Socioeconomic activities are directly and indirectly enabled by 'free gifts of nature' (Daly, 1968;Duchin, 2009;Leontief, 1970), i.e. the extraction of resources from, and dilution of pollutants to the natural environment. These environmental pressures are a dominant driving force of global environmental change -e.g. global warming, biodiversity loss and environmental degradation (Steffen et al., 2015). Due to globalization, production and consumption activities take place at an increasing spatial distance (Fischer-Kowalski et al., 2015;Schaffartzik, Mayer, Eisenmenger, & Krausmann, 2016), giving rise to environmental burden shifting between regions through international trade (Oita et al., 2016;Peters, Minx, Weber, & Edenhofer, 2011). Global models that reflect such biophysical interdependencies can play a pivotal role in informing environmental-economic policy decisions.
A wide spectrum of models and accounting frameworks are available to industrial ecologists, each describing certain aspects of the socioeconomic metabolism (SEM), i.e. the biophysical flows exchanged between societies and their natural environment as well as the flows within and between social systems (Haberl et al., 2019;Pauliuk, Majeau-Bettez, & Müller, 2015). This includes process-based life-cycle assessment (LCA; EU JRC, 2010), environmentally extended monetary input-output analysis (EEIOA; Miller & Blair, 2009), material or substance flow analysis (MFA/SFA; Baccini & Brunner, 2012) and economy-wide material flow accounting (ew-MFA; Krausmann, Schandl, Eisenmenger, Giljum, & Jackson, 2017). To overcome the limitations of individual methods, various combinations of approaches have been presented. For example, the waste input-output (WIO) model was developed to explicitly take into account the interdependence between flows of goods and wastes in input-output (IO) analysis (Nakamura & Kondo, 2002); hybrid LCAs were proposed to relax the system boundary selection problem  and more recently, dynamic MFA techniques (Müller, Hilty, Widmer, Schluep, & Faulstich, 2014) are increasingly applied to incorporate stock-flow relations in ew-MFA (Krausmann, Wiedenhofer et al., 2017;Wiedenhofer, Fishman, Lauk, Haas, & Krausmann, 2019). Such cross-fertilizations between modelling schools have the potential to be more comprehensive and informative than either approach by itself.
Physical IO tables (PIOTs), sometimes referred to as IO-based MFA (Courtonne, Alapetite, Longaretti, Dupré, & Prados, 2015;Nakamura, 2011), are conceived as a way for combining the strengths of IO modeling and (ew-)MFA (Wachs & Singh, 2018). When constructed following an economy-wide perspective using a single physical unit (e.g. metric tonnes), PIOTs represent a mass balanced map of biophysical flows between socio-metabolic processes including societynature interactions. These principles bring a number of advantages. First, PIOTs can circumvent the homogeneous price assumption of EEIOA, which implies proportionality between monetary and physical flows (Weisz & Duchin, 2006) 1 . In fact, physical quantities are not sold at the same price to all consumers, an issue already discussed in early energy IO analysis (Bullard & Herendeen, 1975). Second, PIOTs capture all physical flows whether they have a monetary value or not, including wastes and secondary materials. This is an advantage over monetary IO tables (MIOT), whenever research questions focus more strongly on the biophysical dimension of economic activities. For example, as EEIOA is increasingly used for analyzing circular economy strategies (Aguilar-Hernandez et al., 2019;Çetinay, Donati, Heijungs, & Sprecher, 2020;Donati et al., 2020;Tisserant et al., 2017;Towa, Zeller, & Achten, 2020;Wiebe, Harsdorff, Montt, Simas, & Wood, 2019), the reliance on MIOTs has been identified as a major limitation because circularity policies are usually defined in physical units (Aguilar-Hernandez, Sigüenza-Sanchez, Donati, Rodrigues, & Tukker, 2018). Third, by tracing material flows through economic sectors in a mass-balanced manner, PIOTs open up the black box of ew-MFA and thereby widen its applicability for policy-oriented analysis that focuses on specific economic activities, such as manufacturing, construction, household consumption or public procurement (Altimiras-Martin, 2014; Giljum & Hubacek, 2009). Finally, PIOTs contribute to widening the empirical scope of MFA studies, whose main interest is the quantification of material cycles. So far, most MFA/SFA studies are geographically constrained to single (world) regions and many times trade flows of final goods are ignored (Chen & Graedel, 2012). Due to globalized production-consumption systems, however, closing material cycles must be conceived at the global level (Graedel, Reck, Ciacci, & Passarini, 2019), an undertaking for which a global multi-regional PIOT (gPIOT) provides a practical framework. gPIOTs can function as an integration framework for different data sources. The System of Environmental-Economic Accounting (SEEA; United Nations et al., 2014), which follows the accounting conventions from the System of National Accounts, arranges physical flows and stocks in a series of accounts. Global physical supply-use tables (PSUTs), the building blocks of gPIOTs, play a key role in this regard. Data for gPIOTs must be obtained from a variety of sources, such as energy and waste accounts and statistics on production, recycling, emissions and international trade. The application of the mass balance principle means that a coherent picture emerges from the integration of the different data sources.
Like their monetary counterparts, PIOTs are generally underdetermined systems, i.e. not all data elements of the table are explicitly informed by primary data. A recurring challenge of all PIOT construction efforts is dealing with incomplete and mismatched data sources with high uncertainties, which makes the construction process a time-consuming and labor-intensive undertaking. However, in order to be relevant for policy making, IO models need to be created and updated in a timely, continuous and cost-efficient way (Wiedmann, Wilting, Lenzen, Lutter, & Palm, 2011).
The present work addresses this challenge by introducing the PIOLab, a virtual laboratory for building gPIOTs. It represents the newest branch of the Industrial Ecology virtual laboratory (IELab) concept, which is a collaborative research environment with a cloud-based highperformance computing platform and a single-step reconciliation engine. This enables modelers to use each other's resources to assemble their own tailor-made large-scale multi-regional inputoutput model (MRIO) that is fit for purpose in a cost-efficient and timely way (Lenzen, 2014). IELabs are an evolving niche of IO research that hitherto were mainly used for compiling environmentally extended monetary IO databases (Geschke & Hadjikakou, 2017). Since the construction of the forerunner IELab for Australia (Lenzen, Geschke, Malik et al., 2017;Wiedmann, 2017), various branches have been developed including a global MRIO laboratory (Lenzen, Geschke, Abd Rahman et al., 2017) and several regional i.e. subnational MRIO modeling suits for various countries including for example Japan (Wakiyama, Lenzen, Geschke, Bamba, & Nansai, 2020), China (Wang, 2017), Indonesia (Faturay, Lenzen, & Nugraha, 2017), Taiwan (Faturay, Sun et al., 2020) or the United States (Faturay, Vunnava, Lenzen, & Singh, 2020).
The PIOLab is the first IELab designed for building purely physical IO tables. To overcome the lack of primary data in the physical domain, the PIOLab philosophy is to build extensively upon secondary data derived from accounting frameworks and models each describing certain aspects of SEM, such as MFA, ew-MFA, WIO-MF and LCA. To describe the architecture of the PIOLab and to highlight its analytical capabilities, we show-case the first gPIOT of global iron-steel supply chains which covers 32 regions, 30 processes and 39 types of iron/steel flows for the years 2008-2017.
Steel was selected as a first case study for several reasons. It is a key building block of manufactured capital i.e. in-use stocks (Graedel, 2010;Song et al., 2020;Weisz, Suh, & Graedel, 2015) and a strategic material for sustainable development (Müller, Wang, & Duval, 2011;Pauliuk & Müller, 2014), but also a major concern for local pollution (e.g. particulate matter) and CO2 emissions (Ryberg, Wang, Kara, & Hauschild, 2018;van der Voet, 2013). The iron and steel industry is the largest industrial source of CO2 emissions from fossil fuels (5% in 2014) and due to the high energy intensity of steel production particularly challenging to decarbonize (Carpenter, 2012;Davis et al., 2018). With steel being a highly recyclable material, one of the most important technological options for the steel industry, besides energy and material efficiency measures (Allwood, 2013;Allwood, Cullen, & Milford, 2010), is to transition towards a circular economy through increasing the use of steel scrap as feedstock (Nechifor et al., 2020). Considering all this in the context that metal supply chains are increasingly organized on the global level (Schaffartzik et al., 2016), iron and steel provides an excellent case study to present the challenges faced in the compilation of gPIOTs and the utility of the PIOLab.
The paper is structured as follows. Section 2 describes the methodological features of the PIOLab and its application to the iron-steel case. Section 3 presents an analysis of the alignment of the final gPIOT with its raw data sources and process descriptions, which includes realizations of input and output ratios of the blast furnace process. Furthermore, this section presents a comparison of ew-MFA headline indicators and an overview over international trade in iron-steel supply chains. Section 4 (discussion and outlook) elaborates on the advantages, shortcomings and policy relevance of the PIOLab approach and elaborates on upcoming developments.

METHODOLOGY
The method section is structured in three parts. It starts by briefly summarizing the features that are common to all IELabs (2.1). General characteristics and unique features of the PIOLab 6 approach are presented in 2.2, followed by the particularities of building iron-steel gPIOTs (2.3), i.e. data sources and important modelling assumptions.

2.1.The IELab concept
An overarching goal of the IELab concept is to maximize flexibility so that users can compile their own tailor-made MRIO database that fits individual needs and research goals while at the same time giving access to data and code repositories that others have contributed. The IELab architecture rests on three pillars.
First, a generalized raw data management system that streamlines the extraction of information from data sources and organizes it for further processing. The functions that perform this task are called 'data feeds'. Processed raw data are matched to a root classification, which is the maximum sectoral and regional detail that is theoretically achievable 2 , assuming universal availability of highly disaggregated data . To be used as constraints in the reconciliation process, raw data need to be concorded to the root classification.
The second pillar is a single-step reconciliation engine that compiles the MRIO table in a fully automated way. It requires an initial estimate of the table as input for reconciling the constraints provided by the data feeds. The resultant table is termed a 'base table', where the so-called base classification is any unique aggregation of the root classification. The reconciliation engine of the IELab is AISHA 3 (Geschke, Lenzen, Kanemoto, & Moran, 2011). It provides different reconciliation algorithms including RAS-type methods like KRAS (Lenzen, Gallego, & Wood, 2009) or least-square constrained optimization approaches (Ploeg, 1988). Besides merely aligning the row and column sums of tables, such algorithms can incorporate constraints on arbitrarily sized and shaped subsets of matrix elements, including for example known sums of elements and ratios between flows. These algorithms consider information on the uncertainty i.e. standard errors (SE) of constraints and the initial estimate, thereby enabling the handling of conflicting constraints and finding compromise solutions.
The third pillar of the IELab is a cloud-based, high-performance computer (HPC) architecture with parallel computing capabilities, in order to undertake the large-scale constrained-2 Aggregation of more detailed raw data may lead to an undesirable loss of valuable information. See for example arguments in . The overarching goal of the data management system of the IELab is to maximize the information that can be extracted from a raw data source and avoid information to be buried in the unfortunate aggregation of the source data. 3 Originally, AISHA has been developed in the course of the construction of the Eora MRIO. The acronym stands for "An Automated Integration System for Harmonized Accounts". optimization task in a reasonable amount of time. Each IELab builds on the same general architecture. A decisive feature of the IELab concept is the large degree of flexibility afforded by avoiding the need to lock-in any particular MRIO structure, that is, a regional and sectoral classification for the base table, at the time the laboratory is constructed . For further details on the IELab concept, we refer to the editorial of a special issue on virtual laboratories by Geschke and Hadjikakou (2017). The IELab architecture and workflow 4 as realized in the PIOLab is shown in Figure 1. Figure 1: A generalized representation of the PIOLab architecture and the workflow. Yellow squares stand for datasets i.e. arrays that contain information. Blue circles depict functions or general processing steps. The main purpose of feed functions is to transform the data and its classification in a way that it can be used as an input in the single-step reconciliation process (AISHA). Two general types of data feeds exist i.e. "data feeds for data constraints" which are based on primary or secondary data as well as "data feeds for fundamental constraints" which includes for example balancing constraints. 8

2.2.The PIOLab approach
Initial estimate and secondary data sources AISHA requires initial estimates for all elements of the base table, but primary data for all physical flows is usually not readily available (Kovanda, 2019). The PIOLab approach is to combine secondary data derived from available socio-metabolic accounting frameworks and models whenever primary data is not available. This can include life-cycle inventory databases (Wernet et al., 2016) for designing more detailed process descriptions; results from dynamic material flow models (Pauliuk, Wang, & Müller, 2013) for incorporating addition to and withdrawal from societal material stocks; or hybrid IO models, such as the waste IO approach to material flows (WIO-MF; Nakamura et al., 2007) for producing initial estimates of physical inter-industry flows.
The WIO-MF framework occupies a special place in the list of secondary data providers due to its ability to estimate large numbers of unknown material flows. WIO-MF allows inferring a physical flow matrix from monetary IO tables Nakamura et al., 2007;Nakatani et al., 2020;Nuss et al., 2016;Nuss, Ohno, Chen, & Graedel, 2019;. A filter matrix containing zeros and ones is used to remove monetary flows i.e. inputs from the technology matrix of the conventional monetary MRIO model, which do not correspond with apparent physical quantities ( = ∘ ). Here, ∘ indicates an element-wise multiplication of two matrices. Subsequently, a new Leontief inverse can be calculated ( = ( − ) −1 ) to estimate a modified gross production vector by post-multiplying the inverse with the total final demand vector ( = * ) 5 . The modified variables are used to allocate the material use of industries to final demand via = � �, where � stands for the diagonalized direct material use intensity vector = ̂ � � −1 . Note that the multiplication of sectoral material use with a yield vector results in a flow matrix where only useful output is present 6 . When constructed from a MRIO database, reflects a multi-region physical flow table and the material composition of final products, covering for example flows from higher manufacturing to end-use.
The decision for or against using a WIO-MF approach for making initial estimates is up to the PIOLab user and it can be adapted -for specific regions, processes and flows -when deemed necessary for a particular research question. The construction of the WIO-MF model obliges modelers to make simplifying assumptions that introduce uncertainties. This includes assumptions on the physical or monetary nature of flows, yields of processes and price homogeneity. However, because AISHA requires uncertainty-estimates, i.e. standard errors for all elements of the initial 5 Viewed from the perspective of a structural path analysis, and reflect only such paths that are assumed to be physical inter-industry flows. 6 For clarifications on the distinction between useful output and gross output (i.e. throughput) in (physical) IO models, we refer to section three in the supplementary information of Pauliuk et al. (2015). estimate, the reconciliation process addresses these assumptions. Given the unfavorable situation of general data scarcity in the physical domain, we believe the WIO-MF approach represents a generalizable estimation method for filling data gaps in the initial estimate of gPIOTs.

Root classification
Modelling the full scale and composition of all biophysical flows in one highly detailed gPIOT is very ambitious and by no means always necessary, since useful analysis can be conducted for any of the individual materials or substances of the full set of flows. Importantly however, conceptual and physical consistency is a key feature of the PIOLab, which is operationalized by defining a specific root classification.
As long as there is no universal root classification for all material and energy flows, the approach of the PIOLab is to conceptually separate the socio-economic metabolism (SEM) into two sections, the 'modelled core system' and 'other SEM processes' 7 . The modelled core system is the specific subset of socio-metabolic processes and biophysical flows that the researcher is primarily interested in and wants to model in detail, such as iron and steel supply chains in this paper. This core system is represented by physical Supply-Use tables and and the final use matrix , shown in Figure 2 in yellow. For stock-building materials (such as steel), the final use matrix represents gross additions to material stocks. Scope and highest possible level of detail of the modelled core system is set by the root classification. Physical flows between the modelled core system and other SEM processes -i.e. all inputs to and outputs from the core system that are processed, treated or used by other societal activities -are represented by additional rows/inputs and columns/outputs , shown in blue in Figure 2. Hereafter, these flows are termed SEM-inputs and SEM-outputs.
Finally, introduction of rows and columns for inputs from nature and the discard of residuals represent the interaction of the modelled core system with the natural environment, shown in green in Figure 2 8 . Inputs from nature together with SEM-inputs constitute the 'boundary inputs' to the core system. SEM-outputs plus discard of residuals represents 'boundary outputs' (see Figure 2).
This conceptualization yields a full system description of SEM with varying degrees of detail, following the mass balances principle as well as the SEEA guidelines. Moreover, it allows modelers and PIOLab users to invest their efforts on the subset of biophysical flows and socio-metabolic processes that are most relevant for their research, while retaining conceptual consistency. show final consumption products, which also includes gross additions to societal material stocks for stock building materials such as steel. 1,2 and 2,1 as well as 1,2 and 2,1 stand for trade flows of intermediates and final products respectively. In general, columns depict the inputs and rows the outputs of processes. Summation of all process inputs yields ′ = + + , where ′ stands for transposition of vectors and i, k, m for appropriate summation row vectors containing ones. Summation of all process outputs gives = + + , where j, l and n stand for appropriate summation column vectors containing ones. Vector x stands for total supply-use of processes (i.e. throughput) and q for total supply-use of flows of useful output which is why ∑ < ∑ .
The PIOLab differs from other IELabs by including non-market flows in its root classification, i.e. physical flows that are not distributed through markets and therefore have no monetary value. The sector root classifications of the global MRIO Lab (Lenzen, Geschke, Abd Rahman et al., 2017) is based on the Harmonized System/Central Product Classification (HSCPC) for traded commodities and in its current version differentiates 6357 products. The root classification of the global MRIO laboratory does not, for example, include such flows as molten/hot metal (e.g. liquid steel) or scrap from processing (e.g. the rolling and forming of steel) since they are situated within steel plants and therefore not distributed via markets. However, these flows are important when constructing gPIOTs, as they constitute inputs and outputs of crucial transformation processes and therefore are decisive for maintaining mass balances and the integrity of the overall system description.

Building iron-steel gPIOTs
The root classification of the forerunner PIOLab for the global iron-steel metabolism differentiates 76 transformation processes, 266 types of ferrous material flows and 221 regions. The regional root classification is identical to the one of the global MRIO Lab. All processes that use ferrous materials are included, i.e. mining, iron-and steelmaking, casting, rolling, forming, waste management, steel-using manufacturing sectors as well as the final use-phase i.e. fixed capital formation and final consumption of households or governments. This also includes flows of recycled steel scrap from processing (i.e. rolling and forming) and manufacturing (also termed 'new scrap'). Extraction of crude ore and the combustion of oxygen by furnaces represent inputs from nature. Wastes discarded to the natural environment, such as greenhouse gas emissions, represent outputs to nature. Coke, flux and EoL-scrap ('end of life scrap' which is also termed 'old scrap') are treated as SEM-inputs. By-products of the iron-and steelmaking process, such as blast furnace gas or slag, are classified as SEM-outputs. Table 1 lists all primary and secondary data for setting up the initial estimate of the base table and writing data feed constraints. The statistical yearbooks of the World Steel Association report 27 outputs of the iron and steel industry (intermediate and finished steel products) in 135 countries, accounting for approximately 85% of global steel production (World Steel Association, 2018). Iron ore grades and extraction volumes are taken from the UNEP-IRP global material flow database (UNEP-IRP, 2017). Information on physical trade flows is sourced from BACI (Gaulier & Zignago, 2010), which differentiates a large number of finished steel products (187). This is why the root classification for flows (266) is more detailed than for processes (76). Data on the regional supply of EoL-Scrap is sourced from the dynamic MFA model of Pauliuk and colleagues (2013). Coke consumption and production of blast furnace gas is taken from IEA's energy balances (IEA, 2012). We use the MRIO database EXIOBASE (Wood et al., 2015) to build a WIO-MF model for providing initial estimates of the steel flows in manufacturing and to link 10 steel-using manufacturing sectors with final consumption. Information on manufacturing yields, required for the WIO-MF model, are sourced from the global steel flow model of Cullen and colleagues (2012). Cullen and colleagues' model also provides estimates for the end-use split, i.e. the share in finished steel products used by different manufacturing sectors.  In general, most data sources report the useful output of processes i.e. the Supply-table such as the amount of liquid steel produced in basic oxygen furnaces (BOF) in China. There is little information on the input-side of these processes i.e. the Use-table (e.g. amount of steel scrap and pig iron used in BOF in China). The PIOLab uses process descriptions sourced from MFAs to derive initial estimates for the input-side (i.e. the Use-table) and at the same time formulate ratio constraints for the reconciliation algorithm.

RESULTS
This section begins with an analysis of constraint realizations, followed by a comparison of ew-MFA headline indicators that are derived from the iron-steel gPIOT with results from established databases. The analysis of the constraint realization and the ew-MFA indicators is based on the gPIOT for the year 2008 since this is the base year for which the initial estimate has been constructed. In the end, we present an analysis of international trade flows in the iron-steel supply chain to highlight its analytical capabilities. This analysis is based on the more recent gPIOT for 2014. The gPIOTs are available as a time series for the years 2008-2017.

Analysis of constraint realizations
As a diagnostic test of the constructed base table, we present a rocket plot (Figure 4-A) to show how well the final gPIOT adheres to its raw data constraints. Each raw data set is accompanied with standard errors (SE) so that the reconciliation engine can find a compromise solution for conflicting data points, adhering more to any data that are tagged with relatively low SE. As indicated in the method section, besides raw data on trade, mostly process outputs i.e. production values (e.g. iron ore, EoL-Scrap and various steel products) were available for use as data point constraints. In the optimization run presented here, we have assigned higher SE to trade data than to production data. This can be seen in the rocket-plot shown in Figure 4-A (color of points accord to the different data sources where yellow stands for the trade data), comparing the raw data constraint values (x-axis) against the realized gPIOT values (y-axis) for the year 2008. The rocketplot shows that larger constraint values, which are mostly production values, adhere more closely to the reported values. 17,092 data points were used as constraints in the reconciliation of which the large majority (16,172) refers to trade flows 10 . The rocket-plot also shows the density distribution of data points (three circles in grey for 25, 50, 75%) to highlight areas of heavy overplotting. The outer circle, in the range of 10 -500,000 tonnes, contains 75% of all data points.
Next, we present the realization of mass balances of the full system and individual processes. The balance of the full systems is visualized by the bar chart in Figure Figure 4-B). Out of the 960 processes (32 regions with 30 processes), 32 show a relative imbalance larger than 15%, which are foremost section mill, rod mill and ingot casting with smaller throughputs.

Figure 4-A: The top left scatter plot shows a rocket plot which compares values as reported in data sources on the x-axis and the values as realized in the 2008 iron-steel gPIOT on the y-axis. Yellow indicates BACI trade data and dark blue other data sources. Due to over-plotting, the density distribution of data points (25, 50 and 75%) is shown with three circles (grey lines). The inner circle marks the area that comprises 25% of all data points and the outer circle 75%. Figure 4-B: The top right scatter plot compares total inputs (y-axis) and outputs (x-axis) of all processes in all regions of the base classification. Colors indicate the type of processes. The dashed lines mark +/-15% deviation from the 45-degree equality line. Both scatter plots have a log10 scale in units of metric tonnes. Figure 4-C: The bar chart at the bottom shows the overall mass balance of the full system. It compares the boundary inputs ( + ) to boundary outputs ( + ) plus final use ( ). Boundary inputs reflect the sum of inputs from SEM and nature, which must equal the sum of final use i.e. final demand plus boundary outputs to SEM (by-products and wastes/emissions for further treatment) and nature (wastes/emissions without further treatment)
. Please note that, as mentioned in the method section, in the construction of the present gPIOT we assume that all wastes/emissions are subject to some form of treatment and hence boundary outputs to nature are zero. All results refer to the year 2008. All data can be found in the supporting information (SI-1).
As an example implementation of ratio constraints, we present the realization of technical inputoutput ratios for the blast furnace process 11 . The Steel Manual (Stahlinstitut VDEh, 2015) is currently the PIOLab's main technical process description for iron and steelmaking. The material balance of the blast furnace process is shown in Figure 5-A. It specifies that usually a tonne (t) of pig iron from the blast furnace requires inputs of approximately 1.55 t of iron ore, 0.45 t coke and coal, 1.25 t air/oxygen (for combustion) and 0.2 t of flux 12 (mostly limestone). As a by-product, 2.1 t of blast furnace gas and 0.35 t of slag are produced, resulting in a total throughput of 3.45 t.
The realization of these ratio constraints is visualized in Figure 5-B. In general, iron ore input ratios, show the largest deviations, which indicates conflicts between different data sources (e.g. World Steel, BACI and input-output ratios). However, these variations are less than 0.17 t/t i.e. 9% of the target value 1.55 t/t. The largest deviation for gas output ratios is 0.07 t/t i.e. 3% of the target value 2.1 t/t. Other ratios, such as for the output of slag (with the target value 0.35 t/t) or input of coke (0.45 t/t), show even fewer variations compared to iron ore and gas. In terms of regions, the largest deviations from these ratio constraints are found for the Netherlands (NL), Great Britain (GB) and the United States (US). For all other regions, blast furnace ratios have been well realized.

Comparison of gPIOT results with established databases
To show the extent with which PIOLab results differ from or agree with established databases, we sourced ew-MFA data for iron and steel flows from the UNEP-IRP global material flow database (UNEP-IRP, 2017) to compare these to the respective results from the iron-steel gPIOT. This comparison is shown in Figure 6 and includes Domestic Extraction (DE) and Domestic Material Consumption (DMC) as an indicator of national apparent consumption (Krausmann, Schandl et al., 2017). In addition, we calculated Raw Material Consumption (RMC) i.e. material footprint indicators  with the gPIOT 13 and compared them to results from the 13 Different types of physical IO tables and models can be constructed from the same set of PSUTs. The footprint-type indicators presented in this paper are derived from a gPIOT which is constructed using the approach proposed by Suh (2004), which is discussed in detail in the work of Altimiras-Martin (2014). IO models assume industry/process output to be homogeneous which is in conflict with a situation where process outputs are comprised of commodities and by-products and . Suh proposed calculating footprint-type indicators via endogenization of by-products by treating them as negative inputs. In other words, matrices and (see Figure 2) are transposed and negative (− ′ and − ′). This yields a physical IO model where and can be endogenously derived from final demand. In the present work, this is achieved by using a modified total process output vector for building the IO table and subsequently total requirement matrix , which omits and . Instead of = + + we estimate = which only contains 'useful output' and consequently ∑ = ∑ . Using and the multi-regional PSUTs, an IO model can be estimated following the commodity-by-industry framework as summarized in chapter five in Miller and Blair (2009). First, we calculate the industry input coefficients matrix ( = � −1 ) with the dimension product-by-EXIOBASE MRIO (Wood et al., 2015). Indicators are presented on the level of single countries (scatter plots in Figure 6) and for five aggregated world regions (i.e. Asia/Pacific, America, Europe, Africa, Middle East) and China (bar charts in Figure 6). Hereafter, percentage deviations refer to the hypothetical mean value of the compared results.
The best alignment is observed for DE. On average, DE indicators for world regions vary by 0.8% and for countries by 3.4%. Compared to DMC and RMC, the deviations in DE results are rather small because the UNEP-IRP's extraction account is also used for setting up data point constraints. Larger relative deviations are observed for DMC where results for world regions vary on average by 8.3% and by 22.9% for single countries. 12 out of the 32 regions show DMC deviations that are smaller than 15%, which includes China, India, Germany and Great Britain. The largest relative deviation in DMC is found for Sweden. Here the UNEP-IRP result (10.5 Mt) is around twice the size of the PIOLab result (5.1 Mt). The country with the second largest relative deviation is United States, where the gPIOT result (116.4 Mt) is around two thirds (66%) larger than the result from the UNEP-IRP database (69.9 Mt). Since DE results are very similar, differences in DMC must be rooted in the different estimates of physical trade flows and the physical trade balance (PTB; calculated via physical imports minus physical exports). Regarding the deviations in the DMC of Sweden, we see for example that UNEP-IRP and PIOLab data show very similar imports of ca. 6.8 and 6 Mt respectively. For exports, however, PIOLab results (29.8 Mt) are 4.6 Mt larger than the results reported in the UNEP-IRP data set (25.2 Mt). Deviations in the DMC of the United States are primarily a result of differences in imports, where the UNEP-IRP account reports 74.6 Mt and the gPIOT 111.2 Mt. Differences in the directionality of the PTB are only observed for one Rest-of-the-World (RoW) region i.e. RoW America (WL). Taking a closer look at the UNEP-IRP trade data, we found that global trade flows are not fully mass-balanced, meaning global imports and exports do not sum up to the same total, which might explain some of the differences observed in the comparison of UNEP-IRP and gPIOT results.
Of all the ew-MFA indicators compared, the largest relative deviations are observed for RMC. On average, RMC results for world regions vary by 29.4% and for single countries by 41.3%. Out of the 32 regions, 7 show variations that are smaller than 15%, which includes France, Great Britain, South Korea and Mexico. The largest relative deviations are observed for Brazil and Australia, two major iron ore extracting countries. For Brazil, the EXIOBASE result (133.9 Mt) is around 5 times larger than the PIOLab result (26.4 Mt). For Australia, the EXIOBASE result (60.3 Mt) exceeds the PIOLab result (17.8 Mt) by more than a factor of three. We find that the deviations in industry and the market share matrix ( = � −1 ) with the dimension industry-by-product. This gives us the productby-product technology matrix ( = ) and subsequently the total requirement matrix = ( − ) −1 . All codes of the RMC calculus are included in the GitHub code repository. For further details on the communalities and differences of different physical IO models, we refer the interested reader to chapter three in the supplementary information of Pauliuk et al. (2015). the RMC of Brazil and Australia are primarily a result of the differences in the RME of exports. For Brazil, the PIOLab result for the RME of exports (332 Mt) is 102.4 Mt larger than the result of EXIOBASE (229.6 Mt). For Australia, the PIOLab result (335 Mt) is 39.4 Mt larger than the EXIOBASE result (295.6 Mt). With the exception of China, we find that the directionality of raw material trade balances (RTB; calculated by subtracting RME of exports from the RME of imports), meaning whether a region is a net-importer or net-exporter of RMEs, is the same for EXIOBASE and the PIOLab. According to the PIOLab, China is a net-importer of RMEs (178.3 Mt), while EXIOBASE addresses China as a net-exporter (-98 Mt). Consequently, the PIOLab result for the RMC of China (1005.6 Mt) is 280.3 Mt larger than the EXIOBASE result (725.3 Mt). Data on aggregated trade flows can be found in the supporting information (SI-1). Variations in the RTB and the RME of exports can be interpreted in light of the different allocation logics i.e. value-based vs. physical allocation. When calculating RMC with EXIOBASE, raw material extraction is distributed to final demand following the monetary transactions between sectors and regions. Crude ore is allocated to all monetary supply chains, including the ones of for example service sectors that mostly serve domestic final demand. In contrast, the gPIOT-based RMC calculus follows a physical allocation based on the apparent physical flows between processes and regions. Consequently, the gPIOT calculus allocates crude iron ore exclusively to physical supply chains that distribute ferrous materials. The observed differences in the RME of exports suggest that with EXIOBASE a significant share of the extracted iron ore is allocated to domestic supply chains, which do not physically distribute iron and steel.

International trade in iron-steel supply chains
A major strength of the gPIOT framework is its ability to provide detailed sectoral information on both apparent/actual/direct physical flows and virtual/embodied/indirect flows (footprint-type indicators). To exemplify this, we provide a brief overview over global iron-steel supply chains and the scale of international trade flows. The following analysis is based on the gPIOT for the year 2014. To visualize the different processes and flows, a Sankey diagram is shown in Figure 7-A which is derived from an aggregated version of the gPIOT.
In 2014, the world consumed 1333 Mt of steel in final products, i.e. gross addition to material stocks. With ca. 345 Mt EoL-Scrap withdrawn from existing material stocks, this translates into a global net-addition to material stocks of 988 Mt. Construction is the most important sector, accounting for 59.5% of global steel consumption or 793 Mt, followed by machinery with 173 Mt. Together with motor vehicles (151 Mt), these three sectors account for 84% of global steel consumption.
With regard to international trade, we find that 13.5% of all the steel for final use is not manufactured within the country where the products are finally consumed. The construction sector stands out, as it sources only around 0.6% of its steel products for final use from international markets. On the other hand, manufactured products such as machinery and motor vehicles have trade shares of 32 and 31% respectively (see Figure 7-C) 14 . Products nec -i.e. office and electrical machinery, computers, communication equipment and other appliances -have trade shares of 39%. The largest trade share of all the flows in iron-steel supply chains is found for iron ore. 59.5% or 21 1209 Mt of iron ore is distributed via international markets (compare with the Sankey in Figure 7-A). Steel scrap (output of scrap preparation) has a trade share of 23%, followed by finished steel (output of rolling & forming) with 16%. Castings and products of reduced iron have smaller trade shares of 8 and 5% respectively.
The gPIOT can also be used to quantify upstream material inputs on the sector level. This is shown in Figure 7-B, which depicts the RMC and material multipliers reflecting the amount of reduced iron and steel scrap that are directly used in steelmaking (BOF and EAF) and indirectly required to deliver one tonne of steel for final use. The sector with the largest upstream scrap input per tonne is construction. To produce 1 tonne of manufactured steel for construction, approx. 0.6 tonnes of scrap and 0.93 tonnes of reduced iron are required. All other sectors show smaller scrap multipliers of less than 0.5 t/t, which points toward the fact that steel scrap is usually down-cycled (scrap from motor vehicles are used for construction) because of impurities (with other metals or materials).
In terms of RMC, we find again construction to be the most important sector with 1958 Mt of crude ore, followed by machinery (408 Mt), products nec (331 Mt) and motor vehicles (332Mt). The regional distribution shows that ca. 53% of crude ore is embodied in international supply chains. Motor vehicles stand out with a trade share of embodied crude ore of 66%. Moreover, large upstream, i.e. indirect trade dependencies, are found for the construction sector (trade share of 50%), which stands in stark contrast to the trade share of 0.6% for manufactured construction materials (compare Figure 7-B with 7-C).  Figure 7-B shows for the different manufactured products indirect/upstream material inputs, which includes RMC (y-axis on the left) and material multipliers (y-axis on the right) reflecting the amount of reduced iron and steel scrap that are directly used in steelmaking (BOF and EAF) and indirectly required to deliver one tonne of steel for final use. RMC results are disaggregated into crude ore that is embodied in the final consumption of the country where it is extracted (domestic/in-country flow in dark blue color) and the ore that is embodied in international supply chains (yellow color). Figure 7-C shows steel for final use i.e. gross addition to material stocks by different manufactured products, which is also depicted on the far right side of the Sankey. Steel in final products that is internationally traded is shown in yellow color. Flows that are situated within regions are shown in dark blue color. The data is included in the supporting information (SI-1). The results refer to the year 2014.

DISCUSSION AND OUTLOOK
The iron-steel gPIOT presented here exemplifies how the PIOLab combines the strengths of different modelling schools and accounting frameworks commonly used in Industrial Ecology to synthesize a model that has the potential to be more comprehensive and informative than any individual approach.
Using an IO model in physical units makes footprint-type indicators insensitive to price inhomogeneity and fluctuations, including differences in taxation schemes and subsidies, which distort estimates of actual physical flows between sectors when a monetary allocation logic is applied. The comparison of results from the gPIOT and the monetary MRIO EXIOBASE revealed large differences in RMC indicators, especially for important iron ore and steel producing/consuming countries like Australia, Brazil and China. The present gPIOT yields more robust RMC indicators that are mass-balanced and thus better aligned with other ew-MFA headline indicators (such as DE and DMC) since they can be derived from the same integrated top-down database. The ability to generate a consistent set of indicators on different levels of aggregation is one of the main strengths of the gPIOT. The model can deliver highly aggregated information on the country level, but global material flows can also be disaggregated into different sectors and regions, which allows opening up the black box of ew-MFA and thereby widening its applicability for policy-oriented analysis that focuses on specific economic activities. For example, the analysis of upstream scrap inputs per tonne of final product revealed how scrap demand is strongly driven by the manufacturing of construction materials. Due to the explicit consideration of scrap supply and use, gPIOTs can provide a practical framework for analyses that aim at closing globalized material cycles.
The diagnostic tests of the base table built revealed a good level of adherence between reported data and the values realized in the gPIOT, comparable to constraint realizations in other IELabs (Wakiyama et al., 2020). Conflicts between different data sources are unavoidable when constructing multi-regional IO tables. In the optimization run presented here, we have assigned higher standard errors (SE) to trade data than to production data in order to give priority to a better realization of the latter due to two reasons. Firstly, trade data points are several orders of magnitude smaller than production data points and we assume, according to the law of large numbers, that the relative SE of for example crude steel production in China (512 Mega tonnes in the year 2008) tends to be smaller than that of welded tubes traded between Hungary and South Africa (1.1 t in 2008). Secondly, BACI reports only gross trade that includes re-exports, which are not present in an MRIO framework. We compared raw data of production and trade to reveal the scale of iron ore re-exports. According to the UNEP-IRP database, only 48 countries extract iron ore but BACI lists more than 100 exporting regions. The mismatch between trade and other data sources is relaxed by tagging trade data with relatively higher SE.
The flexibility of the PIOLab enables Industrial Ecologists to create large-scale models in a costefficient way that are tailored to specific needs. All data mapped to the root classification can be re-used when compiling new versions of the iron-steel gPIOTs in the future. New data is added with data feeds to formulate new constraints. For example, researchers might want to construct gPIOTs that reflect other data for iron ore extraction (Schaffartzik et al., 2014) or EoL-Scrap supply (Myers, Reck, & Graedel, 2019). Finished steel products can be broken down into different grades, such as stainless and (non-)alloy steel, using additional data sets or proxy information for disaggregation (Reck, Chambon, Hashimoto, & Graedel, 2010). Another possibility is to construct gPIOTs that align with reported official statistics of a particular country of interest. Similar to the SNAC (Single Country National Accounts Consistent) method which prepares a global MRIO that respects a given national monetary IO table (Edens et al., 2015), the PIOLab can incorporate more country-specific material flow data -e.g. steel yearbooks of China (CNKI, 2018) -to construct a gPIOT that is customized for policy applications for a certain region. In the same vein, modelers can impute further regional information on process descriptions (input-output ratios), yield factors (Wang, Jiang, Geng, Hao, & Zhang, 2014;Wang, Müller, & Hashimoto, 2015) or the supply and use of by-products like furnace slag (Rieger & Schenk, 2019) to compile global tables that better reflect regional differences.
Data infrastructures and related tools have developed impressively over the last decade in Industrial Ecology. Data collections such as the Yale Stocks and Flows database (Myers et al., 2019), the building material composition database (Heeren & Fishman, 2019), the UNEP-IRP ew-MFA accounts (Schandl et al., 2018) or the Industrial Ecology Data Commons (Pauliuk, Heeren, Hasan, & Müller, 2019) represent a cumulative body of knowledge that the PIOLab can build on in the future. The efficient reuse of such data collections for constructing gPIOTs for various material and energy flows requires the development of a universal root classification that covers all socio-metabolic processes and flows, which are described therein. Because the same flow could be measured in different units (mass, energy content or monetary value), the multi-layered SUT approach as proposed by Merciai (2019) would provide a suitable integration framework where relations between layers are imputed via ratio constraints which reflect calorific values (Joule/kg) and prices (Euro/kg). Such a multi-layered 15 socio-metabolic root classification would constitute a model-independent reference system with clear and unambiguous definitions across modelling schools and accounting frameworks. This is very much in line with ongoing debates in the Industrial Ecology community regarding standardization and harmonization of data exchange and processing (Pauliuk, 2020;Petavratzi et al., 2018) as well as efforts to foster transparency and reproducibility of modelling results (Hertwich et al., 2018).
An IELab capability that the PIOLab has not tapped so far 16 is the possibility to create subnational and hierarchically nested multi-geography IO tables that disaggregate economy-wide material flows into different states and local regions. Because environmental and socioeconomic conditions can vary significantly within countries, IO models should aim to move from the aggregated national to a more detailed spatial level (Sun, Tukker, & Behrens, 2019). IELabs facilitate different regionalization methods including location quotient approaches, where national IO tables are disaggregated with proxy information to produce an initial estimate of the subnational table for the reconciliation routine (Flegg, Mastronardi, & Romero, 2016;Jahn, 2017;Jahn, Flegg, & Tohmo, 2020). Nesting such a subnational multi-regional table into a gPIOT would enable biophysical assessments of global supply chains that take into account the heterogeneity of environmental pressures and impacts on the subnational level (Moran, Giljum, Kanemoto, & Godar, 2020). This multitude of possibilities to refine, adjust and expand gPIOTs make the PIOLab a very promising tool for addressing a wide range of sustainability-related challenges in the future.
tables reconciled by AISHA are accompanied with standard deviations that can be used for uncertainty assessments. In general, uncertainty analysis has received increasing attention in MFA Lupton and Allwood (2018) ;Laner, Rechberger, and Astrup (2014). The standard deviations from AISHA can be used for Monte-Carlo Simulations and to estimate confidence intervals of derived indicators see for example Moran and Wood (2014). Secondly, IELabs are increasingly used for disaster analysis. Faturay and colleagues (2020) quantified spillover effects resulting from different earthquakes and typhoons in Taiwan. Lenzen and colleagues (2019) quantified indirect economic damages from a tropical cyclone in Australia. Another recent study by Lenzen and colleagues (2020) estimated global economic impacts of the COVID-19 pandemic. Using the PIOLab for biophysical disaster analysis represents an interesting new research avenue.

SUPPORTING INFORMATION
The supporting information includes all data needed to reproduce the figures (SI-1), the domestic supply tables (SI-3) and the domestic use tables along with aggregated trade flows (SI-4) for the year 2008. Moreover, base and root classifications together with the root-2-base region aggregator can be found in SI-2. A GitHub repository with R scripts is openly available, allowing users to reproduce the figures and perform IO analysis with the gPIOT (www.github.com/fineprintglobal/PIOLab). The time series of the gPIOTs (2008-2017) can be downloaded from Zenodo (https://zenodo.org/record/4385975).