Reference study of CityGML software support: the GeoBIM benchmark 2019 -- Part II

OGC CityGML is an open standard for 3D city models intended to foster interoperability and support various applications. However, through our practical experience and discussions with practitioners, we have noticed several problems related to the implementation of the standard and the use of standardized data. Nevertheless, a systematic investigation of these issues has never been performed, and there is thus insufficient evidence that can be used for tackling the problems. The GeoBIM benchmark project is aimed at finding such evidence by involving external volunteers, reporting on tools behaviour about relevant aspects (geometry, semantics, georeferencing, functionalities), analysed and described in this paper. This study explicitly pointed out the critical points embedded in the format as an evidence base for future development. This paper is in tandem with Part I, describing the results of the benchmark related to IFC, counterpart of CityGML within building information modelling.


Introduction
Interoperability through open standards is critical for the effective re-use and exchange of data and it is essential for reciprocal integration of data having different nature. The integration of 3D city models with Building information models (BIMs) has become a widely discussed topic in recent research. Two open standard data models considered for accomplishing such an integration are the Open Geospatial Consortium CityGML 1 for 3D city models, and buildingSMART Industry Foundation Classes (IFC) 2 for BIM models.
However, even as open standards are highly desirable, there is significant debate surrounding the CityGML standard. As examples, Ledoux et al. (2019) and Arroyo Ohori (2020) report some of the issues of the CityGML standard as found by the developers point of view. Meanwhile, various users on the web also reported related issues, even if in a less academic context 3 . The overall narrative is that the great number of ways in which the same models can be represented increases the implementation effort for software developers and reduces the quality of their implementations, therefore reducing interoperability and lowering the expectations of end-users.
Nevertheless, these issues are mostly discussed informally by practitioners and academics and have not been tested systematically. In order to gain more insight into the topic, the support of CityGML in software was investigated as part of the GeoBIM benchmark project 4 (see Section 2) and reported in this paper. Within the project, the approach to the study of the support for the two standards involved in the GeoBIM integration (IFC and CityGML) was conceived in parallel, also with the aim of understanding if one of the two offered more effective solutions that could be possibly borrowed by the other one in future developments. However, the final outcomes of the two different tasks are very specific for each standard and deserve to be presented and discussed separately, considering the specificities of each case. For these reasons, this paper, focusing on the results about Task 3, the benchmark task related to the support for CityGML, is written in tandem with Noardo et al. (2020), which describes Task 1 covering the support for IFC. In order to allow each paper to be read on its own, the two papers share some information (i.e. the first part of Section 2, explaining the general context and motivation of the study; Section 3.1 covering the initial part of the methodology about the entire GeoBIM benchmark set-up, and Section 3.3 concerning some similarities in the methodology).

The GeoBIM needs and the concept of this study -Common with TASK1 PAPER
Two kinds of 3D information systems were increasingly developed, studied and used in recent times, revealing their potential in the related fields: • 3D city models, which are used to represent city objects in three dimensions and advance previous 2D maps and other cartographic products, in order to support city analysis and management, city planning, navigation, and so on (e.g. Biljecki et al. (2015); Kumar et al. (2017); Egusquiza et al. (2018); Jakubiec and Reinhart (2013); Liang et al. (2014); Bartie et al. (2010); Peters et al. (2015); Nguyen and Pearce (2012)); • Building Information Models (BIM), which are used in the architecture, engineering and construction fields (AEC) to design and manage buildings, infrastructure and other construction works, and which also have features useful to project and asset management (e.g. Petri et al. (2017); Haddock (2018); Azhar (2011) ).
Several international standards exist to rule the representation of the built environment in a shared way, to foster interoperability and cross border analysis and, consequently, actions, or to reuse tools, analysis methods and data themselves for research and possibly government. Some examples of international standards are: the European Directive aiming at an Infrastructure for spatial information in Europe (IN-SPIRE) 5 , aimed at the representation of cross border pieces of land in Europe, for common environmental analysis; the Land and Infrastructure (LandInfra) 6 , by the Open Geospatial Consortium (OGC), aimed at land and civil engineering infrastructure facilities representation; and the green building data model (gbXML) 7 , aimed at the representation of buildings for energy analysis.
Nonetheless, the two dominant reference open standards for those two models are CityGML 8 , by the OGC, focusing on urban-scale representation of the built environment, and the Industry Foundation Classes (ISO, 2013) 9 , by buildingSMART, aimed at the very detailed representation of built works for design and construction objectives, first, but also intended to enable project management throughout the process, and asset and facility management in a following phase. Those standards are both intended to be very comprehensive and are therefore very wide and articulated. They both use complex data models allowing for a wide variety of models using object-oriented representations, even if that comes at a cost of slower and more inconsistent implementations.
Due to the overlapping interests in both fields (meeting in the building-level representation), increasing attention is being paid to 3D city model-BIM integration (GeoBIM), where the exchange of information between geospatial (3D city models) and BIM sources enables the reciprocal enrichment of the two kinds of information with advantages for both fields, e.g. automatic updates of 3D city models with high-levelof-detail features, automatic representation of BIM in their context, automated tests of the design, and so on (Liu et al., 2017;Fosu et al., 2015;Aleksandrov et al., 2019;Kumar et al., 2019;Niu et al., 2019;Noardo et al., 2019a;Arroyo Ohori et al., 2018;Kang and Hong, 2015;Stouffs et al., 2018;Lim et al., 2019;Sun et al., 2019b).
The GeoBIM subject can be divided into several sub-issues.
1. First, the harmonization of data themselves, which have to concretely fit together, with similar (or harmonizable) features (e.g. accuracy, kind of geometry, amount of detail, kind of semantics, georeferencing).
2. Second, the interoperability is a fundamental key in the integration. It is important to note here, that before enabling the interoperability among different formats (e.g. GIS formats and BIM formats), which is the object of the theme of point three below, the interoperability GIS-to-GIS and BIM-to-BIM itself is essential. That means that the formats of data have to be understood and correctly interpreted uniquely by both any person and any supporting software. Moreover, an interoperable dataset is supposed to remain altogether unchanged when going through a potentially infinite number of imports and exports by software tools, possibly converting it to their specific native formats and exporting it back. For this, it is desirable to rely on open standards.
3. Third, the effective conversion among different formats, i.e. transforming one dataset in a (standardised) format to another one in compliance with the end format specifications and features.
4. Fourth, the procedures employing 3D city models and the ones based on BIM should be changed in order to obtain better advantages by the use of both, integrated, since those systems enable processes which are usually more complex than just the simple representations.
The many challenges implied by the points above are still far from being solved, and one of the essential initial steps is actually to outline such challenges more sharply.
In particular, the second point (interoperability and involved standards) is often considered to be solved by standardization organizations. It is indeed desirable to rely on open standards for this, because of the well-documented specifications of open standards enable longer-term support, as well as their genericity with respect to different software vendors, as opposed to closed point-to-point solutions that merely connect one proprietary system to another (and might be discontinued or stop working at any moment). However, our previous experiences suggest that, unfortunately, the support for open standards in software is often lacking.
The researchers promoting this study (as users of data, advocates of open standards and developers of tools adopting such standards) have noticed, over their research and work activities, how the use of those standards in data and their implementations in software were not always straightforward and not completely consistent with the standard specifications either. Many tools, when managing standardized data, do not adequately support features or functionalities as they do when the data is held in the native formats of the software. In addition, software tools have limitations with respect to the potential representation (geometry, semantics, georeferencing) of data structured following these standards, or can generate errors and erroneous representations by misinterpreting them.
The standards themselves are partly at fault here, since they often leave some details undefined, with a high degree of freedom and various possible interpretations. They allow high complexity in the organization and storage of the objects, which does not work effectively towards universal understanding, unique implementations and consistent modelling of data. This is probably due to the fact that such standards often originate as amalgamations of existing mechanisms and compromises between the various stakeholders involved. These experiences have been informally confirmed through exchanges within the scientific community and especially with the world of practitioners, who are supposed to work with (and have the most to gain from) those standardized data models and formats. However, more formal evidence on the state of implementation of these open standards and what problems could be connected to the standard themselves have not been compiled so far.
For this reason, the GeoBIM benchmark project 10,11 was proposed and funded in 2019 by the International Society for Photogrammetry and Remote Sensing (ISPRS) 12 and the European Association for Spatial Data Research (EuroSDR) 13 . The aim of the benchmark was to get a better picture of the state of software support for the two open standards (IFC and CityGML) and the conversions between them, in order to formulate recommendations for further development of the standards and the software that implements them. In addition, we tested two known major technical issues related to GeoBIM integration and which are known to be solved only partially in practice: the ability of tools and methods to georeference IFC and the conversion procedures between IFC and CityGML.
The relevant outcomes regarding the OGC CityGML standard are the subject of this paper.

OGC CityGML: overview and knotty points
CityGML 14 (by Open Geospatial Consortium) (OGC, 2012) is the most internationally widespread standard to store and exchange 3D city models with semantics in the geospatial domain. It establishes a structured way to describe the geometry and semantics of city objects.
CityGML 2.0 (current version, considered for this project) contains classes structured into 12 modules, each of them extending the core module, containing the most general classes in the data model, with city object-specific classifications, e.g. Building, Bridge, WaterBody, CityFurniture, LandUse, Relief, Transportation, Tunnel, Vegetation. These modules contain one or more classes representing specific types of objects, which differ in the way they are structured into smaller parts and the attributes that are expected for each. The most developed and most used module in practice is the Building module.
Moreover, CityGML supports the possibility to further extend the schema through a standardized Application Domain Extension (ADE) mechanism . Some existing official ADEs, which could be useful for future tests of this project, are, for example, the Noise ADE, the Energy ADE, and the Utility network ADE. However, it is known that ADEs have poorer software support, since most implementations would need to specifically encode the new objects and attributes added by an ADE.
CityGML proposed a very attractive management of useful concepts for user communities. For example, it covers the most basic 3D city information with meaningful object-oriented representation, with deep hierarchies and complex relationships, which would be a more faithful representation of reality than a simpler relational database one. However, the downside of this, is that sometimes complex and unusual connections of information to internal/external sources are used in some cases (e.g. the prevalence of xlink-connected geometries within a file, the complex set of attributes used to store addresses, or the possibility to use references to external files through URIs), which could be a problem for the implementation of such model.
CityGML geometries are essentially the same for most classes: objects are represented as boundary surfaces embedded in 3D and consist of triangular or polygonal faces, possibly with holes.
CityGML as a data format is implemented as an application schema for the Geography Markup Language (GML) (CityGML uses version 3.1.1 of GML) (OGC, 2004). It is an open format and human readable, that means that potentially, the information could be retrieved even if losing backwards compatibility in software. However, GML presents many issues from a software developer point of view, since, for example, too many alternatives 15 are allowed even for simple objects, and a supporting application is supposed to foresee all possible combinations of them. The result of this complexity is that few software programs completely support all possible combinations, and most of its richness and power is lost (as the results of this research further demonstrate). An additional consequence of the kind of storage of such models is about their computational requirements: usually very large and complex files are produced, and it can be time-and resource-intensive to manage them properly in software.
As a possible solution to those issues, CityJSON 16 (version 1.0.0)  was recently released, providing a JSON encoding for a subset of the CityGML 2.0.0 data model. CityJSON follows the philosophy of another (non-standardised but working) encoding of CityGML: 3DCityDB (Yao et al., 2018). That is, to store the models efficiently and allow practitioners to access features and their geometries easily. The deep hierarchies of the CityGML data model are replaced by a simpler representation. Furthermore, some more restrictions are applied and one and only one way is allowed to represent the semantics and the geometries of a specific feature. CityJSON is in the process to become an OGC community standard. There is already a broad consensus around it, by users who are choosing it as an alternative to CityGML and many tools already developed to effectively work with it. Some of the tests performed within this study use CityJSON as a pathway to process CityGML in order to manage the data effectively within software.
In this paper, the framework of the GeoBIM benchmark methodology is described, with a specific focus on the part of the project regarding the investigation of the support for CityGML and related results.

The GeoBIM benchmark general set-up -Common with TASK1 PAPER
The benchmark was intended as a way to combine the expertise of many people with different skills, coming from several fields and interests, in order to describe the present ability of current software tools to use (i.e. read, visualize, import, manage, analyse, export) CityGML and IFC models and to understand their performance while doing so, both in terms of information management functionalities, and possible information loss. Moreover, since the big dimension of such standardised datasets often generate difficulties in their computational management, the ability to handle large datasets was a further part of the tests.
In particular, the four topics investigated in the benchmark are: T ask1 What is the support for IFC within BIM (and other) software?
T ask2 What options for geo-referencing BIM data are available?
T ask3 What is the support for CityGML within GIS (and other) tools? 17 T ask4 What options for conversion (software and procedural) (both IFC to CityGML and CityGML to IFC) are available?
For this purpose, a set of representative IFC and CityGML datasets were provided (Noardo et al., 2019b) and used by external, voluntary, participants in the software they would like to test in order to check the support in it for the considered open standard (Noardo et al., 2019a).
Full details about the tested software and a full list of participants can be found in the respective pages of the benchmark website 18 . The significant number of participants, balance in skills, fields of work, levels of confidence about the tested software (asked them to be declared) offered the possibility to limit the bias in the results.
The participants described the behaviour of the tested tools following detailed instructions and delivered the results in a common template with specific questions, provided as online forms. In the end, they delivered both their observations and the model as re-exported back to the original standardised format (CityGML or IFC).
In order to cover the widest part of the list of software potentially supporting the investigated standards, we completed the testing ourselves, by searching the online documentation of both the standards and the potential software.
In the final phase of the project, the team promoting the study analysed the participants' observations, descriptions and delivered further documentation (screenshots, log files, related documents and web pages). From this review, an assessment of the performances and functionalities of the tested tools was derived. Moreover, the delivered models were validated and analysed using available tools, when possible, and/or through manual inspection (Section 3.3). This approach allowed the inquiry about the level of interoperability given by the standard and its software implementation, by comparing the results of the export with the imported model features.
It is important to notice that the test results are not intended to substitute the official documentation of each software. Moreover, there were no expertise nor skill requirements to participate in the benchmark tests. Therefore, some information could be wrong or inaccurate, due to little experience with the tested software or the managed topics. The declared level of expertise is intended to lower this possible bias. Moreover, the benchmark team and the authors tried to double check the answers (at least the most unexpected ones) as much as possible, but the answers reported in the data were generally not changed from the original ones. The eventual discrepancies between the best potential software performances and what was tested could anyway be showing a low level of user-friendliness of tools (and thus a degree of difficulty in achieving the correct result).

The provided CityGML datasets
A number of datasets from different sources were identified, pre-processed and validated for this benchmark activity (see Noardo et al. (2019b) for details). The datasets were chosen to test both the most common features of such data and the main detected issues regarding the interesting but tricky aspects of the format.
Therefore, a large LoD1 file was chosen to test the support for quite simple but extended dataset: i.e. the whole city of Amsterdam in LoD1, covering the representation of many city-related objects and useful to test the software and hardware related performances. Second, a two-LoD file (LoD1 and LoD2) representing a district in Rotterdam was aimed at testing the support of different LoDs stored in the same file; and finally, the more complex geometries (and related different semantics) needed for an LoD3 representation were tested by means of the synthetically generated file BuildingsLoD3.gml (Table 1).

Description Dimension Source Aim
Amsterdam.gml Seamless city model covering the whole city of Amsterdam, including several CityGML city entities (vegetation, roads, water, buildings, and so on). Level of Detail (LoD) 1.

GB Generated through 3dfier by TUDelft 19
Test of the hardware-andsoftware connected performances (it is a very heavy model), and support for the included city classes. Test of the support for multiple LoDs and textured files.

Buildings-LoD3.gml
Procedurally modelled buildings in LoD 3 through 1.33 MB Generated through Random3Dcity . 20 Test of the support for LoD 3 files and related classes.

Answers analysis about the support for CityGML -Partially similar to TASK1 PAPER
The methodology for analysing the results about the support of software for IFC (Task 1) and CityGML (Task 3) are very similar, since they were also conceived to test similar issues concerning interoperability and the ability of software to keep files consistent with themselves after import-export phases.
The initial part of results analysis (Section 4.2) is qualitative, providing the description of software support and functionality based on the delivered answers.
The complete answers and documents delivered in the online templates 21 were double checked for correctness and consistency with respect to the asked questions. However, due to the nature of the initiative, we trusted the delivered information about the software, double checking it with new tests only in cases of inconsistent answers in different tests about the same software, or eventually, unexpected answers. In these cases, we also considered the level of expertise of the participant to assess if further checks were actually needed.
The delivered answers in the templates were critically assessed, cross-checking them with the different tests about the same software and the attached screenshots. A score about each aspect considered for the assessment of general support and software functionalities is assigned, as: 1-full support; 0.5-partial support; 0-no support. Those are synthesized in a table (Table 3), from where it is also easier to deduce possible patterns across many issues for a single software package or across many software packages for a single issue.
The definition of software groups are getting increasingly fuzzy, since the functionalities of all of them are continuously being extended and now tend to overlap with each other. However, in the tables, and more generally, in the analysis, in order to help the detection of possible patterns, the tested software are classified considering the criteria that usually guide the choices made by users, based on their different needs for specific tasks: • GIS -people expect GIS to combine different kind of geodata and layers and make analysis on them, structured in a database, in a holistic system; • 'Extended' 3D viewers are likely software that were originally developed for visualising the 3D semantic models, including georeferencing, and query them. They were (sometimes later) extended with new functions for applying symbology or making simple analysis.
• Extract Transform and Load (ETL) software, and conversion software, are expected to apply some defined transformations or computations to data; • 3D modelling tools have good support for geometry editing, but is not originally intended to manage georeferenced data nor semantics; • Analysis software are intended specifically for few kinds of very specific analysis (e.g. energy analysis); • BIM software, are intended to design buildings or infrastructures according to the the BIM methods.
The investigated issues, reflected in the different sections of the provided templates, regarded mainly the support of the software for the two standards (how the software read and visualise the datasets) and the functionalities allowed by the software with standardised datasets (what is it possible to do with such data). In particular, the test about the support was intended to test: how is the georeferencing information in the files read and managed; how are the semantics read, interpreted and kept after the import; and how is the geometry after the import. Moreover, some additional questions for Task 3 were intended to investigate if any additional step or conversion was necessary to use the file within the tested software, or straightforward workflow was sufficient (e.g. opening the software and importing the file by pushing a button).
Additional questions therefore are: what kind of formats can be managed (is a file supported through the GML encoding, or a conversion is eventually needed in addition or alternatively, to the CityJSON format or to a database, mainly SQL)? Does the software support the file out of the box, or is some specific plug-in or add-on needed?
• What kind of visualization is enabled (3D, 2D, with textures, with specific themes); • What kind of editing is possible (attributes, geometry, georeferencing); • What kind of query (query the single object to read the attributes, selection by conditions on attributes, spatial query, computation of new attributes); • What analysis are allowed. This topic is more complex, since very different analysis can be possible. Therefore we summarized it by defining two analysis types: 'Type 1' is any kind of analysis regarding the model itself (like geometric or semantic validation), and 'Type 2' are the simulations and analysis about the performances of the represented object (e.g. a building) with respect to external factors, in the city or environment (e.g. shadow, noise, energy, etc.).
• Final issue: Is it possible to export back to CityGML?
One more aspect that was asked to the testers of Task 3 (support for CityGML), and checked in the delivered results, was the support for ADEs, although it was just checked in theory, since no ADE datasets were provided.
Moreover, the support for each of the delivered datasets was noted, given the specific features as follows: multi-LoD management (through RotterdamLOD12.gml dataset), LoD3 management (through BuildingsLOD3.gml) and a large LoD1 model (i.e. amsterdam.gml dataset) in the case of Task 3.
This first parts provide a reference about the tools themselves for people intending to use standardised information. In addition, the most challenging tasks and most frequent issues for the management of standards were supposed to be pointed out.
A second, more quantitative, part of the analysis considers the delivered models exported back to CityGML 22 from the tested software (Section 4.2.4). The numbers and types of features of such files were calculated and compared to the same features in the initial datasets that were provided for the test.
The semantics were checked, in terms of number of entities and relationships, as computed by the statistics tool related to the KIT FZK viewer 23 . Moreover, the presence and consistency of attributes was also checked by means of manual inspection in 3D viewers (FZK and azul 24 ).
In addition, the CityGML schemas were validated by means of the GML schema validator related to the FZK viewer and the CityGML schema validator 25 .
The number and kind of geometries were also counted by the FZK statistics tool, further supported by manual inspection, in some cases. The val3dity 26 validation tool (Ledoux, 2013) allowed, finally, the test of the validity of re-exported geometries.
This allowed us to assess the level of interoperability that the connected standards-tools can actually reach in the different cases: i.e. can the data be imported and re-exported without any change?
A further assessment (Section 4.2.3) was intended to evaluate the software and hardware connected performance. The times declared by the testers were compared for the three datasets to see if their computational weight could affect their management within software.
Given the complexity of measuring software performance to the closest second, this was not requested from the users. Instead, they were asked to provide an approximate timing value for each test, according to a classification that was proposed following the way they could affect the perception or the work of a user, as explained in the following list: • It is almost immediate (good!) • Less than a minute (ok, I will wait) • 1-5 minutes (I can wait, if it is not urgent) • 5-20 minutes (in the meantime I do other things) • 20 minutes-1 hour (I cannot rely on it for frequent tasks) • more than 1 hour (I launch my process and go home, definitely ineffective for regular work) 22 It is possible to access to the exported models at https://www.dropbox.com/sh/j39xykfynd0uvkj/ Other options included reporting if the software crashed or if the task was not possible with the software provided, and participants were also asked to provide information about the specification of the machine, as this may impact overall performance of the software. Due to their diverse levels of size and complexity, timing results are summarised for the individual datasets.
4 Results: support of software for CityGML

Tested software against support for CityGML
In order to ensure the coverage of most of the software solutions to manage CityGML, we checked that the main currently used GIS software packages were tested, and in addition we tried to ensure to have the tools declaring some support for CityGML tested as well.
The tested software packages were 15 in total, with several tests for some of them, likely covering a most of the possible solutions. The full list is reported in  Some of the software packages were tested several times, especially the most known and used GIS tools, where the inclusion of 3D city model data could be the most interesting and useful, besides natural, since they are the current tools employed to manage geoinformation.
QGIS 27 was tested 4 times, plus one partial answer, by participants having different levels of expertise (beginners, current users and experts). The four tests conducted adopted different approaches, though: built-in support for GML, import of the file after conversion to CityJSON and the use of the CityJSON loader plug-in. Additional methods are the import through citygml4j, and the use of GMLAS 28 . However, the GMLAS procedure was not successful for the import of the data.
The CityJSON 29 implementation was used to enable the import of the datasets in Blender, by means of the CityJSON plug-in 30 , which could be useful to model and edit the models.
ESRI ArcGIS was also tested multiple times, both in the standard version (3 tests) and in the Pro version (2 tests). Other popular software were the FME (Feature Manipulation Engine) 31 from Safe software, tested twice, and the freeware extended 3D Viewer KIT FZK Viewer 32 , also tested twice (by testers having levels of expertise 1 and 2).
Moreover, other applications were considered (raising the number of tested software to 26), selected on the base of information found on the CityGML Wiki under 'software' 33,34,35 or in the websites of software programs declaring support for CityGML. However, some of them, the ones for which the full test was not performed, did not actually support the data or were no more available, namely: • Autodesk Landxplorer CityGML viewer is quite outdated, it is possible to find information about it until 2011 approximately 36 and no download was found except for an external website publishing an executable for its 2009 version 37 ; • Bentley Map and Bentley Microstation were tested, but it was not possible to import CityGML datasets; • SketchUP+CityGML plugin 38 is outdated: it worked for Sketchup v.2015, whilst the newest one is v.2019; • Cesium viewer and Google Earth work only through conversion. There are different tools available for that purpose but the participants in this study chose the 3DCityDB converter. 3DCityDB loads CityGML into relational tables in PostGIS or Oracle Spatial and can export into CityGML, KML/COLLADA and glTF formats. Rendering is very efficient with these formats at the expense of loosing complex semantics and relationships.
• Sidefx Houdini was proposed for testing by one of the participants, but it does not work with CityGML; • iTOWNS 39 does not support CityGML, unless, probably through some conversion to PostGIS 40 ; • HALE studio 41 should also be able to support CityGML, but the function to import CityGML was not actually found in the software.

Software support for CityGML
In this section, the qualitative analysis of the delivered answers of participants 42 describing the software tools and the tests is reported. Those answers are summarized in Table 3.  Task 3). The colour scale is assigned according to the scores from 1-full support (green) to 0-no support (red). In the table, the software supporting CityGML through conversions to other formats, namely CityJSON or through FME-based scripts are indicated in purple and orange, respectively.

Loading of CityGML data
First, the support for loading (City)GML files directly by the software was evaluated. This functionality would allow conversions between different formats to be avoided, ensuring easier use of such data by experts of other domains. Also, it would prevent the introduction of further errors and inconsistencies to the data, due to the conversion process.
In the case of GIS software (ArcGIS and QGIS), direct import of GML files was only correctly done with LoD1 data, which is basically 2.5D. Moreover, very few functionalities are enabled in this case, being only possible to visualise and query the data, without editing. The other two datasets -that is the LoD3 and multi-LoD files, which are both "real" 3D -were interpreted in a completely wrong way: geometries were only loaded as points for the multi-LoD dataset; and only attributes with no geometries were loaded for the LoD3 dataset.
To import the data consistently, additional software or specific plug-ins were required in both cases. For ArcGIS, the 'Data Interoperability' toolbox was used, which is based on the Safe Software Feature 42 https://www.dropbox.com/sh/35ke2uc5yfykik9/AAB5LmF56qr2jJ-hPLzo7Aypa?dl=0 Manipulation Engine (FME). For QGIS, several plug-ins exist, out of which only one worked (CityJSON Loader 43 ) which requires a conversion to the CityJSON format. The conversion to a relational database through 3DcityDB would also work in both cases, but again that involves a conversion and external tools in the process.
Different is the case for the group of 'Extended 3D viewers', since they were implemented specifically to work with the (City)GML format and can, therefore, read it directly. The same is true for the ETL and conversion tools, able to interpret the GML format, since their aim is generally to transform it to something else. Also in CitySim Pro a specific import for CityGML was implemented, and works.
That is not the case of 3D modellers and BIM software, where external tools are needed: the proprietary software ESRI CityEngine and Autodesk Infraworks only manage their own proprietary format, therefore the import through a connected FME-based converter is necessary. Blender, open source, is able to manage several more standardised formats, instead, but the specific plug-in is needed and, furthermore, this is able to import the data in CityJSON, that means the data should be converted to CityJSON 44 .
To summarise this issue, 70% of the tested software can read GML directly, however, most of them (9 out of 15) were specifically programmed for CityGML. The remaining ones need conversions that usually go through FME processing (especially the proprietary ones) and/or through the conversion to CityJSON 45 , or require the use of plug-ins.
Another point regarding the support of software for the CityGML data model is about the more complex kinds of geometry management, once imported, namely: the consistent interpretation and visualization or use for analysis of multi-LoDs datasets and functioning with LoD3 data.
The consistent interpretation of multi-LoD is intended as the possibility to read the information of the CityGML objects associated to the various LoDs geometries at once, among which it is possible to choose from the same interface, for both visualisation or eventual use of that in analysis. This was tested through the RotterdamLoD12.gml dataset. Only 40% of the tested software is able to manage them consistently: • No consistency in GIS (both read together and superimposed); • Partial support from the ETL and conversion tools (in FME it is either considered as a unique aggregate or it is possible to choose to upload them separately and in 3DcityDB it is necessary to choose which LoD to work with); • Being based on FME, the same is true for CityEngine and Infraworks; • Partial support is in CitySim Pro, since only LoD2 and LoD3 are allowed, therefore it was difficult for us to test it; • The issue is well managed by most of extended 3D viewers (except for tridicon CityDiscoverer Light, FMEDataInspector and azul); • Blender can also manage them consistently (through the CityJSON format). 43 https://github.com/tudelft3d/cityjson-qgis-plugin 44 For the conversion citygml-tools can be used: https://github.com/citygml4j/citygml-tools 45 An FME reader/writer for CityJSON has now been implemented (https://github.com/safesoftware/fme-CityJSON), so that it likely could also be used for importing CityJSON data. However, this functionality was not yet available during the data collection phase of the benchmark. LoD3, tested through the BuildingsLoD3.gml dataset is well supported by all of the tested software, once the necessary conversions are made (it is not read only by QGIS if the GML format is used).
A second level of the test was about checking that the data was interpreted and read correctly, that means their georeferencing information, their semantics and their geometry were not lost in the conversion to the native formats read by software but remained actually interoperable instead.

Loading georeferencing
Looking at georeferencing, the information about coordinate reference systems (CRS) can be consistently read by all the software managing the files (except for Blender, which is not intended to be managing georeferenced objects), as well as the heights, in few cases it is necessary to set the CRS manually, for example in QGIS or ArcGIS in some cases, in novaFACTORY and it also has to be explicitly set in FME and, consequently, CityEngine. No relevant problems are found for this aspect, as expected by software born to manage geoinformation.

Loading semantics
More difficult management is observed for semantics: • 'Extended' 3D viewers as well as ETL and conversion tools can usually interpret the semantics properly, as consistency of entities, hierarchies, attributes and further relationships, with some exceptions, for example, the hierarchy and relationships are managed through parent-ids in relational database fashion in some cases (FME and related software); • Tridicon CityDiscoverer light has no functionality to read neither georeferencing nor semantics information; • In the GIS tools the semantics is converted to the internal, relational, structure of data, therefore some part of the information is always lost or managed through different tables, connected by means of IDs, that make the structure way more complex and less usable.
In QGIS the testers 46 pointed out that sometimes the same entity information can be lost (wall, door, building), but the attributes result listed under a generic 'cityobjectmember', when managed through the GML format. In the CityJSON case, instead, the geometries are correctly recognized, but they cannot be accessed through the attribute table. The entity name (Building, BuildingPart and so on) is kept in the attribute 'type' of the CityJSON format. The relationships are managed through ids stored in the attribute tables. The same management of attributes is sometimes made more complex by their storage in different tables, connected through IDs.

Loading geometry
The tested tools had little possibility to assess the geometry validity and correctness, and more results are found after analysing the exported models (4.2.4). However, the geometries look generally good, except in GIS software when directly importing GML files without conversion. In that case we already discussed that the 3D (LoD3) or multi-LoD geometries were not read correctly.
A secondary question about the support for CityGML regarded the possibility to manage Application Domain Extension (ADE) information. This is managed by FME and some of the software using FMEbased procedures (ArcGIS), FZK Viewer, through the addition of the schemas in the software files, eveBIM, novaFACTORY and Blender. In none of them it is, however, possible to use the information for analysis, except for FME and ArcGIS, as stated in the answers. In the other cases such information can only be viewed and sometimes queried.

Using CityGML data
The functionalities asked for testing were: visualisation, editing, query and analysis.
All the tested tools can visualise the data in 3D and some of them (the GIS software, the ETL and conversion tools, plus 1spatial Elyx3D and FMEDataInspector) can also in 2D, which is not, however, the priority of those tools nor of those data.
Partial support also regards the visualisation of textures, consistently provided by half of the softwaretools: ArcGIS Pro, FMEDataInspector, FME, CityEngine, Infraworks (all based on FME conversion and enabled to visualise textures in their native formats), eveBIM and 1Spatial Elyx3D. The thematic visualisation (i.e. the application of symbology based on variables such as the name of entities, attribute values, queries and so on) is enabled as association of different colours to different entities in azul, novaFactory, 1SpatialElyx 3D, Blender, CitySim Pro. FZK Viewer also allows some thematic symbology based on the values of some attributes. A drawback of this is that only limited amount of pre-set thematization are allowed, without full customization possible. But it is still the most advanced application for this functionality, which could be very useful for current GIS users to use 3D information. QGIS also allows the application of symbology to 3D data in the 3D viewer.
The possibility to edit the data is also important for users. In this task, the GIS tools offer better support. Usually, it is possible to edit attributes, geometry (for example, by moving vertexes) and georeferencing. This is however not possible with GML data in QGIS. Limited editing is possible in enhanced 3D viewers: for example, it is only possible to remove objects from the FZKViewer, novaFACTORY can edit attributes and geometry through a module called 'Feature3D' and additional plug-ins, as well as 1Spatial Elyx 3D can edit attributes only if the data are converted to relational DBMS and imported in that format.
We can notice that, unfortunately, the tools which can best read the GML format (the enhanced 3D viewers) are the least able to manage the editing. This is likely because good functionality is achieved partly through a conversion to an optimised internal data model, from which it is hard to export back to CityGML.
Moreover, any kind of change is possible through the workflows of FME, and partially in CityEngine and Blender, where it is possible to edit attributes and geometry. The format edited in CityEngine is the native software format as converted by FME, while in Blender the CityJSON format is managed and can be edited. Finally, CitySim Pro allows the editing of some energy-related parameters, external to the dataset, according to its scope. Also, 3DcityDB does not include editing, query and analysis functionalities, but of course those are not the aim of the software and it is not considered a limitation.
The query functionality is another important requirement for working with data effectively. The synthesis table (Table 3) shows that most of the tested software, especially GIS and 3D viewers (except tridicon CityDiscoverer Light), allow to query the objects directly (by 'clicking' on them) to read the object attributes. CityEngine and Blender also allow that. More complex queries, such as the selection of entities based on rules, are possible in some of the software. They are well managed in GIS, although the 2D footprint of the geometry is often considered when running spatial queries. In 3D viewers, except for tridicon CityDiscoverer Light, all sort of queries are also possible, even if reduced in some cases, for example, very simple queries can be performed in FMEDataInspector, like looking for one attribute in a table, and in the FZKViewer they are quite predefined. In the case of FME and Blender, instead, any kind of query is supported but it has to be specifically programmed by the user: through the use of FME transformers in FME or in Python in the case of Blender.
The analysis functionalities of software regarding the data themselves (geometry, semantics and georeferencing validity) and the simulation of interaction of the represented objects with their context (Type 1 and Type 2, respectively) were considered separately. This topic will not be treated exhaustively, probably, since not all the participants reported on them with the best possible accuracy and, moreover, the least expert of them could not know how to get to the least apparent analysis tools. Nevertheless, from the tests it is possible to see that, generally, partial support to analysis is in GIS tools, that mainly work with 2D information. Sometimes additional plug-ins or toolbox can perform further 3D analysis. Such tools are quite new and little developed even for the 3D information in the same software native format, therefore we could not have the chance to understand if they work differently when used on CityGML data. In ArcGIS, the 3D Analyst extension is necessary, that used to deal with 2.5D information like Digital Terrain and Surface Models, and probably it is being extended also for real 3D. Moreover, it is possible to program more analysis through Python scripts, which is however not so user-friendly for the least expert users. Very little analysis is possible within 3D viewers, most of the times measurements are possible on the model, some energy analysis can be done in FZKViewer and novaFACTORY, shadows in some cases (eveBIM and novaFACTORY), overlay and buffers and visibility analysis can be performed in novaFACTORY, 1Spatial Elyx 3D, CityEngine and Blender. Moreover, the validation of the schema can be done in FZKViewer and the same, plus some geometric validation and many other analysis are possible in FME, which is the one most supporting this kind of functionalities, through building workflows by means of its transformers. The specific goal of CitySim Pro is to run energy analysis. However, in this test those did not work with the provided CityGML datasets. The implementation of analysis tools in software supporting CityGML is therefore not very developed yet, with some exception for visibility and shadow analysis and some energy tools which is available sometimes. The availability of tools analysing 3D information is generally to be enhanced.
Finally, few software can export back to CityGML, even when being imported through FME (e.g. CityEngine, Infraworks). The GIS can export through the same tools they used for the import (ArcGIS DataInteroperabilty toolbox and CityGML-toolbox for CityJSON in QGIS). In some 3D viewers, the data are not converted at all, therefore they are just saved with the new eventual changes without being exported (that implies, instead, a conversion). Moreover, the ETL and conversion tools are developed on purpose to run import-export processing, and therefore are able to export too.

Software performances with CityGML data
A total of 21 different reports were returned, for 15 different software packages. In particular, multiple results for Esri ArcGIS Pro (2 sets), Esri ArcGIS (3 sets), QGIS (3 sets) and FME (2 sets). These offer the opportunity for timing comparisons to investigate the impact of hardware on software performance. Table 4 gives a summary of the success rates returned for the tests on the Rotterdam data. Table 5 gives the count of the different timing values for the successful tests.

Rotterdam dataset
Note that in some cases users reported results for some of the tests but did not report results for all of the tests ("No result reported"). Additionally, some users typed in comments such as "no error" instead of giving specific timing. These are included in the "No result reported" count.  Less than a minute  For the Rotterdam dataset, 2 out of 17 of the successfully completed tests took over 20 minutes to open the dataset (City Engine and ArcGIS, with the second ArcGIS test reporting that this task took between 5 and 20 minutes). The vast majority of the software packages tested could zoom and pan the data immediately with only one package taking between 5 and 20 minutes to zoom into the dataset (eveBIM).
Only 10 out of the 21 tests report a successful export to CityGML, with all of these taking less than 5 minutes to execute. Table 6 gives a summary of the success rates returned for the tests on the LoD3 data.    This reflects the small size of this dataset, and similarly all 11 successful attempts to re-export the data took less than 5 minutes. Rotation, zoom and plan are very rapid -but one report notes a time of between 5 and 20 minutes to inspect relationships between objects (Esri ArcGIS). Table 8 gives a summary of the success rates returned for the tests on the Amsterdam data. Table 9 gives the count of the different timing values for the successful tests.  As above, in some cases users reported results for some of the tests but did not report results for all of the tests ("No result reported"). These results demonstrate clearly the impact of the larger dataset on the tests carried out, with only 8 out of 21 reports indicating that the software was able to handle the data (48%), with an additional 38% reporting software crashes. The three 'no result reported' here relate to QGIS, where the testers either displayed the data as points or first converted the data to CityJSON, which is not what the task required.  None of testers reported a time of less than 5 minutes to open the dataset, and 5 reporting a time of over 1 hour. 3DCityDB reported an export time of 1-5 minutes, perhaps due to its bespoke development for CityGML, in contrast to the generic functionality offered by many of the other software packages.

Multiple Tests on Same Software Packages
The crowdsourcing approach taken in this project resulted in multiple participants testing the same software, providing the opportunity for comparison. However, the three QGIS tests were eliminated from the comparison as one imported the data into points and the other two created CityJSON data.
• Comparing the FME Data Inspector tests, many of the results obtained were identical in terms of performance time. A slight difference (1-5 minutes versus less than one minute for the import of the Rotterdam data) could perhaps be attributed to the different RAM values with the first participant having 16GB and the second 32GB. However, this did not make a difference in terms of the export time for the Amsterdam dataset, reported at 5-20 minutes by both participants.
• For the two ArcGIS Pro tests, one user reports that the Rotterdam dataset caused the software to crash when zooming and panning, meaning that results could not be compared. All other results (Amsterdam and LoD3) were similar • For the Amsterdam dataset, two out of three of the ArcGIS testers managed to import and visualise the data, whereas the third user reported that this crashed their system. This user also reported slower export times (1-5 minutes) for the LoD 3 and Rotterdam datasets (in comparison to less than a minute reported by the other two users). Examining the hardware used, it can be noted that all three users had 16GB of RAM in their machines and dedicated graphics cards, and the only potentially significant difference between the machines is that the first two are running on an Intel i7 machine and test 3 was run on an Intel i5 machine, however the latter has a reported CPU speed (from the user) of 3.2GHz whereas the former two both use the Intel i7-8750H chip which reports a speed of between 2.2 and 4.1GHz. Similar amounts of hard drive space were available for all three tests (270, 210 and 298GB). Perhaps importantly it was not specified whether the hard drives were solid state -which could be a significant factor when importing a large dataset and writing it temporarily to disc.
• The first FZK Viewer respondent was able to perform some timing tests on analytical tasks using the Amsterdam data, and reports that tests of Type 1 took more than 1 hour (XML Schema validation) and less than a minute (distance measurement). However, while the first report for FZK Viewer reported timings for some analytical tasks, the second report did not include this information. The second FZK Viewer report did not include time measurements for panning the LoD3 dataset. While visualisation times for both FZK Viewer reports were equal for the Amsterdam dataset, a marked difference could be noted when it came to zooming into and panning around the model with one user report a time of more than one hour and the other "it is almost immediate". For query and inspection, the first test reports 5-20 minutes whereas the second reports "more than one hour". There is no particular difference in the hardware used for testing that would account for this difference (both machines have 32GB RAM and are running Windows 10 with an Intel i7 processor). Additionally, for the other visualisation test (rotation) very similar results are reported.

Writing of CityGML files
The data exported from the tested tools 47 and delivered by participants were analysed by means of the tools described in Section 3.3.
The information about georeferencing is always kept unchanged, with two exceptions: novaFACTORY for exporting amsterdam.gml and one of the ArcGIS tests exporting BuildingLoD3.gml dataset. The amsterdam.gml dataset exported by novaFACTORY, reports the reference system Amersfoort / RD New, EPSG:28992, Dutch national reference system for plane coordinates, instead of the EPSG:7415, which is simply associating the same CRS to the Dutch national height system ('NAP height'). Consequently, the z values of the file bounding box are different, although it is possible to change the reference system to the correct one for the export. In one of the tests with ArcGIS for the BuildingsLoD3 dataset, the coordinates of the bounding box are slightly moved from minimum (-0.7,-0.61, 0) to (0,0,0) and maximum (70.39, 67.54, 16.7) to (69.93, 67.21, 16.7), with related consequences in the total dimension of the bounding box and area of the base surface.
In the Tables 10, 11 and 12, the differences between the statistics and computed features (related to both semantics and geometry) in the original datasets and the ones which were exported by the tested tools are represented. In the tables on the left, the absolute number of features that are lost (in orange gradients in the tables) or generated (in red gradients in the tables) with respect to the original files are counted. Only when the value is 0 (green) there were no changes in the files and the data exchange was successful. These tables also include the entities and geometric elements which were not present in the original files, but appeared instead in the exported ones.
In the right part of the tables, the same values are expressed as a percentage of the whole original amount, in order to allow a more meaningful representation. We can understand how the interoperability is only achieved by FME and only with the Amsterdam and BuildingsLoD3 datasets. It does not completely maintain the features unchanged in the RotterdamLoD12. This would not be acceptable for use in practice.
Since a software could be tested more than once, with different results, the same software can appear more than once in the table, reporting the analysis of the delivered models.

Limited interoperability, a matter of complexity?
Although we expected clearer patterns in the results, that could make it possible to better understand the remaining problems of interoperability in CityGML models, the only clear result is that very little interoperability is actually reached. There are very few tools able to read the standardised datasets correctly and even fewer that are able to export them consistently. The ability to uniquely interpret the models and to leave them consistent through the import-export phases is absolutely essential for interoperability and what it enables (data exchange, data re-use and so on). At this stage it is not possible to trust standardised models though, even just for simple file exchange.
Furthermore, the 3D city models are expected to be usable and powerful to support analysis, problem solving and management actions. However, the inability of tools to manage 3D information, and especially standardised 3D models in CityGML, critically hinder this potential. This is way more apparent if looking at the discrepancy between the ability of software to correctly read and interpret the data, and their functionalities for editing, querying, analysing them. In fact, the tools which are better at supporting those functionalities (e.g. GIS) are not the best at interpreting the standardised datasets contents, and vice versa. The frustration expressed informally, as reported in the introduction (Section 1), about the   gap between the users expectations and what is actually feasible (especially when they are just users and not geomatics and 3D modelling expert programmers) is therefore justified by our study.
Results were reported for 15 software packages, including both bespoke CityGML viewers but also generic GIS tools. Support for CityGML is very mixed -and in particular it is not directly supported (in the GML 3D form) in perhaps the most popular open source GIS package -QGIS. CityGML is supported in the most popular commercial package -ArcGIS (including Pro) but only one out of 5 users were able to display an extended 3D city model (Amsterdam) even though the display and management of an entire city model is likely to be a task that users will be interested in accomplishing.
While the georeferencing information is supported quite well by the analysed tools, problems have been reported with semantic and geometric properties. Semantics is also sometimes lost or problematic, with main issues related to the management of hierarchies and other relationships. Although being one of the most exciting possibilities in this kind of models, the lack of suitable support should compel to rethink complex relationships to make them simpler and effectively manageable. A clear definition of how to structure entities, priorities and limits have also to be defined.
For applications, often it is preferable to rely on a simpler model that is implemented consistently in all applications, than on a more complex one that will be implemented in fewer applications and with a lower degree of consistency. Most applications clearly load CityGML datasets into their own internal data models-something that can be most clearly seen by the differences in the imported and exported files, which proves that the assumption that the complex data model of CityGML is directly implemented in applications is false.
The issues and challenges shown by the results of this test bench are influenced by a number of factors, among which: • the complexity of CityGML and implicitly of 3D city models; a complex data model such as CityGML is required to describe and render a wide range of characteristics of the city, from both a geometrical point of view but moreover from a semantic point of view.
• a CityGML v2 dataset combines features and informations from both 3D graphics and geoinformation domain, which in turn causes issues to various software which are not able to handle easily this data model. Some software are tailored to more specific domains: geoinformation software (GIS, ETL, etc.) which are not always focusing on complex data structures or 3D graphics; whereas other software are more tailored for 3D graphics/visualisation, such as 3D modellers, but do not emphasize on semantics.
• another factor which may influence the interoperability issues is the way the standard is exploited and used by users -e.g. how they populate the schema and which encoding they use; • last but not least, issues may also arise due to the way a software uses computational resources. For example, some software are more efficient in using CPU and GPU power and could handle complex CityGML data in a more efficient way, whereas other software only rely on CPU power (and may not use simultaneously more than one CPU thread) therefore causing issues when reading/writing/exporting data, which could also mean other type of data not just CityGML.

The computational load
While it is not possible to say which software package is fastest -the approach to timing used general timing categories rather than requiring the user to undertake the onerous task of time measurement, and performance will also depend both on the hardware being used, we can report that none of the software packages managed to carry out the visualisation task in under 5 minutes for the Amsterdam data, although 13 packages achieved this with the Rotterdam dataset and 18 for the smaller LoD3 dataset. For CityGML export 10 software packages managed to export the data in 5 minutes or less for Rotterdam, 11 for LoD3 but only 1 for Amsterdam.
A simplification of the geometry representation mechanisms could improve the related software performances, greatly affecting the time spent parsing and loading files, as well as the memory used while parsing the notoriously verbose GML geometry definitions.

The voluntary participation
The fact that the inquiry is based on voluntary and completely open contribution is considered both a strength point and a limitation of this work, since it is essential to cover the investigated object in the most thorough way: as many software as possible, with as many highly experts involved, inclusion of also less expert users to test also user-friendliness. However, the limit is the incompleteness of the resulting tools review. This has been limited by the integration of additional packages in the testing that had not been considered at the beginning. Moreover, the tests reporting suspicious results according to the promoting team experiences, for both too good or too bad performances, were double checked with new tests or asking for clarifications. A further issue could be the inexperience of some testers, reporting about tools behaviour in an inaccurate way. To lower this eventuality, it was checked that all the delivered answers were described with sufficient care, whatever the level of expertise could be. Once verified this, eventual conflicting answers with respect to the tools actual potential, could indicate a deficiency in the suitability of the tool to be used by any inexpert user, which would be anyhow necessary for the models to be used in practice. The involvement of a large part of the community is also important to perceive the relevance of the topic, although dealing with a somehow hidden issue among the high-level standardization and academic communities.

Conclusion
This study was designed to point out and provide evidence about the support and issues of available software for standardised information in CityGML version 2.0. Interoperability is essential for a number of use cases, and even for merely exchange and re-use data. Standards are supposed to be enabling such interoperability and standardization is the essential premise to the development of any integration, including the GeoBIM one. CityGML is a popular open reference standard for 3D city models management and storage. However, a number of issues are informally reported on using CityGML as data format, preventing the effective use of such datasets. But no systematic proof was available to be the base of future improvements in implementations, in data modelling and in the standard itself.
Possible bias in the results could be given by the eventual little expertise of participants making the tests, or by the initial inaccuracies in the provided datasets, coming from practice. However, such datasets were validated and improved for the purpose of the benchmark, in order to limit their influence on the quality and reliability of results. Therefore, if any of such chances happened, although the great efforts in controlling them, it would reflect additional drawbacks of the standard itself, for the little clarity about its use for the modelling of actual datasets and the difficulty in implementation, which could produce little intuitive tools.
This study shows the drawbacks of the CityGML (v2.0) standard and implementation, and the related difficulties, also due to the challenges to which it is intended to respond (e.g. representation of the information regarding a huge and complex field, flexibility to multiple needs). The outcomes are of great importance to acknowledge them officially, and to be the base for future research in the field and development of concrete solutions, such as the addition of constraints and specific guidelines, more simple ways to store geometry, better selection of useful semantics, and so on.
Considering the results of this study as an evidence, future work should be aimed at resolving the outlined issues. For example, the test about specific kinds of geometries, how to constrain them and how to guarantee that software can import, read, use and re-export them without any change. The same is true for semantics: which are the essential categories and relationships for the models to be useful? How could those be simplified without losing effectiveness? Moreover, when considering the performances related to the computational requirements, we can easily understand that the reduction of data size is urgent.
Some work is already being done to solve such issues, trying to work for less complex models offering more straightforward choices, easier to implement, for example with the introduction of CityJSON. On the other hand, CityGML version 3.0 is about to be released (Kutzner et al., 2020) This new version introduce many features such as a new LOD concept (based on Löwner et al. (2016)) extended version management system (introducing bitemporal time stamps, see Chaturvedi et al. (2017)), specific classes to facilitate easier conversion to IFC, and the possibility to introduce logical spaces (as a complement to physical spaces). The latter can be used for e.g. 3D cadastre units which enables a connection between CityGML with e.g. the Land Administration Domain Model (International Organization for Standardization (ISO), 2012), without introducing an ADE (Sun et al., 2019a). CityGML version 3.0 has interesting features for establishing new national standards (Eriksson et al., 2020), however it is uncertain if this new standard will be able to address the technical interoperability issues evaluated in this test bench. And there is a risk that the new features (even though interesting from several application perspectives) may even increase the complexity as highlighted in this study and therefore pose further interoperability challenges.

Acknowledgements
This work was possible thanks to the collaboration of the whole GeoBIM benchmark team (with their work as in-kind contribution to the project), all the data providers, the participants making the tests, listed in the GeoBIM benchmark website 48 and the participants to the GeoBIM benchmark workshop 49 .