Enhancing point cloud semantic segmentation in the data‐scarce domain of industrial plants through synthetic data

Digitizing existing structures is essential for applying digital methods in architecture, engineering, and construction. However, the adoption of data‐driven techniques for transforming point cloud data into useful digital models faces challenges, particularly in the industrial domain, where ground truth datasets for training are scarce. This paper investigates a solution leveraging synthetic data to train data‐driven models effectively. In the investigated industrial domain, the complex geometry of building elements often leads to occlusions, limiting the effectiveness of conventional sampling‐based synthetic data generation methods. Our approach proposes the automatic generation of realistic and semantically enriched ground truth data using surface‐based sampling methods and laser scan simulation on industry‐standard 3D models. In the presented experiments, we use a neural network for point cloud semantic segmentation to demonstrate that compared to sampling‐based alternatives, simulation‐based synthetic data significantly improve mean class intersection over union performance on real point cloud data, achieving up to 7% absolute increase.

extent for newly planned and erected buildings, albeit with reduced geometric detail depending on the chosen approach and use case (Gregor et al., 2009); in many cases, no digital models are available at all (Talebi, 2014).
At a later stage in the building lifecycle, the as-is status might deviate heavily from the as-designed or as-built status due to undocumented changes because models are rarely fully updated (Volk et al., 2014).Furthermore, the older an existing facility is, the less likely it becomes that its stakeholders possess any useful digital representationrecreating a detailed, semantic 3D model entirely by hand is extremely expensive and time-consuming (Fumarola & Poelman, 2011).The research field of "scan-to-BIM" focuses on methods that allow automating parts of this process to digitize the existing building stock (Bosché et al., 2015;Lu & Brilakis, 2019).
The as-is conditions of the built environment must be captured in the field first to provide the necessary data basis for approaches associated with scan-to-BIM.Such data acquisition is preferably performed in 3D, using laser scanning or photogrammetry (Li et al., 2022), during the construction phase (Chern et al., 2023;Z. Wang et al., 2022) or in the context of existing projects (Tong et al., 2023;Wu et al., 2022;Zheng et al., 2022).This reality capture results in millions of points representing the object surfaces visible to the sensor in point clouds that can be very precise but are characterized by a few major shortcomings and cannot be directly used for further activities such as redesigning.Point clouds do not inherently carry semantic information, include noise, and lack closed surfaces caused by occlusions (Walsh et al., 2013).To convert them into formats valuable to engineering, facility management, and other activities, they must be processed intelligently to enrich them with further information or create surface or volumetric models that can seamlessly be used in subsequent processing steps.Traditionally, this is a manual task conducted by trained engineers (Hullo & Thibault, 2014): Pre-processed data are filtered, cut into subsets such as slices, and finally used to create 3D models by hand that are as close to the captured point cloud as possible.This process is time-consuming, yields subjective results, and is therefore inflexible and expensive.Lu and Brilakis (2019) investigated the effort required for the manual modeling of bridge infrastructure; Agapaki et al. (2018), addressing the challenge of resource intensity, identified the most critical objects and their frequency in industrial models, along with the manual effort of modeling them.In Fumarola and Poelman (2011), different approaches applied to several projects are presented and evaluated concerning their degree of automation and individual requirements.Hullo et al. (2015) report on a large-scale study that reconstructed a nuclear reactor building from terrestrial laser scanning (TLS) and image data, in which around 70% of the overall required time was spent on the reconstruction of CAD models (computer-aided design).Due to these circumstances, many attempts have been made to automate parts of the scan-to-BIM process, with scopes ranging from volumetric models on the building level (Ochmann et al., 2016) to domain-specific solutions (Smith & Sarlo, 2022).Beyond the generation of a model, typical scenarios include urban applications related to traffic (Balado et al., 2019;Esmorís et al., 2023) and structural health monitoring (Oh et al., 2017;H. S. Park et al., 2007;S. W. Park et al., 2015;Yin Zhou et al., 2022).
The initial step of enriching the raw input point clouds with semantic information is highly labor-intensive, as it requires the user to navigate large unstructured datasets to first identify and then manually separate objects and systems in 3D space.This step is also denoted as semantic segmentation and has received much attention because the underlying technical problem is relevant for those working with the built environment and in autonomous driving, geosciences, augmented and virtual reality applications, and many more.Traditional strategies to perform such semantic enrichment rely on hand-crafted features or well-known geometric properties of object classes (L.Ma et al., 2018;Macher et al., 2017;Sharif et al., 2017); more recently, purely data-driven methods show the most promising results (Croce et al., 2021;Perez-Perez et al., 2021).However, the latter methods rely on the availability of large amounts of high-quality annotated point cloud data for method design and evaluation.The amount and quality of these annotated data are crucial to the success of automated methods of point cloud semantic enrichment.The process of manual annotation is costly and time-consuming, unambiguously pointed out by Huang et al. (2023) for the case of 2D ground penetrating radar data; for 3D point clouds, this issue is even more severe (Shi et al., 2021).
In domains related to urban scenes or indoor office environments, an increasing amount of open-source data is available for these purposes.In this research, however, we are focusing on the data-weak domain of industrial facilities.While capturing manufacturing plants and refinery scenes has become widespread for more than 15 years (Shellshear et al., 2015), industrial owners and operators usually do not annotate large amounts of these data, let alone publish them, due to confidentiality and employee privacy issues.Consequently, there are too few annotated point clouds for training the machine-learning models for automated point cloud segmentation.
At the same time, for manufacturing plants that are subject to periodic changes due to product cycles and frequent adaptations for optimization, 3D design models have become an industry best practice for steel and plant construction (Wiendahl et al., 2015).This domain has been very active in developing and adapting digital methods with regards to 3D models for planning and operation (Gregor et al., 2009), evident, for instance, in national regulation for the standardization of 3D models used in the German car manufacturing industry (VDA, 2009).While detailed 3D models representing building structure and technical equipment can be utilized to generate synthetic ground truth data, their complex, intertwined geometry limits the value of synthetic data generated using conventional, sampling-based methods.Simulation methods for considering these specific conditions have not been investigated in this context.
This paper presents a method to generate realistic, semantically rich ground truth data based on specimens of such 3D design models by applying state-of-the-art laser scan simulation.This type of simulation is able to consider the precision and accuracy of the used sensors through equipment parameters and the complexity of the surrounding scene's layout.In doing so, the paper aims to contribute to an increase in the performance of point cloud semantic segmentation for domains with no publicly available datasets by introducing this level of realism to synthetic data, thus reducing the amount of manually annotated data necessary to achieve useful results.While much less computationally expensive, conventional methods to generate synthetic data based on such 3D models fail to achieve the amount of realism necessary to depict complex scenes in the industrial domain sufficiently to learn distinctive features for semantic segmentation.We conduct an extensive experiment to validate this increased effort the presented simulation-based method brings in comparison to conventional, sampling-based generation methods; results are analyzed and discussed in detail.The results of the experiment unambiguously show the superiority of the more realistic, simulationbased method to generate synthetic training data.Thus, the presented approach is able to facilitate existing scan-to-BIM approaches by improving segmentation results while minimizing manual work.
This study targets academics and professionals seeking practical solutions in industrial applications.It explores the potential of neural networks for point cloud semantic segmentation in domains where open datasets are unavailable.Our approach tests the hypothesis that synthetic datasets can effectively train models for industrial use.We offer comprehensive explanations for creating these synthetic datasets and conduct thorough experiments to validate their practicality in industrial contexts.Additionally, the paper extends the body of knowledge by comparing various data generation methods, highlighting their unique potentials and limitations.
The paper is structured as follows: Section 1 introduces the research topic and provides an overview of our approach.In Section 2, we review relevant prior research to establish the context and motivation for our work.Section 3 presents our methodological approach, detailing the methods and techniques used in our study.Section 4 discusses the extensive experiments conducted and presents the obtained results.Section 5 is dedicated to the discussion of our findings and their implications.Finally, in Section 6, we conclude our presentation and provide directions for future research.

RELATED WORKS
This paper presents an approach to address the scarcity of training data in point cloud semantic segmentation within the industrial domain.The following subsections provide an overview of relevant related works, point cloud enrichment, domain-specific training data, and investigations into the potential of synthetically generated training data.

Point cloud enrichment
All options to investigate and further process the captured data of an existent structure depend on the information that can be recognized in the point cloud.If the objects of interest possess well-known geometric properties, this can be achieved using specifically chosen geometric features.One popular method is principal component analysis, which was, for example, applied in (Yokoyama et al., 2013) to detect poles in urban scenes.The application of data-driven methods allows algorithms to learn the critical features from annotated training data instead of exploiting prior knowledge about geometry or materials.In our approach, point cloud semantic segmentation (PCSS as per Xie et al., 2019) with supervised deep learning was chosen as it allows us to generate point-wise class predictions.Current learning-based methods are able to capture critical features for a large number of classes at once.In further steps, this enables the targeted application of class-specific instance segmentation and reconstruction methods.
PointNet (Qi, Su, et al., 2017) and its successor PointNet++ (Qi, Yi, et al., 2017) were crucial development steps for the discipline because the underlying architecture allowed deep learning on points directly without translating them into a structured representation like a voxel grid first (i.e., VoxNet; Maturana & Scherer, 2015).The performance of architectures, evaluated on a few specific datasets, has constantly been improving since then.Xie et al. (2019) and Zhang et al. (2019) present recent overviews of the topic; online resources such as Papers With Code (2021) can help to provide an upto-date roundup in this fast-changing environment.The work of Mirzaei et al. (2022) contains a comprehensive overview of methods used by state-of-the-art point cloud deep learning network architectures.Among others, a notable performance increase for semantic segmentation on point clouds was achieved by applying kernel point convolutions (KPConv; Thomas et al., 2019) and Point Cloud Transformers (Guo et al., 2021).
In the domain of AECO, these developments were followed with much interest, as they offer a universal first step toward a solution of the automation of the scan-to-BIM process, compared to the application of hand-crafted features.Perez-Perez et al. ( 2021) introduced Scan2BIM-Net, a combination of different networks for semantic segmentation of a case-study indoor environment point cloud dataset.
Industrial facilities pose more specific challenges than commercial buildings, along with different important object classes.Yin et al. (2021) adapted the PointNet++ architecture to their version of ResPointNet++ to achieve improved results for PCSS in an industrial environment.In Agapaki and Brilakis (2020), PointNet++ was extended by a neighborhood consideration to enhance its performance on the authors' manually annotated industrial dataset.The same authors expanded their scope to an instance segmentation approach, starting from an ideal set of semantic segments using a search algorithm and boundary segmentation (Agapaki, 2020).

Domain-specific training data for point cloud semantic segmentation
As for all data-driven methods, the performance of network architectures for point cloud semantic segmentation heavily depends on the quality and quantity of available training data (Gao et al., 2020).This issue has been addressed and partially solved for some domains with the availability of such large-scale open-source datasets.For indoor office environments, well-known examples are the aforementioned S3DIS (Armeni et al., 2016) and ScanNet (Dai et al., 2017).For outdoor urban scenes relevant to the development of autonomous driving and smart infrastructure, among others, there are the datasets of KITTI (Geiger et al., 2013), Vaihingen (Rottensteiner et al., 2013), Paris-Lille (Roynard et al., 2018), and more.Some of the introduced works applying methods of PCSS in AECO have specifically addressed the industrial domain (Agapaki & Brilakis, 2020;Yin et al., 2021) and introduced their work along with datasets the authors prepared and used for the development and validation of their methods.To this date, there are no labeled point cloud datasets publicly available for complete industrial scenes, which has been identified as a significant bottleneck for the wider adoption of PCSS by Cazorla et al. (2021).

Synthetic training data for semantic segmentation
For applications where such ground truth data are rare, researchers and practitioners have considered circumventing the effort for manual data collection and labeling along with potential privacy issues by utilizing synthetic data.Among those approaches are attempts in 2D to generate images and depth maps from 3D models with HoliCity (Yichao Zhou et al., 2020) and to extract frames from video games (Richter et al., 2016).Similar efforts to generate synthetic images have been made by Hong et al. (2021) in the AECO domain.
To generate point clouds that exhibit realistic properties to imitate real laser scan point clouds, some approaches work on top of existing simulation tools, such as the CARLA simulator for autonomous driving (Dosovitskiy et al., 2017).This framework was used to simulate laser scan point clouds in an urban environment similar to the KITTI dataset (Geiger et al., 2013) to create the so-called KITTI-CARLA dataset (Deschaud, 2021) and similarly for the PARIS-CARLA-3D (Deschaud et al., 2021) dataset.With SynthCity, Griffiths and Boehm (2019) provide a synthetic point cloud dataset representing urban scenes along with a highly realistic, textured 3D model of the city.
To investigate the value synthetic data have for use as training data and thus the added value they can bring to scan-to-BIM toolchains, several related contributions are relevant for this work: Frías et al. (2022) used BIM objects to generate synthetic point clouds by sampling, to then render them to images and use them for object classification.For the application in historical buildings, Morbidoni et al. (2020) used synthetic, sampling-based point cloud data generated based on structural components of available 3D models to train an adapted version of DGCNN (dynamic graph convolutional neural network, Y. Wang et al., 2019) for semantic segmentation.In the context of office environments, some studies (J.W. Ma et al., 2020;Zhai et al., 2022) used the S3DIS dataset to investigate the potential of synthetic point cloud data for training a neural network for point cloud semantic segmentation.
For the experiment presented by J. W. Ma et al. (2020), a subset of the S3DIS dataset ("Area 1") was remodeled manually in an engineering application.The pure geometry of the objects in the model was then exported to sample evenly spaced points on a 3D grid within the objects' volumes to generate synthetic training data and finally annotate the point cloud on an instance level with accordingly reduced manual effort.Subsequently, these data were used to train neural networks for semantic segmentation.The study showed that an increase of 7.1% in accuracy was feasible by augmenting a small real-world dataset with an additional large set of synthetic, sampling-based data.Finally, an experiment was conducted to investigate the potential of hybrid training datasets with varying compositions between real and synthetic data in steps of 20%.In this, the dataset with 80% real and 20% synthetic data fell short only 1.52% of the best-performing real dataset in terms of accuracy.These findings, while promising, leave some questions as to the realism of the data generated using the presented method for sampling based on a volumetric grid.We presume that with more realistic synthetic data, better results can also be obtained on real data.
The above-introduced research shows the lack of suitable training data as a critical bottleneck for applying data-driven methods for point cloud semantic segmentation in specific domains beyond commonly used, unspecific benchmark environments.Beyond existing related works, an investigation of domain-specific potential added value of synthetic data used for training networks is identified as a research gap that this paper aims to fill.The industrial domain poses specific challenges in terms of relevant object classes, and complex, intertwined object surfaces openly visible in the facilities.These boundary conditions motivate the implementation of two different methods of synthetic data generation that allow us to take into account these particular circumstances to varying extents.Complex industrial facilities often possess detailed 3D models, representing at least the as-designed status of the building structure, technical equipment in the buildings, and, depending on the domain, production equipment, logistic systems, and more.These models include complete scenes and can be leveraged to create synthetic point cloud data that can be used to train networks for point cloud semantic segmentation.Depending on the chosen method and on the quality and structure of the available models, the data generation process can be fully automated or achieved with very little manual effort: Large amounts of annotated point cloud data can be generated automatically to train and improve data-driven methods for flexible requirements for respective 3D models of industrial facilities.

RESEARCH METHODOLOGY
This paper investigates to which extent synthetic point clouds of varying quality can be used to address the challenge that the insufficient availability of annotated training data poses to the applicability of data-driven approaches for point cloud semantic segmentation for building infrastructure in the industrial domain.
The manual effort otherwise required to annotate such datasets can be completely avoided or reduced drastically while avoiding human errors in the annotation process.The only fixed requirement persists in the input 3D models, which are, in many cases, readily available.Figure 1 depicts the underlying logical structure of the presented method.The quality and inherent value of the resulting, synthetically generated point clouds are highly dependent on the quality of the underlying 3D models, especially regarding the level of detail and completeness, and the chosen method for the point cloud generation.
This paper aims to investigate the value of different types of synthetic data as training data for point cloud seman-F I G U R E 1 Data preparation workflow, highlighting differences between conventional, manual annotation and the generation of synthetic, annotated data using existing 3D models.tic segmentation applications in direct comparison.Two ways to generate synthetic, annotated point cloud data based on 3D models of industrial facilities are introduced and applied to an illustrative model; two independent reference datasets are collected by TLS in industrial facilities and manually annotated to enable the evaluation of real-world applicability and generalization potential.Subsequently, several semantic segmentation experiments are performed using a fixed training and testing setup for all mentioned datasets as training and test data, respectively.The final evaluation is performed with regard to the real laser scan datasets to show the extent of actual industrial applicability.
Beyond the applicability of homogeneous, synthetic training data, in a second experiment, the purely synthetic datasets are combined with small subsets of real data to investigate the potential value that can be achieved with minor manual annotation.This combination of synthetic data with real-world data is expected to improve results and is therefore referred to as augmentation in the following.While this implies an increased effort for data preparation, it is a handy solution to achieve results with no significant shortcomings.

Point cloud datasets
The objective of the conducted experiments is to assess the practical applicability of synthetic data within the context of industrial facilities.To achieve this, we employ suitable reference datasets in the form of real laser scan point clouds captured using industry-standard TLS equipment.
There are multiple solutions to create a synthetic point cloud based on a given 3D model.In this paper, two alternatives are presented and compared: one based on sampling and one using state-of-the-art laser scan simulation.Both methods allow for the preservation of the semantics of the model in the process such that the resulting point cloud is fully annotated and can directly be used to train a neural network model for semantic segmentation.The described steps are chosen such that the procedure can be applied based on any conventional 3D model.No color or material attributes are used in either approach.While they would help to improve model performance (Zhai et al., 2022), standard industrial 3D models do not commonly contain this information: To ensure industrial applicability with no overhead effort for data preparation, only the model geometry is taken into account for data generation.Figure 2 depicts an overview of the process of generating the synthetic point clouds as described in the following.
In the first step, the modeled scene must be split into separate entities representing semantic classes or instances.All objects are first exported separately into individual OBJ (Wavefront OBJ) files to comply with a pre-defined class split according to the classes that should be included in the semantic segmentation.The effort necessary for this step depends on the model structure in terms of semantics and how well they can be mapped to the classes that should be investigated in the point cloud.In a single-layered 3D CAD model, this step has to be performed manually: Connected objects that include multiple classes might have to be separated.If the model contains all necessary information for this split and is, for example, stored in the Industry Foundation Classes (IFC) format (ISO, 2018), this can be fully automated by parsing relevant object properties.The further steps undertaken differ between simulation and sampling.
Multiple solutions are available to simulate a laser scan on a given 3D geometry.As the resulting data should resemble an actual laser scan, equipment and measurement behavior should be emulated as well as possible; model semantics must be included in the resulting point cloud to avoid any manual annotation effort.The measurement itself is based on a line-of-sight evaluation between a sensor emitting laser rays in patterns according to equipment-specific functionality and the surface of an object in the scene.Depending on distance, incidence angle, surface material parameters, and equipment param-F I G U R E 3 "BIM-to-scan" workflow, in terms of data content, application, and file formats used, adapted from Noichl et al. (2021).
eters such as precision, simulation engines return results close to real laser scans.
Existing solutions for this include the educational platform of VRScan3D (Luhmann et al., 2022), the Blenderbased (Blender Online Community, 2021) tool of BlenSor with a focus on mobile scanning platforms and depthcameras (Gschwandtner et al., 2011) and Helios++ (Winiwarter et al., 2021), with a wide variety of applications and flexible setup opportunities including different sensor types and mobile platforms.For this context, Gonzalez Stefanelli et al. ( 2022) present an overview of suitable platforms for data generation based on 3D building models.In our approach, we use Helios++ for laser scan simulation.The simulation kernel is based on ray tracing, simulating laser beams by sampling from probabilistic distributions, and considering material-specific reflectivity parameters.
The process of generating synthetic, annotated point clouds based on semantic 3D models through laser scan simulation is depicted in Figure 3.As a primary step for data preparation for laser scan simulation, the 3D model is exported from the authoring tool using conventional CAD exchange formats (*.fbx, *.dwg) or *.ifc format.All of these exchange formats can be imported into Blender by default or using specific add-ins (such as https://blenderbim.org/).After import into Blender, OBJ objects are collected in a scene in Blender to a Helios simulation scene by using our adaption of the Blender2Helios tool (Neumann, 2020).The simulation tool itself allows for customizing all aspects of the simulation.In a set of XML files, the user can define scanner properties such as range, resolution, precision, and field of view.
Furthermore, scanning locations in the survey are specified, as well as the scene itself, which is built from a set of geometric objects stored in separate OBJ files.The simulation itself takes into account these equipment parameters to simulate rays cast from the virtual scanner's sensor location, trace them, and report intersections with the scene as hits.Therefore, any point in the resulting point cloud can be clearly attributed to the class information from the underlying object.This depicts a perfect, error-free annotation as part of the process that is reproducible and scalable (Winiwarter et al., 2021).The userdefinable parameters include the field of view, resolution, and precise coordinates of the laser scan sensor in the scene.The simulated point cloud possesses realistic properties, such as occlusions and minor measurement inaccuracies.After the simulation step is complete, we calculate the mean surface density of the simulation-based synthetic point cloud for later use in creating the sampling-based synthetic data (cf.Figure 2).
While laser scan simulation produces realistic results, it is also computationally expensive.Point clouds can be generated directly on the previously prepared parts of the 3D model as we process them as a triangulated mesh in OBJ format.There are various methods to do so: As they describe the surface, triangle vertices can be directly interpreted as points of the point cloud.Depending on mesh resolution and face size, this can lead to sparse clouds and highly irregular point densities.To achieve a more uniform point distribution, points can be randomly sampled on each face's surface, with the number of points per face determined by the face's area.Poisson disk sampling (Corsini et al., 2012) is an alternative method that is able to distribute points on the faces of the triangulated mesh representation of the model even more homogeneously.These methods are common practice in the field and are implemented in widely used open-source tools like CloudCompare (2021) or MeshLab (Cignoni et al., 2008).
We start the sampling process by first over-sampling points for each semantic object.Based on the initial class split, a manually defined, high number of points is sampled on the surface of the class objects using the random sampling functionality of CloudCompare (2021).The number of points is chosen so that the resulting surface density exceeds the mean surface density from the simulation.By doing so, the full surfaces of all meshes are covered with points, regardless of location or orientation.Surfaces within a model that are either contained within other objects or located within other geometric bodies, like layers within walls or ceilings, are incorporated into the resulting point cloud.
After sampling, the surface density is calculated based on the surface geometry to ensure the chosen amount of points is sufficient to obtain the mean surface density of the simulation-based point cloud.Otherwise, sampling is repeated with an increased number of points until the target is met.The over-sampling step is necessary because it is impossible to sample points to generate a specific point density without prior calculations on the underlying geometry.After a sufficient point density has been verified, the over-sampled point cloud is down-sampled to ensure the same minimum point spacing as in the simulation-based point cloud (cf. Figure 2).The resulting point clouds have the same mean surface density but different overall properties.Figure 4 illustrates these differences in properties concerning local point densities and occlusions in a simple example.

Semantic segmentation
The core steps of our experiments are the training, testing, and evaluation of a state-of-the-art neural network for point cloud semantic segmentation on variations of our data.For semantic segmentation, the method of KPConv (Thomas et al., 2019) is currently among the best-performing convolution methods.
For network architecture, we therefore use the kernel point fully convolutional neural network (KP-FCNN), which is a fully convolutional network for semantic segmentation introduced by the authors along with KPConv (Thomas et al., 2019).This architecture is well established, has been used in related studies (Deschaud et al., 2021;Soilán et al., 2021), and is among the best-performing architectures for semantic segmentation on S3DIS (Papers With Code, 2021), a core benchmark for point cloud semantic segmentation for indoor scenes in the built environment.The presented work uses the available Pytorch implementation as published on Hugues Thomas' public GitHub repository (Thomas, 2021).
This work focuses on investigating the value of different types of point cloud data used for training a neural network for semantic segmentation.The specific performance in question is the trained network's ability to correctly predict class labels per point in a real laser scanning point cloud test set that is not used in the training phase.To do so, one and the same neural network architecture is trained on a variety of datasets of synthetic, real, and hybrid point clouds from scratch.After running inference on the real laser scanning point cloud test set, these results are compared to the ground truth class labels of the test set.In this final evaluation step, the metrics of F 1 -score and intersection over union (IoU) are investigated.These metrics can be calculated based on correct predictions (denoted as true positives TP and true negatives TN) and false predictions (denoted as false positives FP and false negatives FN) as follows: • Precision: (1) • Recall: • F 1 -score: • Intersection over union IoU: For evaluating results in point cloud semantic segmentation, intersection over union or Jaccard index (Equation 4) is commonly used as a measure of similarity between the ground truth point cloud dataset and the predicted point classes.While IoU is a measure of similarity between the ground truth and the prediction labels, the F 1 -score depicts the harmonic mean of precision and recall as a balanced measure of accuracy.Both metrics are evaluated at the class level to prevent skewed results that overestimate performance in imbalanced datasets.To evaluate overall experiment performance, this is implemented by first calculating the metric per class and subsequently the mean over all classes of the dataset per Equation (5), with C as the overall number of classes and μ as a placeholder of the respective investigated metric (cf.Equations 3 and 4), mc indicates the calculation of a mean class metric.
Furthermore, a variety of confusion matrices is evaluated to identify specific patterns of misprediction between certain classes.Instead of absolute values, confusion matrices are presented in normalized form in order to keep the content clear, facing imbalanced data.

EXPERIMENTS AND RESULTS
Several experiments were performed in the framework of this contribution using a state-of-the-art network for point cloud semantic segmentation on varying training-and test datasets.The difference between each experiment is mainly in the training dataset used.The individual datasets are presented in more detail below.

Datasets
Two real industrial facilities are part of these experiments: an active industrial cooling plant and a cleared factory hall.These facilities were chosen because they contain all typical objects for industrial buildings, including steel beams, pipe runs, ventilation ducts, and cable routing.Nevertheless, they differ significantly from each other.The cooling plant has a footprint of roughly 640 m 2 , comprising a built volume of 4480 m 3 ; the factory hall is significantly larger than the cooling plant, with a footprint of 2850 m 2 and a built volume of roughly 36,200 m 3 .The cooling plant facility represents the core of this investigation, as it allowed performing the case study to its full extent: A detailed asdesigned 3D model of the facility was available, along with the actual access to the facility to conduct a laser scan.Thus, it was possible to create point clouds reflecting this facility in the three independent ways introduced in Section 3.1: two synthetic point clouds using sampling and simulation and an actual laser scan depicting the real as-is situation in the facility.
To investigate how the findings of the core case study hold in transfer to a different exemplary dataset in a typical industrial use case, the same laser scanning system as in the cooling plant to collect point cloud data from a cleared factory hall before it was repurposed.
The classes used to annotate those point clouds are introduced in Table 1.These classes do not follow a usecase-specific structure but represent the major object types present in the case study facilities.
To gather the real laser scan datasets, a TLS scan was performed with the help of a professional surveying expert inside the case study cooling plant and the cleared factory hall.The cooling plant is the core case study dataset; the factory hall is a highly different facility yet comparable in terms of present classes.Excerpts of the two point clouds are depicted in Figure 5 to illustrate this difference.
In the cooling plant, a total of 28 single scans were conducted using a FARO Focus S laser scanner, registered and processed through the manufacturer's native processing software; targetless registration could be performed as the surveyor had ensured sufficient overlaps.The resulting, registered, de-noised point cloud comprises 7 × 10 8 points (cf. Figure 6).For the factory hall, a total number of 15 single scans was conducted at a higher resolution than in the cooling plant to ensure sufficient density for the larger required scanning distances, which led to a total number of 6.7 × 10 8 points.The cooling plant point cloud was downsampled with a minimum distance of 5 mm and the factory hall to a distance of 10 mm between points to create a more uniform point density throughout the datasets and reduce overall size for further processing.In the factory hall, the minimum point spacing had to be increased to keep the final resulting point cloud in a manageable size.
Consecutively, the data were divided into even, box-shaped segments and down-sampled to 5 × 10 5 points per segment to be fit for processing a web-based annotation tool.Manual labeling was then conducted with the AWS SageMaker GroundTruth tool (Amazon Web Services, 2021) with the classes introduced in Table 1.Annotation was performed by hand and took a total of 82 h to complete for the cooling plant dataset and 57 h for the factory hall due to the simpler overall structure and experience gained from the first dataset.
After this, the manually collected label information was realigned with the input point clouds.Subsequently, the annotated points' class information was extrapolated to the points not present in the working sets after down-sampling using a k-nearest neighbor approach ( = 5).As a result, the real laser scan datasets are fully annotated with their original number of points.
Generating the synthetic datasets started from one industrial facility 3D model for both versions of data generation.The 3D CAD model used for this experiment depicts the as-designed status of the cooling plant facility.Semantic information on the contained objects is therefore organized in layers according to the responsible crafts involved in the construction project.Starting from this 3D CAD model, all objects were exported separately into individual files according to their layer-based semantics.Subsequently, the resulting collections of objects were further split or combined to comply with the pre-defined class split as introduced in Table 1.
The resulting OBJ files were then processed in different ways for the methods of simulation and sampling.For the simulation version, the sensor parameters and scanning locations in Helios were defined to be identical to those of the real TLS scan to keep the result as close to reality as possible.These results, a comparison to the real laser scan and the exact scanning parameters used, are presented in Noichl et al. (2021); the simulation took around 1 h to complete.In the final step, the simulation-based point cloud was combined from the single scans and subsequently down-sampled to a minimum point spacing of 5 mm like the real laser scanning point cloud.The mean surface density in the resulting point cloud was calculated as roughly 25,000 pts. in a radius of 5 cm using CloudCompare.For the sampling method, the separate OBJ files were processed as described with CloudCompare (cf. Figure 2).Over-sampling was performed with 10 6 points per object to achieve sufficient density.Subsequently, the resulting individual point clouds were down-sampled to the required mean surface density of 25,000 pts in a 5-cm radius.For larger objects where down-sampling would not reduce the absolute number of points, we repeated the process with increasing initial point numbers until this requirement was fulfilled.This iterative process took less than 10 min in total to compute.The separate point clouds were then combined and down-sampled with a minimum point spacing of 5 cm, like both other datasets.Note that neither generation method requires specific registration as the global object coordinates are known from the beginning and preserved through the process; hence, these individual point clouds can be combined without further computation.
After the preparation of the full point clouds, all three cooling plant datasets are split into training, augmentation, and test sets.The distinct augmentation set is used for training the network and depicts the sets of data to replace and be replaced with data from the real laser scanning datasets to investigate the value of hybrid, augmented point cloud data.To create distinct datasets, the inliers of two bounding boxes in two corners of the cooling plant point clouds are separated from the full point clouds.The remaining major parts of the cooling plant point clouds are used as training sets.As the primary purpose of the factory hall dataset is testing the ability to generalize, only a small part of the dataset is separated for data augmentation purposes; the rest remains as the testing dataset.As an overview of the point cloud data used in the experiment, the real laser scan datasets of the cooling plant and factory hall are depicted in Figure 6, along with an illustration of the introduced dataset splits for both datasets.The resulting total and per set point numbers are summarized in Table 2, which shows the variation between the data types and datasets.
The actual objects vary between the as-designed and asis state of the cooling plant facility; therefore, the class split varies between the individual datasets.Although to a lesser degree, the numbers also differ between the two synthetic datasets due to occlusions-further amplified by modeling details in the original CAD models.The collected numbers of points per class and dataset are collected in Figure 7a,b.As depicted in Figure 7a, the distribution of points over the various classes is quite imbalanced, which is challenging for applying machine learning algorithms but very common for point cloud scenes, as can be seen in similar experiments (J.W. Ma et al., 2020;Soilán et al., 2021).The class analysis is omitted for the factory hall dataset to keep the study concise.While other datasets and types that aim for classification tasks (2D and 3D) can be extended by more samples of specific classes to reduce class imbalance, datasets depicting full scenes for semantic segmentation (2D and 3D) cannot be balanced easily.Publicly available datasets such as S3DIS (Armeni et al., 2016) and KITTI (Geiger et al., 2013) show similar characteristics.
However, the normalized evaluation in Figure 7b shows that the overall point distribution per class is comparable throughout the datatypes.The overall dataset split is, therefore, comparable.The sampling-based dataset has the overall highest number of points.In this dataset, all model surfaces are covered in points; neither occlusions nor model parts within convex volumes are spared.However, the difference between simulation-based and real laser scan data is not that significant.In the simulation, occlusions are realistically considered, as investigated in Noichl et al. (2021), and depicted in Figure 8.With this, the explicitly stated limitation of sampling-based approaches, as identified in the study presented in J. W. Ma et al. (2020), which used a volume-based sampling approach, is fulfilled.Still, the overall number of points in the simulated point cloud is slightly increased, compared to the actual laser scan point cloud, as the used model is incomplete regarding highly complex surfaces and temporary and movable objects in the existing facility.First, the class of "noise" is only present in the real dataset.The synthetic datasets are generated using an as-designed model of the facility and therefore inherently do not contain non-essential or temporary objects.As the applied pre-processing includes a minimum distance down-sampling step, the reduced surface complexity leads to a reduced number of remaining points.Furthermore, after capture, the real laser scan point cloud was cleaned of noise resulting from the facility's highly reflective materials, below-minimum distance surfaces, and incidence angles.

Experiment setup
Two separate experiments were conducted to evaluate the validity of the proposed approach of using synthetic point cloud data for training the KP-FCNN architecture for point cloud semantic segmentation.The main steps of this experiment were repeated for To describe the steps of Experiment 1, Figure 9 provides an overview of the process: data preparation for all three datasets, semantic segmentation, and evaluation of results.The machine learning model parameters remained unchanged through all trials to avoid any distortion between single experiments.For the processing in the KP-FCNN, point cloud data were pre-processed in the first step by down-sampling using a voxel grid with a consistent voxel size of 0.02 m.The radius of the kernel for convolutions to be applied on the points of the point clouds was set to 1.5 m.The learning rate was fixed to 0.01, the batch size to 6, and the maximum number of epochs to 500.After training was complete, the networks were tested on the designated testing parts of the real laser scan point clouds.Thus, it was possible to investigate how well the network performed on this real test data after training on each specific training set.Subsequently, the evaluation metrics introduced in Section 3.2 were calculated for each run and finally compared between experiment runs.

Experiment 1: Homogeneous training data
The first experiment investigates how well synthetic point cloud data can be used as training data for a neural network to perform point cloud semantic segmentation on a real laser scanning point cloud.More precisely, homogeneous sets of sampling-based and simulation-based synthetic data were used to enable a direct comparison and quantify their value for application on real-world problems.The introduced network architecture was trained and tested separately on all datasets introduced above.As depicted in Figure 10, for both synthetic data runs, the loss stabilized around 350 epochs; the chosen 500 epochs of learning seem suitable for this task.
As performance on synthetic data was not the purpose of this investigation, the performance of these trained networks on the designated test datasets was tested in the next step.As introduced in Section 4.1, these data were extracted from the real laser scan point clouds and were not used in training any of the networks.The test set point clouds with predicted class labels were then evaluated against the manually created ground truth.The resulting mean metrics were calculated over all classes per Equations ( 1)-( 5) and are depicted in Table 3.
For the cooling plant test set, the network trained on sampling-based data achieved a mean class intersection over union (mcIoU) of 23%, while the one trained on simulation-based data reached 30%.The network trained on the real laser scan data significantly outperformed both with an mcIoU of 69%.In terms of mcIoU, the simulationbased dataset outperformed the sampling-based by seven percentage points, amounting to an increase of 30% relative to the sampling-based performance.For the mean class F 1 -score (mcF1), the relative improvement averaged 27%.
For the factory hall dataset, the results ranged lower but showed the same characteristics.The model trained on real data yielded the highest results (mcIoU 26%, mcF1 36%), noticeably less than the cooling plant.The type and dimension of objects in the cooling plant and factory hall datasets vary significantly.Despite this, the similarity between the separate training and testing sets within the same dataset is high.For sampling-and simulation-based training data, the difference between the datasets is less evident, resulting in lower but comparable results for the factory hall.
For those, the difference in value between sampling-and simulation-based data was identical to the one measured for the cooling plant dataset, with an absolute increase of 7% mcIoU and 8% mcF1.
For the cooling plant test data, Table 3 presents the F 1 -score for each class.Among the two synthetic candidate datasets, the simulation-based dataset outperforms the sampling-based dataset in most classes, with the differences ranging between 1% and 10%.However, there are a few exceptions.The ventilation duct and clutter classes both have 0% F 1 -score because they are not properly represented in the synthetic dataset, and clutter is absent from all synthetic datasets.Moreover, the wall and tank classes show significant improvements of +21% and +24%, respectively, due to the discrepancy between their surface representation in the 3D CAD model and the visible surfaces in the facility.For example, the walls are modeled with several layers that are all used for sampling but are inherently occluded in the laser scan simulation.Conversely, the sampling-based dataset performs 4% better for the floor class.Although this study's findings cannot fully explain this specific exception, the overall results strongly favor the simulation-based dataset over the sampling-based dataset.
Figure 11 shows a snippet of point predictions per experiment run along with visualized false predictions.Well-performing and failing classes can be distinguished as follows: wall, floor, pipe fittings, and pipe accessories range above 50%.The visual check and false prediction figures of Figure 11  for Experiment 1 are depicted in Figure 12.A clear result in comparing both experiments is the inability to produce good results for elements that are not or are poorly depicted in the utilized 3D models.In both trials of Experiment 1, the facility's ceiling is poorly predicted (cf.Figures 11  and 12).
As shown in Figure 9a,b, the ceiling is depicted by a simple plane in the underlying as-designed model.While this representation might be sufficient for planning purposes, it lacks the geometric precision needed to produce synthetic point clouds for training a neural network for semantic segmentation.
The bracing elements themselves are modeled in detail, but the proximity to the ill-represented ceiling class leads to poor results for this class as well.For the sampling case, predictions for pipe accessories are mixed between pipe fitting and pipe accessories, while in the simulationbased trial, the predictions are more homogeneous pipe accessories.

Experiment 2: Augmented training data
To improve the network's performance, the purely synthetic training point cloud data were augmented with smaller amounts of annotated, real laser scan data.In a practical application, these conditions could be achieved by generating a large amount of synthetic data leveraging industrial facility 3D models; in addition to that, a small amount of point cloud data from the facility in question or a similar one would be manually annotated and added to the synthetic data to form a hybrid training dataset.
In this experiment, the designated augmentation part of the synthetic datasets was replaced by the respective augmentation counterpart of the real laser scan point clouds for this second experiment.Thus, the underlying dataset split was not changed (cf. Figure 6) while aiming to overcome the shortcomings identified in the first experiment.For the reasons laid out in Section 4.1, the total number of points varied between the datasets.Hence, the augmentation set of 7.2 × 10 6 points or 12.8% of the real laser scan dataset led to a different percentage of the overall available points for training in the respective augmented dataset set.For the sampling-based dataset, the substituted augmentation set constitutes 5.3% of the available points for training in the simulation-based version 11.3%.The underlying calculations were based on the dataset-specific numbers of points and relative splits introduced in Table 2. Except for the datasets used, all parameters remained unchanged from Experiment 1. Similar to the first experiment, training loss stabilized around 350 epochs; therefore, the copied training parameters were acceptable for the second experiment.
Both trained networks were again tested on the test set taken from the real laser scan dataset.Same as in Experiment 1, these data were not used in any training.As a result, the performance increases significantly for both configurations.In this setup, the augmented samplingbased data achieve 60% mcIoU, and the simulation-based data even go as high as 65%, which is close to the performance of the complete real training dataset with an mcIoU of 69% on the real testing data.The augmented simulated data thus missed the benchmark achieved by the real data by only 4%, with the augmented sampling-based alternative performing significantly worse with a remaining delta of 9%.Relatively, the simulation-based augmented dataset outperforms the sampling-based version by 8.3%.
In a more detailed description, the F 1 -scores are depicted per class in Table 4.As in the first experiment, the simulation-based approach outperforms the sampling-based approach across most classes.There are two minor exceptions (clutter and equipment), where the latter achieves slightly higher results, but in summary, the results are clearly in favor of the simulation-based alternative as indicated in the macro mean introduced in Table 5.In addition to the well-performing classes of Experiment 1, after augmentation, results range above 50% for the classes of ceiling, beam, ventilation duct, cable routing, bracing, clutter, and equipment.These classes now constitute the majority of classes overall; the results are convincing for the entire scene.(cf. Figure 13) The most significant improvement between Experiments 1 and 2 can be found in the classes of ceiling and bracing.As identified during Experiment 1, this area of the facility has a very simplified geometry in the as-designed model that is used for generating the synthetic datasets.This issue was solved by adding a small amount of real laser scan points to the training set.
Visually, Figure 14 shows a clear improvement in comparison to the results of Experiment 1 (Figure 11) as well.While the overall mispredictions are drastically improved, in comparison to the first experiment, the geometric differences between the data types are less obvious.Both synthetic approaches, however, fail to properly predict point classes for movable items (classified as clutter, such as fire extinguishers) and secondary support structures.As the synthetic datasets are based on an as-designed model of the facility, those objects are not included in those point clouds.
We extended our experiment to the factory hall point cloud to investigate the potential such a hybrid dataset has in the context of a different facility.For this, a small part form inference on the factory hall test set; all results of this are collected in Table 6.
The results show similar characteristics as in Experiment 1, as the model performance on the factory hall dataset is below the results for the cooling plant.By augmentation of the synthetic cooling plant training data with real data from the cooling plant dataset, the performance increase on the factory hall dataset ranges between 5% and 10%, and the absolute added value of simulated data over sampling-based data is clear with 3% and 5%.Noticeably, the model trained on real training data from the cooling plant performs similarly to the model trained on the augmented, synthetic cooling plant dataset (cf.Table 7).
Even at this small size, the augmentation set from the factory hall leads to the best performance.The difference in performance between sampling-and simulation-based data is persistent at 2%-3%.

Comparison with related experiments
The closest related study is the one introduced by J. W. Ma et al. (2020).In contrast to the mentioned paper's method for data generation, the sampling technique used in this paper is limited to the surface of the objects instead of the full volume of the object, which brings the presented method significantly closer to an actual laser scan that is limited to object surfaces.Furthermore, points are sampled using a random distribution instead of a grid for the presented sampling-based approach; the simulation-based version introduces TLS-specific properties such as realistic placement-dependent occlusions, range-dependent resolution, and precision.In their further studies regarding point densities, J. W. Ma et al. (2020) seem to have used random down-sampling, which lifts the spatial restriction of the grid to a certain degree.
Despite the limitations in comparability regarding dataset and sampling technique, this study shows that the simulated synthetic dataset carries significantly more information than the already improved sampling-based alternative while saving storage space.This was shown in the results of the presented experiments, where the dataset generated by simulation has shown increased value for training a neural network to perform well on real, unseen data, in comparison to sampling-based synthetic data, both for homogeneous synthetic as well as synthetic data augmented with real laser scan points.

DISCUSSION
Synthetic point clouds generated through simulation depict reality better than sampling-based alternatives.
Specifically in the industrial domain, where laser scanning is the predominant acquisition technique and complicated geometries often lead to strong occlusions, it makes sense to apply data generation methods that result in data that are as close as possible to real data.The experiments presented in this paper investigate the value that synthetic point cloud data created using two alternative methods have as training data for neural networks for semantic segmentation.
In the two introduced experiments, diverging goals are followed.While Experiment 1 aims to clarify the added value of generating more realistic synthetic point cloud data for training neural networks for point cloud semantic segmentation, Experiment 2 aims to match the performance of a neural network trained on real laser scanning data with a minor trade-off by adding a small subset of manually prepared ground truth data.
As both of the experiments are successful, they are substantial and show that, without pre-trained networks or large-scale labeled data, new domains can be tackled with regard to point cloud semantic segmentation, given that 3D models of comparable scenes are available.
While sampling-based methods are a valid option, employing simulation-based approaches leads to fewer overall points in the point cloud training set, saving storage space and significantly improving overall performance.In the presented method of data generation, the only necessary manual intervention remains in preparing the CAD model for simulation by assigning object classes according to the chosen class structure.This step is necessary if the available 3D model does not possess instance-level semantics-which was the case in the presented experiment.This step can be fully automated by parsing and filtering object semantics in a complete and correct BIM or 3D CAD model with full semantic information on the instance level.
Depending on the use case, the overall results of Experiment 1 might be insufficient for robust further processing due to weak results for certain classes.With the results of Experiment 2, it could be shown that with limited manual intervention, synthetic data lead to robust results that provide a good trade-off between expensively annotated real-world scan data and the cheap solution of sampling-based synthetic data.
The presented approach comes with several limitations.As the experiments have shown, the quality of the 3D model is decisive for the value of the synthetic data generated on its basis.Poor representation of actual geometric features directly influenced the generalization power of our method and led to poor prediction performance for affected classes and neighboring objects belonging to other classes.
The experiments showcase the application of the proposed method for generating synthetic point cloud data only on one facility.Furthermore, compared to other ground truth point cloud datasets for semantic segmentation, the presented three datasets representing the cooling plant are relatively small.In general, this is a shortcoming for the training of a neural network.
To evaluate the potential of such a baseline of training data to generalize and perform inference on other datasets, a second point cloud dataset of real laser scan data was added to the experiment.The results of these extended experiments on the second dataset are in line with the findings on the first dataset.It could be shown that, with very limited additional annotation effort, the baseline training data could be extended to be useful for a significantly different dataset in terms of purpose, layout, size, and included object classes.
The comparability of the presented method is inherently limited due to the absence of established benchmarks and public data.For example, access to their models and sampled data would be required to evaluate the added value of the presented method in comparison to the work using a manually remodeled part of S3DIS and volumetric gridbased sampling (J.W. Ma et al., 2020).To this date, there is a lack of publicly available ground truth datasets showing full industrial scenes that could be used as a starting point for training their own models and benchmarking developed approaches.

CONCLUSION AND OUTLOOK
This paper proposes to integrate realistic synthetic ground truth data into a workflow for point cloud semantic segmentation for the industrial domain, where the absence of publicly available ground truth datasets prevents the implementation of standard approaches with readily available annotated real ground truth data for training.At the same time, the industrial domain poses specific challenges, mainly regarding geometry, resulting occlusions, and specific classes.The presented work shows that realistic synthetic data are helpful for semantic segmentation.Furthermore, compared with data generated in a sampling-based method, the synthetic data created using laser scanning simulation shows a substantial performance increase.Creating such synthetic data requires no manual effort, given suitable 3D models are available; thus, they can be generated quickly with complete, error-less class annotations.Furthermore, as they depict the laser scan in terms of occlusions and surface coverage, in comparison with a full, sampling-based approach, around half of the overall point cloud size is sufficient for reaching the same mean surface density as the real point cloud dataset.Combined with a small amount of real laser scan data, synthetic datasets can produce results close to the presented benchmark achieved using purely real scanning data.Thus, applying scan simulations provides significant effort-saving potential in further processing steps in scan-to-BIM.
While it has been shown that the approach yields promising results for the industrial domain, extending similar investigations with comparable parameters to a wider variety of applications, such as infrastructure or conventional office indoor spaces, would allow for more general statements about the value and limitations of this approach.Semantic segmentation, while arguably the most valuable approach for semantic enrichment in laser scan point clouds for further processing and model reconstruction currently, is inherently limited to object classes and could be enhanced significantly by instance segmentation.Another interesting continuation of this research is identified in an extension of the presented synthetic training data generation method with this instance aspect, addressing this remaining gap.Furthermore, an investigation of the applicability for different domains with specific requirements and classes seems interesting, along with the impact of model quality on the process.

A C K N O W L E D G M E N T S
This work was conducted within the scope of a research project funded by AUDI AG, Ingolstadt, Germany.Further support was received in the form of hardware from NVIDIA through their Applied Research Accelerator program.The support of both companies is gratefully acknowledged.
Open access funding enabled and organized by Projekt DEAL.

F
Steps for the process of generating the simulationand sampling-based synthetic point clouds: model data preparation (left), simulation-based data generation (middle) and sampling-based data generation (right).

F
Exemplary pipe cross-section segment from three different point cloud sources: synthetic sampling-based (left), synthetic simulation-based (middle), and real laser scan (right).

TA B L E 1
Point cloud classes with corresponding integer ID.

F
Investigated laser scan datasets, exemplary sections of the cooling plant (left) and factory hall (right), equal scale.F I G U R E 6 Investigated datasets, dataset splits indicated: cooling plant (top) and factory hall (bottom), 10 m for scale.

F
Distributions of points per class in comparison between dataset types in cooling plant: (a) total points per class and dataset type and (b) normalized share of class per dataset type.

F
I G U R E 8 Comparison between point cloud snippets: (a) sampling-based, (b) simulation-based, and (c) real, laser-scanned point cloud, semantic classes of ground truth color-coded.F I G U R E 9 Experiment workflow for Experiments 1 and 2 with regard to data sources and process steps.

•
three homogeneous datasets; two synthetic point clouds, inclusive of one sampling-based and one simulationbased version, and one real laser scanning point cloud (Experiment 1); • four hybrid datasets of synthetic data augmented with a fixed amount of real laser scanning data, two from the cooling plant and the factory hall datasets each (Experiment 2).

F
Loss curves for 500 epochs training purely on synthetic data: sampling-based and simulation-based.
indicate this.The confusion matrices F I G U R E 1 1 Experiment 1: Point class predictions, network trained on sampling-based (a) and simulation-based (c) datasets; False predictions (red) for training on sampling-based (b) and simulation-based (d) datasets; ground truth class labels (e) and legend for a, c, e (f).

F
Confusion matrices of the real laser scanning test set for models trained on (a) sampling-based and (b) simulation-based synthetic training data.

F
Confusion matrices of the real laser scanning test set for models trained on (a) augmented sampling-based and (b) augmented simulation-based synthetic training data. of the factory hall dataset was designated as augmentation data; the entire rest of the point cloud was designated as testing data.Just like in the first implementation of the experiment, the model was trained on a dataset of synthetic cooling plant data augmented with real data from the cooling plant.Subsequently, the synthetic cooling plant training data were augmented with the augmentation part of the factory hall dataset; training the model was repeated with this hybrid dataset.Finally, both of these models trained on hybrid point cloud datasets were used to per-F I G U R E 1 4 Experiment 2: Point class predictions, network trained on sampling-based (a) and simulation-based (c) datasets; False predictions (red) for training on sampling-based (b) and simulation-based (d) datasets; ground truth class labels (e) and legend for a, c, e (f).TA B L E 6 Results for Experiment 2 on the factory hall dataset: macro mcIoU and mcF1 for varying setups, training on synthetic datasets generated by SAM and SIM augmented with real laser scanning data from cooling plant dataset (+), and factory hall dataset (*); Δ SAM , Δ SIM indicate the absolute changes in comparison to Experiment 1. SAM+ Δ SAM SIM+ Δ SIM SAM* Δ SAM SIM* Δ SIM mcIoU 0.23 +0.08 0.26 +0.04 0.35 +0.20 0.37 +0.15 mcF1 0.29 +0.09 0.34 +0.06 0.42 +0.22 0.45 +0.17 Florian Noichl https://orcid.org/0000-0001-6553-9806Fiona C. Collins https://orcid.org/0000-0001-5246-7727Alexander Braun https://orcid.org/0000-0003-1513-5111André Borrmann https://orcid.org/0000-0003-2088-7254R E F E R E N C E S Agapaki, E. (2020).Automated object segmentation in existing industrial facilities [Doctoral dissertation, University of Cambridge].https://doi.org/10.17863/CAM.52102Agapaki, E., & Brilakis, I. (2020 Dataset splits per data source type.
TA B L E 2 Results for Experiment 1 on cooling plant data: class-wise F 1 -score for training on homogeneous datasets generated by SAM and SIM; REAL for reference.
TA B L E 3

TA B L E 4
Results for Experiment 2 on cooling plant dataset: class-wise F 1 -score for training on datasets generated by sampling (SAM+) and simulation (SIM+) augmented by 12% real data, homogeneous real laser scan data (REAL) for reference; Δ SAM , Δ SIM indicate the absolute changes in comparison to Experiment 1.

dataset Cooling plant Factory hall Training setup SAM SIM REAL SAM SIM REAL
Results for Experiment 1: mean class intersection over union (mcIoU) and mean class F 1 -score (mcF1) for varying setups, training on cooling plant datasets: synthetic data generated by means of sampling (SAM) and simulation (SIM); real laser scanning data (REAL) for reference.