Urban wind field prediction based on sparse sensors and physics‐informed graph‐assisted auto‐encoder

The urban flow wind field is a critical element for downstream research, such as mitigation of urban wind disasters, assessment of urban wind environment, and urban drone route planning. However, it is impractical to arrange a large number of sensors to monitor an urban wind flow field. Hence, acquiring the entire urban wind flow field via sparse sensors would be highly valuable. To date, no scheme including deep learning (DL) model has been specifically designed for this purpose. This study presents an innovative approach to reconstruct complex high‐resolution urban wind fields based on sparse sensors, using a physics‐informed graph neural network (GNN)‐assisted auto‐encoder. The proposed method leverages the relationship between sensors and their surrounding environment enabled by deep mining capabilities of GNNs. As a result, the utilization and emphasis on sparse sensors data are significantly enhanced. The continuity equation of fluid flow is incorporated into the loss function of the convolution neural network to improve the stability and performance of the model. The findings suggest that, in contrast to prevalent generative DL models, the proposed model yields an approximate 50% reduction in root mean square error for reconstructing high‐resolution urban wind fields for multiple wind attack angles.


INTRODUCTION
As urbanization worldwide continues to escalate, the assessment of the urban environment becomes increasingly imperative for safety, health considerations, and airspace management of drone traffic.Of particular significance is the exploration of the urban wind environment, since issues such as urban wind disasters, heat island effect, pollutant diffusion, pedestrian comfort, and nascent drone pathfinding are all intricately intertwined with the urban wind environment.Thus, the instant reconstruction of the wind field across the entire urban expanse, utilizing just a limited number of sparsely arranged anemometers, holds paramount importance for the investigation of other proximate domains as well as in prompting caution of urban wind cataclysms.Research on urban wind fields can be mainly categorized into computational fluid dynamics (CFD) simulation, wind tunnel testing, field measurements, and data-driven methods.
Within the realm of CFD simulation, numerous researchers have made valuable contributions and achieved noteworthy outcomes.For instance, Toparlar et al. (2015) conducted unsteady Reynolds-averaged Navier-Stokes (URANS) simulations to predict urban wind flow and temperatures in the Bergpolder Zuid region.The simulation results were validated using highresolution thermal infrared satellite imagery, leading to the conclusion that CFD has the potential to accurately predict the urban microclimate.The influence of wind direction and urban surroundings on the air change rate per hour (ACH) of a large semienclosed stadium was assessed using CFD by Van Hooff and Blocken (2010).The simulation results were validated using field measurement data.The results indicate that both wind direction and urban surroundings have a significant effect on ACH.The relationship between pollution sources and the concentration of air along their pathways in the Manhattan area was reconstructed through detailed CFD modeling by Huber et al. (2006).The dispersion route of tracers in the urban environment was reconstructed by Flaherty et al. (2007) through the combination of CFD and semiempirical building-resolved models.Ramponi et al. (2015) presented 3D steady RANS simulation and passive scalar transport equation to calculate the effective local mean age of air at pedestrian level as an indicator of pollutant removal efficiency.Grid-convergence analysis and wind-tunnel measurements are used to verify the effectiveness of their result.The result shows that the higher flow rate through the main street reduces the flow rates through the parallel narrower streets, negatively affecting their ventilation efficiency.
Wind tunnel experiments and on-site measurements are valuable approaches for studying urban wind patterns.
Numerous exemplary studies have been conducted using these methods.The influence of twisted wind flows on the wind field at the pedestrian level in an urban area was investigated by Weerasuriya et al. (2018) using a boundary layer wind tunnel.The findings indicate that twisted winds tend to produce higher wind speeds at the pedestrian level compared to the conventional wind profile.Tablada et al. (2009) proposed the construction of a summer comfort zone for residential buildings in Old Havana, drawing upon field measurement data and a limited comfort survey.The wind speed and turbulence characteristics within the first 2 m above ground level were measured by Zou et al. (2021) at three urban green spaces surrounded by buildings of varying heights and densities.The on-site wind data obtained from this study serve as a reference for designing the corresponding experimental conditions.The relationship between four common architectural forms and the wind environment was investigated by Gao et al. (2012).Measurements of wind speed, direction, and other factors were obtained from six representative locations, and a statistical analysis identified the key factors that describe the impact of built forms on resulting airflows.Mikhailuta et al. (2017) demonstrate the comprehensive alteration of undisturbed wind flow resulting from nonuniform topography and building arrangements, using 15 years of data collected from urban monitoring stations.These results are valuable for validating numerical simulation models of air pollution dispersion and improving the parameterization of a wide range of urban wind flow problems.
There also exists an extensive range of research about urban wind fields that utilize data-driving methods (Fan et al., 2022;Song et al., 2022;Tang & Zeng, 2022).Bui-Thanh et al. (2004) successfully demonstrated for the first time that the proper orthogonal decomposition (POD) method can effectively reconstruct complete aerodynamic flow fields of subsonic and transonic nature based on sparse data.Willcox (2006) extended the gappy POD method to the reconstruction problem of nonstationary flow fields.They established a scheme from optimal sensor placement to sparse sensor-based flow field reconstruction, and have verified the effectiveness of this method on a subsonic airfoil model.Shao et al. (2023) introduced a novel physics-informed graph neural network (GNN) for efficiently predicting urban wind fields using irregular unstructured mesh data from CFD simulation.The results demonstrate that the proposed model runs significantly faster (1-2 orders of magnitude) than CFD, while maintaining a nearly consistent level of accuracy.Høiness et al. (2021) incorporated the signed distance function (SDF) as an additional data channel into the generative adversarial network (GAN) model.The outcomes evince the advantageous influence of SDF in enhancing the prognostic capacity of the model regarding the wind field encircling bluff bodies.Kashefi and Mukerji (2022) incorporated physical constraints into the loss function of the conventional point cloud neural network (Qi et al., 2017).The resultant model facilitates the efficient computation of wind fields surrounding an array of bluff objects, while maintaining remarkable generalization capacity across these geometries.Strönisch et al. (2022) applied a graph convolution network (GCN) to predict the stationary fluid flow field of complex geometries.The results demonstrate that the proposed algorithm produces an exceptional initial flow field for CFD and reduces the computational time of CFD in certain cases.
In brief, prior research has produced an assortment of surrogate models via various data-driven learning techniques, exhibiting comparable functionality to CFD models, and demonstrating superb performance.Nonetheless, for the rapid reconstruction of urban wind fields, an ideal deep learning (DL) model must be capable of promptly reconstructing intricate flow fields hindered by diverse civil structures under varying wind attack angles, utilizing data obtained from sparse sensors.To date, no such DL model has been proposed for this challenge.
This paper introduces the first model specifically designed to tackle this issue.In response to the challenge of real-time reconstruction of urban wind fields, this paper proposes a novel DL model that draws upon a range of approaches, including convolution network architecture, graph network architecture, point cloud data format, attention mechanisms, and physical information.This auto-encoder structure-based model achieves extremely impressive levels of performance.At the same time, this paper also examines the limitations of two commonly used conditional generation models, GAN (Goodfellow et al., 2020) and diffusion (Ho et al., 2020), in Section 4, thus demonstrating the efficacy and superiority of our proposed model for this task across a broad range.
It is worth mentioning that the primary contribution of this study is the advancement in high-resolution urban wind flow field reconstruction techniques.The proposed approach is based on data acquired using virtual sparse sensors in large eddy simulation (LES), rather than real sensor data, due to the absence of corresponding sensors selected by the optimal sensor placement scheme (Gao et al., 2023;Luo et al., 2023).Implementation of this program requires the use of transfer learning to tune the DL model based on actual sensor data in the designated area.Consequently, the adapted model can better conform to the actual environment.
This paper is structured as follows.Section 2 presents an introduction to the data source and processing methodology used in this study.Section 3 outlines the details of the proposed DL model.Section 4 elaborates on the validation of the model through statistical and image results.Finally, Section 5 provides the concluding remarks.

TIME SERIES DATA OF URBAN WIND FIELD
This section provides a detailed description of the data set used in this study, as well as the procedures implemented in the preprocessing of the data set.

Raw CFD data
The urban block model utilized in this study was derived from the AIJ benchmark located in Niigata, Japan, with a model scale of 1/250.The selected urban model is considered a typical urban area, consisting of densely packed lowrise buildings, in addition to a small number of mediumand high-rise buildings.The complexity of the urban environment represented in the model allows for an accurate demonstration of the interference that dense urban buildings can have on the wind field.Furthermore, the wind field changes considerably under different wind attack angles in this specific urban environment.The utilization of this specific urban model allows for a comprehensive examination of the capability of the proposed DL model to accurately predict wind field under complex urban wind conditions and various wind attack angles.The geometry of the model used in this study is illustrated in Figure 1a, and CFD simulations of wind flow were conducted at 16 different attack angles, as depicted in Figure 1b.
In each CFD simulation run for each specific wind attack angle, a simulation time step of 0.00002 s was employed.Following this, the simulation was carried out for a total of 30 s, and the first 1 s of unstable data were eliminated.Additionally, the sampling time interval of 0.1 s was used to obtain accurate data measurements.Consequently, a total of 4640 data time steps were collected for this experiment, obtained by multiplying 16 wind attack angles by 290 time steps.Of these, 3680 time steps of data were dedicated to training, whereas 960 time steps were reserved for testing purposes.The training data accounted for approximately 80% of the total data.The specific details and validity of the CFD results have been comprehensively outlined and established in our previous published work (Gao et al., 2023;Gu et al., 2023).

Data preprocessing
The method of data extraction applied in this study involved generating a probes matrix that encompassed the entire research area of the CFD results.Once generated, the matrix was appropriately positioned at a height of 2 m above the ground, corresponding to typical pedestrian height in real-world scenarios.Within the context of this study, the wind field data obtained specifically from a height of 2 m have been selected to empirically examine the efficacy of the proposed model.Nevertheless, following adequate training incorporating height-specific data, the model itself can be appropriately adapted and configured for computation and utilization across a diverse range of height planes.In addition to the extracted CFD data, two auxiliary data channels were also created in this study: the Binary channel and the SDF channel.The Binary channel is used to identify the presence of obstacles such as buildings at each point in space.Specifically, points with obstacles are labeled with zero while nonobstructed points are labeled with one.The SDF channel, on the other hand, indicates the shortest distance between each point in space and the nearest obstacle.Both channels were generated with the same spatial resolution as the probes matrix, ensuring consistency in the analysis of the CFD simulation results.This study employed the SDF channel calculation method based on the approach proposed by Høiness et al. (2021).The binary channel and SDF channel are shown in Figure 2a and Figure 2b, respectively.The DL model proposed in this study is mainly composed of two parts: a main structure and an auxiliary structure.The main structure is responsible for performing plane-to-plane mapping, while the auxiliary structure serves as both a feature extractor and a classifier, providing additional information to the main structure from a point-and-graph perspective.
The main structure of the DL model utilized in this study is designed to accommodate sparse sensor data as input.The generation of the sparse sensor data proceeded through the following four-step process: (1) A 128 × 128 zero matrix was created to serve as the basic framework.
(2) The 30 optimal sensors were identified in the urban area based on the established method presented in our previous work (Gao et al., 2023).(3) Time-history data of wind field were then extracted from these specific locations.(4) The extracted wind data were inserted into the zero matrix created in step 1, with the data being placed according to their respective spatial positions within the matrix.The label data used in the main structure are comprised of all data that were extracted from the 128 × 128 probe matrix within OpenFOAM.It is important to note that the data returned by probes that are located inside buildings have been modified to zero to account for the obstructed airflow within these structures.The input and label data for the main structure are presented in Figure 3a and Figure 3b, respectively.
The input data for the auxiliary structure takes the form of point clouds, constructed according to the established 128 × 128 spatial resolution format.Specifically, the input data consist of all optimal sensors and the 8 closest neighboring points surrounding each sensor, forming a comprehensive point cloud of the sensor and its surrounding environment.For each point within the point cloud, seven unique features were included: x-coordinate , y-coordinate , CFD simulation moment , x-direction component of wind speed   , y-direction component of wind speed   , obstacle mark value Binary, and spatial distance value SDF.One key contrast between our study and other point cloud-based approaches is that the positional relationships between points are not only determined by point features, but also by the structured arrangement of data within the 128 × 128 spatial resolution matrix.By incorporating a convolution module into the DL architecture proposed in this study, this data arrangement enables the model to better exploit data information.It is important to note that in real-world scenarios, the   and   values of the eight neighboring points surrounding each sensor in the center are often unavailable.Therefore, in this study, wind speeds of neighboring points located outside of buildings are set to be the same as that of the sensor in the center, while wind speeds of neighboring points located inside of buildings are set to 0. Under this setting, the "edge" value obtained by Get Graph Feature (GGF) layer (see Section 3.1.1)for   and   of neighboring points located outside of buildings are set to 0, thus the potential negative impact on the model is minimized.The input data for auxiliary structure are shown in Figure 4.The label data for the auxiliary structure are comprised of a set of discrete numerical values ranging from 1 to 16.These values correspond to the input data from the 16 different wind attack angles that were considered in this study.

REAL-TIME RECONSTRUCTION DL MODELS
This section provides a detailed overview of the DL model structure proposed in this study, including both ).The data utilized for RGCNNC are extracted from sparse sensor positions and their neighboring points.The sparsely gathered data from the urban wind field are organized in a condensed data structure within the neural network.Throughout the processing, any modifications in the data structure are thoroughly documented, enabling us to interpolate the data back to their original sparse structure using their initial spatial location.To facilitate comprehension of the overall framework, the auxiliary structure will be introduced first, followed by a discussion of the main structure.

GGF layer
GGF layer is the initial layer of RGCNNC.Its primary function is to establish "one-way edges" between neighboring points and their respective sensor point in the center in all of feature dimensions.As shown in the top portion of Figure 5, GGF layer takes the original point cloud data (as demonstrated in Figure 4) as input.Within this layer, the sensor data in the center for each of the seven feature channels are extracted from the central position of the input data (the green grid of original point cloud data), then used to populate a 3 × 3 matrix for each feature dimension, thereby forming the sensor point cloud data.Then, the sensor point cloud is utilized to perform an element-wise subtraction from the original point cloud data.The outcome of this operation is known as the "edge feature" due to its correspondence, from a graph-theoretic standpoint, with the creation of "one-way edges" from each neighboring point to its respective sensor point in the center.These "one-way edges" effectively represent the underlying influence of neighboring points on the central point, allowing for an accurate characterization of the structural relationships and interactions between the neighboring points and the central point.Finally, the origin point cloud data and the edge feature are superimposed onto the data feature dimension to obtain the graph feature that contains both original point features and edge features.
It is noteworthy to mention that the architecture proposed in this study only establishes a graph connection between the sensor point in the center and its neighboring points.
The graph connection relationship between sensors in the center, however, is not established.The urban wind field is significantly influenced by numerous adjacent buildings, which substantially alters the flow dynamics and weakens the correlation between the data collected at various sensor locations.Furthermore, the data retrieved from a sensor inevitably reflect the influence of the surroundings flow field.Hence, in urban wind fields, the approach proposed in this paper is deemed to be a more practical solution.

Neighboring points assemble (NPA) layer
NPA layer serves as an intermediary layer between the information extraction and the information enrichment components for classification tasks (as demonstrated in Figure 8).The main objective of the NPA layer is to condense the spatial-wise data.More precisely, it synthesizes the impact of eight neighboring points on the sensor point in the center into a single value in preparation for the generation of subsequent classification evaluation metrics.Viewed from a data structure perspective, following processing by GGF layer and other convolutional modules (refer to Figure 8), the data are presented in the form of  × 9 ×  before they enter NPA layer.Here,  denotes the number of sensors, 9 represents the sensor point in the center and eight neighboring points, while  represents the number of feature channels.As indicated in Section 3.1.1, channel is treated as the feature dimension for convolution.However, at the NPA layer, the channel related to 9 (representing the sensor point in the center and eight neighboring points) of  × 9 ×  is treated as the feature dimension.In other words, from the perspective of CV, the data are reconsidered as an  ×  image with 9 feature channels.Thereafter, dimensionality reduction convolutions are performed, ultimately producing only one feature channel, which represents the comprehensive value of the spatial influence.The procedure is depicted in Figure 6.

3.1.3
Point cloud and graph features enrichment (PCGFE) layer PCGFE layer is responsible for a further condensation of point and graph information after the processing of NPA layer to generate the evaluation indicators necessary for the classifier.In the PCGFE layer, the information from all sensors is condensed by extracting the maximum and mean values of all sensors across various feature channels, as illustrated in Figure 7. Subsequently, the maximum and mean values are concatenated into a onedimensional vector that enters subsequent fully connected mapping layers.
Identifying the maximum and mean values of each feature is a commonly employed feature enrichment technique in the point cloud neural network that is based on the permutation invariant and transformation invariant features of point cloud data.The data utilized in this study can be classified as two-dimensional point cloud data situated at specific locations (sensor locations), therefore obeying the aforementioned point cloud data properties.This manipulation may result in a slight loss of graph information within the data.Nonetheless, the main model has already been served with the most complete graph information extracted via graph feature 1-5 in Figure 8 before reaching PCGFE layer.Therefore, the minor loss here only limits the extent to which graph features can contribute to the classification task and will have a negligible impact on the main model.The work of Wang et al. (2019) has established the effectiveness of the information enrichment methodology employed in this study for classification tasks.

RGCNNC structure
The primary objective of the RGCNNC model is to accurately determine the wind attack angle of the wind field based on the point cloud data obtained from  sensors and their eight neighboring points at any given time.In other words, RGCNNC works as a classifier (Alam et al., 2020;Pereira et al., 2020;Rafiei & Adeli, 2017).The structure of the RGCNNC model is divided into two critical components: the information extraction and information enrichment sections, as illustrated in Figure 8.
The objective of the information extraction part (as shown in Figure 8) is to extract relevant data information.After processing by GGF layer, all point cloud and graph data are modified into a format resembling images for feature mining based on convolutional-related operations, as described in Section 3.1.1.Subsequently, five up-dimensional convolution modules were implemented on this data structure to maintain spatial resolution consistency ( × 9) while simultaneously increasing the number of feature channels to fully express the information contained in the data.Furthermore, residual connections are incorporated to enhance the utilization of the initial feature information and ensure that the learning direction of model is guided by the initial features.During the training process of RGCNNC model, the structures mentioned above aim to uncover the latent information within the data and judge different wind attack angles in high-dimensional feature space, with the ultimate goal of improving the classifier accuracy.Upon the processing result of th convolution module ( = 1, 2, 3, 4, 5), a highdimensional graph feature, designated as Graph feature m, is derived, as depicted in Figure 8. Through the utilization of diverse feature channel depths varying from shallow to deep (64, 128, 256, 512, 1024 feature channels for Graph feature 1-5), these graph features comprehensively capture the traits of the data from the sensors and their neighboring points for each wind attack angle, further outlining the disparities among these characteristics under different wind attack angles in a detailed and multifaceted manner.These messages sufficiently depict the information required by the main model structure.
The information enrichment part (as shown in Figure 8) condenses the high-dimensional feature-represented data by gradually lessening their dimensions while simultaneously refining them into suitable indicators that suggest the likelihood of the input data belong to each wind attack angle.While the graph features generated by each convolution module of the information extraction part possess their respective strengths in representing the input data, in general, the data with the highest feature dimensions include a broader range of information.Moreover, the incorporation of residual connection technology further ensures that information in shallow layers of neural network can be effectively transferred to deep layers.Hence, given that the succeeding portion of the model aims to condense the data, excessive superfluous information may have a deleterious impact.As a result, the information enrichment part solely processes the last graph feature, namely, Graph feature 5. NPA layer integrates spatial-level information, specifically the influence of neighboring points on the sensor data in the center.While the PCGFE layer integrates sensorlevel information, which is derived from all sensors and their corresponding neighboring points.Subsequently, the fully connected module conducts a gradual reduction in feature dimensionality on the condense information and ultimately produces 16 values that are utilized by the crossentropy loss function to determine the wind attack angle of the input data.

Physical loss generator
Numerous existing physics-informed neural networks in the domain of fluid dynamics set spatial parameters, for instance,  and , as input data and identify the output data as flow field variables, such as   and   .Subsequently, they compute the Navier-Stokes equations by utilizing the automatic derivation function of pyTorch or TensorFlow (Cai et al., 2021;Jin et al., 2021;Lucor et al., 2021).This approach is theoretically perfect and conforms entirely to mathematical standards.However, the usage of automatic derivation requires that the neural network structure is fully connected.This requirement makes it exceedingly challenging to apply this rigorous mathematical technique to convolution-based or other non-fully connected structures.In order to overcome the previously mentioned issues, this study proposes a facile and userfriendly technique of integrating physical constraints into non-fully connected neural network structures, particularly those based on convolutions, within the context of a data-driven approach.
The two-dimensional mass conservation equation of the fluid ( is a representation of the variation in   () upon the occurrence of a unit displacement in the x-direction, while   denotes the variation in   () upon the occurrence of a unit displacement in the y-direction.Derived from this realization, the physical loss for this study has been devised, which solely concerns the output data, as depicted in Figure 9.To elaborate, let us consider the example of   .After the main structure (PGI-AE) generates the predicted   wind fields, for each pair of th and ( + 1)th column of   matrix data, the difference between them is calculated in the direction from positive Given this interpretation, the physical loss function can be expressed by Equation (1): where   and   are coefficients that aim to eliminate the aforementioned difference in numerical magnitude value.
In this study,   and   have been defined to be identical, because the data are uniformly sampled in space (i.e., Δ = Δ).Regarding the definition of values for   and   , it can be viewed from the aspect of loss function, specifying the degree of emphasis on physical loss throughout training of the model.In this paper,   and   are adjusted in a manner that the physical loss and mean squared error (MSE) loss are commensurate in terms of magnitude.
In comparison to other methods of generating physical loss, the approach presented in this paper boasts several advantages.It is simple to expand and alter in accordance with a prescribed equation, featuring rapid calculation speed, devoid of time-related items, and disentangled from the discretization process of CFD.Evidently, the physical loss generation method demonstrated in this paper possesses certain constraints.Specifically, it can only be employed in the data format of regular grids.Nevertheless, this constraint already exists for convolution neural networks.Therefore, the approach presented here does not impose any additional usage limitations.

Graph feature encoders
The primary objective of graph feature encoders is to rearrange the data from Graph feature 1-5 (obtained through RGCNNC) according to their corresponding sensor spatial positions and further dig out the implied information in Graph feature 1-5.In particular, Graph feature i ( = 1, 2, 3, 4, 5) is characterized by an  × 9 ×  data structure where  represents the number of sensors, and 9 corresponds to both the sensor and its 8 neighboring points.Throughout the data processing phase in the information extraction part of RGCNNC, every change in the data structure resulting from each step of the process is meticulously documented.This enables the interpolation of the data in Graph feature i into a 128 × 128 zero matrix, taking into account the spatial position of their corresponding sensors.This crucial process is prominently depicted in Figure 10.
After the rearrangement of data based on their corresponding sensor spatial positions, Graph feature i is subsequently transmitted and processed within the appropriate graph feature encoder (Graph Feature Encoder i), wherein Graph feature i is considered as an image with   feature channels.Graph feature encoders are purposefully and systematically designed based on the unique number of feature channels of graph features.The design framework of graph feature encoders is broadly categorized into two primary components: the feature buffer convolution module and the feature dimension reduction convolution module.The feature buffer convolution module is composed of a dual-layered convolutional structure.
The initial convolutional layer effectively compresses the spatial data information, whilst simultaneously retaining the original number of feature channels.Subsequently, the second convolutional layer maintains the existing data structure, and enhances the establishment of the underlying data mapping relationship through utilization of a 1 × 1 convolution kernel.The feature buffer convolution module is solely employed when tasked with operating on graph features containing 256 or more feature channels.The primary aim of this module is to alleviate the persistent and substantial pressure generated during the data information concentration phase.By reducing this pressure, the module plays a key role in preserving the maximal amount of information within the high feature dimension data.The feature dimension reduction convolution module is distinctly composed of two individual layers for optimized performance.These layers include the spatial dimension reduction convolution layer and the feature dimension reduction convolution layer, both of which are systematically integrated to effectively condense the intrinsic data characteristics at both the spatial resolution and feature channel scales, respectively.The feature buffer convolution module and the feature dimension reduction convolution module operate simultaneously to effectively refine the corresponding graph feature size to a degree that enables seamless concatenation with the inner data structure of the main model (PGI-AE).As previously stated, the graph feature exclusively consists of values relating to the sensor in the center and its neighboring points, and the remaining points in 128 × 128 data matrix are nullified to 0. It can thus be deduced that the application of the graph feature can be interpreted as an attention mechanism (Li et al., 2023), selectively focusing exclusively on the sensor in the center and its neighboring points.The structure of the graph feature encoder can be explicitly visualized and comprehended through Figure 11.

PGI-AE structure
The key objective of PGI-AE is to effectively reconstruct the wind field data for the entire research area with any wind attack angle, basing on real-time sparse   and   data received from  sensors.This task is achieved through the integration of the supplied graph data provided by the auxiliary structure.PGI-AE model is predominantly constituted of three primary components: the sensor message encoder, the graph feature encoder, and the comprehensive decoder.The sensor message encoder module, composed of Sensor Feature Extraction Blocks and skipconnection convolution modules, is exclusively dedicated toward extracting multilevel feature from the  sensor data input.This process is accomplished via the utilization of a series of convolutional filters, each possessing unique receptive fields, which serves to extract both localized and global information.Finally, the intricate information obtained through this rigorous feature extraction process is seamlessly relayed to the corresponding comprehensive decoder module for further processing.The present methodology employed for sensor message encoder is inspired by the established UNet structure first proposed by Falk et al. (2019).The graph feature encoder module is a crucial component within the proposed framework, dedicated to processing the supplied graph information received from the auxiliary structure at different depths.
The resulting processed data are then transmitted to the corresponding level of the comprehensive decoder module for further processing, yielding a robust and effective representation of the original input data.Subsequent to the completion of the feature extraction process by both the sensor message encoder and the graph feature encoder modules, the resulting information is combined by the comprehensive decoder module, producing a unified representation of the complete wind field data for the overall study area.Notably, the predicted wind field is subjected to dual constraints of MSE loss and physical loss during the training phase, yielding a carefully refined and high-quality output.The fundamental structure of the proposed PGI-AE model can be comprehensively visualized in Figure 12.Within the domain of conditional generative models, the auto-encoder structure is among the most widely utilized and revered techniques.This classical framework comprises a two-part structure featuring an encoder and a decoder.The encoder module within the auto-encoder undertakes the crucial task of extracting data information from the input, while the decoder effectively leverages this extracted information to enable successful conditional generation of the desired output.Thus, it can be inferred that the remaining structure featured within Figure 12 holds significant resemblance to the classic auto-encoder framework, once auxiliary structure (RGC-NNC), physical loss, graph feature encoder (GFE i), and their counterparts within the comprehensive decoder are subsequently removed.As detailed within the results section, the remaining framework is referred to as Basic Model.In addition to the auto-encoder framework, the most widely recognized conditional generation models comprise a range of models featuring GAN and Diffusion structures.Within the results section of this study, an exhaustive discussion of the limitations affiliated with models driven by these two fundamental structures is presented.Such critical analysis is aimed at emphasizing the inherent validity and superiority of the proposed model presented within this paper concerning these competing models, thus demonstrating its fundamental efficacy and performance.

RESULTS
This section serves to present the comprehensive experimental results of both the auxiliary structure and main structure via a series of concise yet meticulous ablation experiments.These experiments primarily focus on the impact of auxiliary physical input data channels, physical loss, and novel neural network structures proposed within this study on the resulting experimental outcomes.Additionally, a critical discussion of the inherent limitations of the GAN and Diffusion models with respect to the presented task is also presented within this section for insightful comparative analysis.All experiments were conducted on an NVIDIA RTX 3090 GPU.Our proposed model utilizes 5 GB of GPU memory and completes the prediction of a wind field for a single time step in less than 1 s.

Auxiliary structure-RGCNNC results
From an intrinsic perspective, the primary task of RGC-NNC revolves around leveraging the point cloud data sourced from  sensors in conjunction with neighboring points to effectively determine and discern the attack angle of the wind field.From a global structural standpoint, the auxiliary structure serves the crucial purpose of seamlessly providing primary graph feature information to the main structure.Thus, this section seeks to comprehensively detail the experimental results of RGCNNC within the context of these two distinct yet interrelated aspects.
In its capacity as a classifier, a comprehensive data set containing 3680 data time steps sourced from 16 distinct wind attack angles is employed to facilitate the requisite training process.Following 548 epochs, the training set classification accuracy attains the notable milestone of 100%.Upon qualified application of the trained model to a total of 960 data time steps within the test set, a remarkable classification accuracy rate of 95.1% is achieved.This result effectively validates the intrinsic ability of RGCNNC to efficiently and effectively extract pertinent wind field information contained within point cloud data while accurately differentiating the nuanced data features of distinct wind attack angles.
Operating as an auxiliary structure, RGCNNC facilitated the comprehensive acquisition of a considerable volume of data features.Due to spatial constraints, the noteworthy properties of Graph feature 1 has been opted to specifically shed light upon, featuring an impressive 64 distinct feature channels.Focusing on the salient data of the 2nd, 33rd, and 64th feature channels, the concrete evidence of the model inherent proficiency is presented in accurately extracting discrete data features and proficiently reassigning them to their corresponding distinct feature channels (Figure 13).

Single wind attack angle results
As stated in the research conducted by Karniadakis et al. (2021), there are three predominant techniques for performing physical information fusion within neural networks.These techniques involve the integration of physical information present within input data, the utilization of customized network architecture, and the adoption of a loss function that is based on physical equations.The current study introduces a specially designed network structure that includes a loss function based on physical equations.Furthermore, the study aims to enrich the input data with as much physical information as possible.This was achieved by including Binary and SDF data in the auxiliary structure and adding them to Basic model for conducting ablation experiments that evaluated their effects on model performance.For the ablation experiment, the data collected under the north-west (NW) wind attack angle were selected.Theoretically, this wind direction corresponds to the closest value ranges for   and   , ensuring that the model performance verification is not impacted by the differences between these value ranges.The training data set consisted of a total of 230 time steps (samples), while the test data set was composed of a total of 60 time steps (samples).In evaluating the performance of the model on the test data set, four indicators were primarily used:  2 , MSE, RMSE, and MAE.
The statistical values of the four aforementioned indicators for both   and   reconstruction experiments are presented in Figure 14.The term "Sensor" is used to indicate the input mode wherein only   and   sparse sensor data are fed into the Basic model.By contrast, Binary, SDF, and Binary+SDF refer to the inclusion of the respective auxiliary channel data into the input data alongside the Sensor mode inputs.In the boxplots presented, the upper and lower edges of the boxes signify the upper and lower quartiles, respectively.The median is represented by the central orange line.Meanwhile, the highest and lowest horizontal lines indicate the maximum and minimum values of the data set, while outliers are marked with circles.Finally, the mean value is reflected as a green triangle.As such, the height of the box and the degree of outlier deviation demonstrated in the boxplot are directly proportional to the extent of model performance fluctuation.Similarly, the mean value and median reflect the overall model performance, providing insights into model performance with and without the impact of outlier samples.
Figure 14 indicates that, the overall model performance remains relatively consistent across various input modes, as indicated by the insignificant differences in box position.However, the Sensor and SDF input modes demonstrate superior overall performance in   -related indicators and exhibit greater stability, as evidenced by the reduced number of outliers and minimal degree of deviation.Based on these findings, these two input modes are selected for further experimentation.It is evident from this study that an increase in the physical information of the model input data does not yield a significant impact.Moreover, it is noteworthy that the superior performance obtained in the study conducted by Høiness et al. (2021) could not be replicated.This could potentially be attributed to the complexity of the flow field, different neural network structure or the utilization of different input data (sparse data) in this study.
In light of the experimental results presented above, it was determined that the physical loss item should only be incorporated in situations wherein the Sensor and SDF input modes are employed to assess its combination effect.The experiment results have been illustrated in Figure 15.Incorporating the physical loss has nearly caused all the indicators included in Figure 15  of   , the box of Sensor input mode has shifted downward, whereas the position of the SDF+PL box has remained relatively unchanged.This finding suggests that the physical loss algorithm did not result in significant improvement in predicting   under the SDF input mode, considering negligible abnormal samples.Consequently, regarding the overall model performance, it can be inferred that the Sensor+PL combination yields optimal results, and the effectiveness of incorporating physical loss to enhance the overall model performance has been validated.A thorough evaluation of the model performance stability, specifically the box height, suggests that incorporating the physical loss algorithm effectively optimizes the performance of both input modes in terms of the  2 , MSE, and MAE metrics for   and MSE, RMSE for   .However, solely the SDF+PL combination exhibited a decrease in the box height for   RMSE and    2 indicators, and solely the Sensor+PL combination showed a reduction in Box height for   MAE.Thus, upon considering the extreme samples, the SDF+PL input mode performs the best.On the other hand, when it comes to ignoring the extreme samples, the Sensor+PL combination yields the superior outcome.Furthermore, the effectiveness of the physical loss algorithm in enhancing the overall model performance stability has been duly validated.An assessment of the extreme performance of model, chiefly the deviation degree of outliers, reveals that incorporating the physical loss algorithm either reduces the number of outliers or restricts their deviation to the original (Sensor or SDF) maximum value range.This finding holds true for almost all indicators.It is worth noting that the phenomenon of outliers reaching beyond the original maximum value range is solely evident in   MAE of the Sensor+PL input mode.Nonetheless, upon the removal of this outlier, significant optimization of the Sensor+PL input mode performance concerning this metric is apparent in both overall performance and stability.Integrating the three aforementioned perspectives of evaluation, it is apparent that the Sensor+PL input mode combination excels concerning the overall model performance.Notably, the physical loss algorithm enhances the performance of both input modes concerning model stability.Consequently, it can be concluded that Sensor+PL is widely considered as the ideal combination and will be incorporated into the subsequent round of experiments.
After conducting the aforementioned two-step inspection, it can be deduced that the Sensor and Sensor+PL models exhibit superior performance.Consequently, these two models were selected for comparison with the proposed model in this study, and the corresponding comparative results are depicted in Figure 16.The model performance was also assessed based on several key parameters, including the position of the box, which represents the overall model performance, the height of the box, which reflects the overall stability of model, and the degree of deviation exhibited by the outliers, which serves as an indicator of the model extreme performance.Based on the  2 analysis, it can be observed that the proposed model outperforms the other two models for both   and   , as its  2 values are significantly closer to 1.This finding suggests that the auxiliary structure provides information that is similar to that of the input data for a given attack angle, resulting in a higher level of interpretability of the input data to the model output.In terms of RMSE, MSE, and MAE, the performance of the proposed model for   is significantly better than the other two models.Specifically, the position of the box for the proposed model is notably lower than that of the other models, indicating superior overall performance.In terms of the stability of the model performance, the box height for the proposed model is comparable to that of the Sensor+PL model, however, the proposed model exhibits no outliers in the results.This finding suggests that the proposed model is more stable than the Sensor+PL model.Regarding   , the performance of the proposed model falls between that of the Sensor and Sensor+PL.Specifically, the position of the box for the proposed model is intermediate to that of the Sensor and Sensor+PL, signifying that the overall performance of the proposed model is in between those of the other two models.The box heights for all three models are similar, and the deviation degree of the outliers for the proposed model is also intermediate to those of the Sensor and Sensor+PL, suggesting that the performance stability of the proposed model is also intermediate to that of the other two models.Taking into account the  2 analysis, it can be inferred that the auxiliary structure can provide information similar to the input data for a given attack angle.However, this information lies on the boundary between effective and redundant information for the main structure to address the task of single attack angle reconstruction.Figure 17 presents a visual-ization of the best and worst results of the three models (excluding outliers), where   RMSE serves as the evaluation criteria.The visualization outcomes suggest that all three models exhibit excellent performance stability in the single attack angle task.Moreover, the performance of the proposed model is comparable to that of the other models, with no significant differences observed.Hence, it is deemed most effective to solely incorporate physical loss into the main structure for reconstructing wind fields for single attack angles.Nonetheless, in reality, an area is seldom accompanied by a single wind direction.Thus, the multiple wind attack angle outcomes, as presented in the following section, hold significant practical and valuable implications.
To facilitate performance comparison in the single attack angle task, a GAN-structure-based model is also adopt.Given the similarity between the super resolution task, which can be regarded as an image generation task based on image constraints, and the task type addressed in this paper, the SRGAN model proposed by Ledig et al. (2017) is adopted for performance comparison purposes.Additionally, the super-resolution coefficient of the model is modified to 1 to align with the data structure of the current task.Table 1 presents the average statistical outcomes of the SRGAN model when there is no physical information input along with the other models discussed in this study.The analysis reveals that the GAN-structurebased model performs worse than any model founded on the auto-encoder structure utilized in this paper.With regard to the input mode involving any auxiliary input channel data highlighted in this study, it is important to note that the GAN model is unable to converge.This phenomenon can be attributed primarily to differences in the loss function.The loss function employed in auto-encoder structures is typically MSE between the model-generated image and the corresponding real image.
While this approach provides a direct and comprehensive means to gather information, its drawback lies in the fact that the MSE-guided model generates blurred results due to the averaging of features across all samples.As a consequence, the model may fail to capture the fine details and nuances of individual samples.In the domain of image generation, one of the foremost strengths of the GAN-based model is its ability to produce sharp and clear results with minimal blurring.This goal is typically accomplished by incorporating a loss function comprising two key elements: the score given by the discriminator and the extracted upper image features.The score denotes the level of confidence with which the discriminator perceives the authenticity of the image produced by the model as a genuine representation.The extracted upper image features pertain to the utilization of a trained multilayer convolutional architecture, such as the initial 18 layers of the VGG-19 model, as a feature-extraction mechanism to extract relevant high-level features from both the model-generated and real images.This process yields a pair of upper image features with lower spatial resolution and larger feature channels.The MSE between these two features is then used as the loss function.Under these constraints, the generator is optimized to achieve the highest possible discriminator score and the smallest possible MSE of the extracted upper image features.It is evident that the loss function of GAN introduces a greater level of complexity to the task of the generator, in comparison to that of the auto-encoder.The method of extracting upper image features presents a significant challenge for GANs when attempting to extract meaningful information from sparse data.In cases where sparse data are augmented with inputted physical information, there is a high likelihood that the sparse sensor data will be disregarded, resulting in a failure of the model to converge.As a result, GANs require adequate and efficacious input data to serve as a foundation in order to surmount numerous challenges and successfully accomplish tasks with proficiency.However, for this particular task, the information available from input data is both limited and inadequate, making it an exceptionally challenging endeavor for GANs.
Concerning the diffusion-structure-based model, the model that best fits the requirements of this study is the SRDiff model proposed by Li et al. (2022).The SRDiff model imposes more rigorous demands on the input data information.The method commences by utilizing standard interpolation methods such as bicubic interpolation to conduct preliminary super-resolution on low-resolution data using low-resolution images as a basis.Then, the images undergo refinement via the diffusion model.The present task cannot be accomplished using the aforementioned technique due to the inability of traditional interpolation methods to transform sparse data into a complete data matrix.As a result, the auto-encoder structure is more efficient than the other two generated structures for translating sparse data into full-field data via the use of MSE between the synthesized image and the authentic image as the loss function.
Furthermore, Table 1 presents a comparison of the variational auto-encoder (VAE) model using a fully connected structure and the Transformer model employing convolution and self-attention, with other models discussed in this paper.The experimental results indicate that, despite the incorporation of prior knowledge, the fully connected framework faces challenges in addressing complex reconstruction problems.While the Transformer model outperforms the VAE, it falls short in comparison with the other models examined in this study.

Multiple wind attack angles results
In this section, the Sensor, Sensor+PL, and proposed models that exhibited the most superior performance in the single wind attack angle test underwent further performance testing under multiple wind attack angles.The statistical results of the test set, comprising 960 samples, are depicted in Figure 18.The statistical findings illustrated in Figure 18 indisputably validate the advanced capabilities of the proposed model in complex urban wind fields, particularly under varied wind attack angles.From a statistical standpoint, it can be observed that solely the primary value range of the proposed model conforms to a reasonable distribution between 0 and 1, as evidenced by the  2 metric.Conversely, the  2 values of the remaining two models suggest the absence of a meaningful causal relationship between the input and output variables.As previously stated in Section 4.2.1, it is important to note that the task at hand involves a nonlinear mapping, which may limit the efficacy of  2 as a standalone metric for evaluating model performance.As such, it is advisable to consider the other three indicators in conjunction with  2 when making judgments regarding the effectiveness of the model.Regarding the evaluation of our proposed model, it is important to consider the metrics of RMSE, MSE, and MAE for both   and   .Our model exhibits exceptional performance in these regards.Utilizing the mean value (represented by a green triangle) as the criterion, a significant decrease was observed in errors for   when compared to the Sensor input mode, with reductions in RMSE, MSE, and MAE of 51.96%, 78.85%, and 55.94%, respectively.For   , reductions of 53.65%, 81.07%, and 57.20% were observed, respectively, compared to the Sensor input mode.In comparison to Sensor+PL input mode, a reduction in   errors by 48.94%, 76.04%, and 52.61% were observed, respectively, and for   , reductions of 46.12%, 73.47%, and 49.39% were recorded, respectively.Furthermore, considering the overall performance of the model, including the position of the boxplot, the height of the boxplot indicating performance stability, and the deviation degree of outliers indicating extreme performance, our proposed model emerges as the clear leader.The aforementioned results serve as a compelling validation of the exceptional ability of the proposed model to address the challenge of sparse sensor-based high-speed reconstruction of the urban wind field at multiple wind attack angles.Moreover, these results are derived from a statistical perspective, which further reinforces the robustness and reliability of our proposed solution.
To provide a more intuitive comparison of the performance of each model under various wind attack angles, the average test metrics of each model are compiled in Table 2.Moreover, in order to evaluate the impact of different sensor placement schemes, Table 2 also presents the average statistical metrics of wind field reconstruction for the proposed model using 30 randomly selected sensor positions.The comparison results reveal an interesting phenomenon: When sensors are randomly distributed, the average metrics of the model tend to exhibit smaller errors.To further investigate this phenomenon, the wind field reconstruction results at the time step with the best performance in the multiwind attack angles scenario (25.6 s) were visualized for two different sensor placement schemes, as depicted in Figure 19.It is evident that although the model demonstrates better average metrics under the random sensor placement scheme, it struggles to accurately reconstruct the corresponding area due to a lack of sufficient sensors in the wind speed transition zone.Therefore, effective sensor placement has a positive impact on enhancing the predictive performance of the proposed model in this study.Similar to the single attack angle condition, visual representations of the best and worst outcomes have been provided (with   RMSE used as the selection criterion) for the three models (excluding outliers) in Figure 20.It is evident that in cases involving multiple wind attack angles, the Sensor input mode does not provide an effective solution for reconstructing the directional wind field based on sparse sensor data.The results obtained under this approach can be viewed as an average outcome for all wind directions, lacking the specificity required for satisfactory performance.Similarly, the results obtained using Sensor+PL exhibited minimal improvement in overall performance, despite the reduction in RMSE values achieved through the incorporation of physical loss.However, for the proposed model, the situation has markedly improved.First, the reconstructed wind field exhibits clear directionality, a crucial factor in accurately predicting wind field.Second, the reconstruction results effectively capture the regions of maximum wind speed within the wind field, providing a highly detailed and informative representa-tion.Additionally, Figure 20 shows significantly different performance depending on   or   in both the Sensor and Sensor+PL models.However, in our proposed model, there is no significant difference in RMSE between   and   .A possible explanation for this is that, without auxiliary structures to reinforce sensor information mining, the performance of the primary structure will exhibit a more pronounced decline when facing a wind field with stronger fluctuations or sudden changes.For example, in Figure 20a, the fluctuations in (iv) are significantly stronger than that in (iii), resulting in a corresponding increase in the RMSE value of   .In Figure 20b, when compared to (iii), the wind speed in the urban area of (iv) experiences greater differences or sudden changes relative to the outside of the urban area, consequently the RMSE value of   also increases accordingly.In the context of urban wind fields, such reconstruction results prove highly valuable as they can rapidly facilitate various applications, such as wind disaster warning, wind energy collection, pollutant dispersion analysis, and urban drone route planning.Furthermore, these outcomes can also be used as initial field inputs for more in-depth simulations to accelerate simulation convergence.It is worth noting that this study employed a total of 16 wind attack angles, with an interval of 22.5 • , in line with the urban wind field benchmark of Architectural Institute of Japan (AIJ).However, for more specific wind attack angle data requirements, employing an appropriate database for training becomes necessary.The exceptional performance of our proposed model in the current case suggests that it holds tremendous potential for application in scenarios involving more extensive wind attack angle data.

CONCLUSIONS
This study presents a novel DL model based on the autoencoder architecture, specifically designed for addressing the issue of sparse sensor-based rapid reconstruction of complete urban wind fields.A comprehensive validation was conducted for both single wind attack angle and multiple wind attack angles conditions, utilizing a range of quantitative indicators to assess model performance.
Within the context of this study, the wind field data obtained specifically from a height of 2 m has been selected to empirically examine the efficacy of the proposed model.Nevertheless, following adequate training incorporating height-specific data, the model itself can be appropriately adapted and configured for computation and utilization across a diverse range of height planes.This paper delves into the influence of physical information on model performance across all three critical components: input data, neural network structure, and loss function.Regarding input data, this study investigates the influences of sensor wind speed (Sensor data), building location information (Binary data), and spatial distance information (SDF data) on model reconstruction performance.Results suggest that despite incorporating physical information in the form of input data, optimal model performance for the task at hand remains elusive.
With regard to the neural network architecture, an auxiliary structure founded on the GCN has been specifically designed for the present task.The aim of this structure is to unveil and fortify the correlation between each sensor and its neighboring environment.This information is subsequently transmitted to the main structure in the form of an attention-like mechanism.The objective of this approach is to augment the overall ability of the algorithm to extract and utilize information.The experimental findings demonstrate that, in the presence of multiple wind attack angles, the integration of the aforementioned auxiliary structure results in a notable decrease in model prediction error by approximately 50% for both RMSE and MAE, and approximately 75% for MSE.However, in the case of a single wind attack angle task, the effectiveness of the auxiliary structure optimization is not particularly noted.
Regarding the loss function, a novel physical loss function tailored specifically for convolutional architectures is presented in this paper.This loss function possesses numerous advantages, including ease of extension to other gradient-based physical equations, rapid computation, and no limitations on discrete format.Experimental evidence demonstrates that the proposed physical loss function positively impacts the performance of model in both single and multiple attack angle conditions.Furthermore, when applied under a single attack angle condition, the physicalloss-equipped basic model operating in Sensor input mode (Sensor+PL) is deemed optimal when compared to other models mentioned in this paper.Additionally, this research showcases that the auto-encoder structure exhibits superior adaptability when tasked with constructing full-field data based on sparse information.This outcome is in comparison to well-known DL models such as GAN and Diffusion structures.
In summary, this study proposes a specially designed neural network structure for the rapid reconstruction of urban wind fields based on sparse sensors.This approach considers three key physical aspects, namely, observational biases, inductive biases, and learning biases (Karniadakis et al., 2021), and showcases the exceptional capabilities of the proposed model.Consequently, this novel DL model serves as an effective tool for acquiring wind field data through optimally arranged sparse sensors.This approach can provide highly effective fundamental wind field information needed for research focused on urban wind disaster warning, heat island effects, urban pollutant diffusion, and urban drone route planning.However, there are still some shortcomings, such as, due to the difficulty in the availability of real-world data, this study has not yet been validated using actual measurements.The sparse sensor data used in this study may also bring some problems because of the sparsity, such as when there are dramatic wind speed changes at locations far away from sensor, the model will find it difficult to make accurate predictions.In future research, we aim to investigate the performance of the algorithm using realworld data and address potential concerns such as noise interference.

A C K N O W L E D G M E N T S
This study is supported by Shenzhen Science and Technology Program (SGDX20210823103202018, KQTD20210811090112003), National Natural Science Foundation of China (52278493, 52108451), and the Research Grants Council of the Hong Kong Special Administrative Region, China (Project No. T22-504/21-R).
Open access publishing facilitated by The University of Sydney, as part of the Wiley -The University of Sydney agreement via the Council of Australian University Librarians.
[Correction added on March 20, 2024, after first online publication: CAUL funding statement has been added.]

F
I G U R E 1 Information of computational fluid dynamics (CFD) simulations.

F
Input and label data of main structure:  represents total time steps.

F
Structure of point cloud data: Green grids/circles are optimal sensors, eight colorful grids around each green grid are neighboring points.the auxiliary and main structures.The auxiliary structure (RGCNNC) has two tasks: (1) accurately determining the wind attack angle of the wind field based on the point cloud data obtained from all sensors and their neighboring points data and (2) providing high-dimensional graph features to the main structure to help reconstruct the whole wind field.The main structure (physics & graph-informed auto-encoder [PGI-AE]) only has one task: reconstructing the whole wind field data in real-time based on sparse sensor data and the high-dimensional graph features provided by the auxiliary structure (residual graph convolution neural network classifier [RGCNNC]

F
I G U R E 5 Details of get graph feature (GGF) layer: Gray operators represent element-wise operations; green operators represent channel-wise operations; black dots represent omitted data, the black arrow represents tensor form transformations;  is the number of sensors.

F
Details of point cloud and graph features enrichment (PCGFE) Layer: Black arrow represents the tensor concatenating operation.

F
I G U R E 8 Auxiliary structure-residual graph convolution neural network classifier (RGCNNC): The purple dotted line in the information extraction part represents the residual connection, and the two connected data are concatenated in the feature dimension.
to negative along the x-axis (i.e., from right to left,  +1  −    ), emitting Δ  data, which signifies the variation of   () upon the occurrence of Δ displacement in the xdirection.As a result,   and Δ  Δ both possess the same physical connotation, albeit with a certain variation in numerical magnitude.Similarly, the association between   and Δ  Δ also adheres to the aforementioned derivation.

F
Generation of physical loss: Numbers in the   grids indicate columns where   () data are located, and numbers in the   grids indicate rows where   () data are located; Δ indicates the   data in the th column, and the Δ indicates the   data in the th row; Δ indicates the side length of the grid in the x direction, and Δ indicates the side length of the grid in the y direction; Cut indicates data cutting; Sum indicates an element-wise summation of data.F I G U R E 1 0 Details of graph feature format conversion:   represents the number feature channel of Graph feature ,  represents the number of sensors, the white block on the right of the figure represents a value of 0.

F
Details of graph feature encoder i (GFE i): FBCM is the abbreviation of feature buffer convolution module, FDRCM is the abbreviation of feature dimension reduction convolution module, ReflectionPad only pads one pixel along edges of the image,   represents the number of feature channels of graph feature , and ×( − 2) represents that FDRCM should be repeated ( − 2) times.When  = 1, only execute the modules in the dashed box.When  = 2, keep only one FDRCM module.

F
The proposed model-physics-informed graph-assisted auto-encoder: PGI-AE is the abbreviated name for main structure-physics & graph-informed auto-encoder, RGCNNC is the abbreviated name for auxiliary structure-residual graph convolution neural network classifier, ReflectionPad only pads one pixel along edges of the image, Channel Cat represents the concatenation of feature channel dimensions, the upsampling factor of Up Sample is 2, GFE i is the abbreviated name for graph feature encoder i.

F
Statistical test results of basic model with assistant data channels for single wind attack angle: Sensor represents the input mode that only   and   sparse sensor data enter Basic model; Binary, signed distance function (SDF), and Binary+SDF, respectively, represent the addition of corresponding auxiliary input channel data on the basis of Sensor input mode.The green triangle represents the mean value, and the orange line represents the median value.
Figure14indicates that, the overall model performance remains relatively consistent across various input modes, as indicated by the insignificant differences in box position.However, the Sensor and SDF input modes demonstrate superior overall performance in   -related indicators and exhibit greater stability, as evidenced by the reduced number of outliers and minimal degree of deviation.Based on these findings, these two input modes are selected for further experimentation.It is evident from this study that an increase in the physical information of the model input data does not yield a significant impact.Moreover, it is noteworthy that the superior performance obtained in the study conducted byHøiness et al. (2021) could not be replicated.This could potentially be attributed to the complexity of the flow field, different neural network structure or the utilization of different input data (sparse data) in this study.In light of the experimental results presented above, it was determined that the physical loss item should only be incorporated in situations wherein the Sensor and SDF input modes are employed to assess its combination effect.The experiment results have been illustrated in Figure15.Incorporating the physical loss has nearly caused all the indicators included in Figure15to shift toward positions suggesting better model performance.An analysis of the overall model performance, particularly the position of the box for MSE and RMSE indicators for   , as well as the MSE, RMSE, and MAE metrics for   , indicates that incorporating the physical loss leads to a downward shift in the position of the boxes.This downward shift indicates overall improvement in model performance.For the MAE metric

F
I G U R E 1 6 Statistical test results of Basic model, physics-loss-embedded Basic model, and proposed model with Sensor input mode for single wind attack: Sensor represents the input mode that only   and   sparse sensor data enter Basic model; PL represents the embedding of physical loss.The green triangle represents the mean value, and the orange line represents the median value.

F
The prediction test results of single wind attack angle condition: Sensor represents the input mode that only   and   sparse sensor data enter the Basic model, PL represents the embedding of physical loss, Proposed model represents the model proposed in this paper.  RMSE is simply selected as the criteria of model performance.TA B L E 1 Average test result of models for single wind attack angle: Avg is the abbreviation of average; Sensor represents the input mode that only   and   sparse sensor data enter Basic model; Binary, SDF, and Binary+SDF, respectively, represent the addition of corresponding auxiliary input channel data on the basis of Sensor input mode; FC corresponds to fully connected structure; F represents that the corresponding model fails to converge.

F
I G U R E 1 8 Statistical test results of Basic model, physics-loss-embedded Basic model, and the proposed model with Sensor input mode for all wind attack angles: Sensor represents the input model that only   and   sparse sensor data enter the Basic model.The green triangle represents the mean value, and the orange line represents the median value.

F
The prediction test results of all wind attack angles condition: Sensor represents the input mode that only   and   sparse sensor data enter the Basic model, PL represents the embedding of physical loss, Proposed model represents the model proposed in this paper.  RMSE is simply selected as the criteria of model performance.

Avg Ux R 𝟐 Avg Ux MSE Avg Ux RMSE Avg Ux MAE Avg Uy R 𝟐 Avg Uy MSE Avg Uy RMSE Avg Uy MAE
Average test result of models for all wind attack angles: Avg is the abbreviation of average; Sensor represents the input mode that only   and   sparse sensor data enter Basic model; PL represents embedding physical loss; RS means random sensor arrangement scheme.
TA B L E 2 F I G U R E 1 9 Influence of sensor arrangement influence.