Novel Geometric Parameters for Assessing Flow Over Realistic Versus Idealized Urban Arrays

Urban heterogeneity, such as the variation of street layouts, building shapes, and building heights, cannot be fully represented by density parameters commonly used in idealized urban environmental analyses. To address this shortcoming and better model flow fields over complex urban neighborhoods, we propose two novel descriptive geometric parameters, alignedness and building facet entropy, which quantify the connectivity of inter‐building spaces along the prevailing wind direction and the variation of building facet orientations, respectively. We then conducted large eddy simulations over 101 urban layouts, including realistic urban configurations with uniform building height as well as idealized building arrays with variable heights, and evaluated the resulting bulk flow properties. Urban canopy flow over realistic neighborhoods resembles staggered building arrays for low urban densities but becomes similar to aligned configurations beyond λp ∼ 0.25 where the realistic flow is less sensitive to changes in density. We further show that compared to traditional density parameters (such as plan and frontal area densities), the mean alignedness, a measure of connectivity of flow paths in street canyons, better predicts canopy‐averaged flow properties. Furthermore, for realistic urban flow, the dispersive momentum flux shows a clear increasing trend with building density, and a decreasing trend with alignedness, which is in contrast with idealized cases that exhibit no clear trend. This distinct behavior further highlights the necessity of evaluating flow over realistic urban layouts for flow parameterization. This study provides an improved method of describing urban layouts for flow characterization that can be applied in neighborhood‐scale urban canopy parameterization.

Numerical simulations (such as Large-eddy simulation (LES) or Reynolds-averaged Navier-Stokes (RANS) models) on idealized urban layouts provide critical information for many applications. For example, urban canopy parameterization (UCP) can be developed based on horizontally averaged flow information in idealized urban configurations Nazarian et al., 2020;Santiago et al., 2013;, which enable larger scale models to take the urban surface effects into account. Urban surfaces after idealization are also optimal to study the three-dimensional spatial variability of urban flow with thermal stratification (Li & Bou-Zeid, 2019;Nazarian & Kleissl, 2016), drag force distribution (Santiago et al., 2008), and street ventilation (Buccolieri et al., 2019) in the urban canopy. However, idealized urban canopy flow is not easily comparable with flow over realistic urban surfaces, where turbulent length scales are less uniform due to intrinsic urban heterogeneity (Sharmin et al., 2017). This limitation significantly oversimplifies flow dynamics in urban canyons and limits the scalability of idealized microscale simulation studies to real-world applications.
With increased computational power and accessibility to high-resolution urban geometry data sets, neighborhoodor city-scale simulations over realistic urban layouts have become more feasible, further proving the necessity of considering urban heterogeneity in applications. For example, Bou-Zeid et al. (2009), conducted simulations over a suburban area characterized by discontinuous clusters of low buildings with different geometry details, demonstrating that the geometric representation of the urban canopy could affect the mean and turbulent properties. Geometry heterogeneity also modifies the aerodynamic characteristics  of urban space by altering the form drag applied to the canopy flow driven by the external pressure gradient in the Roughness Sublayer (RSL). Studies of realistic urban layouts also prove the deficiencies of idealized array assumptions (Giometto et al., 2016). Accordingly, the flow parameterization obtained from idealized arrays, even equipped with variable height, cannot be fully representative of realistic urban areas (Kanda et al., 2013). Focusing on outdoor thermal comfort, for example, (Letzel et al., 2012), found that street orientation, the interconnectivity of air spaces, and building heights could have profound impacts on the wind environment. Further, (Santiago et al., 2017), used the RANS model to assess the evolution of the heterogeneous distribution of pollutants over time, highlighting the importance of adopting realistic geometry in real-world applications.
However, only a limited number of scenarios are studied and inter-study comparison was not feasible due to the diverse simulation setups (such as boundary conditions, level of details in geometry (Biljecki et al., 2016), and domain resolutions) for different purposes, making quantitative conclusions on the impact of realistic configurations nearly impossible. To make it worse, despite ever-increasing computational power, it is not feasible to run microscale simulations for every urban neighborhood of concern, at least in the foreseeable future (Blocken, 2018). Therefore, classifications and characterizations of the urban canopy flow based on key morphological parameters are needed. One of the most useful tools to address this challenge is the urban morphometric method that reduces the inter-and intra-variability of urban surfaces by relating geometry parameters to bulk flow properties. The method starts with microscale simulations on idealized arrays for different scenarios, followed by regressions of flow parameters against the morphological parameter of interest, and designs urban flow models as the final outcome (Santiago et al., 2013). Conventionally, urban density is the most common parameter applied for this purpose (Chokhachian et al., 2020;Nazarian et al., 2020;Santiago et al., 2008) since it is intuitive and easily obtainable. However, conventional urban densities usually only capture the volume feature of roughness elements in a neighborhood and are unable to unequivocally determine an urban layout and consequently the flow over it, as other equally influential urban factors such as building shapes and streets are beyond their capacity.
Meanwhile, studies that quantitatively connect altered microscale canopy flow characteristics and urban morphology other than density are rare, leading to inaccurate prediction of urban climate components such as energy balance in the Weather Research and Forecasting (WRF) model (Sun et al., 2021). Therefore, it is paramount that the relationships between urban morphology and urban flow be thoroughly examined to determine where and how to adopt geometric parameters to large-scale models. The limitations associated with relying solely on density parameters in urban flow modeling motivate a more comprehensive characterization of urban heterogeneity, posing three research questions: 1. Necessity: Is the idealized array of buildings with varying densities adequate to represent the realistic urban environment? 2. Improvement: When characterizing horizontal heterogeneity, is it possible to employ an alternative geometric parameter with better adaptability to realistic urban form to complement or replace the density parameters? 3. Adaptability: To what extent will horizontal geometrical parameters maintain their adaptability as vertical heterogeneity increases?
Here, we address these research questions by defining two types of geometric parameters that aim to describe urban heterogeneity beyond density. Next, focusing on the aerodynamic effect of urban structures that shapes the urban boundary layer, flow over 101 urban surfaces is simulated using an LES model. The simulation cases first consider idealized building arrays with uniform height as a base point to measure vertical and horizontal heterogeneity. Both heterogeneities are then considered separately by including realistic urban forms with uniform height, and idealized urban forms with variable building height. Next, simulation results are employed to test the performance of novel geometric parameters by spatially averaged vertical profiles, canopy-averaged properties, and aerodynamic drags of the urban surface. The paper is organized as follows: Section 2 introduces two novel geometric parameters including alignedness and entropy of building facets to characterize the aerodynamic features of urban geometries. Section 3 describes the research method as a three-step process involving site selection, numerical simulation, and evaluation of different aspects of urban flow modeling. Section 4 evaluates similarities and dissimilarities between idealized building arrays and realistic urban layouts and tests the performance of the novel geometric parameters to address flow variability. Finally, Section 5 presents concluding remarks on the new findings of this study.

Novel Geometric Parameters to Describe Urban Layouts
The urban boundary layer is primarily shaped by interactions between urban structures and thermal and mechanical drivers. Focused on the mechanical forcing, drags exerted from building facets counteract the pressure gradient, where the net effect of drag decelerates the flow and generates mechanical turbulence, promoting the mixing of scalars such as temperature and pollutants. The effect of buildings follows the law of fluid dynamics and is mostly studied with numerical or experimental approaches, which demand substantial computational or experimental resources. As a more cost-effective method, morphometric approaches are used in urban climate analyses where urban form parameters (i.e., detailed shape, size, and orientation of urban structures) are employed to predict the properties of the urban flow (Ratti et al., 2006;Ratti & Richens, 1999). However, such analyses are usually undertaken with geometric factors derived from the densities of buildings, leading to shortcomings in the representation of realistic urban configurations.
To demonstrate the deficiency of density parameters in describing urban flow fields, we compare the pedestrian-level wind environments for "aligned" and "staggered" building layouts (Coceal et al., 2007) with an intermediate plan area density λ p = 0.25 in Figure 1. Both configurations consist of nine identical buildings but represent different building placements. In the "aligned" configuration, buildings are arranged in a way that results in three run-through channels (urban streets). When the wind angle is parallel to the street canyons, three jet-like impingements of rapid high-momentum flow due to the pressure gradient in uninterrupted streets can be identified. On the other hand, the staggered configuration is characterized by a more evenly distributed wind field. Although represented with the same density, flow over these two configurations behaves differently, with the aligned configuration characterized by reduced flow dispersion and increased sheltering (Grimmond & Oke, 1999;. This discrepancy has been addressed in the literature by discussing the aligned and staggered configuration separately (e.g., Macdonald, 2000) but still leaves ambiguity about the most representative configuration for describing the realistic urban form.
10.1029/2022MS003287 4 of 25 To address this gap, we exploit the rich geometric information embedded in realistic urban layouts, and expand the conventional focus on densities to describe the street's connectivity ( Figure 2a) and the orientations of building facets (Figure 2b). In defining these novel parameters, two considerations were made: (a) Simplicity: The calculation of the parameters should not require extensive computational power to retain applicability to larger domains. (b) Availability: Considering that the urban morphology is spatially and temporally variable, the information required by calculations has to be available for most geometry data sets.
Although some shortcomings are identified, conventional densities effectively reflect the building effects for regular urban arrays. Therefore, we also include two density parameters (λ p and λ f ) as a benchmark for testing the performance of the other two types of novel geometric parameters that will be derived in Sections 2.1 and 2.2. It is worth noting that although both novel parameters are based on the 2D cross-section of a neighborhood and may yield quite diverse values at different heights, considering most cases have uniform heights, they are evaluated at the ground level for simplicity.

Alignedness: A Measure of Street Connectivity
Urban wind flows through streets and building gaps, producing coherent structures enclosing high-speed (street lines) and low-speed (cavities) regions. These structures have been found to be several times larger than the scale  of geometry roughness and are highly correlated with turbulent organized structures within and above the urban canopy (Inagaki et al., 2012). Repeated penetrating streets (no buildings obstructing the flow path as shown by red grids in Figure 2a) of idealized building arrays, however, are not always apparent in realistic urban layouts, which place a great challenge for flow models to reproduce physical phenomena responding to such urban feature. To provide an indicator that is more applicable to realistic urban layouts, we quantitatively characterize the strength and frequency of the jet-like channeling flow seen in the aligned building arrays with alignedness, which aims to evaluate the occurrence and dominance of long streets along the prevailing wind directions in an urban geometry.
Assuming the wind angle is unidirectional (three examples over wind direction along with the x-axis are shown in Figure 2a), urban geometry is first discretized into M one-dimensional profiles along the streamwise direction (x). We then define uninterrupted street length C(y, n, θ) as the length between two obstacles (red and green colored grids in Figure 2a) where y ∈ [1, ⋅⋅⋅, M] is the index of the profile, n represents the nth continued canyon/street, and θ is the prevailing wind direction. Under the periodic boundary conditions applied in Section 3.2, C(y, n, θ) will be different over three special geometric conditions shown in Figure 14.
We hypothesize that these continued streets along prevailing wind direction C(y, n, θ) lead to faster local wind speed and other traceable flow features. In this sense, we define alignedness as a 1D profile in the spanwise-direction y that takes the longest uninterrupted street length max ∀ ∈ ( ( )) . One of the key advantages of the alignedness parameter is the dependency on wind direction calculated as a function of wind direction), which removes the need to parameterize the impact of wind directions separately. A schematic diagram with three different wind angles over the same geometry is shown in Figure 3. Evaluating alignedness from different wind directions alters the map of the longest uninterrupted streets (red and green grids in Figure 3). In practice, the alignedness parameter can be pre-calculated before modeling exercises based on prevailing wind direction or fragments of wind directions in θ = [0° − 90°]. The alignedness profile is then normalized in two ways: a) Mean alignedness, γ m (θ) following Equation 1, is normalized by the domain length L x in the streamwise-direction. γ m (θ) falls in a similar range as conventional densities, that is, γ m (θ) ∈ [0, 1]. It is worth noting that the value of γ m (θ), similar to conventional density parameters, shows dependence on the computational domain size. However, since the sites selected in this study (Section 3.1) are all sufficiently large to reflect individual neighborhoods, this limitation will not affect our analysis.
Modified alignedness, * ( ) following Equation 2, that instead takes the maximum length of uninterrupted streets after normalization by the building height H (y, n) ahead. γ m (θ)* is independent of domain size and looks for streets with the lowest sheltering effect in the building canopy.
In this study, the prevailing wind is kept along with the streamwise direction (i.e., θ = 0), therefore, the following discussion will use a simpler notation γ m and * to represent these two geometric indicators. There are more ways to capture the occurrence of the channeling flow in urban canopies. However, γ m and * are shown to have a superior performance over three other alignedness parameters (namely, γ p , γ c , and γ s ) here. Derivation and evaluation will be discussed in Appendix A.

Entropy: A Measure of Disorder in Orientations of Building Facets
The interaction between roughness elements and urban flow occurs near the building facets. Unlike cubes in an idealized configuration, building shapes, and their relative orientations to the approaching wind, vary greatly in cities and result in different near-surface flow fields (Santiago et al., 2013). To take the variation of facet orientations into account, we adopt an indicator from Boeing (2019) that quantifies how well building facets are following the geometric order by counting their relative orientation to the prevailing wind direction. The entropy of urban facets ϕ calculates the global Shannon entropy H (Shannon, 1948) which describes the distribution probability of the state of the entire site based on the number of types and their proportional abundance. By taking the relative orientations of building facets weighted by their length as a one-dimensional random sequence and normalized by two theoretical extremes H max (circular shape) and minimum H min (square shape), ϕ is calculated as Equation 3, where, f ∈ [0, 1, ⋅⋅⋅, F] is the index of orientation prescribed by the user, here F = 36 with each orientation binned to angles divisible by 10° (0°, 10°, 20°, …). P (w f ) represents the proportion of orientations weighted by their length w f that fall in the f bin. By definition, lower entropy corresponds to a more square-like overall shape of buildings and therefore increased effective width (Agbaglah & Mavriplis, 2019) in terms of flow separation and wake, and stronger interaction with other buildings. As a result, we expect lower orientation entropy to lead to slower urban wind speed and other corresponding flow features. The above normalization entropy of facets ϕ evaluated for idealized (i.e., staggered and aligned) square-shaped building arrays yields zero values. Therefore, ϕ is designed to be only meaningful for realistic urban layouts.

Data and Methods
Geometric parameters proposed in Section 2 will be assessed via a three-step method shown in Figure 4. The first step involves obtaining and processing urban layouts for realistic configurations. We prepare realistic urban geometry by rasterizing building footprints from an open-source data set OpenStreetMap (OpenStreetMap contributors, 2017) obtained from several major cities (Section 3.1) such as Sydney and Melbourne (Australia), Barcelona (Spain), Detroit, Los Angeles, and Chicago (United States). The boundary of the selected neighborhood is usually an irregular shape, while the computational domain using a Cartesian grid (PALM in this study) is required to be rectangular for parallelization. Accordingly, additional changes are required: to minimize the empty space artificially created in the numerical domain, the rastered urban layout is first rotated to have the two most dominant building facet directions orthogonal and parallel to the wind direction. This ensures that the domain can be contained in a small rectangular space. Note that although the flow statistic has significant dependence on wind angle, the rotation for most cases is less than 15° and could be considered a specific wind direction condition in the real world. Then, we trim the lateral area to minimize the artificial open space around the selected neighborhood and further meet parallelization requirements for numerical modeling. In the second phase, the 2D arrays of buildings are prepared for LES simulations (Section 3.2) considering the building height distribution. Finally, geometric parameters are evaluated against selected flow properties from the simulation to test their effectiveness in reflecting urban flow heterogeneity.

Sites Selection
Simulations conducted in this study cover both realistic and idealized urban layouts. We first consider 14 idealized urban arrays with uniform heights (7 staggered and 7 aligned arrays as shown in Figure 1, UA/US) with varying densities as the baseline for assessing how idealized layouts can reflect the realistic urban canopy flow. Urban heterogeneities in the horizontal direction are introduced with 59 realistic urban layouts with a uniform height (referred to as UR) that cover a range of horizontal heterogeneities (represented by plan and frontal area densities as well as novel parameters). Building heights in UR cases are all set to be the same as idealized cases To answer the third research question about the validity of geometric parameters over increased vertical heterogeneity, building height variability is considered as the standard deviation of building height H std , and tested with 28 idealized urban arrays with two sets of height variability (H std = [2.8 m, 5.6 m]) for both staggered and aligned configurations, that is, VA/VS. While the impact of H std is not directly parameterized here, the novel geometric parameters can be evaluated at every vertical level, γ(z), to account for vertical heterogeneity. This is similar to Multi-Layer Urban Canopy Models (MLUCM) (such as Martilli et al. (2002)) that often depend on the vertical profile of plan area density (λ p (z)) to consider vertical heterogeneity. Details of cases considered in this study are shown in Table 1.
For realistic urban layouts, domains are selected with a similar size to the state-of-the-art grid resolution of mesoscale models (Δx = Δy ∼ 300 m, e.g., F. Chen et al., 2022;Leroyer et al., 2014), that is, each case could be deemed as one grid cell in a high-resolution mesoscale model. This arrangement further ensures evaluations among different urban surfaces precisely address the inter-grid variability of urban flow in mesoscale modeling.
To provide a systematic comparison, selected urban neighborhoods need to cover a reasonable range of the geometric conditions proposed in Section 2. Based on a preliminary evaluation of the geometric parameters across several cities, the selected ranges are as follows: plan area density λ p ∈ [0.06-0.5], mean alignedness γ m ∈ [0.2-0.8], and entropy of facets ϕ ∈ [0.3-0.8]. In addition, urban canyons in the selected site cannot be too narrow to generate artificial connections between buildings with the desired resolution determined in Section 3.3. We intentionally avoid sites with building gaps narrower than 3 grids (3 m in the real world) to allow them to be fully resolved by LES.
The wide range of geometric parameters demands a large building geometry data set that covers cities of different layouts. However, realistic geometry data used in the last decade is quite diverse where the majority were either commercially obtained for example, Kanda et al. (2013), or provided by the local government for example, Giometto et al. (2016) Note. Red, black, and green colored geometries show aligned, staggered, and realistic urban layouts in 3D view, respectively. UA/US refers to uniform height idealized urban arrays arrange in aligned and staggered form, VA/VS refers to variable height idealized urban layouts, and UR is realistic urban layouts with uniform height.

Table 1 Geometric Details of 101 Urban Layouts and Example of 3 Types of Urban Layouts Simulated in This Study
are shown in Figures 5b and 5c. Although realistic urban layouts in this study are all considered with a uniform building height, OSM2LES has the capacity to prepare urban layouts with both uniform and realistic building heights. Figure 6 shows the correlation between plan area density λ p and three other geometric parameters for the 101 cases discussed in this study. Two mean alignedness parameters that focus on the air space have a clear negative correlation (r = −0.75 and r = −0.62) with λ p , indicating the street connectivity parameters intrinsically take some contributions from building densities. The entropy of facets ϕ has no clear correlation (r = −0.29) to density, which is expected given the calculation of the entropy focuses on the orientations of building facets and doesn't contain the global quantitative information as the density.  . Correlation between λ p and three novel geometric parameters: two mean alignedness parameters , * and entropy of facets ϕ. Red, black, and green colored dots refer to the aligned, staggered, and realistic urban layouts. Correlation coefficients r are shown at the lower left of each plot, respectively.

Numerical Model and Computational Setup
We performed Large-Eddy Simulations (LES) using the Parallelized Large-eddy Simulation Model (PALM, version r4554) (Maronga, Banzhaf, et al., 2020) with the computational domain discretized using second-order central differences (Piacsek & Williams, 1970). Horizontal grid spacing was set equidistant in the horizontal and staggered Arakawa C-grid in the vertical direction. Time integration was fulfilled using a minimal storage scheme (Williamson, 1980) to solve the prognostic filtered incompressible Boussinesq equations. The filtering process yielded four unknown covariance terms and they were parameterized using a 1.5-order closure (Deardorff, 1980). The pressure perturbation was considered in Poisson's equation and was solved by the FFTW scheme (Frigo & Johnson, 1998).
Simulation setups followed (Nazarian et al., 2020) which has validated results against Direct Numerical Simulation (DNS) (Coceal et al., 2007) and wind tunnel experiments (Brown et al., 2001) for idealized configurations. The Level of Details (LOD) (Biljecki et al., 2016) for realistic layouts was set to 1.2, where the small building parts were identified but with a uniform building height (H = 16 m), allowing direct comparison with idealized configurations focused on horizontal urban heterogeneity. Total domain height H T was set to 7.6 times of mean building height H with the first 50 m at Δz = 0.5 m resolution and stretched thereafter with a stretching ratio of 1.1 (Nazarian et al., 2020;. The horizontal resolution after sensitivity tests in Section 3.3 was set to Δx = Δy = 1 m. Periodic boundary conditions were employed in all four lateral directions to ensure the simulations capture the flow statistics in a sufficiently large urban area. The top boundary condition for momentum was free-slip, while the bottom boundary condition was set to non-slip to enforce a parallel flow. Following the Monin-Obukhov similarity theory (MOST) a constant flux layer was assumed as boundary conditions between surfaces and their immediate grid level (Maronga, Knigge, & Raasch, 2020). The canopy flow was studied with a neutral atmospheric condition where the only flow driver is a constant pressure gradient of magnitude = 2 ∕ in the x direction, where ρ is the density of air and u τ is the friction velocity. A Posteriori within the canyon yielded a similar (error ϵ ≤ 4%, due to variation of building volume in the domain) value. The friction Reynolds number Re τ = u τ H T /ν ∼ 2.5 × 10 6 is large enough to neglect viscosity effects. The simulations were first spun up for 3 hr corresponding ∼ to 140 eddy turnover time (T = H/u τ ) to reach the quasi-steady state, then the output was time-averaged and stored every half hour for the next 8 hr (∼375 eddy turnover time) to filter out eddies generated through urban structures.

Grid Sensitivity Analyses
The grid resolution is set to allow a clear representation of street canyons and solve cross-canyon eddies in a scale of meters while covering a space larger than 300 m × 300 m to record the largest motion within the urban canopy. These two requirements make simulations computationally expensive and demand an optimal grid resolution. To simulate as many scenarios as possible within computational means, we determine the ideal resolution balancing between precision and cost by testing two grid sizes, namely, Δx = Δy = 1m and Δx = Δy = 0.5 m.
The pedestrian-level (z = 1.75 m) wind field is compared = √ 2 + 2 + 2 , where , , and are the time-averaged streamwise, spanwise, and vertical velocity, respectively. Although the four cases shown have pressure gradients applied at the shorter side of the domain, all simulations for idealized and most realistic building arrays are applied at the longer side. This arrangement might cause persistent energetic streaks in the flow field, but will not be detrimental to the final performance of the novel geometric parameters. Spatially averaged vertical profiles including wind speed ⟨ ⟩ , turbulent kinetic energy (TKE) = 1 2 ( ′ ′ + ′ ′ + ′ ′ ) , total momentum flux (turbulent + dispersive, ⟨ ′ ′ ⟩ + ⟨̃̃⟩ ) and turbulent transport of TKE ( are compared. The overbar (⋅) denotes the time average so that ′ = − is the fluctuating turbulent part departure from the time-average quantities. The angle bracket ⟨⋅⟩ denotes the intrinsic spatial average (Mignot et al., 2008) over only air grids and = − ⟨ ⟩ is the dispersive fluctuating part from its spatial and time average. It is worth noting that the dispersive momentum flux (DMF) in this study and many previous studies (e.g., Nazarian et al., 2020) have comparable values to turbulent momentum flux (TMF, ⟨ ′ ′ ⟩ ) and therefore they are evaluated together as total momentum flux. All four pedestrian wind fields between the two resolutions agree well except for differences that are not dependent on global but local geometry in the upstream. Departures from the finer resolution can be attributed to less grid representing narrow building gaps, causing eddies to be ill-resolved. For vertical profiles, the wind speed, total momentum flux, and turbulent transport of TKE from the coarse grid fall in a small confidence range (δ ≤ ±5%) across all four cases. TKE profiles exhibit less consistency, where the coarse grid produces a lower turbulence level across the whole canopy and the largest departure occurs in the canopy top where the shear stress dominates. Considering the focus of this study is mainly within the urban canyon and departure percentages are reasonably small, the quality of the LES results is acceptable and the 1 m grid resolution will be used in subsequent analyses.

Results and Discussions
LES simulations yield both detailed three-dimensional and time-varying information over the urban boundary layer. A snapshot of the instantaneous wind field over a medium-high density (λ p = 0.342) neighborhood is shown in Figure 8. Here, we focus on the flow field within the urban canopy layer (UCL) and discuss the performance of geometric factors in predicting flow variability among different configurations with spatially averaged flow properties in Section 4.1, canopy-averaged flow properties in Section 4.2.1, and drag parameterization in Section 4.2.2. These analyses further contribute to the development of urban canopy parameterization through the calculation of turbulent length scales (Li et al., 2020) and drag coefficients (Santiago et al., 2013) using both idealized (Blunn et al., 2022) and realistic urban configurations.

Spatially Averaged Flow Properties in Realistic Configurations
Before investigating the performance of the novel geometric parameters, we compare the vertical profiles of flow properties over realistic urban layouts with two idealized configurations commonly used in urban canopy parameterization (namely aligned and staggered configurations shown in Figure 1). Figure 9 shows the vertical profiles of wind speed ⟨ ⟩ and TKE ⟨ ⟩ covering both idealized (aligned in red, staggered in black) building arrays and realistic urban layouts (green) with six densities ranges λ p ∈ [0.06-0  Lin et al., 2016), and exhibit a significant difference within the same density range. The lowest density range λ p ∈ [0.06, 0.07] shows a less clear distribution with no clear resemblance to either staggered or aligned configurations, which is attributed to the fact that urban layouts of extreme-low density are rarely constructed in an evenly distributed manner in the real world. In other words, urban layouts under such packing density often consist of a few clusters of buildings placed highly arbitrarily while the rest of the domain remains empty, which is hard to be predicted by idealized building arrays.
Within the density range of medium urban density λ p ∈ [0.15, 0.27], flow over realistic layouts shows a good agreement with staggered layouts, which is consistent with the conventional approach of using staggered layouts to calibrate algorithms and parameterize urban climate models (UCM, e.g., Krayenhoff et al., 2015;Nazarian et al., 2020). However, as the urban density grows beyond λ p ∼ 0.25, flow over realistic urban layouts shows less agreement with staggered building arrays, but becomes closer to aligned configurations. This transition can be traced back to constraints from the real world, where increased density reduces the variability of urban layouts significantly due to streets connecting buildings in accordance with local regulations. Although staggered layouts seem realistic for sparsely built neighborhoods, featured with shortened, narrow, and gradually artificial urban streets, they become inappropriate to represent a realistic neighborhood as the density increases. Accordingly, low (near-zero) wind speed and TKE reported in high-density staggered configurations appear to be unrealistic compared to realistic urban layouts.   The turbulent (TMF, ⟨ ′ ′ ⟩ ) and dispersive (DMF, ⟨̃̃⟩ ) momentum flux over the same density ranges are shown in Figure 10. Differences in TMF among the three types of configurations (i.e., aligned, staggered, and realistic) are less significant across density ranges, except for high-density cases λ p ∈ [0.42, 0.46] where turbulent momentum flux of realistic cases are more similar to aligned configurations. On the contrary, the profile of DMF presents a similar transition seen in Figure 9. The DMF over idealized building arrays remains relatively low (less than 30% of TMF) across density ranges, which is attributed to the highly idealized building arrangements and cubes representing buildings. For flow over realistic layouts, however, DMF profiles show an increasing and diverging trend as density increases, where its contribution accounts for up to 50% of TMF in Figure 10. Therefore, the contribution of DMF is not negligible and has to be considered in the flow parameterization, especially for dense layouts.
On the other hand, the distinct behavior of DMF, a measure of momentum transport due to spatial variations in the time-averaged flow, over realistic urban layouts indicates realistic building shapes and their arrangements respond to density changes differently compared to idealized configurations. Idealized urban geometries translate urban densification into narrower building gaps and channels only, which results in an oversimplification of overlaying wake regions and unrealistic spatial flow variability. In contrast, spatial variability of the flow over realistic Figure 11. Scatter plots of canopy-averaged wind speed, TMF, the ratio between DMF to the total momentum flux (DMF/(DMF + TMF)), and turbulent kinetic energy against proposed geometric parameters. Green dots represent cases with realistic urban layouts (UR). Red and black dots represent aligned (UA/VA) and staggered (US/VS) configurations, respectively, with uniform (square markers) and variable (triangle markers) height with standard deviation H std = 2.8 m (down-triangle) and H std = 5.6 m (up-triangle). Black, blue, and green lines represent semilogarithmic, linear, and regressions over only realistic cases, respectively. urban layouts is amplified (the trend is more clear in Figure 11) by the density growth, which further emphasizes the significance of flow modeling based on realistic urban flow.

Urban Canopy Parameterization
Vertical structures of flow properties over realistic urban layouts report discrepancies compared to idealized configurations (Figure 9), indicating that adopting only density parameters and idealized urban arrays is inadequate to determine the urban flow. Accordingly, we discuss the contribution of different morphological parameters (such as street connectivity and sheltering) to key flow properties and their application in urban canopy parameterization. Here, we evaluate the performance of proposed geometric parameters in describing the flow field by considering the Multi-Layer Urban Canopy Modeling (MLUCM, e.g., ) that requires canyon integral of flow parameters (Section 4.2.1), and drag parameterization that evaluates the roughness nature of the urban surface (Section 4.2.2). The efficacy of geometric parameters is evaluated based on their performance over a large range of urban layouts and their ability to retain a relatively constant error band across the data range.

Canopy-Averaged Flow and Turbulent Properties
The development of parameterization for MLUCM involves the calculation of turbulent length scales and drag coefficients, which further requires flow properties such as wind speed, TMF, and TKE. Accordingly, we evaluate these canopy-averaged properties that is, volume-average from the ground to the canopy top, for each simulation case (shown as data points in Figure 11). When testing the performance of geometric parameters, we first check if flow over realistic urban layouts follows a similar trend as the idealized cases. If not, we test performance only over realistic cases to discuss distinct patterns in realistic urban flow. When appropriate, empirical regressions (semilogarithmic and linear) are obtained for each geometric parameter. The Normalized Root Mean Square Error (NRMSE) and Mean Absolute Error (NMAE) are then estimated based on regression and the actual relationship between geometric and flow parameters to evaluate the performance (Figure 13). Potentially, multi-variable regressions similar to those from  might be a step forward to a more pragmatic parameterization, however, in this study, we apply a single-variable regression as our purpose is to find a geometric parameter that better represents realistic urban geometry.
Conventional densities only characterize the volume of roughness elements in a neighborhood but ignore how they are arranged and intrinsically assume every urban and non-urban grid is equally affecting the bulk flow properties. Nonetheless, a good agreement is presented between densities and all four flow properties, across both idealized and realistic urban layouts, for sparely built urban layouts (λ p ≤ 0.25). This is expected as the flow regimes over low urban densities are mostly isolated roughness flow, where the incoming flow is mainly altered by individual roughness elements and overlaying of wake regions is negligible. However, as the density increases, differences between realistic and idealized urban layouts becomes more distinguishable, where idealized building arrays maintain similar trends across the whole density range, but flow over realistic urban layouts becomes more scattered and less sensitive to the density growth. The reason is partially explained in Section 4.1 where the realistic building arrangements are accountable.
To further demonstrate how flow responds to realistic building shapes and arrangements, two examples of pedestrian wind fields from high-density realistic cases that cannot be predicted by regression over conventional densities are shown in Figure 12. Figure 12a shows the wind field over a typical urban layout from Barcelona (Spain) with λ p = 0.534, γ m = 0.312, where the layout resembles an aligned configuration except for a diagonal (∼20°) street canyon. The area of the street roughly accounts for ∼30% of the air grids and bears a large-scale coherent structure enclosing high wind speed and turbulence levels. The impact of realistic streets, especially when penetrating the whole neighborhood, unproportionally alters the bulk flow properties to higher wind speed than idealized layouts with buildings evenly distributed. As a result, bulk wind speed over such a domain is three times larger than the predicted value from regression over λ p . Figure 12b shows the wind field over a neighborhood from Los Angeles (United States) with λ p = 0.368, γ m = 0.443. Buildings are all in rectangular shape but with large variations in footprint size, where the largest building has a similar area to building blocks in the aligned configuration of similar density while the smallest is only 1/20 of the largest. Variable building sizes and their complex arrangement translates into an even more complex flow field, where cavity vortices are deformed, elongated, and thinned by downstream obstacles. In addition, in realistic configurations, the overlap of the wake region is less structured, which leads to higher spatial flow variabilities compared to the idealized configuration and results in higher wind speed and TKE. Such distinctive and less structured spatial variation of flow is common among realistic urban layouts, which is manifested in the clear trend of DMF over geometric parameters in Figure 11. For idealized scenarios, on the other hand, the contribution of DMF remains relatively low and difficult to characterize using the geometric parameters tested in this study. Absolute , where y p is the predicted quantity from regression and y o is the observed value from Large-eddy simulation simulation, n is the number of data points (N = 101). Both indicators are normalized to [0,1] with observed maximum and minimum values to allow comparison of different flow parameters over the same geometric parameter. White/black numbers in each lattice show error estimated from a regression between a flow property and a geometric parameter. From the two matrices, each row has a bold black number that indicates the minimum error achieved and hence the best performance. Canopy-averaged flow properties are then assessed as a function of novel geometric parameters introduced in Section 2. Alignedness parameters (γ m , * ) present an improved performance with a reverse trend to densities, where increased alignedness leads to an increase in TMF (in magnitude, same as will be discussed below), wind speed, and TKE, and decrease in DMF. Regressions over alignedness parameters exhibit a good agreement from Figure 13 across most configurations with relatively constant error bands as shown in Figure 11. For the canopy-averaged wind speed, linear regressions based on both alignedness parameters show improved performance for all three types of configurations (i.e., aligned, staggered, and realistic). While semilogarithmic regression based on modified alignedness * significantly outperforms all other geometrical parameters in predicting TMF. When analyzing canopy-averaged TKE, densities and alignedness parameters present some shortcomings. λ p exhibit the highest RMSE/NMAE in predicting TKE across the whole density range when all configurations/ height arrangements are combined. * shows the best performance for realistic and staggered cases but underestimates TKE over some aligned cases, which is only slightly better than other geometrical parameters. This may be due to aligned urban arrays being less sensitive to the density change over sparse configuration, which distinguishes it from staggered and realistic urban layouts. We find no clear correlation between the entropy of facets (ϕ) and averaged flow properties. This is likely due to the fact that ϕ only characterized the drag effect on individual building facets but overlooked the integral effects of all facets on the flow.
Two alignedness parameters, normalized by domain size (γ m ) and sheltering building height ( * ) respectively, show a better prediction performance for cases with height variability compares to urban densities. It is evident that γ m shows good predictability over variable height cases even though its calculation does not covey any height information (i.e., γ m is only calculated based on horizontal heterogeneity). The modified alignedness * , normalized by the sheltering building height ( ) ( ) , shows an almost full tolerance to the vertical heterogeneity except for some aligned configurations. Comparison between flow over variable height cases (triangles in Figure 11) and their uniform counterpart reveals the intertwining nature of vertical and horizontal heterogeneity in the canopy flow. Results suggest the increase in building height variability (quantified by H std ) and the density growth impose a similar effect on the bulk flow properties. In addition, the two idealized building arrangements perceive height variability differently, where the staggered arrangement is less responsive than the aligned. Therefore, for realistic urban layouts that exhibit heterogeneity in both directions, geometric-descriptive parameters that account for both (e.g., alternative mean alignedness, * ) are expected to provide better performance.
Applicability to theoretical extremes also reflects the performance of regressions. When extending the geometric condition to a scenario where the domain is fully filled with buildings (λ p ⇒ 1, γ m ⇒ 0 and * ⇒ 0 ), regression values match well with the theoretical estimations (⟨ ⟩∕ ⇒ 0 , ⟨ ′ ′ ⟩∕ 2 ⇒ 0 , and ⟨ ⟩∕ 2 ⇒ 0 ). Considering urban layouts with density beyond λ p ≥ 0.5 is already rare in the real world, regressions from Figure 11 can be considered to be applicable to fully packed extreme conditions. In the other extreme, flow over an empty neighborhood (λ p ⇒ 0, γ m ⇒ 1 and * ⇒ +∞ ) forms the horizontally homogeneous turbulent boundary layer (HHTBL) (Richards & Hoxey, 1993), in which flow properties within can be directly estimated based on the standard k − ϵ turbulence model from Equation 4.
Where κ = 0.41 is the von Karman constant, C μ = 0.05 is the coefficient of the k − ϵ model based on an estimation from Nazarian et al. (2020), and z 0 is the surface roughness length that is set to 0.01m in the simulation setup. The asymptotic value for TKE ( ⟨ ⟩∕ 2 ⇒ 4.9 ) and TMF ( ⟨ ′ ′ ⟩∕ 2 ⇒ 0.78 ) is roughly expected for a classical surface layer over flat terrain, where from Equation 4 ⟨ ∕ 2 = 4.47 and ⟨ ′ ′ ⟩∕ 2 = 1 . For wind speed, although the theoretical value ⟨ ⟩∕ = 14.8 doesn't quite match with regressions ⟨ ⟩∕ ⇒ 6.4 , empty scenario are non-urban and therefore regressions are still valid for urban flow modeling purpose.

Drag Parameterization
The promising performance of the two alignedness parameters motivates a closer evaluation of the drag parameterization, which illustrates the integral geometrical effects from the underlying 3D urban canopy to the upper atmosphere and is commonly adopted in UCMs. Except for modifications on mixing length scales, the major building effects on the flow are fulfilled through form drag, which serves as the sink of momentum and source of TKE. Depending on whether the urban surface is modeled as a single layer, which simplifies urban geometry to a surface with increased roughness, or multi-layer with the urban surface as a porous medium, the form drag can be parameterized as a single quantity that applies to the entire canopy or as a sectional parameter that varies with height as follows (Coceal & Belcher, 2004).
Where Δp(z) is pressure deficit (Cheng & Castro, 2002) or pressure drop at height z across building cubes, ρ is the air density. Attempts to parameterize the drag coefficient C d are extensive (Kanda, 2006) but tend to majorly focus on idealized building arrays such as aligned (Buccolieri et al., 2019;Simón-Moral et al., 2014), staggered (Krayenhoff et al., 2015) or cubes with different sizes . However, evaluation of form drag over realistic urban layouts is scarce due to the lack of detailed 3D flow information.
Among typical drag coefficients appearing in the literature e.g., (Santiago et al., 2013), we select the equivalent drag coefficient C deq following , which requires fewer inputs but still maintains similar results relative to other height-dependent drag coefficients and has a good agreement with LES (Nazarian et al., 2020). C deq in UCMs is usually employed as a factor that considers geometric configuration with the wind speed profile U(z) for momentum field feeling the drag and sectional building area density S(z) depicting the surface of where the force was applied to formulate the sink of momentum (Equation 6a) and source of TKE (Equation 6b) to represent the building effects.
The irregular building shapes and their arrangements considered in UR cases, together with the angle of pressure gradient applied, control the projection to cross-wind direction and the pairing of windward and leeward facets and the building length between them (Sützl, Rooney, Finnenkoetter, et al., 2021). In our study, the wind angle θ = 0° corresponds to a particular case where the projection to the cross-wind direction is also the cross-stream direction (y), and the paired facets are aligned with the streamwise direction (x). In practice, we discretized the paired facets to paired grids, where the windward and leeward points are defined as the starting and ending points of continued building blocks in the streamwise direction to calculate the pressure deficit in Equation 7.
Pressure deficit is calculated as Δp(z) = p f (z) − p b (z), where p f (z) and p b (z) are the pressure on the windward and leeward grid at height z, respectively, H = 16 m is the canopy height and ⟨ ( )⟩ is the horizontally averaged velocity. Under the periodic boundary condition, as shown in Figure 14, p f and p b will be positioned in a different way for three types of special conditions, depending on whether the starting/ending point of the domain is filled with a building grid. Although the urban geometry is shown in a gridded form in this study, the calculation of either γ m or C deq is not restricted to the rastered urban form and can be applied to unstructured urban geometry.
For a direct comparison, the evaluation of pressure deficits for idealized building arrays was also conducted grid by grid. Only uniform height cases (N = 73) are considered for drag evaluation for a neat comparison. Two alignedness parameters show a similar trend over C deq whereas the mean alignedness γ m has a similar data range ([0,1]) as the plan area density λ p , therefore we only show γ m in Figure 15.
A clear separation among three classifications (UA/US/UR) is observed in Figure 15, with C deq for the realistic cases falling between aligned and staggered configurations, where the minimum value can be lower than the aligned configuration and the maximum value only reaches 60% of the staggered cases. Therefore, We perform separate regressions for different classifications respectively following Equation 8 modified from Nazarian et al. (2020). Equation 8 depicts the increment of drag by adding roughness elements until the newly introduced is balanced by the increased sheltering effects (Hagishima et al., 2009) (Nazarian et al., 2020), where C deq peaks at a similar density deq = 0.42 with similar value C deq,max = 3.52. The evaluation of the pressure deficit from the paired windward and leeward grids is hence validated.
Unlike staggered configurations, C deq over aligned configurations is insensitive to density (as shown in Figure 1), probably due to the higher streamwise velocity as the denominator in Equation 7, while in the meantime the pressure deficit is reduced due to narrower building gaps. C deq over realistic urban surfaces also shows a reduced dependence on density. However, the mechanism is different where the streamwise velocities are not significantly large ( Figure 11) and building gaps are not systematically narrow. Instead, the drag is lowered by the tilted orientation of building facets to the dominating wind direction, which results in a reduced pressure deficit to only a fraction of the pressure intensity in the normal direction of the approaching wind.
The relationship between C deq and γ m presents a similar mirroring trend to that of λ p that is already observed in Figure 11. A simple estimation is performed following Equation 9 with applicability at both theoretical extremes (γ m ⇒ 0 corresponds to adding more roughness elements until a new urban surface is formed γ m = 0, and γ m ⇒ 1 corresponds to larger building gaps until the urban layout equals a plain surface γ m = 1) ensured. Regressions over Equation 9 over three types of urban layouts (US, UA, UR) yield d γ [S, A, R] = [2.92, 0.66, 1.32]. The evaluation of the drag parameterization shows certain but less effective predictability over the mean alignedness γ m compared to the density, which is intuitively anticipated because alignedness parameters do not directly measure the quan titative condition of roughness elements. Adopting a combination of both density to characterize building wake and alignedness for flow penetration may yield better predictability, but is beyond the scope of the current study.

Conclusions
LES simulations were conducted over 101 urban layouts with idealized and realistic urban layouts to evaluate the performance of two novel geometric parameters focusing on urban horizontal heterogeneity. Following the "fitness for purpose" (Garuma, 2017) philosophy that determines model complexity based on the output objectives, this study comprehensively examined the fitness of novel geometric parameters in canopy flow modeling. The three-step approach introduced in Section 3, together with OSM2LES to prepare realistic urban layouts, could serve as a procedure for testing performances of prospective geometric parameters.
Similarities and dissimilarities of bulk flow properties between realistic urban geometry and cube arrays were systematically analyzed through vertical profiles of flow properties across densities. Results reveal the plan area density λ p is not decisive in navigating the variation of urban geometries to the corresponding canopy flow. The conventional approach using staggered layouts to calibrate flow models over the realistic urban surface is found only applicable for sparsely built (λ p ≤ 0.25) neighborhoods, where the flow retains a similar sensitivity to the variation of density. For more densely built scenarios, however, the realistic canopy flow perceives fewer density changes, exhibiting similar bulk flow properties to aligned layouts but from different mechanisms. For aligned building arrays, the large wind speed and other accompanying flow feature result from the contribution of penetrating streets (Figure 1), whereas the remaining is filled with cavity vortices of low wind speed. For the realistic, the large DMF indicates irregularities of urban geometry (street layouts, building sizes, and shapes) are responsible, where building gaps no longer entirely translate into cavity zones that are common in idealized building arrays. Instead, these vortices enclosing low wind speed are deformed and intruded by high-speed flow (e.g., Figure 12b). Results highlight previous efforts to construct urban layouts with the same arrangement (staggered or aligned) only with the building size (urban density) varied oversimplifies the complexity of realistic urban geometry. Further, the quality of urban canopy parameterization relying on the idealized urban flow will be degraded and cannot be fixed without acknowledging urban geometry heterogeneity.
The discrepancy between realistic and urban layouts indicates certain urban geometry features affecting the flow are overlooked. We developed two types of novel geometric-descriptive parameters focusing on the orientation of building facets (entropy of facets ϕ) and space connectivity (alignedness γ). The entropy of facets shows no clear correlation to the variation of canopy-average flow properties, indicating orientations of building facets that are seemingly responsible for the turbulence level in the flow field, do not control the spatial-temporal averaged bulk flow properties. Two alignedness parameters (γ m , * ) present a promising performance in predicting flow parameters, where estimation errors are smaller or comparable to conventional urban densities. The impacts of urban heterogeneity on the overall roughness effect on the flow are evaluated from drag parameterization. Again, the staggered configuration previously employed to calibrate model constants is not suitable for aerodynamically smoother realistic neighborhoods.
The impact of vertical variability on the parameterization based on horizontal geometric parameters was evaluated in Section 4.2.1 by including 28 idealized building arrays with non-uniform building heights. The effect of height variability H std on the flow is similar to the density in Figure 11, where increased H std leads to lower wind speed, TKE, and TMF. The weight of vertical heterogeneity over densities is different over concerned flow properties and increases with denser urban layouts.
Alignedness parameters proposed in this study can be easily adapted to depict vertical heterogeneity by evaluating their values at every vertical level to form a 1D profile similar to flow properties shown in Section 4.1. Then the vertical alignedness profile γ(z) can be either used in its original form or further averaged into a single value for potential modeling purposes. Then the vertical alignedness profile γ(z) can be either used in its original form or further averaged into a single value for potential mesoscale modeling purposes. The implementation and performance of γ(z) in 1D models can be evaluated in future work.
In summary, three key conclusions can be summarized in this study. a) With distinct spatial variability and bulk flow properties over realistic urban layouts identified, a cautious remark is raised on building flow parameterization based on only idealized urban arrays. b) By challenging conventional urban density parameters with alignedness (γ m , * ), we found geometry aspects beyond conventional densities, such as street arrangements, building shapes, and sizes also fundamentally alter the flow and demand explicit parameterization. c) Discussion of the dominance and interrelationship of horizontal and vertical urban heterogeneity and the excellent less sensitive to the street width but more sensitive to the abrupt change of the building and street regions. TMF presents an opposite but close relation to both building gaps and long streets as TKE. The connection between alignedness and flow properties is more clear in the overlaying of crosswind profiles, where γ(y) almost captures all street scale flow variation. It is worth noting that the alignedness profile varies as the urban layout rotates, which implies a potential to also account for varying wind direction.

Data Availability Statement
v0.1.0 of the OSM2LES Lu et al. (2022) used for the preparation of realistic urban layouts is preserved at https://doi.org/10.5281/zenodo.6566346, available via MIT License without registration and developed openly at GitHub (https://github.com/jiachenlu95/OSM2LES). Building footprint data is extracted and available from OpenStreetMap contributors (2017). Figure B1. Wind speed, turbulent kinetic energy, and TMF sampled at pedestrian height (upper). The alignedness profile γ(y) (black line) is overlaid with normalized flow profiles (red line) corresponding to the upper 2D field (lower). TMF profile is shown with 1 − ′ ′ ( ) to have a positive correlation with γ(y).