PPSurf: Combining Patches and Point Convolutions for Detailed Surface Reconstruction

3D surface reconstruction from point clouds is a key step in areas such as content creation, archaeology, digital cultural heritage and engineering. Current approaches either try to optimize a non-data-driven surface representation to fit the points, or learn a data-driven prior over the distribution of commonly occurring surfaces and how they correlate with potentially noisy point clouds. Data-driven methods enable robust handling of noise and typically either focus on a global or a local prior, which trade-off between robustness to noise on the global end and surface detail preservation on the local end. We propose PPS URF as a method that combines a global prior based on point convolutions and a local prior based on processing local point cloud patches. We show that this approach is robust to noise while recovering surface details more accurately than the current state-of-the-art. Our source code, pre-trained model and dataset are available at https://github.com/cg-tuwien/ppsurf.


Introduction
3D surface reconstruction from point clouds is a key step for workflows in areas such as content creation, archaeology, digital cultural heritage, and engineering, to convert raw 3D point scan data, like casual RGBD (color and depth) mobile phone images or more ac-curate range scans (e.g., from laser range scanner), to surface-based 3D object representations that can be used in downstream applications.
Given the large practical interest, surface reconstruction has become a central problem in computer graphics and vision research.The problem is generally ill-defined, as different surfaces may correspond to similar point clouds.However, several approaches have been proposed to tackle this ambiguity.One research direction attempts to optimize surface representations with strong non-datadriven inductive biases to fit the point cloud [KBH06, WSS * 19, PJL * 21,BZYSM21].This resolves the ambiguity, but is susceptible to deteriorating conditions of the input points, such as scan noise or regions with missing points, which cannot easily be corrected using a fixed inductive bias.Another line of research focuses on learning data-driven priors, usually over the distribution of commonly occurring surfaces and how they correlate with potentially noisy point clouds [PFS * 19, EGO * 20, PJL * 21, BM22].The surface reconstruction ambiguity can then be resolved by finding a surface that has a high probability for the given point cloud under the learned prior.The prior in these data-driven methods can range from global, where the prior captures a distribution over full 3D object surfaces, to local, where the prior captures the distribution over local surface patches.Global priors are the least susceptible to noise and missing points, but have limited capability to capture fine local details.Local priors, on the other hand, can capture such fine details accurately, but are more susceptible to strong noise and missing points.Existing methods mostly focus their prior on a small range in this global-local spectrum.For example, DeepSDF [PFS * 19] uses a global prior, Points2Surf [EGO * 20] mostly focuses on a local prior, while POCO's point convolutions [BM22] learn a prior in the medium range that is reasonably robust to deteriorating conditions, but still struggles to accurately capture local detail.
We propose PPSURF as a method that covers a wider range in the global-local spectrum of priors, by combining the local prior of a patch-based method like Points2Surf with a more global prior of a point convolution-based method like POCO.For this purpose, we design an architecture that has two branches: the first branch is based on POCO [BM22] and provides a global prior by applying several layers of point convolutions to a sparse set of support points.To reconstruct geometric details more accurately, we merge features from this first branch with features from a second branch, which processes a local patch of points with Point-Net [QSMG17].We additionally discovered that modifying the architecture of PointNet by replacing the sum aggregation with an attention-based aggregation improves performance.This results in a method that is robust to noise and missing points, while preserving details more accurately than previous methods.
In our experiments, we compare PPSURF to several previous state-of-the-art methods, both data-driven and non-data-driven, on synthetic as well as real-world data, and demonstrate improved performance on both in-distribution, and out-of-distribution surface reconstruction tasks.

Related Work
Surface reconstruction from point clouds is an active area of research.We distinguish between data-driven methods that train on a large dataset, and non-data-driven methods that do not use machine learning or overfit to a single shape.
Non-data-driven methods.Poisson reconstruction [KBH06,KH13] has for many years been the gold standard of non-datadriven approaches.Recent works have suggested optimizing the parameters of a neural network to predict the signed distance to the surface [AL20, SMB * 20, AL21] directly from a single point cloud.In particular, Atzmon and Lipman [AL20] introduced this concept for unoriented point clouds.They optimized the parameters of the neural network with a sign-agnostic loss and a geometric initialization of its parameters.Gropp et al. [GYH * 20] and Atzmon and Lipman [AL21] followed up on this work and included a gradient regularization in the loss.Later, Ma et al. [BZYSM21] introduced Neural-Pull, an optimization objective that uses directly the gradient of the optimized SDF to move the query points to the closest point in the input point cloud.In follow-up work, this approach was extended by incorporating a network to classify a point being on the surface or not [CHL23], and an additional loss that aligns the gradient direction between different level sets of the SDF [MZLH23].In order to improve the quality of the final SDF, Yifan et al.Non-data-driven methods are sensitive to noise, which is usually present in real 3D scans.In order to address this limitation to some extent, a recent pre-print from Wang et al. [WWW * 23] proposed Neural-IMLS, a non-data-driven method that regularizes the smoothness of surface normals using an MLP with limited capacity.While this produces smooth surfaces, it also loses some geometric detail due to this non-data-driven regularization.Noise to Noise Mapping by Baorui et al. [MLH23] focuses on the reconstruction of noisy point clouds in an unsupervised overfitting scheme.Additionally, these methods require significant reconstruction times due to the optimization being performed for each shape individually, which can be a limiting factor for large scans.
Data-driven methods.A recent line of research has approached the problem of shape reconstruction in a data-driven manner by using a large dataset to learn a prior over the distribution of commonly occurring surfaces and how they correlate with the input submitted to COMPUTER GRAPHICS Forum (2/2024).point cloud.These approaches are typically fast and robust to noisy inputs compared to non-data-driven approaches.However, in such methods, the resulting reconstruction highly depends on the quality of such priors.
Several works have proposed to use a global prior to capturing the distribution over full 3D object surfaces [CZ19, MON * 19, PFS * 19].These methods define such a prior as a single latent vector representing the shape, which is then used as a condition in a fully connected network to decode the SDF of a given query point.Usually, the decoder is trained on large data sets with a point-cloud encoder [MON * 19,CZ19].However, Park et al. [PFS * 19] proposed to train the decoder directly on such data sets and then optimize the latent vector to match the noisy point cloud during inference.Recently, Zhang et al. [ZTNW23] proposed to use richer global priors.They introduced an encoder-decoder network that encodes the input point cloud using attention modules into a set of latent vectors representing the shape, which are then used to predict the SDF for a set of query points using cross-attention modules.
Other works have opted to condition their models with local priors.Siddiqui et al. [STM * 21] encoded the input point clouds in a set of latent scene patches.These latent vectors are used to query a database of latent vectors from patches obtained from the training set.The obtained patches are then blended together using an attention mechanism.Ma et al. [BYSZ22] incorporated local priors by including a network pre-trained on a large number of surface patches which classifies a point as being on the surface or not.This network is used to guide an optimization process that learns the shape's SDF using another neural network.Jiang et al. [JSM * 20] pre-trained an SDF encoder-decoder on a large data set of object parts.Then, during the optimization process, only the latent codes of the different parts of the object are optimized.Chen et al. [CTFZ22] propose a dual contouring method learned on a small local prior.
Since global and local priors provide complementary information about the shape, a common approach is to use a prior in the medium range using a hierarchical encoder-decoder network.These approaches reduce the input point cloud to a simplified representation, e.g., voxelization or subsampled point cloud, which is then enriched by the global information provided by the bottleneck of the encoder-decoder architecture.Chibane et al. proposed a CNN that works directly on an Octree, from which the model was able to predict the SDF.Wang et al. [WLT22] also represented the input point cloud with an octree, from which they constructed a graph.This graph was further processed by a GCN encoder-decoder to generate an embedding for each octree node, from where the final SDF is predicted.Dai et al. [DDN20] instead used a 3D sparse encoder-decoder network to complete partial 3D scans and predict a complete SDF.Lionar et al. [LESP21] also developed an encoder-decoder network but used instead the projection of the input point cloud to a set of arbitrary 2D planes, from which the final SDF was predicted.Boulch and Marlet [BM22] recently proposed to use an encoder-decoder network that directly worked with points, avoiding discretization artifacts from voxel-based representations.Although all these methods work relatively well when compared with methods that use global or local priors alone, they struggle to accurately capture fine local details of the shapes.While the local branch was able to capture high-frequency details relatively well, they used a weak global prior due to the small subset of points selected to represent the shape.Our approach addresses the limitations of all these methods by incorporating strong global and local priors.

Method
The goal of our method is to take as input an unoriented point cloud P = {p 1 , p 2 , . . ., pn} that was sampled from an unknown watertight surface S gt with a noisy sampling process, and output a surface S that approximates S gt as closely as possible.Similar to several previous approaches, we define the surface S using an implicit representation, since this guarantees watertightness and naturally handles arbitrary surface topology in a smooth and differentiable way.More specifically, S is defined as the 0.5-level set of an occupancy field o(x): We train a network f θ (x, P) with parameters θ to model the field o given a point cloud P: The network f uses two branches: i) a global branch f g (x, P ′ ) that performs point convolutions [BPM20] on a sparse random subset of points P ′ ⊆ P and effectively learns a global prior over the coarse shape of S given the input points P, and ii) a local branch f l (x, Px) that processes a small local patch Px ⊂ P around x and effectively learns a local prior over the detailed shape of local surface patches.Each branch outputs a feature vector for a given query point x that is combined into a single feature vector before being processed by a small MLP f o that outputs the occupancy probability o(x): where ⊕ is the operation used to combine the two feature vectors, a sum in our experiments.Here, we omit the parameters of the networks f o , f g , and f l to avoid a cluttered notation.Figure 2 illustrates our architecture.
In the following, we describe the architecture of PPSURF, including the global and local branches in Section 3.1, followed by a description of the training and inference setups in Sections 3.2 and 3.3, respectively.A global branch processes a sparse subset P ′ ⊆ P using point convolutions, followed by an attention-based interpolation to get features at x that capture the coarse shape of the point cloud.A local branch processes a local patch Px ⊂ P using a PointNet [QSMG17] with attention-based aggregation to get features at x that capture the detailed shape of the point cloud near x.Global and local features are aggregated to compute the occupancy probability at x.

Architecture
Global Branch The global branch f g (x, P ′ ) takes as input a random subset P ′ ⊆ P and a 3D query point x and outputs a global feature vector for the point x, which encodes information about the coarse shape of the point cloud.We implement the global branch using POCO [BM22], which consists of two main components: i) a point convolution module that computes a feature vector z ′ i for each sparse point p ′ i ∈ P ′ , followed by ii) an interpolation module that interpolates the feature vectors z ′ i to get the global feature vector at point x.
The point convolution module uses FKAConv [BPM20] to process the sparse point cloud P ′ into a feature vector for each point: where is the set of feature vectors at each sparse point.Due to limitations both in performance and network capacity, convolutions can only be performed on the sparse subset P ′ instead of the full point cloud P, with |P ′ | = 10k in our experiments.This module consists of 10 layers of convolutions.Each layer uses a convolution kernel that operates over the 16 nearest neighbors of each point.
Given a query point x, the interpolation module interpolates the feature vectors z ′ i at the nearest neighbors N ′ x of the query point to get the global feature vector using an attention-based weighting: where ∥ denotes concatenation, f ga , f gb are two MLPs that transform the feature vectors before and after the weighted sum, and f gw k are learned weighting functions, each implemented as a single linear layer.Analogous to the attention heads in multi-head attention, multiple different weighting functions are used as a form of ensemble learning, 64 in our experiments.Note that when evaluating multiple query points x for a point cloud, the point convolution module only needs to be evaluated once, while the interpolation module needs to be evaluated once per query point.

Local Branch
The local branch f l (x, Px) processes a local patch Px around the query point x and outputs a local feature vector for the point x, which encodes information about the detailed shape of the point cloud near x.We base the local branch on the popular PointNet [QSMG17] architecture, which has been successfully applied in various methods that process local point cloud patches [GKOM18, RLBG * 19].We modify the architecture with an attention-based aggregation, instead of the original max-or sumbased aggregation, which we found to improve performance.
We define the local patch Px as the 50 nearest neighbors of the query point x.We normalize the patch by centering it at the origin and scaling it to fit into a unit sphere, obtaining the normalized patch Px.Subsequently, we apply PointNet with attention-based aggregation similar to Eqs. 4 and 5, but without using multiple attention heads: with v j := softmax j f lv f la ( p j ) , where f lv is a learned weighting function implemented as linear layer, and f la , f lb are two MLPs that transform the feature vectors before and after the weighted aggregation.

Training Setup
We The training takes about 5 hours.

Inference Setup
We use the inference setup from POCO [BM22], which differs from the training setup in two main aspects: First, we perform test-time augmentation in our global branch to obtain more reliable results.Second, we sample query points in a grid and use a variant of marching cubes to reconstruct a mesh.We describe both in more detail below.
Test-time augmentation.The sparse subsample P ′ ⊆ P used for the global branch may miss important geometric detail.To improve robustness, we compute the per-point feature vectors z ′ i for multiple different random subsamples P ′ 1 , P ′ 2 , . . ., until each point in P is included in at least 10 subsamples.The ≥ 10 different feature vectors for each point in P are then averaged before performing the interpolation step.
Mesh reconstruction.We place query points in a 257 3 grid and use a variant of marching cubes [LC87] proposed in POCO to obtain a mesh from the occupancy field o(x).That marching cubes variant uses a region-growing strategy starting from the input points to avoid the costly evaluation at all grid points, and super-samples marching-cube edges that intersect a surface to get a more accurate estimate of the intersection point.

Results
We evaluate PPSURF by comparing our surface reconstruction performance to several state-of-the-art methods, both data-driven and non-data-driven.We show both quantitative and qualitative comparisons in Section 4.1.Additionally, we provide an ablation to empirically validate our main design choices in Section 4.2.
Metrics We use three well-known metrics to evaluate the error of our reconstructed surfaces: the Chamfer distance, the F1-score, and the normal error.We evaluate each metric at 100k random surface samples for the Chamfer distance and normal error, or volume samples for the IoU.This results in roughly ±0.5% variance between different runs.between two point sets.We use it to measure the distance between reconstructed and GT surface samples.It is defined as:

The Chamfer distance [BTBW77,FSG17] measures the distance
where A and B are point sets of size 100k sampled on the surface of the GT object and the reconstructed object.
The F1 Score [TH15] measures the overlap between the ground truth surface and the region enclosed by the reconstructed surface, similar to the IoU.It weights precision and recall equally.
The normal error measures the difference between the normals of the reconstructed surface and the ground truth normals.We sample 100k points uniformly on the ground truth mesh A and the reconstructed mesh B, storing the normals of their originating faces.Then, we find the closest neighbor of each point b ∈ B in A. We report the average angle between the normals of these point pairs: All synthetic point clouds were created with the simulated scanner BlenSor [GKUP11] with a scanner resolution of 176 × 144, using a random number of scans between 5 and 30.Each dataset comes in up to five variants: • no noise: A version without noise submitted to COMPUTER GRAPHICS Forum (2/2024).
• med.noise: A version with noise using a standard deviation of 0.01L, where L is the largest side of the object's bounding box.• high noise: A version with noise using a standard deviation of 0.05L.• var.noise: A version with variable noise, where the amount of noise used for a given shape is sampled uniformly in [0, 0.05L] and the number of scans in [5, 30].• sparse: A version with medium noise where all shapes only uses 5 scans, resulting in point clouds between 2k and 22k points.• dense: A version with medium noise where all shapes use 30 scans, resulting in point clouds between 5k and 112k points.
For a fair comparison, we train all data-driven methods on the ABC var.noise dataset and evaluate them with each test set.Some point cloud examples of these data sets are illustrated in Figure 3.

Comparisons
We compare PPSURF to several recent data-driven and non-datadriven reconstruction methods.PGR [LXSW22], Neural-IMLS (IMLS) [WWW * 23] and Shape as Points (SAP-O) are non-datadriven methods that do not train on a large dataset and instead directly fit a surface to the input point cloud.Shape as Points also has a data-driven variant (SAP) that uses a trained network.Additionally, we use Points2Surf (P2S) [EGO * 20] and POCO [BM22] as data-driven methods.We took the best available variants and settings for each method: For PGR, we use the default parameters wmin=0.0015,alpha=1.05 for no noise, med noise and var.noise.We use the following adapted parameters for the other datasets: wmin=0.03,alpha=2.0 for high noise, wmin=0.03,alpha=1.5 for dense and sparse.We use thingi-noisy for SAP-O, vanilla for P2S , and 10k-FKAConv-InterpAttentionKHeadsNet for POCO.We used the provided noise-large configuration for SAP.For IMLS, we used the results provided by the authors (high noise datasets were not provided by the authors).Note that IMLS was developed concurrently with our work.
Qualitative Comparison Figure 4 shows comparisons for one example of each dataset variant.While non-data-driven methods give competitive results on low-noise results, PPSURF has a clear advantage with sparse and noisy point clouds.
We show examples on real-world point clouds in Figure 5, where PPSURF produces clearer edges and finer details.
Quantitative Comparison Table 1 shows the performance of PPSURF on all dataset variants.We report the average over all shapes in the test set.Similar to the qualitative results, POCO, PPSURF and the non-data-driven methods share the first place in most low-noise dataset variants, but PPSURF 50NN takes the lead in almost all other dataset variants.This confirms that adding the local branch does indeed improve the local reconstruction.

Computation Time and Memory Consumption
Training PPSURF on the ABC var-noise training set was done in 5 hours on 4 NVIDIA A40 GPUs and 48 AMD EPYC-Milan cores.We reconstruct all shapes in our test sets on a single A40 and 48 CPU cores.See the timings and memory consumption in Table 2.While non-data-driven methods tend to be faster than data-driven ones, SAP is a lightning-fast exception.PPSURF with small patch sizes has a negligible impact on resources compared to POCO.Neural IMLS does not report timings.As it is concurrent work, we could not do our own measurements.While it is fast, PGR's memory usage varies a lot with point cloud size, between a few GB to going out-of-memory with >46GB on 21 shapes.
Discussion For dense and noise-free point clouds, non-data-driven methods such as PGR, SAP-O and especially IMLS are a good option.However, their performance is limited in the presence of typical point-cloud artifacts, due to missing data-driven priors.Datadriven methods such as SAP, P2S, POCO and PPSURF can better deal with such artifacts.SAP is the fastest method but lacks accuracy, possibly due to its very small network.A bigger version could perhaps produce competitive results but would require non-trivial changes to the method.P2S employs a relatively simple PointNet for global shape encoding, which results in a weak global prior that can not reach the quality of a more efficient encoder such as FKAConv.Furthermore, it reconstructs noisy surfaces, which is reflected in the relatively high normal error, even with noise-free inputs.
Apart from some noise-free datasets, only POCO is close to PPSURF's quality.PPSURF achieves similar results on low-noise point clouds, but significantly better reconstructions for noisy point clouds.When predicting the occupancy at the query points, POCO has no direct access to the full point cloud, only to a coarse latent representation.This inability to accurately represent local information is likely the reason why POCO tends to produce blobby structures and over-smooth the reconstructed surfaces.We avoid this by providing a latent code that captures local detail more accurately by adding a local branch that directly encodes dense local patches of the point cloud.

Ablation
We investigate several design choices in an ablation study on the ABC var-noise test set.Most importantly, Table 3 shows that having both global and local branches gains a major advantage.Referring to Table 4, the optimal local patch size lies in the range of 25NN to 100NN.Further, attention is a better symmetric operation than max, and concatenating features is similar to summing them.This can be seen in Table 5. Please see the supplementary for an evaluation of the most relevant variants on all datasets.We compare the following variants of our method: • Full is the full method as described in Section 3.
• For Only Local, we set the global features to zeros, disabling this branch.Based on the results of this experiment, we conclude that this model can not reliably encode any surface since it lacks global knowledge of the surface to reconstruct.• Only Global is similar to POCO as it omits the local branch.
The results show that a global prior can help to obtain reliable reconstructions but with lower performance due to the missing fine details.   • In Merge Cat, we concatenate the features of both branches instead of summing them, which leads to twice the input size for the final MLP. Results show that this is slightly worse than Full.• The QPoints variant is the same as Merge Cat, but additionally, we concatenate query point coordinates to the input of the learned weighting function f lv .However, this results in a slightly worse performance than Full and even Merge Cat.• For the xNN variants, we take the x nearest neighbors for local subsample.Full is equal to 50NN.

Limitations
Reconstruction times are still non-interactive, due to the need to evaluate the occupancy at a large number of samples.Possibili-

PPSurf (ours) ground truth
Neural IMLS ties for speed-ups include more efficient sampling strategies to use fewer query points.
As our learned priors were trained on noisy data to make PPSURF more robust to noise, they also bias the reconstructed sursubmitted to COMPUTER GRAPHICS Forum (2/2024).face to some extent towards the distributions learned by the priors.This results in some loss of accuracy when applied to noise-free point clouds of the non-data-driven methods (see Figure 6).Learning a prior that is specialized to noise-free point clouds, or including more noise-free point clouds in our training set would alleviate this issue.
While PPSURF is better than the in filling scan shadows, it is not a generative method and cannot generate new geometric detail in large missing regions.This limits the size of missing regions that can be filled with plausible geometry.Combining PPSURF with a generative model would be an interesting direction for future work.See Figure 7 for an example of inaccurately filled scan shadows.

Conclusion
In this paper, we have introduced PPSURF as a method for surface reconstruction from raw, unoriented point clouds.In contrast to previous methods, PPSURF incorporates strong local and global priors learned from data.Whilst our global prior is based on a point convolutional neural network that processes the point cloud as a whole, fine details are preserved through the local prior based on dense local point cloud patches.We have shown in extensive studies that PPSURF is able to achieve better surface reconstructions than previous data-driven and non-data-driven methods, being more robust to noise in the input point cloud and preserving fine details at the same time.In the future, we would like to investigate how modern techniques borrowed from generative models could improve the obtained reconstruction from sparse point clouds where large parts of the shape are missing.
[YWOSH20] and Zhou et al. [ZML * 22] proposed to iteratively increase the input point cloud with points sampled from the optimized SDF in the previous iteration.A different approach was proposed by Peng et al. [PJL * 21] (also used in LION [ZVW * 22]), based on a differentiable Poisson Surface Reconstruction operation that could be used for optimization-based or learned reconstructions.Differently from previous methods, the set of points in the surface is optimized through the differentiable reconstruction instead of a neural network representing the SDF.Lin et al. proposed a parametric Gauss formula for reconstruction [LXSW22], which has quadratic complexity in memory leading to prohibitive costs for larger point clouds.VIPSS by Huang et al. [HCJ19] formulates reconstruction as a constrained quadratic optimization problem.iPSR by Hou et al. [HWW * 22] uses an iterative approach to Poisson reconstruction that improves the surface more and more, while removing the need to be given point normals.IsoPoisson by Xiao et al. [XSL * 23] incorporate an isovalue constraint to the Poisson equation, which helps with consistent normal orientation and consequently improved reconstruction.
[CAPM20] and Peng et a. [PNM * 20] proposed a 3DCNN encoder-decoder network to encode the sparse or noisy point cloud to later predict the SDF for an arbitrary point around the surface.Chibane et al. [CMPM20] extended this work to predict an unsigned distance field, which allowed them to represent complex open surfaces.Tang et al. [TLX * 21] extended the work of Peng et al. [PNM * 20] to include test-time optimization to improve out-of-distribution point clouds.Ummenhofer and Koltun [UK21] Erler et al. [EGO * 20] proposed to explicitly model global and local priors directly from point clouds using two different branches.Each branch used a PointNet [QSMG17] architecture, to process the local patch around the query point in the local branch, and a point cloud representing the complete shape in the global branch.

Figure 2 :
Figure2: PPSURF computes the occupancy probability at a query point x given a noisy point cloud P. A global branch processes a sparse subset P ′ ⊆ P using point convolutions, followed by an attention-based interpolation to get features at x that capture the coarse shape of the point cloud.A local branch processes a local patch Px ⊂ P using a PointNet[QSMG17] with attention-based aggregation to get features at x that capture the detailed shape of the point cloud near x.Global and local features are aggregated to compute the occupancy probability at x.
train our network with a binary cross-entropy loss BCE(o(x), o gt (x)) supervised by the ground-truth occupancy o gt (x) on query points defined by the Points2Surf ABC varnoise training set [EGO * 20].We train with AdamW (lr=0.001,betas=(0.9,0.999), eps=1e-5, weight_decay=1e-2, amsgrad=False) for 150 epochs with scheduler steps at 75 and 125 epochs.On our training machine, we can fully utilize all 4 NVIDIA A40 GPUs with distributed data-parallel training using a total batch size of 50 and 48 workers.The other hyperparameters are mostly based on POCO, namely 10k manifold points, a network decoder k of 64 and 2 output classes.One change is the increased latent size of 128, which was 32 in POCO.The additional hyperparameters for the local branch are a PointNet latent size of 256 and a patch size of 50.

Figure 3 :
Figure 3: Point cloud examples of the data sets used in our evaluation.

Figure 4 :Figure 5 :
Figure4: Qualitative comparison to all baselines.We evaluate one example from each dataset variant (except for the no-noise variants, where we only show one example due to space constraints).Colors show the distance of the reconstructed surface to the ground-truth surface.Due to our combined local and global branches, PPSURF reconstructs details more accurately than the baselines, especially in the presence of strong input noise.Note that results for Neural IMLS are not provided by the authors for the high-noise dataset variants.See the supplementary material for a qualitative comparison on all shapes in our test sets.

Figure 6 :
Figure 6: Limitations.Our method has difficulties to recover the edges of clean point clouds due to training with noisy point clouds.

Figure 7 :
Figure 7: Limitations.Our method struggles with reconstructions of large missing areas in the input point cloud since we did not incorporate any generative model capabilities.
[QSMG17]m Max, we replace the attention-based interpolation used in the local branch with the max, effectively making this branch a PointNet[QSMG17].The results show an advantage for attention.

Table 1 :
Comparison of reconstruction errors.We show the Chamfer distance, F1 Score and normal error between reconstructed and groundtruth surfaces averaged over all shapes in a dataset.Apart from a few noise-free datsets, PPSURF consistently performs similar or better than the baselines.Note that the mean performance of Neural IMLS does not include results of the high noise datasets, which are likely to favour PPSURF.Due to out-of-memory errors, PGR could not reconstruct all shapes, which are ignored here.Best results per row are marked in bold and the second-best results are underlined.

Table 2 :
Comparison of reconstruction times and memory usage.We show the mean reconstruction time per shape and the maximum GPU-memory consumption for each method on the ABC var noise dataset.200NN uses reconstruction batch size 25k instead of 50k.PGR went out of memory on 21 shapes.

Table 3 :
Branch Ablation Study.Using the ABC var-noise test set, we compare PPSURF Full to variants with disabled branches.The only-local variant failed to produce some meshes, which are ignored in the metrics.The best results per column are marked in bold.

Table 4 :
Patch Size Ablation Study.Using the ABC var-noise test set, we compare PPSURF Full (which is 50NN) to variants with different patch sizes.The best results per column are marked in bold.

Table 5 :
Miscellanous Ablation Study.Using the ABC var-noise test set, we compare PPSURF Full (which uses Merge Sum and Sym Att) to more variants.The best results per column are marked in bold.