Enumerating Calabi-Yau Manifolds: Placing bounds on the number of diffeomorphism classes in the Kreuzer-Skarke list

The diffeomorphism class of simply-connected smooth Calabi-Yau threefolds with torsion-free cohomology is determined via certain basic topological invariants: the Hodge numbers, the triple intersection form, and the second Chern class. In the present paper, we shed some light on this classification by placing bounds on the number of diffeomorphism classes present in the set of smooth Calabi-Yau threefolds constructed from the Kreuzer-Skarke list of reflexive polytopes up to Picard number six. The main difficulty arises from the comparison of triple intersection numbers and divisor integrals of the second Chern class up to basis transformations. By using certain basis-independent invariants, some of which appear here for the first time, we are able to place lower bounds on the number of classes. Upper bounds are obtained by explicitly identifying basis transformations, using constraints related to the index of line bundles. Extrapolating our results, we conjecture that the favourable entries of the Kreuzer-Skarke list of reflexive polytopes leads to some $10^{400}$ diffeomorphically distinct Calabi-Yau threefolds.


I. INTRODUCTION AND SUMMARY
The purpose of this paper is to distinguish the diffeomorphism class of smooth, simply connected Calabi-Yau threefolds, defined as compact Kähler threefolds X with trivial canonical bundle and vanishing H 1 (X, O X ).We assume that certain basic topological invariants are known, namely the Hodge numbers, the symmetric trilinear intersection form, and the second Chern class given explicitly relative to an integral basis of H 2 (X, Z).The key problem addressed here is to decide when two sets of intersections forms and second Chern classes are the same up to a basis transformation.
The pair of Hodge numbers (h 1,1 (X), h 2,1 (X)) specifies what is known in the literature as the topological type of a Calabi-Yau threefold X. Certainly, if two Calabi-Yau threefolds are homeomorphic to each other, their topological types must agree.The converse, however, is not true.Two Calabi-Yau threefolds of the same topological type may differ with respect to other invariants, such as the trilinear form and second Chern class.For simply connected manifolds with torsion-free cohomology, Wall's theorem [1] states that the isomorphism class of the system of invariants mentioned above, including the Hodge numbers, specifies uniquely the diffeomorphism type.Two non-singular diffeomorphic Calabi-Yau threefolds are also deformation equivalent [2,3], that is, their homotopy types agree, provided that the diffeomorphism types agree; however, this fails to hold in general [4].
The question addressed in this paper is of importance to both pure mathematics and physics.Since CY manifolds form one of the building blocks in the classification of algebraic varieties up to birational isomorphisms, their classification is an important and open problem in algebraic geometry.From a differential geometric perspective, the question of classifying CY manifolds up to diffeomorphisms is natural.In dimensions one, all CY manifolds, that is, genus-one curves, are diffeomorphic to each other.The same is true in dimension two: all CY2 surfaces, a.k.a.K3 surfaces, are diffeomorphic as smooth 4-manifolds.In dimension three, the picture is much more diverse and, to a large extent, unresolved.
For instance, it is not known whether the number of distinct topological types is finite, except in the case of elliptically-fibered CY threefolds, for which the answer is positive [5,6].It is also not known what kind of trilinear intersection forms can occur on CY threefolds, unlike the case of K3 surfaces, where the possible intersection forms are classified by the even self-dual lattices of signature (3,19).Another interesting question is whether the number of distinct Hodge pairs that can arise for a given isomorphism class of triple intersection numbers and divisor integrals of the second Chern class is bounded or not.It turns out that this is true for CY threefolds containing no rigid non-movable surfaces [7], however, it is unclear whether this statement generalises.
In physics, the classification of CY threefolds is closely related to the classification of supergravity theories derived from string theory or M-theory.For instance, the low-energy limit of M-theory compactified on a CY threefold X is a five-dimensional N = 1 supergravity theory with h1,1 (X) − 1 vector multiplets and h 2,1 (X) + 1 hypermultiplets.The triple intersection numbers of X determine the field-space metric for the vector multiplets and certain Chern-Simons couplings, while the integrated second Chern class fixes certain higher-curvature terms.A recent research avenue in this context, known as the swampland program [8][9][10], aims to study certain classes of field theories derived from string and M-theory in order to extrapolate their common features to general statements about field theories that can be embedded into quantum gravity in the ultraviolet.The triple intersection numbers also arise in the computation of A-model Yukawa couplings, given by an infinite series that collect contributions from holomorphic maps of all possible degrees from genus-0 curves into X.Such infinite series correspond to the 'quantum intersection numbers' of the manifold, which reduce to their classical counterparts in the limit when X has large radius.
Many constructions of CY threefolds have been developed both in algebraic and complex geometry.The largest set of known examples, arises from Batyrev's construction in the context of mirror symmetry [11].Given a four-dimensional reflexive polytope ∆, the construction associates with ∆ a Gorenstein toric Fano variety and a generic anticanonical section therein, whose crepant resolution produces a CY threefold X.The Hodge numbers of X are determined by the combinatorial data of the polytope, while the triple intersection numbers and the divisor integrals of the second Chern class are computed from the triangulation of ∆ associated with the crepant resolution.The classification of four-dimensional reflexive polytopes undertaken by Kreuzer and Skarke [12] includes 473, 800, 776 polytope isomorphism classes, with 30, 108 distinct pairs of Hodge numbers.The large degeneracy of the Hodge numbers can be partly understood as a consequence of the K3 fibration structures that abound among the Kreuzer-Skarke (KS) manifolds.A distinctive feature of these fibrations in that the K3 polyhedron is contained in the four-dimensional polytope as a slice, dividing it into two parts, a top and a bottom.The Hodge numbers satisfy an additivity relation with respect to the operation of assembling reflexive polytopes from tops [13], which is largely the source of the Hodge number degeneracy.Another feature of the manifolds constructed from the KS list is that these populate a region with h 1,1 (X) + h 1,2 (X) ≥ 22.The complementary region h 1,1 (X) + h 1,2 (X) < 22 is much less populated, and all known examples arise from other constructions [14].The KS list includes 16 polytopes that lead to non-simply connected manifolds [15].Throughout this work, these have been removed from the KS list as in this case Wall's theorem does not apply.
The number of diffeomorphism types present in the set of CY threefolds derivable from the KS list of fourdimensional reflexive polytopes is not known.In principle, this can be very large, since the number of triangulations grows exponentially with h 1,1 (X) and the KS list includes examples for which h 1,1 (X) is as large as 491.An upper bound on the number of fine, regular, star triangulations of four-dimensional polytopes was given in Ref. [16] as 1.53 × 10 928 .In the same paper, a weak upper bound on the number of diffeomorphism classes of CY threefolds derivable from the KS list was given as 1.65 × 10 428 .Both of these bounds are dominated by the single largest polytope with the largest number of points, which is associated to manifolds X with h 1,1 (X) = 491.The number of distinct Hodge number pairs, namely 30,108, gives a weak lower bound.Our aim here will be to provide bounds for the number of diffeomorphism types present in this class of CY threefolds up to Picard number 6.The main difficulty arises from the comparison of triple intersection numbers and divisor integrals of the second Chern class up to basis transformations on the second integral cohomology H 2 (X, Z).
To be more specific, we denote the triple intersection form κ on H 2 (X, Z) by κ(A, B, C) = A • B • C, where A, B, C ∈ H 2 (X, Z), and the second Chern class of X by c 2 (X).Let us consider two smooth, simply connected CY threefolds X and X ′ with torsion-free cohomology, Hodge numbers h := h 1,1 (X), h ′ := h 1,1 (X ′ ) and intersection form and second Chern classes denoted by κ, c 2 (X) and κ ′ , c 2 (X ′ ), respectively.Further, we introduce integral bases 1 (D i ), where i = 1, . . ., h, of H 2 (X, Z), and Relative to these bases, we can define the intersection numbers and components of the second Chern class by Then, the two manifolds are diffeomorphic if and only if there exists an invertible integral transformation such that, relative to the given bases, where P ∈ GL(h, Z) is an invertible h × h matrix with integer entries.A somewhat simpler question is to determine whether a solution for P exists over the real numbers.Of course, if the answer turns out to be negative, then the same must be true over the integers and the two manifolds cannot be diffeomorphic to each other.A classification of the triple intersection forms in Picard number two up to real basis transformations has been carried out in Ref. [18], revealing four different classes.Attempting a similar classification in higher Picard number is non-trivial.However, the problem of solving (I.3) for a real matrix P can be turned into an optimisation problem, as discussed in Ref. [19].
The computational complexity of an exhaustive search for integral transformations satisfying (I.3) with the entries of P in a range In Section III, we introduce an alternative search algorithm with computational complexity O(k h max ) and, in Section IV, we apply this algorithm to obtain an upper bound on the number of diffeomorphism classes present in the list of KS CY threefolds up to Picard number six.For each h, the value of k max is increased until either the bound stabilises or the computational time becomes impractical (for example, for h = 6 we did not attempt to extend the search beyond k max = 5.).Even in the cases where the upper bound did not stabilise, the dependence on k max strongly indicates that the actual number of diffeomorphism classes is very close to the obtained bound.
The existence of integral transformation satisfying (I.3) implies diffeomorphism equivalence.On the other hand, necessary criteria for equivalence can be derived from certain basis-independent quantities, including the Hodge numbers, the GCD-invariants found by Hübsch in Ref. [20], as well as other invariants, to be discussed in Section II.The latter invariants are derived from polynomial invariant theory and, to our knowledge, are discussed here for the first time, at least in the context of CY threefolds.Applied to the list of KS CY threefolds up to Picard number six, the invariants define a number of distinct classes, which leads to a lower bound on the total number of diffeomorphism classes.Moreover, these invariants inform the application of the proposed search algorithm, since the existence of transformations satisfying (I.3)only has to be tested for pairs of manifolds for which all invariants coincide.Clearly, the fewer elements we have in the classes of manifolds with the same invariants, the more efficient the search algorithm is.
A different set of invariants arises from the study of limiting mixed Hodge structures (approximately speaking, these describe the behaviour of the Hodge decomposition of the third cohomology at the boundaries of the complex structure moduli space), as recently pointed out in Ref. [21].However, we will not make use of these latter invariants in our analysis.Attempts to machine learn certain basis-independent invariants have appeared in Ref. [22].
The bounds on the number of diffeomorphism classes present in the list of KS CY threefolds up to Picard number six are discussed in Section IV, and summarised in Table II and Fig. 3.The lower and upper bounds are close to each other, with a maximal difference of about 5% for Picard number h = 6.Moreover, plotted on a logarithmic scale as functions of the Picard number h, the bounds align on almost straight lines.Consequently, the number of diffeomorphism classes contained in the favourable entries of the KS CY threefolds list, at least for h ≤ 6, follows the simple formula, n diff = (0.68 ± 0.02)e (1.87±0.01)h, (I.4) which amounts to an approximate scaling factor of 6.5 per unit increment in h.By comparison, the number of fine, regular, star triangulations (FRSTs) for 1 ≤ h ≤ 6 follows a more precise exponential law: n triang = (0.48698 ± 0.00005)e (2.33295±0.00002)h, (I.5) which amounts to an approximate scaling factor of 10.3 per unit increment in h.
It is tempting to extrapolate these formulae beyond h = 6.By summing up the contributions from all Picard numbers in the range 1 ≤ h ≤ 491, a naïve extrapolation leads to an estimate of (1.6 ± 0.1) × 10 497 FRSTs and a total number of diffeomorphism classes between 10 396 and 10 401 in the (favourable) KS-list, both of which fall within the bounds of Ref. [16].These estimates are dominated by the contributions of the single polytope which leads to manifolds X with h 1,1 (X) = 491.

II. A PLETHORA OF INVARIANTS
In this section, we introduce a number of diffeomorphism invariants constructed from the second Chern class and the intersection numbers.These invariants are used in Section IV to place lower bounds on the number of CY threefolds which can be constructed from the KS list.In Section II A, we review the GCD invariants found in Ref. [20], and, in Section II B. we generalise these using representation theory of GL(h, Z) in Section II B. In addition, there are also polynomial invariants, which we introduce and discuss in Section II C. With the exception of the simplest GCD invariants, to the best of our knowledge, none of the others presented here have been used in the context of CY threefolds.Subsection II D explores the discriminative potential of each of the invariants.
Ideally, one would like to produce a number of invariants that are sufficiently low in computational complexity to be of practical use and at the same time sufficiently powerful to distinguish between diffeomorphically nonequivalent CY threefolds.Below, we construct many invariants that almost completely distinguish between the diffeomorphism classes present in the set of KS CY threefolds up to Picard number 6.The discussion is regrettably labyrinthine and in order to streamline the presentation we have delegated many of the details to footnotes and appendices.
The classification of the divisor integrals of the second Chern class up to integral basis transformations is trivial, as vectors in the fundamental of GL(h, Z) are classified by the GCD of their entries.In order to gain some appreciation for the difficulty of the problem, let us focus for a moment on the classification of the triple intersection numbers alone, that is, we consider symmetric trilinear forms up to integral basis transformations.Since a general result for trilinear forms is not known, we take a step back and recall what is known about the classification of symmetric bilinear forms.Sylvester's theorem gives the invariants which fully determine equivalence up to GL(h, R) transformations, namely the rank and the signature.By specifying an additional (polynomial) invari-ant, the determinant, equivalence up to SL(h, R) transformations can be determined.A complete classification of quadratic forms (both definite and indefinite) up to rational equivalence exists, and this makes use of p-adic invariants.Restricting further to SL(h, Z) transformations, more invariants are needed, namely the GCDs of the integers specifying the bilinear form in a basis, as well as other more complex invariants, which have a long history in the literature [23].In dimension two, Gauss' Disquisitiones arithmeticae [24] gives a complete classification of all quadratic forms.In higher dimensions, the 'spinor genus' introduced by Eichler [25] gives a complete classification for indefinite forms and a partial classification for definite forms.There exist algorithms that can completely classify all definite quadratic forms in low dimensions, but these become unworkable [23] beyond dimension 24.
Given these complications, we will not attempt a complete classification of trilinear forms, but rather identify a number of simple and powerful invariants.Unlike for bilinear forms, the number of independent entries in a trilinear form is polynomially larger than the number of entries specifying a basis transformation.This suggests a large number of invariants can be expected for trilinear form.

A. GCDs of intersection numbers and divisor integrals of the second Chern class
We begin with a discussion of the simplest GCD invariants identified in Ref. [20].The starting point is the observation that the GCD of the entries of the symmetric array d ijk is preserved under GL(h, Z) transformations, and likewise, the GCD of the integers c 2,r .This is a simple consequence of the generalised Bézout's identity, and the fact that the determinant of the transformation is ±1 (see Appendix A for more details).
In Ref. [20] Hübsch identifies a total of eight GCD invariants.The first four of these depend on either d ijk or c 2,r and are given by: The remaining four invariants involve tensor products of d ijk and c 2,r .To simplify notation, it is useful to introduce the symmetric quadrilinear form b ijkl = c (i d jkl) , in which case the other four invariants are defined as (II.2) We note that S2 and S3 'descend' from S1 similarly to how S5 , S6 and S7 'descend' from S4 and this is explained in detail in Appendix A. This does by no means exhaust the list of GCD invariants.Two further examples, similar in form to S3 and S7 , are given by: (II. 3) Yet further examples can be obtained by using tensor products of d ijk and c 2,i other than the completely symmetric one, b ijkl .For example, a modification of S4 leads to the invariant (II.4) Examples of this kind fit into a representation-theoretic approach, which we now discuss.

B. Tensor powers and representation theory
The vector c 2,i transforms in the fundamental H of GL(h, Z), and the tensor d ijk transforms in the representation R = Sym 3 (H).Each case leads to GCD invariants, but further such invariants can be obtained from tensor products of R and H.
It turns out that symmetric tensor powers of H which correspond to polynomials in the integers c 2,i do not lead to new invariants: the GCDs factorise and, as a result, there is only one independent GCD invariant for vectors in the fundamental (see Appendix A).On the other hand, symmetric tensor powers Sym δ (R), which correspond to polynomials in d ijk , do lead to further GCD invariants (note that the entries of the intersection form fill out the full representation R).
Polynomials in both d ijk and c 2,i correspond to tensor products which involve both representations H and R, such as Sym δ (R ⊗ H).In the simplest case of δ = 1, these correspond to the invariants S4 and S′ 4 above.In general, the analysis is more complicated, as expressions of the form d ijk c 2,l do not fill out the entire representation R ⊗ H.As a result, it is not easy to confirm the existence of invariants, as many of the singlets actually evaluate to zero2 , when expressed in terms of c 2,i and d ijk .Nonetheless, some nontrivial invariants can be identified.
We can analyse these invariants in detail using the inclusion of the discrete subgroup3 SL(h, Z) ⊂ SL(h, C).For illustration, we list the following branching rules (which can be obtained using computer algebra packages such as LiE [26]) for the case h = 2, H = 2, R = 4, focussing on the intersection numbers alone.The first few decompositions for polynomials up to degree δ = 4 are given by: Each representation in these decompositions has an associated GCD invariant, which comes with a related, finite family of further GCD invariants derived in a manner analogous to how S2 and S3 and derived from S1 .The simplest of these new 'polynomial GCD invariants' is discussed in Appendix A, along with the general algorithm for generating the higher degree cases.
We note that singlets in tensor representations, such as the singlet in the last row of Eq. (II.5), lead to 'polynomial invariants' rather than polynomial GCD invariants (since the GCD of one number is simply the absolute value of this number).The representation theory will inform us where to look for these polynomial invariants, which, it turns out, can then always be written by appropriate contractions of d i,jk , c 2,i and the Levi-Civita tensor ϵ i1•••i h .We discuss such polynomial invariants and their construction in the following subsection.

C. Polynomial invariants
The singlet in the fourth row of Eq. (II.5) corresponds to the Cayley hyperdeterminant [27] of a trilinear form in two dimensions 4 .This can be written explicitly as ∆ 4,2 , where the subscripts denote the degree and Picard number of the invariant: where d 111 = a, d 112 = b, d 122 = c, d 222 = d.In two dimensions, the ring of invariants of symmetric trilinear forms is generated by the Cayley hyperdeterminant 5 , but the situation is more complicated in higher dimensions.
In the three-dimensional case, there are two independent invariants of degrees four and six generating the ring 6 .No method to predict the degree of all invariants is known, although it is relatively easy to establish some constraints-see Appendices B and C for details.We can, however, use the computer algebra package LiE to confirm exactly when they do occur.
The number, N gen (h) of algebraically independent polynomial invariants (after taking into account possible syzygies) is bounded by N gen (h) ≤ N upper (h), where the number N upper (h) is the naïve count for the number of basis-independent degrees of freedom given by For the cases h = 1, 2, 3, we have verified that this bound is saturated, that is N gen (h) = N upper (h).For h > 3, the number N gen (h) is not known, but we expect-without strong evidence-that the bound continues to be saturated.I.The maximum possible number of invariants and the lowest degrees of invariants of symmetric trilinear forms for 1 ≤ h ≤ 7. Starred degrees have not been evaluated for the KS data, and daggered degrees correspond to polynomials for which we have not constructed closed form expressions.Other invariants are discussed later, but have much higher degrees or involve the second Chern class. 4The Cayley hyperdeterminant is a special case of a very general and powerful invariant, which we term the 'Gelfand hyperdeterminant' [28], related to the degeneracy of the multilinear form associated to a (not necessarily symmetric) array.This point is discussed in Appendix B, but we note that the degree of this polynomial is too high for any practical use beyond dimension 2. 5 We note that the sign of the hyperdeterminant and the rank of the array viewed as a map Sym 2 2 * → 2 suffice to distinguish between symmetric three-arrays up to GL(2, R) equivalence.We can connect this classification to the complete classification of [18] in terms of (class, rank): if the hyperdeterminant is zero, the (cl,rk)type is (1, 1) or (2, 2).If the hyperdeterminant is positive, it is (3, 2), while if the hyperdeterminant is negative, the type is (1,2).This is because the hyperdeterminant is the discriminant of the curve in RP 1 induced by the cubic form associated to the trilinear form. 6The ring includes the Gelfand hyperdeterminant which in this case has degree 36.
In Table I, we show the upper bound for the number of independent invariants from Eq. (II.7), and we list the degrees of the simplest invariants for each h, as determined with LiE.
For an h-dimensional trilinear form, there always exist an invariant polynomial with degree 2h (see Appendix C for more details), which can be seen to be generically nonvanishing.Such an invariant, in fact, be realised using the rarely-discussed 'Pascal hyperdeterminant' (PDET), defined as

Graphs and Levi-Civita tensors
We would now like to explain how some of the other invariants indicated in Table I, in particular the quartic invariant for h = 3 and the order 10 invariant for h = 6, arise.It is clear from Eq. (II.8) that the Pascal hyperdeterminant can be written in terms of Levi-Civita symbols.This feature turns out to be more general, and it is a result from classical invariant theory-known as the first fundamental theorem of invariant theory-that all polynomial invariants of SL(h, C) can be constructed in a similar fashion [30].We can therefore construct polynomial invariants by forming appropriate contractions between Levi-Civita symbols and the cubic form d ijk .One immediate combinatorial requirement for a degree δ invariant to be non-vanishing is δ ≤ 3δ/h invariants can be represented as bipartite graphs with edges indicating the contractions between the 3δ/h indices on Levi-Civita tensors and the δ indices on d ijk (for more details, see Appendix B).We refer to these as contraction graphs.For example, the h = 3, δ = 4 invariant, is represented by the contraction graph in Fig. 1.The rules for constructing these graphs have to be slightly modified if the second Chern class c 2,i is included, in addition to d ijk .Evidently, any given Levi-Civita symbol can only connect to one Chern class vector, or else the associated invariant vanishes.A simple method which guarantees a non-vanishing result is to start with an invariant associated to a degree δ invariant in h − 1 dimensions for d ijk and then replace the Levi-Civita symbols This leads to an invariant of bi-degree (δ, δ/(h−1)) in the trilinear form and the second Chern class.For example, to obtain an h = 4 invariant which involves d ijk and c 2,i , we can start with the invariant I 4,3 for d ijk in Eq. (II.10) and carry out the replacement η ijk = ϵ ijkl c 2,l .The leads to the bi-degree (4, 4) invariant (II.11) Using the same method, we can start with the Pascal hyperdeterminants (II.9) for dimension h − 1 and obtain bi-degree ((2h − 2), 6) invariants in dimension h.Other ways of introducing the Chern class vector are discussed below.
Determinant/rank/signature-type invariants Although all polynomial invariants can be written using contraction graphs, their evaluation quickly becomes unmanageable in large degree and/or dimension.We now discuss a subclass of polynomial invariants at large degree that can be evaluated using a different approach.This new approach allows us to introduce further new non-polynomial invariants corresponding to the rank and signature of certain linear maps.
We begin the discussion by reconsidering the symmetric quadrilinear form b ijkl , introduced in Section II A, as a linear map b : Sym 2 H * → Sym 2 H.By Sylvester's law of inertia, we have a number of invariants: namely, these are the rank, signature, and determinant9 of the matrix b.Unfortunately, the rank becomes non-maximal when h > 3, and consequently the determinant vanishes.More precisely, in the vast majority (≳ 99%) of cases considered in the list of KS CY manifolds (up to Picard number 6), the signature is zero and the rank is 2h.It follows that these provide very little discriminative power.
However, we have identified another polynomial invariant, for h even, also built from a linear map.This is much too large to be reasonably constructed using contraction graphs, since it has degree h h+(h/2)−1 h/2 which for h = 6 is 336.Instead, we first introduce an auxiliary variable z i and consider the cubic polynomial d ijk z i z j z k .Taking the determinant of the Hessian, we are led to a degree h polynomial given by (II.12) By taking appropriate derivatives, we can form an harray that is independent of z i , as follows Now, if h is even, we can view A i1......i h as a linear map Â : Sym h/2 H * → Sym h/2 H.As was the case with b, the determinant, rank, and signature of Â are invariants.We call the polynomial invariant, coming from the determinant of Â, the 'SymDet' of d ijk .
For a given d ijk , computing SymDet(d ijk ) has a computational complexity of O h+(h/2)−1 h/2

3
, given the second determinant is computed with Gaussian elimination.In practice, however, this is still more accessible than the Pascal hyperdeterminants (for small h), precisely because it can be computed using Gaussian elimination.
The SymDet invariant is frequently zero when applied to the KS CY data, presumably because there are many more zeros in an intersection form than there are in an arbitrary symmetric array.In any case, the rank and signature of Â are still useful as, unlike b, its rank and signature take many different values on the KS data.
Finally, there are similar invariants that vanish after some critical value of h (for the same reason as the determinant of b) and others that can be written down for h satisfying other conditions.These are discussed in the Appendix B. One can also consider the rank of the intersection form, viewed as a map Sym 2 H * → H.However, this rank is maximal for all data used in this paper, and so provided no discriminative power.
Having introduced a large number of invariants, we now discuss their applicability and relative power to discriminate diffeomorphism types.
Whilst polynomial invariants necessarily capture a considerable amount of information, any GCD invariants are less useful.If we take the GCD of n numbers selected uniformly in some range [1, N ], a reasonably elementary argument yields that the probability that the GCD is 1, in the limit of large N , is 1/ζ(n) [31].Using this ζ function argument as a rough approximation 10 , it follows that the GCD invariants become less useful as the dimension of the representation increases.
High-degree polynomial representations can still be relatively small in dimension.For example, in the h = 4 case, there is a 4-dimensional irreducible representation in the branching of Sym 11 R.This has a fair chance (approximately 1/ζ(4) ≈ 10%) of having a non-trivial 10 GCD.Actually finding the explicit form of those polynomials is completely unfeasible, for the reasons mentioned above.Fortunately, all non-singlets have dimension greater than or equal to h, and so for higher h even these 'coincidental' low-dimensional irreducible representations should become less powerful.However, we note that the zeta-function calculation is an exceedingly poor approximation, as typical values in the data are far from being uniformly distributed.
To exemplify how GCD invariants perform, we consider the Picard number h = 6 case.There are 128 distinct Hodge numbers, and GCD invariants suffice to delineate 2092 diffeomorphism classes.All non-GCD invariants delineate 51, 330 classes.All invariants combined, increase this lower bound to 52, 361.
We note that the GCDs are the only invariants used here which guard against SL(h, Q) relations.We do not expect them to provide a total classification.There may well exist other invariants, such as the analogues of the 'spinor genus' for quadratic forms, which would provide more effective protection.These are unfortunately beyond the scope of this paper, but we note that, for example, it is possible that the techniques developed for quadratic forms could be applied to either of the linear maps b or Â created in the previous subsection.
At each level h, then, there is a finite number of polynomial invariants, as well as an infinite number of GCD invariants.All of these are, in general, difficult to construct.None have particularly favourable complexity for evaluation.Computationally, the first (lowest degree) invariants are reasonably tractable 11 for individual cases up to h ≃ 8. Due to the large number of manifolds descending from the KS list, we avoid evaluating invariants that take ≳ 5s to compute for h = 6.

E. Real equivalence of manifold data
While not of relevance for the problem at hand, we briefly discuss what the situation for invariants would be if we were interested in GL(h, R) equivalence of manifold data.For some other purposes, the real equivalence of the intersection form can be of use [18].In this case, GCD invariants are no longer useful.Polynomial invariants identified become relative invariants, transforming with a suitable power of the determinant of the transformation matrix.If this is an even power, the sign of the relative invariant is an invariant.
Furthermore, a suitable ratio of two different relative invariants becomes itself invariant.For example, for the Picard number 3 singlets under the special linear group (the quartic and sextic invariants I 4,3 and I 6,3 ), the ratio I 3 4,3 /I 2 6,3 is an invariant.This now represents the only remaining continuous and basis-independent degree of freedom.As before, ranks and signatures remain invariant.

III. DIRECT IDENTIFICATION OF BASIS TRANSFORMATIONS
As before, we consider two smooth simply connected CY threefolds X and X ′ with torsion-free cohomology.Recall that we have introduced bases (D i ), where i = 1, . . ., h, of H 2 (X, Z) and (D ′ i ), where i = 1, . . ., h ′ , of H 2 (X ′ , Z) and the intersection forms and second Chern classes, relative to these bases, are denoted by (d ijk , c 2,i ) and (d ′ ijk , c ′ 2,i ), respectively.If h ̸ = h ′ or if any of the invariants introduced in the previous section differ, the manifolds X and X ′ are clearly not diffeomorphic, so let us instead assume h = h ′ and identical values for all invariants.In this case, we have to decide whether a basis transformation P ∈ GL(h, Z) which satisfies Eqs.(I.3) exists.Clearly, this is difficult (except, possibly, for small values of h) so instead we ask the following related and simpler question.
Problem: For a given k max > 0, is there a P ∈ GL(h, Z) with all |P i r | ≤ k max that satisfies Eqs.(I.3)?
In this section, we describe an algorithm for solving this simplified problem.In Section IV we will employ this algorithm to obtain upper bounds on the numbers of diffeomorphism classes in the Kreuzer-Skarke list for all cases with h ≤ 6, by removing equivalences within the classes of manifolds with the same invariants.
The algorithm in question relies on the observation that H 2 (X, Z) is also the group of isomorphism classes of holomorphic line bundles over X (with the tensor product of line bundles as the group operation).We denote the line bundle L with first Chern class c 1 (L) = k i D i by O X (k).The Hirzebruch-Riemann-Roch theorem states that its index can be computed from Suppose two manifolds X and X ′ are diffeomorphic via a matrix P ∈ GL(h, Z) which satisfies Eqs.(I.3).Then, for a line bundle L = O X (k) there exists a unique line bundle Moreover, for related line bundles, k ′ = P −1 T k, it is clear that the two terms in Eq. (III.1) must be invariant separately, so that, defining ).These conditions are stronger than Eq.(III.2) and, in fact, imply that for all n ≥ 1.Moreover, if we have line bundles L a = O X (k a ), where a = 1, . . ., m, and the corresponding line bundles L ′ a = O X ′ (k ′ a ), related under the diffeomorphism (so that k ′ a = P −1 T k a ), the linear combinations where n a ∈ Z, are related under the diffeomorphism as well.
The basic idea of the algorithm, described below, is that the invariance of χ d and χ c in Eq. (III.3)(and linear combinations preserving this invariance) places strong constraints on viable basis transformations P ∈ GL(h, Z).

A. The algorithm
We are now ready to present the algorithm which solves the problem stated above, that is, which decides whether a basis transformation P ∈ GL(h, Z) with entries bounded by |P i r | ≤ k max exists.
Step 1: For a 'basis' of line bundles O X (k i ), where i = 1, . . ., h and c 1 (O X (k i )) = D i , evaluate the quantities χ d (k i ) and χ c (k i ) from Eq. (III.3).
Step 2: Evaluate χ d (k ′ ) and χ c (k ′ ) for all line bundles O X ′ (k ′ ) with |k ′ i | ≤ k max .Then, for each i = 1, . . ., h, select for such k ′ the subset In practice, checking for a single pair of (sufficiently large) integers (n 1 , n 2 ) is enough.
Step 5: If repeating Step 4 proceeds to the construction of a non-empty set S = S 1,2,...,h , then S contains a number of potential integral basis transformations The large computational cost is due to the number (2k max + 1) h of line bundles within this search box.For large values of h, this becomes unfeasible even for small values of k max .

B. Other approaches and extensions
Since the above 'classical' algorithm becomes too slow for large values of h, it is worth trading completeness for speed by employing heuristic methods.Such methods, including genetic algorithms, reinforcement learning and quantum annealing have been used to solve similar Diophantine equations in Refs.[32][33][34][35][36][37][38][39][40][41].These results show that a good fraction of the solutions can be found by checking a tiny sample of the search space, provided a non-trivial number of solutions exists.The fundamental feature of heuristic searches is to rank the possible alternatives at each branching step based on the available information in order to decide which branch to follow.This approach can considerably speed up the identification of solutions.However, in case of an empty search result, the absence of solutions is by no means guaranteed but only established with a certain level of confidence.In the following, we briefly discuss the various heuristic methods we have considered in turn.
Newton-Raphson minimisation.Following Ref. [19], direct Newton-Raphson minimisation of a 'loss' which measures the failure to satisfy Eqs.(I.3) is reasonably successful in finding rational transformations P ∈ GL(h, Q).The method involves the computation of the inverse of a h 2 -dimensional matrix, which is the main limiting factor.We did not pursue this method further as the result is stochastic (dependent on initialisation), and it produces matrices in GL(h, Q), rather than the required ones in GL(h, Z).Genetic algorithms.It is possible to search for GL(h, Z) matrices satisfying Eqs.(I.3) using a genetic algorithm.Given the intersection numbers and second Chern classes of two manifolds, the environment consists of h × h integer matrices P with entries restricted as k min ≤ P i r ≤ k min + 2 n bits − 1.Then every entry of P can be represented by n bits bits and the entire matrix by a bit list of length h 2 n bits .The fitness function needs to measure the failure to satisfy Eqs.(I.3) as well as incorporate the condition det(P ) = ±1 and an obvious choice is Here, w 1 , w 2 and w 3 are positive weights that can be adjusted to optimise performance.We have implemented and extensively tested such an algorithm but, it turns out, while its performance is much better than random search it is inferior to the algorithm described in Section III A.
Neural nets.Another approach is to train a neural network to learn GL(h, Z)-transformations.We present a possible architecture in Figure 2. The input consists of vectors k ∈ Z k which described line bundles O X (k).
The upper branch of the network computes the invariants χ d (k) and χ c (k) for the manifold X from Eq. (III.3).The lower branch first transforms to k ′ = (P −1 ) T k, where the matrix (P −1 ) T represent the weights of the layer, before computing the invariants χ ′ d (k ′ ) and χ ′ c (k ′ ) for the manifold X ′ .The final loss layer measures the difference between these invariants, that is, The idea is that, after training this network with a training set {k} of integer vectors, the weights in (P −1 ) T have settled to a viable transformation matrix.This does indeed work for many cases, but it is, unfortunately, a very time-consuming process, as the network has to be trained for every pair of manifolds.Of course, it naturally leads to real rather than integer transformation matrices.
Neural network that learns the required GL(h, Z)transformation P .
Image Recognition.Another approach we have investigated is inspired by image processing techniques.Point-set registration is the process of aligning two point sets which are misaligned by some spatial transformation.Working again with line bundle vectors, we consider two copies of Z h ⊂ R h and 'colour' the line bundle vectors k of the first set with the values (χ d (k), χ c (k)) from Eq. (III.3), and the line bundles vectors k ′ of the second set with The result is two 'coloured' copies of (some finite subset of) Z h , and we wish to find a point-set registration of these copies consistent with the colouring.We have used a coherent point drift (CPD) point-set registration algorithm [42], which maximises the likelihood of a Gaussian mixture model via an expectation-maximisation algorithm.We modified the code in Ref. [43] to use colours.In the language of Ref. [42], we have adjusted the code to block-diagonalise the alignment probability matrix, and restrict the transformation matrix to be in GL(h, R).The 'noise' parameter, which works by adding a uniform distribution to the Gaussian mixture model, partially accounts for the fact that the finite subsets of the two copies of Z h are not mapped precisely to each other under the GL(h, Z) transformation.Tuning of the noise parameter is required, and it is a good idea to restrict to a small number of colours.Somewhat surprisingly, this algorithm performs reasonably well, despite being heuristic: for a given box size, it would typically identify the same transformation matrix as the systematic search algorithm of Section III A. However, whilst the latter is more memory-intensive, the former is entirely deterministic.For this reason, we did not pursue the image recognition approach further.

IV. APPLICATION TO THE KREUZER-SKARKE LIST
The largest known list of CY threefolds descends from the KS list [12], where the manifolds are represented by hypersurfaces in toric varieties corresponding to reflexive polytopes enumerated by Kreuzer and Skarke.Each of the 473,800,776 reflexive polytopes can be used to generate potentially many topologically inequivalent CY manifolds by triangulation, which corresponds to a choice of particular hypersurface.
Distinct triangulations of the same polytope can correspond to distinct manifolds, but they must have the same Hodge numbers as that property descends from the polytope.Any two triangulations of different polytopes might be diffeomorphic, but manifolds from the same polytope often seem to have numerically similar or even identical data and so are particularly likely to be equivalent.We use cytools [44] to generate all manifolds correspond-ing to favourable12 polytopes for h ≤ 6, and compute their corresponding data.Note that many of these triangulations have exactly the same data and are therefore clearly isomorphic.It suffices, then, to remove exactly duplicate data, and thereby state the naïve upper bound on the number of manifolds: we call this the number of trivially distinct FRSTs.The results of this section are summarised in Table II and in Figs. 3 and 4.

A. Evaluating invariants
For each value of h, we evaluate some of the invariants identified in Section II; precisely which invariants were used is explained in Table III and Appendix D.
We note that the polynomial invariants become increasingly difficult to compute with larger h.Given the exponential increase in the number of FRSTs at each Picard number, we only computed invariants for h ≤ 6.If we do not compute sufficiently powerful invariants, some equal-invariant classes become impractically large, and we cannot reasonably run a pairwise-comparison upper bound algorithm.Consequently, in this work, we present only the bounds for h ≤ 6.
The lower bounds, provided by these invariants, are presented in Table II.

B. Running the systematic search algorithm
Within each class, we use a union-find method to identify equivalent manifolds, searching for a GL(h, Z) transformations using the algorithm described in Section III A. For all h ≤ 6, we iteratively apply the algorithm at increasing values of k max = 1, 2, 3, . . ., up to the maximum given in the last column of Table II.The systematic search algorithm was compiled to the Wolfram virtual machine, and the rest of the program was realised in the Wolfram language in Mathematica, and run on a laptop for h = 1, 2, 3, 4 and an HPC cluster for h = 5, 6.For h > 5, we found it advisable to set a constraint of ∼ 1GB to memory usage (in any given instance of the search algorithm), as in rare cases the sets S 1,2,...,k (for k < h) become extremely large 13 .
The evolution of the upper bound on the number of diffeomorphism classes, upon changing k max , is plotted in Figure 5.We stop increasing k max after either observing saturation, or after runtime exceeded 48 CPU hours.We see that saturation occurs for k max ∼ 5, providing good evidence that most manifold equivalences have been identified in this search.The final result for each h is given in Table II.
Looking at the resulting data, the nature of the equivalence classes, resulting in the KS list, appears difficult to predict.For example, there were exactly 648 incidences of one particular manifold in Picard number h = 6, from 17 different polytopes.627 of them have numerically distinct data, but all transition matrices between those 627 were found after searching for basis transformations with entries of up to 5. By contrast, another manifold is realised 373 times with precisely the same numerical data.

C. Exact determination of the number of manifolds for h ≤ 3, and rational equivalence
For h ≤ 3, Mathematica's symbolic solver is able to decide equivalence by directly solving Eq.I.3.It follows 13 These typically corresponded to cases where many of the integers d ijk and c 2,r vanished.II on a logarithmic scale.Here, we plot the lower (from the invariants) and upper (from the systematic search) bounds on the number of distinct manifolds.We also plot the bare numbers of polytopes and triangulations, as well as the number of numerically distinct triangulations (i.e.triangulations with exactly the same topological data).
that, for those cases, we should be able to determine the true number of diffeomorphism classes.We do this by running the union-find algorithm coupled to Mathematica's Reduce function, only demanding that the Hodge numbers should be equal.At Picard numbers h = 1, 2, and 3, we find that there are exactly 4, 27, and 183 classes of simply connected manifolds with data related under GL(h, Z).
How close can we get to the exact number of 183 diffeomorphism classes for h = 3, using the invariants from Section II? Polynomial invariants and the Hodge numbers only discriminate 150 diffeomorphism types.In fact, I 4,3 , I 6,3 (from Section II C) and det b (from Section II C) suffice to achieve this lower bound.By using GCD invariants, we are able to increase the bound from 150 to 171.However, our invariants are insufficient to saturate the lower bound at 183.
Interestingly, if we allow GL(h, Q) equivalence, there are precisely 4, 22, and 150 distinct sets of topological data respectively, and all the GL(h, Q) transformations have determinant ±1.Somewhat surprisingly, this means that the polynomial invariants are still preserved-even though rational transformations need not have determinant ±1: we should really only trust relative invariants, which exist for h = 3 as I 3 4 /I 2 6 , and do not exist for h = 1 and 2. This explains the 150 topological types discriminated by polynomial invariants.

D. Extrapolation to the entire KS list
Looking at Figs. 3 and 4 the number of FRSTs, as well as the number of diffeomorphism classes, appears to grow exponentially with h.An exponential fit for h ≤ 6 gives the number of FRSTs as Eq.(I.5) and the number of dif-  II.
feomorphism classes as Eq.(I.4).We therefore predict (6.023 ± 0.002) × 10 6 FRSTs, with (3.3 ± 0.4) × 10 5 diffeomorphism classes at Picard number h = 7.By way of comparison, using cytools, we find 5,990,333 favourable FRSTs for h = 7, which is very close to our estimate.Of these, 522,388 are numerically distinct, which is compatible with our prediction of the number of diffeomorphism classes.
A naïve extrapolation of the above fits to the entire (favourable) KS list, leads to (1.6 ± 0.1) × 10 497 FRSTs and to between 10 396 and 10 401 diffeomorphism classes.As might be expected, these estimates are dominated by the single polytope which leads to manifolds X with h 1,1 (X) = 491.

E. CY manifold class data
The list of CY manifolds from the KS list, up to h = 6, along with their evaluated invariants can be found on the site specified in Appendix D. These data files also include the explicit maps constructed between different manifolds with equal invariants when available.
Two files are provided for each Picard number h = h 1,1 (X).The first is a list of the manifolds and invariants, and is contained in the file ManifoldData.zip.The second is a list of equivalences, contained in the file EquivalenceData.zip.The equivalence data is designed to be handled in Mathematica, and comprises three parts.Firstly, we have a list of classes of FRSTs which all have the same invariants and are not linked to each other by GL(h, Z) transformation.Secondly, we have an association taking a pair of manifolds to a list of GL(h, Z) transformation matrices which realise the transformation between the two.Thirdly, we have an association describing the FRSTs with numerically duplicate data, taking a single manifold in the first or second list to a list of other manifolds.For a full explanation, see Appendix D.
Finally, in EquivalenceData.zip,we also include two files which contain the GL(h, Q) relations between man-ifold data for h = 2, 3.These files use exactly the same formatting as the other equivalence data.

V. CONCLUSIONS
In this work, we have placed bounds on the number of diffeomorphism classes present in the list of CY threefolds of low Picard number derived from the KS list.Wall's theorem asserts that two CY three-folds are diffeomorphic iff their Hodge numbers, intersection forms and second Chern classes coincide.In practice, the intersection form and the second Chern class are specified relative to an integral basis on H 2 (X, Z), so checking equality involves finding suitable GL(h, Z) basis transformations, where h = h 1,1 (X).This can be difficult, particularly for larger h, and, as a result, it is not a priori clear how many inequivalent CY three-folds are contained in the dataset.
We have tackled this problem by using sets of complementary methods: (i) by using GL(h, Z) invariants constructed from the intersection numbers and the components of the Chern class and (ii) by using an algorithm, based on line bundle invariants, to find suitable GL(h, Z) transformations and (iii) by directly solving, for low Picard numbers, h ≤ 3, the relevant Eqs.(I.3) for the transformation matrix P ∈ GL(h, Z) using computer algebra methods.
For the first method, we have relied on known GL(h, Z) invariants from the literature [20] as well as on novel invariants constructed here for the first time.These invariants, discussed in detail in Section II, split manifolds with the same Hodge numbers into subclasses, thereby providing lower bounds on the number of diffeomorphism classes in the KS list.For higher Picard number, some of these invariants, specifically the polynomial ones, become too computationally expensive to compute for every manifold in the list.As a result, we were able to determine these lower bounds only up to h = 6.
For the second, computational method, we have used the fact that we must have an identification of line bundles on two diffeomorphic manifolds which leaves certain quantities, related to the line bundle index, invariant.This observation allows searching for suitable GL(h, Z) transformations, using the algorithm described in Section III.This algorithm is applied only on pairs of manifolds that agree at the level of the invariants calculated under method (i).Fortunately, the invariants used were strong enough to partition the data into small enough classes for h ≤ 6.The result is an upper bound for the number of diffeomorphism classes for h ≤ 6.
The results obtained from applying these methods to the set of simply connected CY three-folds from the KS list are described in Section IV and summarised in Table II and Figs. 3 and 4. For h ≤ 3, where computer algebra methods are feasible, the exact number of diffeomorphism classes can be determined.Specifically, for h = 1, 2, 3 we find 4, 27 and 183 diffeomorphism classes,  respectively.For h = 4, 5, 6 we find tight lower and upper bounds for the number of classes, using methods (i) and (ii), with the precise numbers given in Table II.
An interesting observation from Fig. 3 is that both the (logarithmic) number of FRSTs and diffeomorphism classes depends linearly on h.Boldly extrapolating to h ≤ 6 assuming this dependence, we estimate that a total of (1.6 ± 0.1) × 10 497 FRSTs and a total of between 10 396 and 10 401 diffeomorphism classes can be obtained from the KS list.
There are a number of natural extensions of this work, which we leave for further study.One remaining issue is to find an explanation for the GL(h, Q) transformations discovered between various manifolds.We have no good explanation for the presence of these relations, nor an explanation for why the determinants of these maps are always ±1.
It is clearly desirable to extend our results to higher Picard numbers and to find further invariants which might help to achieve this.On the latter point, we can ask a number of questions.Are there generalisations of the spinor genus to cubic forms?Can one apply spinor genus techniques to the symmetric matrices generated in the course of evaluating invariants in this work?In particular, are there enough, yet unknown, invariants to fully determine the number of diffeomorphism classes?Could these invariants be relevant to questions of boundedness for CY manifolds?It would also be interesting to understand if there is a geometric interpretation for the invariants presented in this work. 14

ACKNOWLEDGEMENTS
Aditi Chandra was supported by the David Brink fund and Balliol College during the course of this work.Andrei Constantin's research is supported by a Stephen Hawking Fellowship, EPSRC grant EP/T016280/1.Cristofero Fraser-Taliente is supported by the Gould-Watson Scholarship and Lady Margaret Hall.Thomas Harvey is supported by an STFC studentship.The authors would like to thank Steve Abel, Giulio Gambuti, Naomi Gendler, 14 For example, the degree 4 and 6 invariants at the h = 3 (usually understood in the context of ternary cubic equations, and known as the Aronhold S and T invariants [45]) partially specify the rank and border rank of the associated real ternary cubic.If the degree four invariant vanishes, the polynomial corresponding to the intersection form is a sum of three independent and complex linear forms.If the sextic form is negative, the linear forms are actually real, and if it is positive one is real and two are complex conjugates.In the former case, this means the tensor is 'diagonal' over R-in some (potentially non-integral) basis, there are only three intersections d 111 , d 222 , d 333 .It is possible that this has a deeper meaning for our context.Both cases occur in the list, and intriguingly we find integral basis transformations.As an explicit example, we are able to transform the intersection form of the h = 3 manifold with (polytope #, triangulation #) = (1,0) to a diagonal 3-tensor with entries (−3, −1, 1).mentioned in the main text, various different generalisations of the Cayley hyperdeterminant are possible which suggest various strategies for constructing invariants, but at least three of these coincide for 2 × 2 × 2 arrays15 .This must happen, as there is only one polynomial invariant.The exact form of the Gelfand hyperdeterminant is only known for two-dimensional 3-tensors (and partially known for three-dimensional 3-tensors [49]).However, it is known to exist for any h × h × h array with degree : 1) The degree of this polynomial grows exponentially with h, taking the following values up to Picard number five: 1, 4, 36, 272, and 2070.Anything beyond h = 3 is unworkable.
We can think of these contraction patterns as bipartite graphs with edges running from the 3δ/h, h-valent ϵ-type vertices to the δ trivalent array-type nodes.Due to the symmetries of the 3-tensor and Levi-Civita symbol, the topology of this graph uniquely specifies (up to sign) a contraction pattern and hence an invariant.It is easy to exhaustively construct all such graphs for a given δ, h, and it is particularly easy when there is just one possible contraction pattern (when the bound is saturated).If we attempt to detect the Pascal hyperdeterminant for h = 3, we generate 330 different contraction patterns, of which just four are topologically distinct with approximately equal incidences.Two of those four yield the Pascal hyperdeterminant, and the others yield an invariant identically equal to zero.For h = 2, δ = 4, there are two graphs, leading to one zero invariant and one non-zero.One can then directly evaluate the invariants, which is still practicable up to h = 6.All invariants of low degree mentioned above are constructible with this method, but if the degree gets too high, it becomes unworkable.Nevertheless, sparse array methods in Mathematica, coupled with optimised index contraction paths, ensure that these invariants remain calculable for the Picard numbers considered in this paper.
We include for interest the 6 different graph topologies for h = 3.Of the corresponding invariants, three (shown in Fig. 6) evaluate to zero and three to non-zero (shown in Fig. 7).As before, they are presented in directed and undirected form, and ϵ-type vertices are coloured red.For higher tensors (e.g.4-tensors), or higher degrees/h, the initial construction of all possible graphs becomes hard.However, as nonzero contraction patterns are not 'rare' in the space of admissible bipartite graphs, it is often sufficient to sample the space of bipartite graphs of fixed valence, rejecting graphs that vanish by symmetry.

Additional polynomial invariants generated from the Hessian-determinant representation
From the Hessian-determinant representation we can construct the four-array C with indices I valued in Sym n/4 H: 2) Then take the PDET of C IJKL .We did not evaluate this invariant as, in doing so, we regain the computational complexity issues associated to taking the PDET for large dimension.Similarly, for h divisible by 3 we could symmetrise appropriately and then take the PDET of a new 3-tensor.

Additional polynomial invariants for small Picard number
Finally, we consider some other invariants which can be constructed but vanish for h larger than 3, 4, or 5  due to a non-maximal generic rank or generalised tensor rank.The first two invariants are algebraically related to the other known polynomial invariants.We include these to indicate that there are some constructions which give invariants but become zero after some point.
1.An invariant which vanishes for h ≥ 3: This coincides with the hyperdeterminant 2 for its one nontrivial case.One can also calculate the rank and signature of B.

Appendix C: Generating polynomial invariants
It is interesting to note that GL(h, Z) is finitely generated [50] by two (or three) matrices S, T, U (depending on dimension).S and T suffice for h even, and all three are required for h odd 16 .If we find the matrix representations R d S , R d T , R d U of the generators in the relevant degree-δ polynomial representation, the non-zero simultaneous eigenvectors correspond to singlets.This immediately enables us to find the lowest order invariants for small h.Unfortunately, these matrices become exponentially large.
Firstly, one can consider the 'torus invariants'.Consider the subgroup D of SL(h, R) given by the invertible diagonal matrices.The action of D, known as the maximal torus, maps monomials into monomials.We know that under P ∈ GL(h, R) a degree-δ invariant I δ,h of the 3-tensor, must be mapped to det P 3δ/h I δ,h .It follows that the monomials present in the invariant must be those which transform with a factor of h i=1 t g i under D, for g = 3δ/h.This heavily restricts the number of admissible monomials, and when considered using the Lie algebra formalism these 'torus monomials' are precisely those with weight zero.Moreover, each monomial should come with its entire permutation orbit, and thus we can combine multiple torus monomials.
Consideration of the torus-invariant monomials also lets us determine which degree plethysms could actually entertain invariants.If invariants transform as (det P ) 3δ/h , as they are polynomial representations, we must have 3δ/h an integer, k.Then the only admissible degrees are those δ given by kh/3 for integer k, a condition which is naturally very closely related to the Levi-Civita graph discussion above.As determined above, not all of these admissible degrees actually support invariants.The eigenvector operation eigenvectors(R S ) can be converted to a nullSpace operation, and then we can therefore project one side of the matrix (R S − Id) onto the admissible weight zero monomials.It turns out to be sufficient to consider only the eigenvectors of the S matrix (we can neglect T and U ).For actual evaluation of polynomial invariants, it is much faster to directly evaluate them using the PDET/Levi-Civita formulae, rather than substituting into a symbolic expression.

Name
Explanation h "2h-invariant-X-ic" Eq.II.9 2-5 "GCD-dijk" S1 in Eq.II.Table III.Table giving the names given to each invariant in the data, an explanation of what the name refers to, and for which Picard numbers the invariant was evaluated.We note that not all obvious or accessible invariants were evaluated, particularly for h = 2, 3 where we had perfect discrimination.
polytope and triangulation numbers.For ease of use, all FRSTs are referred to by a triple of indices: {manifoldIndex, polytope, triangulation} = {m, p, t}.Any invariants appearing in these files (for each h) were used in the determination of the lower bounds in Table II.We list which invariants were calculated in Table III.In this work, we only considered manifolds with SimplyConnected =True.
The next file (contained in EquivalenceData.zip) is composed of three lists and gives the equivalence data.They are formatted as Mathematica lists.The first is a list of classes of FRSTs which all have the same invariants (for the cases h = 2, 3, these classes are all singlets, as GL(h, Z) relations have been explicitly ruled out).Each FRST is listed using our three-index {m, p, t}.The next is an association of linear maps between FRSTs, taking pairs of FRSTs to lists of transformation matrices (along with their determinants).The last is an association describing the FRSTs with numerically duplicate data, taking single 3-indices {m,p,t} to lists of 3-indices which also have the same data.To illustrate this, describe a simple dataset of five manifolds, where we have searched for transformation matrices with entries ≤ 3. We keep the labelling indices {m,p,t} generic.

Figure 1 .
Figure 1.The bipartite graph denoting the index pattern for the δ = 4, h = 3 'coincidental' invariant, firstly displayed undirected and then directed.Levi-Civita-type vertices are coloured red.Uniquely, for h = 3 both the Levi-Civita and array vertices are trivalent.) bounds have been obtained by explicitly identifying basis transformations, using constraints related to the index of line bundles.The bracketed lower bounds denote the exact numbers of classes decided by symbolic solutions in Mathematica.The fourth column indicates the number of FRSTs with trivially distinct topological data.The fifth column indicates the lower bounds given by the Hodge numbers alone.The last column indicates the range of line bundles used in the derivation of the upper bounds.

FRSTsFigure 3 .
Figure 3. Plot corresponding to TableIIon a logarithmic scale.Here, we plot the lower (from the invariants) and upper (from the systematic search) bounds on the number of distinct manifolds.We also plot the bare numbers of polytopes and triangulations, as well as the number of numerically distinct triangulations (i.e.triangulations with exactly the same topological data).

Figure 4 .
Figure 4. Average number of distinct manifolds per polytope from TableII.

Figure 5 .
Figure 5.The solid lines are upper bounds on the number of diffeomorphism classes per distinct triangulation, for h ≤ 6, identified by the algorithm introduced Section III A as kmax is increased.For h = 2, 3, 4, 5, 6, kmax = ∞, ∞, 15, 12, 5. Lower bounds for each value of h are indicated with dashed lines in the appropriate colour.

Figure 6 .
Figure 6.Contraction patterns for degree 6, h = 3 invariants which evaluate to the non-zero Pascal/2h-invariant.Each graph is shown once in directed and once in undirected form.Levi-Civita-type vertices are coloured red.

Figure 7 .
Figure 7. Contraction patterns for degree 6, h = 3 invariants which evaluate to zero.Each graph is shown once in directed and once in undirected form.Levi-Civita-type vertices are coloured red.
8)(i) , (II.8)where a i1,...,in is a tensor in D dimensions.It is non-zero only for even tensor rank n and it has polynomial degree D much like the standard determinant.One is tempted to discard it, as we are mainly interested in three index tensors.However, we can consider the six index tensor obtained by symmetrising d (ijk d lmn) and then evaluate the Pascal hyperdeterminant [29](d (ijk d lmn) ) , (II.9)which is indeed of degree 2h in d ijk and it is nonvanishing7.Unfortunately, the above formula for the PDET has exponential complexity and, for this reason, we use a recursive algorithm, presented in Ref.[29], to evaluate it 8 .This recursive algorithm has complexity O(2 Dn D n ) = O(2 6h h 6 ), as n = 2 × 3.
For each such P , check if Eqs. (I.3) hold.In general, if a basis transformation is found, it is not unique.If h = 1, Steps 3 and 4 are omitted and if h = 2, Step 4 is omitted.The slowest part of the above algorithm is Step 2. It involves a search over all line bundles O

Table II .
Upper and lower bounds on the number of diffeomorphism classes present in the KS list of simply connected CY threefolds up to Picard number 6. Lower bounds have been obtained using the invariants summarised in TableIII.Upper