Hamiltonian Analysis of f(Q)$f(\mathbb {Q})$ Gravity and the Failure of the Dirac–Bergmann Algorithm for Teleparallel Theories of Gravity

In recent years, f(Q)$f(\mathbb {Q})$ gravity has enjoyed considerable attention in the literature and important results have been obtained. However, the question of how many physical degrees of freedom the theory propagates—and how this number may depend on the form of the function f—has not been answered satisfactorily. In this article it is shown that a Hamiltonian analysis based on the Dirac‐Bergmann algorithm—one of the standard methods to address this type of question—fails. The source of the failure is isolated and it is shown that other commonly considered teleparallel theories of gravity are affected by the same problem. Furthermore, it is pointed out that the number of degrees of freedom obtained in Phys. Rev. D 106 no. 4, (2022) by K. Hu, T. Katsuragawa, and T. Qui (namely eight), based on the Dirac‐Bergmann algorithm, is wrong. Using a different approach, it is shown that the upper bound on the degrees of freedom is seven. Finally, a more promising strategy for settling this important question is proposed.

Although f (ℚ) gravity has been subject of extensive research, certain aspects of this theory remain elusive.One such issue is the determination of its physical degrees of freedom.This is a matter of crucial importance as it directly affects the theory's cosmological and astrophysical perturbations and reveals any implicit symmetries that may have gone unnoticed.Determining these degrees of freedom was exactly the objective in ref. [23], where the Hamiltonian formulation and subsequently the Dirac-Bergmann procedure was used to analyze the constraints and obtain the number of propagating degrees of freedom.The conclusion was that f (ℚ) gravity exhibits precisely eight distinct physical degrees of freedom.
In our investigation, it is revealed that a basic assumption of the Dirac-Bergmann algorithm is violated by f (ℚ) gravity, as well as other teleparallel gravity theories based on torsion or non-metricity, and that the analysis can therefore not be carried through.The deficiency is demonstrated explicitly in the case of f (ℚ) gravity, and the flaws in the analysis of ref. [23] are brought to attention.Moreover, we use a completely independent method to establish that f (ℚ) propagates at most seven physical degrees of freedom.
The article is organized as follows: In section 2 we review the basics of the Symmetric Teleparallel Equivalent of GR (STEGR).This serves the purpose of introducing basic concepts as well as fixing notations and conventions.Furthermore, we introduce two commonly studied extensions of STEGR: A five-parameter family of quadratic Lagrangians which define Symmetric Teleparallel Gravity (STG) (subsection 2.2) and f (ℚ) gravity (subsection 2.3).Subsection 2.4 briefly touches upon the main features of teleparallel theories of gravity based on torsion.
Section 3 is devoted to reviewing the basics of the Hamiltonian formalism and, in particular, the Dirac-Bergmann algorithm.This algorithm is summarized in subsection 3.1.In subsection 3.2 we show under which conditions the Dirac-Bergmann algorithm is not applicable to field theories.The main part of this article are subsections 3.3, 3.4, and section 4. Subsection 3.3 is devoted to the Dirac-Bergmann analysis of f (ℚ) gravity.We explicitly show how and why the analysis fails and where the work of ref. [23] went wrong.Other teleparallel theories of gravity, based on either torsion or non-metricity, are scrutinized in subsection 3.4, where we argue that almost all teleparallel theories are affected by the same problem which leads to a failure of the Dirac-Bergmann approach.In section 4 we employ an entirely different method to analyze f (ℚ) gravity and we establish that it propagates at least four and at most seven degrees of freedom.
Finally, we draw our conclusions in section 5 and we propose methods to circumvent the limitations of the Dirac-Bergmann algorithm.A complete analysis of f (ℚ) gravity and a determination of its physical degrees of freedom is left for future work.

The Symmetric Teleparallel Equivalent of GR
The STEGR, [2][3][4]24,25] or STEGR for short, is a formulation of GR which is rooted in the mathematical framework of metric-affine geometry. That i, the effects of gravity are described by the triple (, g, Γ), where  is a real, connected, four-dimensional manifold, g is a metric tensor of Lorentzian signature, and Γ is an affine connection.Standard GR can also be described within this mathematical framework.The only difference between GR and STEGR are a different set of geometric postulates, which are imposed to restrict the form of Γ, and a different form of the action.
To make this more precise, we first recall that the purpose of a connection is to define a notion of parallel transport or, equivalently, define a notion of covariant differentiation on .The action of the covariant derivative ∇ on vector fields V and 1-forms  is defined as Notice that it is always necessary to choose an affine connection in order to properly define the action of the covariant derivative operator ∇ on vectors, 1-forms, and ultimately on any type of tensor (density).
Given an affine connection Γ, one can define three types of tensors which characterize metric-affine geometries (, g, Γ).These tensors are Curvature tensor: Torsion tensor: Non-metricity tensor: where the round and square brackets symbolize symmetrization and anti-symmetrization of indices, respectively.Metric-affine geometries can now be classified according to which of these tensors (or which combination of these tensors) is not zero.Riemannian geometry, which constitutes the mathematical framework of GR, emerges as special case of a metric-affine geometry, where the torsion and non-metricity tensors vanish.Observe that this amounts to restrictions on the choice of Γ, since the connection has to satisfy the equations Γ [] = 0 and   g  = 2Γ  ( g ) .As is well-known, these conditions completely determine the connection, leaving us no free choice.In fact, the connection then has to be the Levi-Civita connection, which we denote by {𝛼 } and which is explicitly given by (2.3) STEGR, however, is based on a different set of geometric postulates.Concretely, the connection is postulated to have vanishing curvature and vanishing torsion: These postulates leave the non-metricity tensor Q  as the only non-vanishing 1 tensor and the effects of gravity can be attributed to this tensor.Moreover, these postulates do not completely fix the connection.In fact, one can show [2][3][4]25] that a connection which gives rise to a vanishing curvature tensor necessarily has the form where Λ   are the components of a matrix which belongs to the general linear group GL(4, ℝ), i.e., the set of invertible 4 × 4 matrices.Using (2.5), the postulate of vanishing torsion takes the form This condition is solved by matrices of the form where   denotes four arbitrary 2 functions of the spacetime coordinates, not a vector field!By combining (2.5) and (2.7), one finds that a flat and torsionless connection has the form where x  ∕  should be read as the inverse of the Jacobian matrix     :=   ∕x  .We emphasize that any affine connection of the form (2.8) is guaranteed to be flat and torsionless.Moreover, this form of the connection uncovers a particular feature of gravity theories based on flat and torsionless metric-affine geometries: The connection can be trivialized globally by a particular gauge choice.In fact, by choosing   = M   x  +   0 , where M   is an invertible matrix with constant components and   0 denotes four arbitrary constants, one finds       = M       x  = 0 . (2.9) Thus, if the four functions   are chosen to have the form   = M   x  +   0 , the connection vanishes globally, Γ   = 0.25] We also emphasize that the coincident gauge is a feature common to every theory based on a metric-affine geometry (, g, Γ) with a flat and torsionless connection, independent of what the Lagrangian of the theory might be.
Having established the geometric foundations of symmetric teleparallel theories of gravity, we now turn to the action which defines STEGR and its dynamics.There are actually two action functionals one can consider.The first, which is based on a flat and torsionless connection of the form (2.8), is given by refs.[3,  4] consider non-trivial spacetimes, one needs Q  ≠ 0 when curvature and torsion vanish. 2 The only condition one has to impose on these functions is that the determinant of the matrix     is not zero.That is because matrices belonging to GL(4, ℝ) are invertible and thus have a non-zero determinant.
where g denotes the determinant of the metric,  matter stands for the action of minimally coupled matter fields, and ℚ is the so-called non-metricity scalar, and where are the two independent traces of the non-metricity tensor.Observe that the action (2.10) is a functional of the metric g  and the four functions   .A different action functional, based on the Palatini principle, which regards the metric and the connection as two independent fields, is given by refs.[4, 25]    STEGR [g, Γ; Π, χ] where Π  and χ  are tensor densities3 of weight w = +1.These tensor densities act as Lagrange multipliers and by varying the action (2.13) with respect to these multipliers, one obtains the equations Thus, the difference between the actions (2.10) and (2.13) is that in (2.10) one uses the parametrization (2.8), which directly implements the postulates of vanishing curvature and vanishing torsion, whereas in (2.13) the connection Γ   is a general affineconnection.Only after varying with respect to the Lagrange multipliers and solving the Equations (2.14) does one obtain a flat and torsionless connection.Ultimately, the actions (2.10) and (2.13) define the same theory.Namely, GR.
One can show [2][3][4] that the action (2.10) is equal to the Einstein-Hilbert action of GR, up to a boundary term.This result follows from the geometric identity where  and   are the Ricci scalar and covariant derivative operator with respect to the Levi-Civita connection (2.3) (not with respect to the affine connection Γ).From this identity it follows at once that STEGR and GR give rise to the same field equations for the metric.It is therefore also no surprise that GR and STEGR are equivalent and propagate the same number of physical degrees of freedom: two.This fact has also been checked by an explicit Hamiltonian analysis of the STEGR action (2.10) in the coincident gauge. [26]owever, flat and torsionless metric-affine geometries can also be taken as starting point for constructing modified theories of gravity.There are two commonly considered modifications in the literature and for these theories it is in general no longer true that there are two physical degrees of freedom.In the following two subsections we define these two theories and discuss some of their properties.

Five-Parameter Family of Quadratic Non-Metricity Lagrangians
In the previous subsection, we saw that the action of STEGR is based on the non-metricity scalar.This scalar is quadratic in the non-metricity tensor and one can naturally ask, what is the most general scalar quantity one can construct which is quadratic in the non-metricity tensor?27] Thus, the most general nonmetricity tensor one can consider reads where c 1 , c 2 , c 3 , c 4 , and c 5 are real but otherwise arbitrary constants.The non-metricity scalar (2.16) gives rise to a fiveparameter family of gravity theories, which we dub STG theories, described by the action functional ) . (2.17) One can check that the choice reproduces the non-metricity scalar (2.11) and thus the STG action (2.17) reduces to the STEGR action (2.13).Furthermore, it is important to realize that the geometric identity (2.15) holds only for this choice of parameters, i.e., only for the choice which corresponds to STEGR.For any other choice of parameters, the most general non-metricity scalar (2.16) differs from the Ricci scalar  of the Levi-Civita connection by more than just a boundary term.Thus, any STG theory based on a choice of parameters different from the one of STEGR leads to a modification of GR.However, not every choice of parameters might lead to a viable theory.The field equations derived from (2.17) are clearly at most second order field equations for the metric, since the non-metricity tensor only contains first order derivatives of the same.This excludes instabilities due to higher derivatives, but it does not preclude the presence of ghosts.It is also possible to choose the parameters such that one obtains a non-trivial action but either one or zero propagating degrees of freedom.A detailed analysis of the five-parameter action (2.17) based on the coincident gauge (recall that the coincident gauge is common to all flat and torsionless metric-affine geometries, irrespective of any Lagrangian) was performed in ref. [27].
The analysis was based on the first step of the Dirac-Bergmann algorithm, which consists in finding all so-called primary constraints (see subsection 3.1 for a definition of this terminology).Primary constraints give us an upper bound on the number of propagating degrees of freedom and in ref. [27] it was found that the number of primary constraints depends on how one chooses the parameters c 1 , c 2 , c 3 , c 4 , and c 5 .It was found that the fiveparameter space of theories compartmentalizes into nine different sectors, as summarized in Figure 1.Each sector is characterized by a different number of primary constraints, or, put differently, is characterized by a different upper bound on the number of propagating degrees of freedom.These insights can already be used to conclude that certain sectors in the five-parameter space can only contain unphysical theories.We refer the reader to [27] for more details on this.
For the remaining sectors it is not possible to draw any conclusions from the primary constraints alone.One therefore has to proceed to the second step of the Dirac-Bergmann algorithm, which is to determine the so-called secondary constraints, as we will review in subsection 3.1.This step has not yet been carried out and, as we will show in subsection 3.2, it is unlikely that one will succeed.The problem is that the Dirac-Bergmann algorithm suffers from a little known shortcoming and unless one is lucky, it is not possible to carry out the second step of the algorithm for field theories (there is no obstruction for constrained Lagrangians describing point-particles).

Non-Linear Extension: f (ℚ) Gravity
Another possible modification of gravity based on flat and torsionless metric-affine geometries is f (ℚ) gravity.This theory represents a non-linear extension of STEGR, where the nonmetricity scalar (2.11) (not the scalar (2.16)) in the STEGR action (2.13) is replaced by f (ℚ), where f is an arbitrary function solely subjected to the condition df dℚ ≠ 0. This condition ensures that the resulting theory has a non-trivial dynamics, as one can determine from looking at the field equations for the metric derived from the action functional [2,4,25]  f (ℚ) [g, Γ; Π, χ] ) . (

2.19)
This non-linear extension of STEGR has received considerable attention in the literature.7][8][9][10][11][12][13][14][15] However, the question of how many degrees of freedom propagate in f (ℚ) gravity has received relatively little attention.In ref. [26], the Hamiltonian analysis of STEGR in the coincident gauge was carried out and an educated guess about the number For brevity, and following, [27] we have defined c := c 1 + c 2 + c 3 + c 4 + c 5 , ĉ := 2c 1 + c 2 + c 4 , and c 35 := 2c 3 + c 5 .The vanishing of these parameters (or the vanishing of combinations thereof) corresponds to the appearance of one or more primary constraints (the number of independent primary constraints is shown in the last column).STEGR is contained in sector V.
of degrees of freedom in f (ℚ) was attempted.The argument given in that reference is based on (a) findings from cosmological perturbation theory obtained in ref. [5], where it was found that f (ℚ) propagates at least two additional degrees of freedom compared to GR; (b) the expectation that the primary constraints of f (ℚ) are all second class, since the theory is generally covariant.
Putting (a) and (b) together leads to the conclusion that f (ℚ) propagates six degrees of freedom.Of course, this argument is purely heuristic and needs to be followed up by a rigorous analysis of the theory.[30] The algorithm allegedly determined the number of physical degrees of freedom to be eight, thereby challenging the guess ventured in ref. [26].Moreover, the authors of ref. [23] argued that this large number has to be explained with the breaking of diffeomorphism symmetry of the theory.The reason for breaking this symmetry is the use of the coincident gauge during the analysis.
It seems that the Dirac-Bergmann analysis of ref. [23] supersedes the heuristic argument of ref. [26] and therefore settles the question.However, as we will show in the next section, the situation is not as clear-cut and simple as it might seem.In fact, we will point out an error in the analysis of, [23] which will reopen the question.Unfortunately, even after identifying the error, we are not able to give a definitive answer regarding the number of physical degrees of freedom.Rather, we uncover a shortcoming of the Dirac-Bergmann algorithm which afflicts field theories.We explicitly show how this shortcoming prevents one from completing the Dirac-Bergmann analysis of f (ℚ) gravity and we show that many other teleparallel theories of gravity are potentially affected by this problem-this includes theories based on non-metricity as well as theories based on torsion.Therefore, before reviewing the Dirac-Bergmann algorithm and isolating the problem, we will briefly introduce torsion theories of gravity.

Teleparallel Theories of Gravity based on Torsion
We have seen that metric-affine geometries provide the mathematical framework for two distinct, but ultimately equivalent formulations of gravity: GR and STEGR.These two formulations are distilled from the most general metric-affine geometry (, g, Γ), where Γ is a connection which gives rise to non-trivial curvature, torsion, and non-metricity tensors, by imposing certain postulates.In GR one demands that Γ is chosen such that torsion and non-metricity vanish, which unambiguously fixes the connection to be the Levi-Civita connection.STEGR, in turn, is based on a flat and torsionless connection.
By demanding a flat and metric-compatible connection with non-trivial torsion, one establishes the basis for a third formulation of gravity: The so-called Teleparallel Equivalent of GR (TEGR).This approach to gravity complements the other two and together GR, STEGR, and TEGR form what is known as the geometric trinity of GR. [2,3] Of course, the postulates of vanishing curvature and nonmetricity do not yet fully define a theory.One has to stipulate an action principle, which in the case of TEGR reads (2.20) Here, Π  and Ξ are again tensor densities of weight +1, which act as Lagrange multipliers in order to reduce the general affine connection Γ   to a flat and metric-compatible one.Moreover,  denotes the so-called torsion scalar and it is explicitly given by  The vanishing of one or more than one of these parameter combinations corresponds to the appearance of one or more than one primary constraints.The number of independent primary constraints is listed in the last column.TEGR is contained in sector V.This table was inspired by, [34] but uses a slightly different nomenclature for the primary sectors.
The tensor T  := T   denotes the only possible trace of the torsion tensor.Moreover, just as in the case of STEGR, one can prove [2,3,25] a geometric identity which relates the Ricci scalar of the Levi-Civita connection to the torsion scalar: ( From this identity, one can infer that the action (2.20) is equal to the Einstein-Hilbert action of GR, up to a boundary term.Thus, it follows that GR and TEGR possess the same field equations and the theories are equivalent in the sense that they possess the same solutions.The equivalence is reinforced by the fact that TEGR propagates two degrees of freedom, as has been shown in ref. [31].
The framework of metric-affine geometries with flat and metric-compatible connection can be taken as starting point for defining modified theories of gravity.Just as with the extensions of STEGR, there are two classes of theories which are commonly studied in the literature.The first class is based on an extension of the torsion scalar.Due to the anti-symmetry of the lower indices of the torsion tensor T   , there are only three independent contractions one can build which are quadratic in torsion.Thus, the most general quadratic torsion scalar one can construct reads which obviously reduces to (2.21) for the parameter values c 1 = − 1 4 , c 2 = − 1 2 , and c 3 = 1.We refer to the three-parameter family of gravity theories based on the most general quadratic torsion scalar (2.23) as Teleparallel Gravity, or TG for short.The corresponding action functional is given by where  (g, Γ; c i ) indicates that  is constructed from the metric, the affine-connection, and that it depends on the parameters c 1 , c 2 , and c 3 .Depending on how one chooses these parameters, one can get more or less degrees of freedom compared to GR and it needs to be explored, which theories are healthy and viable modifications of gravity.A first step in this direction has been taken in refs.[32, 33].The methodology employed in refs.[32,  33] is similar to the one used in ref. [26], which led to the compartmentalization of the five-parameter family of quadratic nonmetricity theories into nine sectors.That is, the authors of refs.[32, 33] determined how many primary constraints can appear from the action (2.24) of TG depending on how one chooses the parameters c i .Remarkably, it was found that TG compartmentalizes into nine sectors.This is the same number as in STG, [26] even though TG has two parameters less than STG.A summary of the different categories of theories is given in Figure 2. The analysis of TG has not been extended beyond cataloging its possible primary constraints.In particular, the consistency of these constraints and the possible occurrence of secondary constraints has not yet been investigated. [34]As we will discuss in subsection 3.4, TG as well STG may suffer from the same shortcoming of the Dirac-Bergmann algorithm and completing this analysis may therefore be difficult, if not impossible.Finally, the non-linear extension of TEGR, which is defined by the action functional where  is the torsion scalar (2.21), not the most general scalar (2.23), and f is a function which is subjected to the sole condition that df d ≠ 0. This condition follows from an inspection of the field equations derived from the action (2.25) and the requirement of non-trivial dynamics.
The question about how many degrees of freedom propagate in f ( ) gravity has received more attention than the analogous question in f (ℚ) gravity.However, the situation is similarly unsatisfying because it led to disputed results. [34]For instance, the analyses in refs.[35, 36] reached a different conclusion (five degrees of freedom) than the analysis in ref. [37] (three degrees of freedom), which may be traced back to problems in computing the Poisson brackets between constraints. [36]Furthermore, the authors of ref. [36] also point toward the possibility of having four degrees of freedom or none at all, even if these scenarios seem unlikely.

The Dirac-Bergmann Algorithm for Constrained Hamiltonian Systems
The theory of constrained Hamiltonians was systematically developed in the 1950s by Dirac, and Bergmann and collaborators (see eg. [28][29][30] and [38] and references therein for a more modern account).These developments culminated in the Dirac-Bergmann algorithm, which is used to uncover all constraints of a given Hamiltonian system.This is a necessary step to (a) check the internal consistency of a theory (e.g.check whether there are ghosts), (b) define the physical (i.e., Dirac) observables of the theory, and (c) to perform the canonical quantization of the system.
Interestingly though, the algorithm was mostly developed by studying finite-dimensional system, not field theories.The transition from a finite-dimensional system to a field theory is usually regarded as a straightforward process with only minor technical complications [30,38,39] (see [40] for a more critical assessment of this transition).
We follow the traditional path and present the Dirac-Bergmann algorithm for finite-dimensional systems, only to then point out a little-known complication which occurs in field theories.This complication has far reaching consequences, as it has the power to invalidate the whole approach.

Summary of the Algorithm
We follow roughly [34,39] in what follows and the diagram 3, which shows all steps of the Dirac-Bergmann algorithm, was inspired by ref. [34].
Let  be a N-dimensional smooth manifold coordinatized by q i with i ∈ {1, … , N}.We refer to this manifold as configuration space.Given , we can construct the tangent bundle T which consists of pairs (q i , ̇qi ) and which is therefore referred to as velocity phase space.A Lagrangian is a function L : T → ℝ which we shall assume to be at least twice smoothly differentiable, i.e., L ∈ C 2 (T).To define the theory we work with, we introduce the action functional  : T → ℝ defined by The starting point for defining the Hamiltonian theory, and also step 1 in diagram 3, is to determine the conjugate momenta p i , defined as The goal is to perform a Legendre transformation of the Lagrangian, which replaces the velocities ̇qi by the momenta p i .However, this is only possible if the Hessian matrix , then the equations p i = L(q, ̇q)∕ ̇qi can all be solved for ̇qi , which means we can express all ̇qi in terms of q i and p i .However, if one finds rank( ij ) = R < N, it is not possible to replace all velocities ̇qi by momenta p i .In fact, if the Hessian has less than full rank, it means that not all conjugate momenta are independent and/or that the right hand side of (3.2) contains no velocities for some of the momenta.In either case, we can solve for only R velocities, We follow [34] and label the solvable velocities ̇qι by the index ι, while the velocities one can not solve for are labeled by the index ῑ, i.e., ̇qῑ .
The right hand side of (3.2) is in general a function  i of q i , ̇qι , and ̇qῑ .We can thus write where we used Equation (3.4) and the fact that  i can no longer depend on ̇qῑ after using (3.4), since otherwise it would be possible to solve for more velocities.Notice that for i ∈ {1, … , R}, the Equations (3.5) reduce to identities of the form p i ≡ p i .This is a direct consequence of the definition of the momenta and the fact that we could solve for R velocities.However, a different situation presents itself for the momenta with indices i ∈ {R + 1, … , N}.
Here we find that the right hand side gives us a relation between configuration space variables q i and canonical momenta p i .Thus, we are forced to introduce the so-called primary constraints The number M of primary constraints is determined by the dimension of the configuration space and the rank of the Hessian as M = N − R. We stress that these M relations between the configuration variables q and the momenta p are purely a consequence of the definition (3.2), which in turn depends on the precise form of the Lagrangian.No equations of motions were used up to this point.However, the presence of M constraints (or relations) between q and p implies that the dynamics of our system does not take place in the 2N-dimensional momentum phase space Γ coordinatized by (q i , p i ), but rather on the submanifold Γ P ⊆ Γ of dimension 2N − M. Concretely, the so-called primary constraint surface Γ P is defined as the region of Γ in which the Equations (3.6) are satisfied.
Step 3b is to introduce the so-called canonical Hamiltonian H 0 (q, p) := ̇qi p i − L(q, ̇q) , ( where a summation over the index i ∈ {1, … , N} is implied.As one would expect, the Legendre transform guarantees that this Hamiltonian is a function of q i and p i , i.e., it does not depend on the velocities ̇qi .However, this Hamiltonian knows nothing about the primary constraints  A and thus does not generate the same dynamics as the Euler-Lagrange equations derived from the functional (3.1).To remedy this shortcoming, step 4 in the Dirac-Bergmann algorithm consists in introducing N − R Lagrange multipliers  A and to define the primary Hamiltonian It should be stressed that  A are arbitrary functions of time and their sole purpose is to enforce the primary constraints.Furthermore, one can now show that H P generates the same dynamics as the Euler-Lagrange equations.In particular, the time evolution of any phase space function F : Γ → ℝ is given by where the symbol "≈" stands for "weak equality".The concept of weak equality is introduced to say that the left and right hand side are equal up to terms involving arbitrary linear combinations of constraints.In Equation (3.9), the curly brackets denote the usual Poisson bracket, and we dropped the term {F,  A } A , because it represents a linear combination of constraints.
For the time evolution to be consistent, i.e., for the evolution to always take place on the constraint surface Γ P and never leave it, it is necessary to impose the consistency condition Then one can show that one can solve for P Lagrange multipliers if and only if If this condition is not satisfied, then the inhomogeneous system of linear equations is inconsistent in the sense that after a Gaussian row-reduction one finds at least one equation of the form 0

and u A
h A ≈ 0, it is possible that these consistency conditions are trivially satisfied or that they lead to new constraints, which we call secondary constraints  B (q, p) = 0. (c) Step 6c: Finally, if det(C AB ) ≉ 0, Equation (3.11) can be solved by inverting C AB and one finds that the Lagrange multipliers are given by We stress that the secondary constraints which arise in (b) require the validity of the equations of motion, unlike the primary constraints.If secondary constraints are encountered, their consistency under time evolution has to be checked.This can give rise to tertiary constraints and so on.In practice, the verification of consistency conditions has to be iterated, as indicated in Figure 3 in step 6b, until no new constraints appear (or until one encounters an inconsistency).
Assuming that no inconsistencies appeared, one is now left with a full set of constraints after step 7: This set of constraints defines a submanifold Γ phys of the initial phase space Γ.The time evolution of physical degrees of freedom takes place entirely in Γ phys .In order to be able to count the degrees of freedom, it is convenient to group the above set of constraints into two classes, as indicated in 8a and 8b: The set of first class constraints is the maximal set consisting of all constraints Φ I from (3.15) for which one has {Φ I , Φ J } ≈ 0 for all I, J in the set of first class constraints . (3.16) We use the abbreviation FCC for first class constraints.If a constraint is not first class, it is automatically second class (SCC).We denote the set of all second class constraints by  K .One can show that the Lagrange multipliers for SCC can all be solved for.This means they are explicitly given in terms of known functions of q i and p i .The same is not true for FCC, which in fact correspond to gauge transformation.This means that the Lagrange multipliers associated with FCC can be seen as the generators of such gauge transformations.It is possible to choose these Lagrange multipliers at will, without affecting physical observables.Thus, one finds that the number of degrees of freedom is given by Notice that FCC "kill twice": Each FCC removes one degree of freedom because it is a constraint and it removes a second degree of freedom because there is an arbitrarily specifiable Lagrange multiplier associated with that FCC.Also, the number of SCC must be even for the interpretation of FCC constraints (and consequently Equation (3.17)) to be valid (see [39] for details).This concludes the summary of the Dirac-Bergmann algorithm for particle theories.In the next subsection, we discuss a shortcoming of this algorithm which occurs in field theories.

Field Theories: Cases in which the Algorithm Fails
In the previous subsection we summarized the Dirac-Bergmann algorithm for systems defined over a finite dimensional configuration space.The transition to field theories is typically regarded as a task which poses only minor technical complications. [30,38,39]ndeed, if mathematical subtleties are ignored, as can be done for most field theories of physical interest, it is straightforward to generalize steps 1 through 4.However, when imposing the consistency conditions in step 5 and when attempting to analyze these conditions, one can encounter serious obstacles which invalidate the whole approach.
To give a first idea of what can go wrong in the Dirac-Bergmann analysis, we give a qualitative description of the problem and rel-egate an explicit example to subsection 3.3.First of all, the phase space variables of field theories depend on time and space, unlike their point-particle theory relatives, where the phase space variables can only depend on time.It follows that although primary constraints cannot contain any time derivatives, they can (but do not necessarily do) contain spatial derivatives of configuration space variables.If such spatial derivatives are present in primary constraints, this can affect the mathematical form of the consistency conditions obtained in step 5.For a field theory, the Poisson bracket is defined as ) , (3.18)   where Ψ k denotes the configuration space variables (i.e., the fundamental fields of the theory in question), Πk are the conjugate momentum densities associated with these fields, and ⃗ x, ⃗ y, ⃗ z refer to three distinct points on the spacelike hypersurface Σ t (a surface of "constant time t").Similarly, the primary Hamiltonian is schematically defined as where  denotes the Lagrangian density of the theory in question, φA (Ψ, Ψ, Π) are the constraint densities which we allow to explicitly depend on spatial derivatives of the fields Ψ k , and  A are arbitrary Lagrange multipliers which depend on space and time.
In terms of the Poisson bracket and primary Hamiltonian we have just defined, the consistency condition for the primary constraints reads Notice the crucial difference to Equation (3.11):The Lagrange multiplier appears inside an integral.Since φ depends on Ψ, it is necessary to perform an integration by parts in order to be able to compute the variation Ψ(x)∕Ψ(z) inside the Poisson bracket.However, such an integration by parts will inevitably move the spatial derivative from Ψ onto .The consequence is that we no longer end up with an inhomogeneous system of linear equations for the Lagrange multipliers.Rather, we are now confronted with the much more difficult task of having to solve an inhomogeneous system of first order partial differential equations for the Lagrange multipliers.Recall that the original identification of secondary and tertiary constraints in the Dirac-Bergmann algorithm relies on the fact that one has to analyze a system of linear algebraic equations.Therefore, in the case we just considered, we can no longer proceed with the steps 6 through 8.
Of course, we do not actually need to solve the first order system of PDEs.We only need to determine whether or not a unique solution exists.This is formally possible, but conceptual problems arise nevertheless: In order to obtain a unique solution we also need to impose initial value conditions.What do these conditions physically correspond to?Do different choices of initial conditions manifest themselves physically?That is, do different initial conditions represent physically inequivalent situations?
The situation is even worse when it can be shown that no unique solution exists and that additional constraints arise.In addition to the conceptual problems alluded to above, one faces also considerable technical difficulties in identifying secondary constraints and checking the consistency of the combined set of primary and secondary constraints.
It is conceivable that there are theories which are simple enough such that these problems can be solved in a consistent way.However, in general, the Dirac-Bergmann algorithm breaks down at this point.Ignoring this issue likely leads to wrong conclusions.
In the next subsection we show explicitly how this issue invalidates the Dirac-Bergmann analysis of f (ℚ) gravity.

Dirac-Bergmann Analysis of f (ℚ) Gravity in the Coincident Gauge
As evident from Figure 3, the first step in the Dirac-Bergmann algorithm consists of determining the conjugate momenta starting from the action functional.To that end, one first needs to define the time derivative of configuration space variables, from which the momenta can then be constructed.In any covariant theory, this means that spacetime (, g  ) needs to be arbitrarily foliated by spacelike leaves Σ t .Concretely, we assume that  has the topology  ≅ ℝ × Σ, where Σ is a three-dimensional manifold of arbitrary topology representing "space", while ℝ represents "time".
Any coordinate system x  on  can then be split as (t, x a ), where t is the "time" coordinate and x a are the "spatial" coordinates. 5The leaves Σ t simply correspond to t = const.hypersurfaces.Moreover, it is convenient to write the metric in a form adapted to the arbitrarily chosen foliation of spacetime.This socalled Arnowitt-Deser-Misner (ADM) form of the metric is given by where N denotes the lapse function 6 , N a is the shift vector field, while h ab denotes the intrinsic spatial metric on the spacelike leaves of our foliation.
In what follows we work with the ADM variables {N, N a , h ab } and we employ the coincident gauge, i.e., we exploit the fact that we can choose Γ   = 0 everywhere. 7These choices allow us to write the f (ℚ) action functional as where we used the easy to prove fact that and where we introduced an auxiliary field  in order to "pull out" the non-metricity scalar ℚ from the function f .This has the advantage that we can use results obtained in ref. [26] regarding variations of ℚ with respect to the velocities Ṅ, Ṅa , and ḣab , which leads to particularly simple forms of the conjugate momenta.We stress that a) the action (3.22) is equivalent to the original f (ℚ) action (2.19) written in ADM variables and in coincident gauge and b) that we need to add a boundary term in order to use the ADM form of, [26] with the crucial difference that  will inevitably enlarge the phase space and derivatives will act on it.The addition of this boundary term has no effect on the counting of degrees of freedom since it does not modify the field equations.It will just simplify the expressions.Furthermore, the construction has been explicitly carried out in ref. [23], which thus prompts us to refrain from going through the individual steps.
Step 1 (conjugate momenta): After having introduced the necessary framework for a Hamiltonian analysis, defined variables adapted to our purpose, and molded the action functional into a more convenient form, we can finally execute the first step of the Dirac-Bergmann algorithm: Determining the momenta conjugate to N, N a , h ab , and .Notice that these momenta are necessarily tensor densities, since the Lagrangian from which they are derived is a scalar density.We place a tilde on top of the momenta to emphasize this fact and we find that they have the form one would expect from the results obtained in ref. [26], namely Here, K ab denotes the extrinsic curvature, which is defined as and where  a is the unique torsion-free covariant derivative operator compatible with h ab .
Step 2 (solve for velocities): Observe that only the momentum density πab contains a velocity, namely ḣab , which is "hidden" inside the extrinsic curvature tensor K ab .The velocities Ṅ, Ṅa , and φ do not appear in any of the momenta.Therefore, we can only solve for ḣab , for which we obtain

.26)
Step 3a (primary constraints): Since the momentum densities (3.24) do not depend on the velocities Ṅ, Ṅa , and φ, we obtain a total of 1 + 3 + 1 = 5 primary constraints, which we denote by Step 3b (canonical Hamiltonian): To define the canonical Hamiltonian H 0 , it is convenient to first introduce its density: The canonical Hamiltonian is then defined as the integral of this density over one of the leaves Σ t in the foliation of spacetime, i.e., The "≈" means that we used the primary constraints (3.27) to bring the integrand in the above form.Moreover, we performed a partial integration and dropped the boundary term, as it will not affect the counting of degrees of freedom.We remark that if one chooses f (ℚ) = ℚ, which then also implies  = 1, Equation (3.29) reduces to the canonical Hamiltonian of Coincident GR which was derived in ref. [26].This is the same canonical Hamiltonian as in GR, as one would expect.
Step 4 (primary Hamiltonian): Given that we now know all primary constraints and the canonical Hamiltonian, we can turn to the task of defining the primary Hamiltonian.To do so, we need to introduce the Lagrange multipliers  0 ,  a , and   .The primary Hamiltonian can then be written as

.30)
Step 5 (consistency conditions): Before checking the consistency conditions, we wish to highlight a peculiarity which arises in the context of field theories.Namely, that the Poisson bracket between two constraint densities produces a -function or even spatial derivatives of a function.Expressions involving -functions which do not occur inside an integral have a rather formal character and they can easily lead to confusion, as we will discuss further below.
A simple way to deal with this issue is to introduce so called smeared constraints.Concretely, one integrates a constraint density against a test function F, i.e., one defines and for any test function F. When computing Poisson brackets between constraint densities, one encounters a product of two -functions, but because of the presence of the two smearing integrals, these -functions can be "integrated out", and one is left with mathematically well-defined expressions.Applying this (standard) strategy to our constraints results in the following Poisson brackets: ) The integral which appears on the right hand side stems from the integral in the definition of the Poisson bracket and has nothing to do with the smearing we introduced above.
The consistency conditions we have to check are and our task is to compute the quantities bI and the integral over the Poisson bracket.For the vector densities bI we find the following results: where we have introduced as well as Notice that a is precisely the expression for the vector density constraint of standard GR, while  reduces to GR's scalar density constraint for  = 1.
The integral over the Poisson bracket is easy to compute using the brackets (3.32).In fact, using definition (3.31), one can write this integral as where C I [] means we choose a -function to smear CI and one immediately sees that C I [] = CI .Using (3.32) with F 1 =  and F 2 =  J , we now find the following expressions for the Poisson brackets: We make the important observation that these expressions contain derivatives of the Lagrange multipliers.As explained in the previous subsection, the Dirac-Bergmann algorithm tacitly assumes that the consistency conditions lead to a system of linear equations for the Lagrange multipliers, which can be analyzed using standard tools of linear algebra.
Here, however, we end up with a first order system of partial differential equations (PDEs).We now explicitly show that this prevents us from making any progress in the Dirac-Bergmann analysis.First of all, the consistency conditions can now be written concisely as Equation (i) is purely algebraic and it can be used to eliminate   .This results in After plugging this expression for   into equations (ii) and (iii), we are left with three PDEs for the three functions  a and an equation for the function  0 .Equation (iii) can be solved for  0 , giving us We recall that in the standard Dirac-Bergmann algorithm, we are mostly interested in determining whether the system of linear equations for the Lagrange multipliers is solvable or not (see steps 6a, 6b, and 6c).In our case, we can ask the same question: Is the first order system of PDEs solvable or not?This question can be answered using standard methods from the mathematical theory of partial differential equations, which is summarized in the Appendix.Here, we simply remark that the PDE (ii), after having plugged in By introducing the matrices The question whether this system admits a unique solution for  a can now be answered by checking whether the matrix ∑ 3 i=1 M (i) n i , where n i are the components of the normal vector to the initial value surface, has a non-vanishing determinant.However, one trivially finds det for any nowhere vanishing vector field ⃗ n.A direct computation further reveals that the above matrix has rank two.Two consequences follow from this fact: (a) Since the above matrix has rank two, it is possible to bring the PDE into such a form that only two equations contain derivatives of  a , while the third equation will be purely algebraic.In other words, one finds a new constraint.(This is the analogue of Equation (3.13) in the Dirac-Bergmann algorithm) (b) Because there are only two algebraically independent differential equations in the system, it is only possible to determine two of the functions  a , while the third one has to be specified by hand.This could represent a gauge freedom.
However, the interpretation of this freedom is not straightforward.Moreover, should the freedom to prescribe initial data for  a on, say, a z = const.surface also be interpreted as gauge freedom?What are the consequences of having this freedom?In what way does this affect the counting of degrees of freedom?These questions are far from trivial and we will not attempt to answer them here.Rather, we point out that the Dirac-Bergmann procedure breaks down at this point.The reason for the breakdown is the presence of spatial derivatives of configuration space variables in the primary constraints, which lead to PDEs, rather than linear equations, for the Lagrange multipliers.The silent assumption of the Dirac-Bergmann algorithm that one always obtains linear equations for the Lagrange multipliers is therefore broken.
Before concluding this subsection, we point out where the analysis of ref. [23] went wrong.As alluded to above, a peculiarity of field theories is that the Poisson bracket of two constraints produces a -function or even spatial derivatives of -functions.This is precisely what happened in ref. [23] (see equation (43) in ref. [23]).The consistency conditions found in ref. [23] are then only apparently linear equations for the Lagrange multipliers.In fact, the matrix C n ′ n in equation ( 49) of ref. [23] contains -functions and spatial derivatives thereof, while the vector { n ′ ,  0 } in equation (48) only contains a -function.We recall that -functions acquire a well-defined mathematical meaning when integrated over.If the apparently linear equation (48) of ref. [23] is integrated over a Σ t surface, the -functions and their derivatives can be "integrated out", leaving behind a system of first order PDEs, rather than linear equations for the 's.
The argument presented here casts considerable doubts on the validity of the Dirac-Bergmann algorithm in the case of f (ℚ) gravity and the results obtained in ref. [23].In the next subsection, we show that (almost) every teleparallel theory of gravity is affected by the same problem and in section 4 we show explicitly, using a completely independent method, that the number of degrees of freedom in f (ℚ) gravity is at most seven.

Potential Obstacles in Other Teleparallel Theories of Gravity
In the case of f (ℚ) gravity, we have seen how the presence of spatial derivatives in constraint densities can invalidate the whole Dirac-Bergmann approach.Here, we show that other teleparallel theories of gravity are potentially afflicted by the same problem.Concretely, we consider the five-parameter family of theories described in 2.2, as well as f ( ) gravity and the three-parameter family of torsion theories mentioned in 2.4.
Starting with the most general non-metricity scalar (2.16), one finds, after choosing the coincident gauge and applying a subsequent ADM decomposition of the metric, that the momentum densities conjugate to lapse, shift, and intrinsic metric can be written as ) , (3.46)   where n  = (−N, ⃗ 0) is the unit timelike normal 1-form to Σ t and P   is the so-called non-metricity conjugate ) . (3.47) As shown in ref. [27], these momenta can be written more explicitly as where we used the shorthand notations Notice that only non-metricity tensor components of the form Q 0 and Q 0 contain time derivatives.All other non-metricity tensor components contain spatial derivatives.The nine sectors defined in Figure 1 correspond to choices of c i such that the time derivatives disappear from the momenta. 8However, none of these choices leads to a complete absence of spatial derivatives.Therefore, each primary sector described in Figure 1 contains constraints with spatial derivatives of lapse, shift, and/or intrinsic metric.It follows that potentially every single theory in this five-parameter family of Lagrangians is affected by the same problem we encountered in f (ℚ) gravity, namely that the Dirac-Bergmann algorithm cannot be carried through to completion.
Notice that this is even the case for STEGR, which is contained in sector V.The momenta conjugate to lapse and shift in the case of STEGR are non-trivial and involve spatial derivatives.In ref. [26], the Hamiltonian analysis could only be carried out after the STEGR Lagrangian had been modified by partial integrations, which is the same as adding boundary terms.The effect was such that the momenta conjugate to lapse and shift trivialised.However, this strategy is likely not always successful.The authors of ref. [23] attempted the same kind of simplifications in the case of f (ℚ) gravity.While they managed to trivialise the momentum conjugate to N, the other momenta retained a non-trivial form and they feature spatial derivatives.
Next, we turn our attention to the three parameter family of torsion theories summarized in Figure 2. Explicit expressions for the primary constraints9 can be found in refs.[32-34].It can be verified that all primary sectors, with the exception of the pathological sector 0, harbour primary constraints with spatial derivatives of configuration space variables.Thus, for all of these theories it is potentially not possible to carry out the Dirac-Bergmann analysis to its completion.
The situation for f ( ) gravity is similar.Interestingly, the authors of ref. [37] fell victim to the same error as the authors of ref. [23]: When determining the consistency conditions, they overlooked the presence of spatial derivatives of -functions and arrived at seemingly linear equations for the Lagrange multipliers.An analysis of this erroneous linear system led them to conclude that there are three propagating degrees of freedom in f ( ) gravity.
The error was recognized in ref. [36], but it could not be corrected.The authors of ref. [36] expressed puzzlement at finding a system of linear PDEs for the Lagrange multipliers and discussed three possible scenarios, giving them different numbers of degrees of freedom for f ( ).In any case, it is clear that f ( ) suffers from exactly the same problem as f (ℚ) and that the question about the number of physical degrees of freedom remains unsettled in both theories.
Before concluding this subsection, we remark that even though it is true that all teleparallel theories of gravity discussed here contain primary constraints with spatial derivatives, we do not claim that every single one of them necessarily leads to first order partial differential equations for its Lagrange multipliers.We believe that it is likely that this happens in every case, but the presence of spatial derivatives in the constraints is only a necessary and not a sufficient condition for this to happen.In fact, there are two important examples where constraints with spatial derivatives do appear without leading to problems in the Dirac-Bergmann algorithm: Electromagnetism and GR à la Einstein.
In electromagnetism one finds that the momentum Π 0 conjugate to A 0 vanishes, which thus constitutes a primary constraint.Clearly, this constraint is unproblematic as it does not contain any derivatives.However, running through the Dirac-Bergmann algorithm up to Step 5, one finds that this constraint generates the secondary constraint where Π a is the momentum conjugate to A a .This is the wellknown Gauss constraint and it clearly contains a spatial derivative.The reason this does not create any problems is because the Poisson bracket between the primary and the secondary constraint vanishes: This prevents the spatial derivative from being moved onto a Lagrange multiplier.In GR, one encounters a very similar but slightly more complicated situation.As is well-known, the momentum densities conjugate to lapse and shift both vanish, These constraints give rise to secondary constraints, namely the so-called scalar and vector constraints, Both of these constraints contain spatial derivatives.The spatial derivatives of the scalar constraints are hidden inside (3) , which contains terms of the form  c h ab , while the spatial derivatives of the vector constraint are plainly visible.Just as in electromagnetism, this does not cause any issues.However, the reason is slightly different.Whereas in electromagnetism the Poisson bracket between the primary and the secondary constraint is exactly zero, in GR one finds for the smeared primary and secondary constraints where f i and ⃗ v i are test functions.Notice that the Poisson brackets between the primary and secondary constraints vanish identically, while the Poisson brackets between the secondary constraints vanish on the constraints surface for any choice of test functions.Since in Step 5 of the Dirac-Bergmann algorithm the consistency condition is evaluated on the constraint surface, this is sufficient to guarantee that the field theoretic analogue of Equation (3.11) does not lead to differential equations for the Lagrange multipliers.
These two examples clearly teach us that, as already stated, the presence of spatial derivatives in constraints is not sufficient to conclude that the Dirac-Bergmann algorithm will fail.If the Poisson brackets between the constraints vanish exactly or on the constraint surface, then the algorithm stands a chance of succeeding (it is still possible that the consistency condition generates a new constraint with spatial derivatives and which has a non-vanishing Poisson bracket with at least one of the old constraints, thus leading to a failure at a later point in the algorithm).

Deriving a Lower Bound from Cosmology and an Upper Bound from the Kinetic Matrix
Even though the Hamiltonian analysis of f (ℚ) gravity fails to answer the question of how many physical degrees of freedom propagate, it is still possible to give an upper and lower bound for this number.
A lower bound is actually already known since the first work on f (ℚ) cosmology. [5]Using a perturbative argument, it was shown that f (ℚ) gravity propagates at least two additional degrees of freedom compared to standard GR.Thus, the lower bound for the total number of physical degrees of freedom is four.
To obtain an upper bound for this number, we will analyze the so-called kinetic matrix of the theory.We recall that given a system of second order partial differential equations for N fields Ψ := (Ψ 1 (t, x), … , Ψ N (t, x)) T , the kinetic matrix  is the matrix which multiplies the second order time derivatives, Ψ.By determining the rank R := rank() we can infer the smallest amount of constraints that are present in the theory, which in turn allows us to give an upper bound on the number of constraints and degrees of freedom, namely N − R and R, respectively.These facts are explained in more detail in the Appendix.
Here, we dive right into the determination of  and its rank in the case of f (ℚ) gravity.To begin with, we recall that the field equations of f (ℚ) gravity can be written as [12,18] The first set of equations is referred to as metric field equations, while the second set are the connection field equations.The tensor G  is the standard Einstein tensor with respect to the Levi-Civita connection, G  :=   − 1 2  g  , while the non-metricity conjugate reads 10 ) , ( and ℚ is the non-metricity scalar as defined in (2.11).Recall that the affine connection Γ   is postulated to be flat and torsionless, which means that it can be written as where   are four arbitrary functions of the spacetime coordinates.Notice that the metric field equations are symmetric in  and  and that the connection field equations have only one free 10 This follows from (3.47) for the parameter choice (2.18).
index, namely .Thus, we have a total of 10 + 4 field equations for 10 metric components and 4 connection functions   .Since these are tensorial equations, we have the freedom to apply diffeomorphisms which act simultaneously on   and g  .This realisation helps us in simplifying the discussion, because there is a particularly simple choice of gauge: The coincident gauge.As we have explained in section 2, the coincident gauge is always a viable choice within any theory based on a flat and torsionless connection, independent of any action principle or field equations.In using this gauge freedom, we have essentially fixed   and we are left with 10 + 4 equations for the ten metric components.
It seems that now we run into the problem of having an overdetermined system.However, this is not the case.Because the action functional (2.19) which defines f (ℚ) gravity is diffeomorphism invariant, one can show that the following Bianchi identities hold [41] : where  is the covariant derivative operator with respect to the Levi-Civita connection,   stands for the metric field equations (i.e., the left hand side of (4.1)), and   stands for the connection field equations (i.e., the left hand side of (4.2)).We conclude that when the ten metric field equations are satisfied,   = 0, the connection field equations are automatically satisfied by virtue of the Bianchi identity (4.5).It follows that the use of the coincident gauge "moves" all physical degrees of freedom into the ten metric components and that we only need to study the metric field equations in order to determine these degrees of freedom.
It is convenient to introduce a foliation of (, g  ), as explained in subsection 3.3, and to work with the ADM variables {N, N a , h ab }.Our task is now to isolate all second order time derivatives of lapse N, shift N a , and the intrinsic metric h ab in the metric field equations (4.1).Given the above form of the field equations, this is straightforward: (a) The non-metricity scalar ℚ is constructed solely from the non-metricity tensor Q  := ∇  g  and contains therefore only first order derivatives of the ADM variables.(b) The first term of the metric field equations, f ′ (ℚ) G  , contains only second order time derivatives of h ab , but not of N and N a .This is a well-known property of the Einstein tensor.(c) The second term, − 1 2 (f (ℚ) − f ′ (ℚ) ℚ)g  , is highly non-linear but it contains only first order derivatives of all variables.(d) The third term, 2f ′′ (ℚ)P 0   0 ℚ, contains second order time derivatives of all ADM variables.(Here we have omitted the terms P a   a ℚ since they are irrelevant to the present discussion).
Thus, the kinetic matrix  receives contributions from G  and from P 0   0 ℚ.Given this fact, it is convenient to write the kinetic matrix as the sum of two matrices, where  GR is the kinetic matrix of GR and  M encodes the modifications to the dynamics coming from P 0   0 ℚ.
To explicitly construct these matrices, we choose the following conventions: The second order time derivatives of our basic metric variables are collected into a column vector Ψ, defined as (4.7) The 10 × 10 matrix  acts on Ψ and generates the terms containing second order time derivatives in the metric field equations.
The components of the 10-dimensional vector  Ψ correspond to the 00, 01, 02, 03, 11, 12, 13, 22, 23, 33 components (in this order) of the field equations.It is now evident that  GR has the form where 0 m×n denotes a matrix of dimension m × n whose entries are all zero, and D 6×6 is a placeholder for the actual non-trivial 6 × 6 matrix which describes the second order time derivatives of h ab in the field equations.The upper left and upper right blocks of  GR simply express the fact that there are no second order times derivatives in the components G 0 , while the lower left block tells us that there are no second order time derivatives of lapse and shift in the space-space components G ab .A direct inspection of the space-space components of the Einstein tensor finally reveals that the lower right block matrix in  GR is generated by the term 1 2N which, using our conventions, translates into the matrix which has rank six.In order to determine the matrix  M , it is convenient to use the result obtained in ref. [26], namely that the non-metricity scalar ℚ in terms of ADM variables can be written as where the conjugate momenta are explicitly given by 11 ) 11 The momenta given here differ from the momenta given in (3.24) even for the case  = 1 because here we did not use any integrations by parts to simplify ℚ.In fact, we are not allowed to, since we are merely rewriting ℚ in terms of ADM variables, rather than the action itself.
Observe that the momenta conjugate to lapse and shift do not depend on any velocities, while  ab is linear in ḣcd .Therefore, the action of the differential operator P 0   0 on ℚ reduces to + terms involving first order time derivatives , (4.13) where we have introduced In this form, it is particularly simple to read off the matrix  M .First off, one notices that the ten rows of  M can be written as a single row vector multiplied by ten different pre-factors.Concretely, one finds This immediately tells us that  M has less than full rank.In fact, since all rows are linearly dependent, the rank of  M is just one!We can now put together all pieces and we find that the kinetic matrix can schematically be written as where R  and R  denote the row vector composed of the first four and last six entries of R, respectively, and the arrow symbolizes the process of eliminating linearly dependent rows, which is a necessary step in the determination of 's rank.If f ′′ (ℚ) = 0 with f ′ (ℚ) ≠ 0, the contribution of  M is absent and the kinetic matrix has rank six.This is what one would expect, since the case f ′′ (ℚ) = 0 with f ′ (ℚ) ≠ 0 simply corresponds to recovering GR and rank six means there are 10 − 6 = 4 constraints.These are precisely the constraints that arise from having non-dynamical lapse and shift fields.The more interesting case is when f ′′ (ℚ) ≠ 0, since this leads to modifications.In this case it follows that the rank of  is seven, which conversely implies the presence of three constraints and it places an upper bound on the degrees of freedom: There are at most seven propagating degrees of freedom in f (ℚ) gravity.
In conclusion, we point out that the analysis performed in this section is completely independent of the Hamiltonian analysis which we attempted in section 3. The presence of partial differential equations for the Lagrange multipliers we encountered there cast doubt on the results obtained in ref. [23].The analysis of this section finally conclusively shows that the result obtained in ref. [23], namely that there are eight degrees of freedom, can not be correct.

On a Subtlety Related to Integration by Parts in f (ℚ) Gravity
It might seem puzzling at first that the non-metricity scalar (4.11)  written in ADM variables is only linear in Ṅ and Ṅa , while the metric field equations (4.1), as we have shown in the previous subsection, contain the second order time derivatives N and Na .One would naively expect that a Lagrangian which is quadratic in some velocities but only linear in others does not lead to field equations which contain second order time derivatives of all variables.In fact, in STEGR, where the Lagrangian is  = √ −g ℚ, the field equations contain second order time derivatives of h ab , but not of N and N a .This is consistent with the usual expectations.
Why then do we obtain second order time derivatives of all variables in f (ℚ) gravity?The simple and qualitative answer is that even tough Ṅ and Ṅa appear only linearly in ℚ, the Lagrangian  = √ −g f (ℚ) is in general a non-linear function of ℚ.Thus, in general, Ṅ and Ṅa do in fact not appear linearly in the Lagrangian and it has to be expected that the field equations contain second order time derivatives of lapse and shift.Quantitatively, this can be seen as follows: The Lagrangian of f (ℚ) gravity can schematically be written as where the dots stand collectively for the other ADM variables and their derivatives.If we vary the action with respect to N, we obtain where f ′ stands for df dℚ , as always.In order to properly derive the field equations, we have to perform an integration by parts which removes the time derivative from  Ṅ.As usual, we can drop boundary terms arising from this integration since N vanishes on the boundary and we simply find (4.21) By demanding that this variation vanishes for all N in the bulk, we obtain the field equation which contains second order time derivatives of N because of the term f ′′ Q.It is also evident that this is due to the non-linear nature of f , as explained before.For f ′′ = 0, i.e., if f is a linear function, then the second order time derivatives disappear from the field equations.

Conclusion
The concept of symmetry lies at the very core of theoretical physics, although the specific geometric representation employed to illustrate it is subject to conventional choices.The true triumph of GR never resided solely in the geometrization of gravity, but rather in its unification with inertia.This unification, epitomized by the equivalence principle, signifies that gravity can always be locally eradicated through a change in coordinate systems.Simultaneously, a fundamental characteristic of gauge theories emerges, wherein a gauge field force can be made to locally vanish when its field strength is zero.We may ponder that the coincident GR, which purges both torsion and curvature from gravity, could be the theory of gravity, embracing the astounding accomplishments of the gauge principle in the realm of particle physics.This formulation of GR based on non-metricity represents also a unique extension of gravity with interesting cosmological and astrophysical implications.Theories based on f (ℚ) give rise to accelerating universes and hairy black hole solutions.As a modification of GR, it naturally contains additional degrees of freedom.How many these are still remains an open question.Cosmological perturbation analysis puts a lower bound of four degrees of freedom.Previous studies in the literature unfortunately blindly applied the Dirac-Bergmann algorithm, not realizing that f (ℚ) gravity breaks one of the basic assumptions of the algorithm, and erroneously claimed that the theory propagates eight degrees of freedom.In this work, we explicitly showed that the Hamiltonian analysis based on the Dirac-Bergmann algorithm fails in f (ℚ) gravity.We have identified the reason for this failure and we argued that other teleparallel theories of gravity based on either torsion or non-metricity likely suffer from the same shortcoming.As an alternative approach, we studied the kinetic matrix of f (ℚ) gravity in ADM variables.This allowed us to put an upper bound of seven degrees of freedom.

Appendix: Elements of the Mathematical Theory of 1 st and 2 nd Order PDEs
In the main text we made use of two tools stemming from the theory of partial differential equations (PDEs): The condition (3.45), which tells us when a first order PDE has a unique solution, and the kinetic matrix and its various properties.In this appendix we provide the necessary mathematical background to comprehend these tools and how to work with them, however, without being mathematically too precise.Intuition plays a more central role for us here.More details on the subject of PDEs and, in particular, a rigorous formulation which can withstand the scrutiny of the mathematically inclined reader can be found for instance in the book. [42]1.First Order PDEs and the Solvability Criterion We are concerned with first order PDEs for N variables in D spacetime dimensions.Let us denote the N variables by the Ndimensional vector Ψ := (Ψ 1 , … , Ψ N ) ⊺ .Its components are functions of the D spacetime coordinates x  = (x 1 , … , x D ).What we are interested in is the so-called initial value problem for first order PDEs.Specifically, we want to prescribe initial data on a (D − 1)dimensional Cauchy or initial value hypersurface  and answer the question, whether a given PDE together with initial data on  uniquely determines Ψ everywhere in the spacetime.Qualitatively speaking, if we have a first order PDE, we have to integrate once to find Ψ, which means there will be one "integration constant".Thus, the initial data we have to prescribe in order to fix the "integration constant" is the "value" of Ψ on .We write this symbolically as Ψ|  = F, where F is a vector field which represents the data we have chosen.
From a physicist's perspective, we can think of the initial value problem as follows: We are interested in a certain field Ψ and we assume we have measured, or that we know through some other means, that at a certain instant of time this field is given by F. This instant of time is the (D − 1)-dimensional hypersurface .Given this knowledge, can we predict how Ψ will evolve into the future of ? Similarly, can we retrodict how Ψ behaved in the past of ?In other words, can we determine Ψ everywhere in the spacetime given that we know its field equations and given that we know that on  it is equal to F?
There is a surprisingly simple criterion, which we will derive.The starting point is the fact that so-called quasi-linear first order partial differential equations 12 for Ψ can be compactly written as where M (i) with i ∈ {1, … , D} are D matrices of dimension N × N, L is also a N × N matrix, and V is a N-dimensional vector.All i.e., because we do not know the value of Θ( 1 , … ,  a , … ,  D−1 , ), since Θ is evaluated at a point which is not on .Equation (A.5) also gives us another insight: If we are able to determine Θ  |  , then we can determine Θ in a neighbourhood of .In fact, if we know Θ  |  , then it follows from (A.5) that Thus, we can formally integrate the equation and determine Θ also away from .This is where the PDE (A.1) enters the game.In fact, in the new coordinate system, we can re-write the PDE as where M(i) , L, and Ṽ are the same matrices and vectors as before, but expressed in the new coordinate system.Now observe that if we evaluate this equation on , we obtain schematically It follows that if we are able to solve this equation for Θ  D |  , we can integrate the PDE (A.1) and determine Θ (which is the same as determining Ψ) also away from the surface .Thus, determining whether the initial value problem (A.1) has a unique solution or not is reduced to a problem of linear algebra: One has to determine whether the matrix ∑ D i=1 M(i)  x i is invertible.Recall that  has a normal vector n := ∇, whose components are simply given by n i =  x i for i ∈ {1, … , D}.Furthermore, a matrix is invertible if and only if its determinant is not zero.It follows that the initial value problem admits a unique solution in a neighborhood of  if an only if det This is precisely the solvability criterion (3.45) we have used when analyzing the PDEs (3.42) for the Lagrange multipliers, which emerged from the Dirac-Bergmann algorithm.In (3.45), we found that for that particular PDE the determinant is zero for all x, no matter how one chooses the Cauchy surface.Hence, it is never possible to find a unique solution to that PDE.One can show that the rank of the matrix ∑ 3 i=1 M (i) n i is two, which means that after performing linear operations on the PDE (3.42) (such as adding/subtracting equations from each other or multiply/divide them by certain functions) one can bring the PDE into the form Evidently, in order to determine  z  1 and  z  2 , one has to prescribe  z  3 by hand.This holds in full generality: If the rank of the matrix ∑ D i=1 M(i) n i is r < D, then it is necessary to prescribe D − r of the partial derivatives Θ  in order to be able to determine the remaining r derivatives from the PDE and the initial data.
We close in remarking that this is a common feature in gauge theories.Also, further examples of how to apply the tools introduced here can be found in the Appendix A.2 and A.3 of ref. [43].

A.2. Second Order PDEs and the Kinetic Matrix
The results of the previous subsection can be readily generalized to higher order PDEs.In particular, the Cauchy or initial value problem for a quasi-linear second order PDE can be written as Since now it is necessary to perform two integrations in order to determine Ψ, we have to specify two initial value functions on .
As before, we have to specify Ψ on , but we also have to say what the derivative of Ψ in the direction orthogonal to  is.That is the meaning of n    Ψ|  , where n  is the unit normal vector to .By following the same ideas and steps as in the previous subsection, one can prove that the initial value problem (A.11) has a unique solution if and only if det This is the obvious analogue of Equation (A.9) which we found for first order PDEs.In practice, it is often more convenient to write the initial value problem (A.11) in a slightly different and more intuitive form.First of all, we assume that we can pick out a time coordinate from x  = (x 1 , … , x D ).Our convention is to set x D = t and call it the time coordinate.The coordinates (x 1 , … , x D−1 ) are consequently referred to as spatial coordinates.Furthermore, we define Ψ :=  t Ψ and we take the Cauchy surface  to be a t = const.surface.This latter choice in particular implies that n    Ψ|  = Ψ|  = G.This is essentially the physicist's version of the mathematically more general formulation presented in the previous subsection.Given this split of spacetime into space and time, we can also divide the matrices M (ij) into matrices with time-time indices, space-time, and space-space indices.Concretely, the initial value matrices  (ij) of dimension N × N, which we call potential matrices.
It is easy to see that in the coordinates we have chosen, the solvability condition (A.12) simply reads det ( (x) ) ≠ 0 f o r a l l x ∈  .(A.14) Thus, we immediately see why the kinetic matrix is so important in determining the number of degrees of freedom propagated by second order PDEs: If  has full rank, it also has a non-vanishing determinant and we can thus solve for all Ψ's, i.e., we can integrate the second order PDEs.If, however,  does not have maximal rank, its determinant is also zero and the PDEs cannot be integrated.This means we are not able to determine all the fields Ψ just from the initial data and the PDEs.More input is needed.
To see this, we simply notice that we can always perform linear operations (addition/subtraction of equations, multiplication by real numbers/functions) on our system of second order PDEs (A.13).Using such operations, we can always bring the system into a form in which the kinetic matrix has a row-echelon form.We denote the row-echelon form of  by .If the rank of  is r := rank() < N, its row-echelon form will consist of r nontrivial and linearly independent rows, while the last N − r rows are filled with zeros: This means we can solve the PDEs for the first r second order time derivatives of Ψ, namely for Ψ1 , Ψ2 , … , Ψr .However, there are no equations which determine Ψr+1 , Ψr+2 , … , ΨN .Rather, it is possible that these second order time derivatives appear on the right hand side of the expressions one obtaines for Ψ1 , Ψ2 , … , Ψr .Thus, the initial data and the PDEs are not sufficient to determine Ψ1 , Ψ2 , … , Ψr , we also have to specify the fields Ψ r+1 , Ψ r+2 , … , Ψ N everywhere by hand.The row-echelon form of  also tells us that we are not completely free in specifying the initial data.In fact, since the last N − r rows of  are zero, this implies that the last N − r equations of our system of PDEs contain at most first order time derivatives or no time derivatives at all.This follows from the fact that after performing the linear operations necessary to bring  into its row-echelon form, the matrices  (i) will in general not have their last N − r rows filled with zeros.Thus, these equations represent constraint equations.The initial data have to respect these constraints.
Moreover, we have to study how these constraints behave under time evolution.Concretely, we are allowed to take time derivatives of our PDEs and this can potentially generate new constraint equations.The mechanism can be sketched as follows: We assume that the system (A.13) has been manipulated such that  is in row-echelon form.Its last N − r equations are thus constraint equations.By taking time derivatives of these constraints, we obtain new equations of the form That is, we obtain second order time derivatives of Ψ.By taking spatial derivatives of the first r equations of (A.13) we obtain schematically   i Ψ + terms linear in  t Ψ = 0 .(A.17) Under the right circumstances, it can happen that Equation (A.17) can be used to eliminate  i Ψ terms in Equation (A. 16) and that we are left with new equations which are first order in time derivatives.Hence, it can happen that we encounter new constraint equations.Even tough we have not worked out the precise conditions under which new constraints can appear, the argument we sketched above hinges on the presence of  i Ψ terms in the time derivative of the constraint equations.These terms can only appear if  (i) are non-trivial matrices.If, however, the matrices  (i) are absent, it is not possible to generate new constraints.This suggests the following approach: Find a field-redefinition such that the PDEs for the new fields contain no  (i) matrices.Then, the number of dynamical degrees of freedom can be directly determined from the rank of .
Notice that these field redefinitions can not be linear transformations, since no linear transformation can ever eliminate a matrix.These redefinitions are highly non-trivial, but they can be found as follows: Assuming that the PDEs (A.13) stem from an action principle, write the corresponding Lagrangian in the form This can be written even more compactly using a single "master matrix", namely  (1) 2 (11)  (12) ⋯  (1d)  (2)  (12) 2 (22) ⋯  (2d) ⋮ ⋯ ⋮  (d)  (1d)  (2d) ⋯ 2 (dd) where we have introduced d := D − 1 for brevity.Diagonalizing this master matrix is tantamount to finding a field redefinition such that the mixing matrices  (i) are all absent.Moreover, the kinetic matrix is then automatically diagonal and the number of propagating degrees of freedom can be read off by counting the non-zero diagonal elements.

Figure 1 .
Figure 1.Each primary sector (first column) is defined by the vanishing of certain combinations of the parameters c 1 , c 2 , c 3 , c 4 , and c 5 (columns two through six).For brevity, and following,[27] we have defined c := c 1 + c 2 + c 3 + c 4 + c 5 , ĉ := 2c 1 + c 2 + c 4 , and c 35 := 2c 3 + c 5 .The vanishing of these parameters (or the vanishing of combinations thereof) corresponds to the appearance of one or more primary constraints (the number of independent primary constraints is shown in the last column).STEGR is contained in sector V.

Figure 2 .
Figure2.Each primary sector (first column) is defined by the vanishing of certain parameter combinations (columns two through five).The vanishing of one or more than one of these parameter combinations corresponds to the appearance of one or more than one primary constraints.The number of independent primary constraints is listed in the last column.TEGR is contained in sector V.This table was inspired by,[34] but uses a slightly different nomenclature for the primary sectors.


(ij)  (x)  i  j Ψ +lower order terms = 0Ψ|  = F Ψ| | = G , (A.13)where  is the N × N kinetic matrix,  (i) are D − 1 matrices of dimension N × N which we refer to as mixing matrices (since they mix spatial and temporal derivatives of Ψ, i.e., derivatives of the form  i Ψ), and finally we have theD(D−1) 2 Fortschr.Phys.2023, 71, 2300185