Ising Machines for Diophantine Problems in Physics

Diophantine problems arise frequently in physics, in for example anomaly cancellation conditions, string consistency conditions and so forth. We present methods to solve such problems to high order on annealers that are based on the quadratic Ising Model. This is the intrinsic framework for both quantum annealing and for common forms of classical simulated annealing. We demonstrate the method on so-called Taxicab numbers (discovering some apparently new ones), and on the realistic problem of anomaly cancellation in $U(1)$ extensions of the Standard Model.

shall do this by focusing on two Diophantine tasks.
The first is the purely number theoretic one of finding what we will refer to generally as "Taxicab" numbers, namely those numbers that can be expressed in more than one way as sums of equal powers. The most famous example is the number of Hardy and Ramanujan's eponymous taxi, Ta(2) = 1729. This is the smallest of the following list of numbers, all of which are expressible as the sum of two cubes in two different ways: We shall use the notation (k, m, n), to refer to such numbers, where k is the power, while m and n are the number of terms on each side. Thus Ta(2) = 1729 is defined to be the smallest (3, 2, 2) number, while Fermat's theorem is the statement that (k, 1, 2) numbers only exist for k = 2. Here we will develop annealing methods to determine the above list of (3, 2, 2) numbers (where we consider all numbers in the list to be of interest not just the smallest). We also test our methods on several variations, namely (4,3,3), (3,1,5), (3,1,7), (3,6,6), (3,7,7), (3,8,8). Examples of most of these are known, and can be found in Refs. [17][18][19][20][21], although some were discovered only with the advent of high performance computing and appeared relatively recently. However some, such as (3,7,7) and (3,8,8) numbers, do not seem to have been known before. (Indeed as we shall see the latter represent solutions in a search space of size ∼ 10 24 .) The second task that we will consider is the physical one of finding solutions to the anomaly cancellation conditions of a typical extension of the Standard Model of particle physics. In four space-time dimensions this is as mentioned also a cubic (i.e. homogeneous, third order) Diophantine problem, in which the integers correspond to the numerators of rational gauge charges. (In 2d dimensions the equations are instead order d + 1). In this case as well, we will find that a sufficiently well-crafted encoding onto an annealer allows one to solve for anomaly cancellation in the systems considered in Refs. [3][4][5], at an already relatively advanced level, and in a very short time, and certainly without having to perform any kind of exhaustive scan.
Although we will not consider 't Hooft anomaly matching explicitly in our discussion, it is worth mentioning that this nonperturbative procedure is morally a generalisation of the Ta(2) Taxicab problem. That is, when two particle theories are duals of each other, then by 't Hooft's argument they must realise the same set of global anomalies in two different ways. Similarly physical trialities and quadralities are generalisations of the Ta (3) and Ta(4) Taxicab problems (which realise the same number in 3 and 4 different ways respectively). Such systems exist but are comparatively rare [22][23][24].
Constructing suitably efficient encodings for these sorts of problems requires significant advancement. Firstly as we shall review we are interested in the kinds of set-ups found in quantum annealers, in which problems are encoded in the Hamiltonian of a quadratic Ising model, which must then be minimised to solve the problem. Thus the crux of the matter in encoding a non-trivial system of cubic-and higher-order Diophantine equations is to implement a reduction procedure that can represent the complete system as a single loss-function represented by a spin-Hamiltonian that is at most quadratic. Of course Diophantine problems, in particular factorisation, have been considered on Ising model annealers before [25][26][27][28][29][30], along with quadratic systems of polynomial equations [31,32]. However both these problems can be mapped into the optimisation of an order four Hamiltonian which can in turn be reduced to a quadratic Ising model suitable for a quantum annealer, with only two layers of reduction. By contrast, here we will be considering problems of order much higher than two. To accomplish this, we use a procedure to automate the reduction of an arbitrary order Hamiltonian to a quadratic one. This procedure, which iterates that first appearing in Ref. [33] and then more recently in Refs. [15,26,34,35], is completely problem-independent and therefore potentially applicable to any set of Diophantine equations. It can perform the many layers of reduction required to reach a quadratic spin-Hamiltonian representation of the high order problems we will be considering.
Furthermore, the integers in the Diophantine equations are naturally encoded using a binary representation, which is what ultimately will enable the procedure to be much more efficient than a systematic scan. However this in turn leads to large and highly connected Ising model encodings of the Diophantine system of equations. Thus to improve efficiency we introduce a technique we refer to as solution-mining. This innovation begins by finding one solution in the conventional manner. A random perturbation from this solution then serves as the new starting point for the next run. From there the system often tunnels to a new solution nearby, and when it does so this serves as the next starting point, and so on. This approach appears to be effective for problems, like anomaly cancellation, that contain many coupled Diophantine equations, and it allows the domain of solutions to wander in the parameter space. We shall describe these techniques in Section II.
The rest of the paper is organised as follows. In Section III we begin with a warm-up problem which is to solve a single Diophantine polynomial equation with two variables. We use this simple problem to discuss how annealing is implemented on a quantum annealer, and to compare the performance of currently available quantum annealers with classical simulated annealers. Our conclusion here is that quantum annealers are only just beginning to catch up with classical simulated annealing in terms of the complexity of problems that can be embedded and solved, but when they do become competitive the potential speed-ups could be substantial for this kind of problem. Then Sections IV and V consider the Taxicab problems, and anomalies respectively.

II. METHODS FOR ENCODING DIOPHANTINE PROBLEMS
The general principle behind an Ising machine is to solve problems by reformulating their solution as the minimisation of a function H of spin variables σ = ±1, where labels the spin sites. On a quantum annealer this so-called problem-Hamiltonian, H, is forced (by the physical architecture of the annealer) to be quadratic in the spins, where the spins would of course correspond to physical qubits. We can postpone further discussion of the physical realisation of the system (which will be described later) and focus on the embedded abstract problem which will apply to any annealer of this kind. Specialising to the present case, we are interested in solving a set of polynomial Diophantine equations f A (t i ) = 0, where t i ∈ Z are the would-be integer solutions to the problem of interest. These integers will be encoded on the annealer in a binary format, namely we will use the following encoding: We use τ to denote the binary variables corresponding to a given spin, with τ i,k ∈ {0, 1}, and where we allow classical integer shifts, s i ∈ Z. These shifts can for example be negative to allow the domain to include negative integers or, as will be the case for solution-mining, they can be adjusted iteratively to explore the search space.
We would like to use such an encoding to solve the Diophantine equations, and this can in principle be done by finding the minimum of a loss-function Hamiltonian, In addition one may wish to add several constraints: for example the Taxicab numbers are usually defined as the smallest numbers expressible in two different ways. Such conditions can be included with a constraint Hamiltonian, H C , which might in the case of the Taxicab numbers be simply the numbers themselves (since they are positive). Thus we begin with an idealised (i.e. nonquadratic) system, Note that solutions to the Diophantine equations all have H D = 0, so that constraints imposed by H C would independently select the preferred solution. However the converse is generally not true: that is one should avoid over-weighting the constraints H C such that competing minima appear that have lower H C but H D = 0. Of course in many cases the desired solutions are very rare, so it is often much more efficient (or more precisely not an NP-hard problem) to simply apply any desired constraints by post-processing the solutions (e.g. for the Taxicab numbers, one could simply select by hand the smallest number).

A. Reduction
For a Diophantine system containing order-d polynomials in t i , the raw HamiltonianH in Eq. (6) is an order-2d polynomial in the spins, σ . For example the Ta(2) problem yields an order-6 spin-polynomial. Therefore we now arrive at the task of transformingH into an equivalent quadratic problem-Hamiltonian, H, that is guaranteed to have the same minima, but which can be implemented on the annealer.
For this task we shall use the reduction method described in the Appendix of Ref. [35]. This method works by introducing auxilliary spins 1 to represent pairs of spins in the original Hamiltonian of Eq. (6), and is one of the many methods in the comprehensive survey of Ref. [34]. (We should remark that there exist "qubit-saving" reduction methods that do not require auxilliary spins, but these are more task specific and currently appear to be restricted to reduction of terms in the Hamiltonian with products of 3 or 4 spins [26].) The method works as follows. We begin with the raw polynomialH(σ ) written as a function of binary variables using Eq. (4). SupposeH has terms involving products of two binary variables τ 1 and τ 2 . Now consider adding to the polynomialH a quadratic term that involves the binary variables together with a new auxiliary variable τ 12 , which is of the form Q(τ12; τ1, τ2) = Λ(τ 1 τ 2 − 2τ 12 (τ 1 + τ 2 ) + 3τ 12 ) . (7) Inspection shows that a sufficiently large and positive overall coupling Λ enforces τ 12 = τ 1 τ 2 . Importantly the minimum at this point has Q = 0. Therefore we may replace the product τ 1 τ 2 with τ 12 wherever it appears withinH, and the new Hamiltonian is guaranteed to have the same set of minima as the originalH. Therefore the process can be iterated until one arrives at the desired problem-Hamiltonian which is quadratic in spins, and which is schematically of the form with the constraints imposed by the Q terms ensuring that this quadratic Hamiltonian has the same minima as the original order-2d polynomial. We can check that the reduction works correctly with the example order-3 Hamiltoniañ where we drop the constant −1 in translating to the binary variables. This Hamiltonian has 4 minima at σ 1 σ 2 σ 3 = −1 (which corresponds in binary language to any one of the τ being zero, or all of them), as opposed to the seven solutions to τ 1 τ 2 τ 3 = 0. As described above we can reduce the trilinear term by trading τ 1 τ 2 for an auxiliary binary τ 12 and adding the Hamiltonian Q(τ12; τ1, τ2). The quadratic problem-Hamiltonian (in QUBO language) is then It is easy to verify that provided Λ > 2 the original 4 degenerate solutions hold in the new combined Hamiltonian as required.
With the increasing complexity of the raw Hamiltoniañ H and a limited number of physical qubits at our disposal, we of course aim to find a reduction procedure that minimises the number of auxiliary variables. Therefore, the central question is how can we choose the smallest set of spin pairs that correctly collapses all the higher order terms to quadratics? In the case of cubic to quadratic, finding a spin-optimised procedure is equivalent to the set cover problem which can in turn be cast as 0-1 ILP [36]. Both set cover and 0-1 ILP are well known to be NPcomplete by analogy with vertex cover [37]. Therefore, generalising this to arbitrary order Hamiltonians would recast our spin-optimised problem into a task which is at least equivalent to solving k − 2 NP-complete problems, where k is the order of the Hamiltonian and therefore k − 2 are the required layers of reduction. For this reason, we shall use a different approach based on a simple greedy algorithm which works as follows: at each reduction stage it finds the pair of binary variables τ i τ j that appears most often in the Hamiltonian; wherever the pair appears, we replace τ i τ j with the auxilliary logical spin τ ij , and add the penalty term in Eq. (7). The quadratised Hamiltonian is constructed by repeating these three steps iteratively. In the language of set covering, this is equivalent to the greedy heuristic algorithm first proposed in Ref. [38]. In Fig. 1 we have collected three plots which show how the average number of required auxilliary variables grows as we increase the number of cubic interactions, the rate of this growth in the linear central region and the time required to perform the reduction as a function of the number of cubic couplings.  These results are very similar to those obtained in Ref. [36] where the optimal reduction is found by solving exactly the equivalent 0-1 ILP. Two important remarks are in order. First, as we can see in Fig. 1a, both methods saturate approximately when the Hamiltonian contains all possible cubic interactions with n qubits, namely when N 3-couplins ≈ n 3 . Second, we see that in either case the number of auxilliary spins increases linearly with the square root of the number of cubic interactions and the growth rate is given by the square root of the total number of spins, as shown in Fig. 1b. Nevertheless, as we can see in Fig 1a, in some regions our procedure uses a number of auxiliaries that is larger than the value at saturation, especially in the n = 12 and n = 13 cases. This is of course due to the fact that we are not seeking an exact solution of the spin-optimised reduction problem. Indeed, a local optimal choice of our reduction algorithm does not necessarily lead to a global minimum in terms of the number of auxilliary spins. More precisely, in the language of the equivalent set covering problem, it has been shown in Ref. [38] that this greedy algorithm returns to an approximate solution which cannot be bigger than H(n) times the minimum one, where H(n) is the n-th harmonic number and n the size of the set to be covered (namely in our case the set of all higher order couplings). However, in the problems treated below, the greedy algorithm we adopt returns quadratised Hamiltonians with at most ∼ 300 logical spins, far below the limit imposed by for example the number of available qubits in the currently accessible quantum annealers, making it unnecessary to solve the problem exactly. Finally, we should remark that this procedure is straightforwardly generalisable to Hamiltonians of arbitrary order, requiring a number of steps which grows roughly linearly with the size of the problem, as we can see in Fig. 1c and also in Ref. [38]. As expected, this is in contrast with the exact method discussed in Ref. [36] which shows an exponentially increasing amount of time with the increasing complexity of the Hamiltonian. Reduced Hamiltonians can be represented using connected graphs in which nodes correspond to spins and links to couplings. As an example, Fig. 2 is a representation of the quadratised Hamiltonian associated to the first Diophantine equation in Table II.

B. Solution mining for improved performance
Given the increase in difficulty with bitnumber, β, we will utilise a method for improving the performance. This method allows one to explore larger regions of parameter space (i.e. larger integers) without increasing β, yielding in turn solutions with larger values.
The method operates iteratively, by at each run constructing a brand new Hamiltonian from the previously found solutions. At say the k-th iteration, we minimise the Hamiltonian looking for solutions of the form where k = 0, ..., N (with N being the total number of anneal runs), and where s k i is a classical shift that centres the new search, which is determined from a solution found in the (k − 1)-th run: if we designate the previous solutiont k−1 i , then the {s k i } are chosen such that based on a random choice. This procedure finds new solutions by performing a kind of "random tunnelling" from previously found solutions (hence the name "solution-mining"). It generally operates well when there are many variables in the system and many different equations, because in such systems the solutions can be relatively close in each dimension of the search space (even though the total Hamming distance could be very large due to the large number of dimensions). For the two specific example problems we are discussing here, it is not a useful enhancement for finding Taxicab numbers because there one is seeking the smallest numbers, and (as we shall see) the solutions to the Diophantine system are very widely spaced. However for solving anomaly equations the method is a significant improvement. In such systems, new solutions to the anomaly equations tend to appear with consistent frequency when the allowed charge size is increased, and it is the sheer number of anomaly equations and charges that makes the problem difficult.
It should be noted that there is no additional cost for solution-mining because even though a brand new Hamiltonian must be constructed at each stage, the embedding graph remains the same if the values of β do not change. This means we construct an entirely new HamiltonianH, but do not need to perform a new reduction of the solution. On a quantum annealer we perform reverse annealing (to be explained below) in order to collect the solution and construct the new Hamiltonian at each stage, which then simply has to be translated into new couplings via the updated {s k i } values.

III. QUANTUM VERSUS CLASSICAL ANNEALERS FOR DIOPHANTINE EQUATIONS
In this section we shall compare quantum and classical annealers to solve Diophantine equations, using the encoding methods described above on a simple warm-up problem. It has been already shown in Ref. [31] that a quantum annealer can successfully be used to solve second order systems of polynomial equations. On the other hand, simulated annealing and variations thereof have been extensively applied for similar purposes (see Ref [39,40]) along with other techniques such as genetic algorithms, particle swarm optimisation and so on (see for example [41][42][43][44]). In the following we shall see that quantum annealing is also successful in solving equations of order higher than two, retrieving most of the results found using Particle Swarm Optimisation in Ref. [42] and Fuzzy Adaptive Simulated Annealing in Ref. [39]. We shall find that simulated annealing on the quadratised Hamiltonian can be used to solve problems which are still hard to embed in a quantum annealer due to technological limitations (restricted number of qubits and limited connectivity). Such problems will realistically be solvable in the near future with annealers that have higher connectivity but still with at most quadratic interactions.
Let us first outline the central features of the quantum annealer that we will need for this study (for a comprehensive review see Ref. [15]). The quantum annealer is defined by a Hamiltonian of physical qubits of the form where H is the problem-Hamiltonian in Eq. 2, σ , Since H(1) = H(σ ), the usual quantum annealing strategy is adiabatically to adjust the pre-factors A(s) and B(s) using the parameter s, such that the system ends up in a global minimum of H. The time dependence of s(t) is defined by the user in a so-called annealschedule. Thus during a reverse anneal for example one begins at s = 1 and with the system set in any eigenstate of σ ,z . Then one allows the system to evolve quantum-mechanically by bringing s to small values using a piecewise-linear function s(t) of time t, which completes back at s(t final ) = 1 when the measurement of final spins is made.
A technological limitation is that the connectivity of the quantum annealing device in terms of the allowed non-vanishing couplings J m between qubits is limited. Let us be specific to the architecture we will be using in this work, namely D-Wave's [45] Advantage_system4.1: this annealer contains 5627 qubits, connected in a Pegasus structure, but only has a total of 40279 couplings between them.
The warm-up problem that we will use to compare this kind of annealer with classical simulated annealing is simply to find solutions of a generic Diophantine equation where x 1 , ..., x N ∈ Z and f : Z N → Z is a generic order k polynomial function. In order to solve such an equation, we again square it to form the problem-Hamiltonian: The next step is to binary encode x 1 , . . . , x N as in Eq. (3), choosing the values of s i and β depending on the specific problem.
The above Hamiltonian is therefore an order-2k polynomial in τ i,j . Nevertheless, regardless of how high the order is, it can always be reduced to a quadratic Ising Hamiltonian making use of the procedure described in Sec. II A. Using the D-Wave's quantum annealer we find all the solutions listed in Table 2 of Ref. [42] (which have also been found using Fuzzy Adaptive Simulated Annealing in Ref. [39]). We report some of them in Table I.

Equation
Solution Naux β  Table I. A selection of Diophantine equations and corresponding solutions found using the D-Wave's quantum annealer.
Naux is the number of auxilliary qubits necessary to quadratise the Hamiltonian, and β is the number of qubits used to encode each variable (see Eq. (3)).
To test this method further, we move on to Diophantine equations with increasingly many variables. As one would expect, the higher the number of variables, the higher will be the average number of interactions per qubit. This often means that when the connectivity of the problem exceeds the native connections supported by the D-Wave Quantum Processor Unit (QPU), a single binary variable in the quadratic optimization problem needs to be represented by two (or more) qubits (called a 'chain') instead of one. This procedure, known as embedding is carried out by an embedding algorithm, and should be carefully monitored as it can lead to so-called broken-chains that have two or more physical qubits in the same chain taking different values. This ultimately limits the size of problems that can be solved on quantum annealers, while performing classical annealing on the same Ising Hamiltonian turns out to be successful in all the examples treated in Table II, where we list the equations solved using both classical (cyan) and quantum (blue) annealing.

IV. RAMANUJAN (TAXICAB) NUMBERS
Having demonstrated that problems such as these can in principle be already solved on a quantum annealer, we are now ready to move on to the more complicated class of problems outlined in the introduction, beginning in this section with Taxicab numbers. In line with the above discussion, classical simulated annealing turns out to be superior currently for these problems, so we will use that annealing method in this and the next section.
In general, finding Taxicab numbers is not a trivial task and for higher Taxicabs, such as Ta (7), Ta(8) etc., only an upper bound is known [46]. Indeed (using the (k, m, n) notation for these numbers), it is interesting to note that no (5, 2, 2) numbers have been found, despite searches up to 10 26 (see Ref. [17]).
Let us show explicitly how we use the reduction technique of Subsection II A to construct an Ising Model Hamiltonian whose ground states are precisely the Taxicab numbers we want to find. As a first example we focus on Ta(2), i.e. we want to find four non-negative integer numbers such that We again use binary encoding (see Eq. (3)) with β = 5, s i = 1 (numbers from 1 to 32) and t i ∈ {a, b, c, d}. To impose the equality between the two sums of cubes we define the following Hamiltonian However, this is not the end of the story as must also encode the constraint a = {c, d} to avoid all the trivial minima of the above Hamiltonian, which occur when a = c and b = d or vice versa. In other words we want to construct the H C Hamiltonian such that it has its global minimum when a = c and a = d. It is more straightforward to write such a constraint Hamiltonian directly in terms of binary variables τ i,k , where i ∈ {a, b, c, d} and k = 0, ..., β − 1. It is easy to see that the Hamiltonian achieves this. Explicitly, one finds that The Hamiltonian we shall use is then the sum Written is terms of τ 's, this Hamiltonian is a polynomial of order 2β for β ≥ 3. Again, setting β = 5 and using the technique described in Sec. II we can reduce it to a quadratic Hamiltonian by adding 98 auxilliary spins. In Fig. 3 we represent the reduced Hamiltonian for β = 4. We see that stronger couplings are rare among the interactions, which mostly form a very complex network of weaker couplings in the background. Classical annealing on the reduced Hamiltonian yields all the solutions written in Eq. (1), namely all the Taxicab numbers with a, b, c, d ≤ 32.
Let us now push this further and attempt to solve more complicated generalisation of the Taxicab problem, (k, m, n) where where {a 1 , ..., a m } = {b 1 , ..., b n }. Beginning with (4, 3, 3) numbers, we define the following Hamiltonians and to impose the equality in Eq. (22) and also to force a = d, e, f . The order of the complete Hamiltonian, which is the sum of Eq. 24 and Eq. 23, is 2β for β ≥ 4. Again, it can be reduced to a quadratic one by adding 66 auxilliary variables in the case with β = 4 and 154 in the case with β = 5. Several anneal runs (each with 10000 reads) with β = 4, 5 yield the following results  Table III. List of (4, 3, 3) numbers found using β = 3, 4, 5 and 10000 reads per anneal run.
To comment on the efficacy of the method: the search space is of order 32 6 ∼ 10 9 , and yet these solutions are found after order 10 5 reads.
For the remainder of this section we consider the (3, n, m) numbers, where n, m ∈ N + . For this purpose we define the following Hamiltoniañ where in this case we do not add any constraint Hamiltonian to enforce {a i } = {b i } when n = m, because here it is sufficient to simply check at the end of each anneal run if the minimum is trivial or not.
(3,1,7) a1 b1 b2 b3 b4 b5 b6 b7 2744 14 2 3 5 7 8 9 10  13824 24 3 5 8 9 13 15 19  32768 32 1 6 15 16 17 20 23  148877 53 3 21 24 28 29 32 36  205379 59 5 12 13 18 23 43 47  238328 62 17 20 22 31 32 38 46   Table V. A list of (3, 1, 7) numbers found using β = 5, 6. The reduction needs 160 and 288 auxilliary spins respectively.  Note that all the above solutions are non-trivial (3, n, m) numbers, in that they are not sums of smaller solutions. Indeed, it may happen that a (3, n, m) number is actually the sum of (3, p, q) and (3, k, l) numbers with p + k = n and q + l = m. In order to avoid such trivial solutions, we have simply removed them by hand at the end of each anneal run. Having road-tested our reduction methods on simple problems, we now turn to a physical application, namely the anomaly cancellation conditions in the Standard Model extended by an extra U (1) gauge symmetry. This is one of the simplest and most studied extensions of the Standard Model (see Ref. [47] for a review of Z physics), and it has been the target of numerous experimental searches [48].
The generalities of anomaly cancellation for such systems have been discussed in Refs. [2][3][4][5][49][50][51][52][53][54][55][56][57][58][59]. In this work we will for concreteness specialise to the models studied in Ref. [4]. Here, the main assumption is that the chiral fermions appear in the usual 3 families of quarks and leptons, together with 3 right-handed neutrinos. The charges under the additional U (1) are labelled by indicating the generation number. Under this assumption the anomaly cancellation condition yields the following set of Diophantine equations for the charges: A general solution to the above equations has already been found analytically in Ref. [4]. However, we shall demonstrate here that these problems can be also tackled using Ising model annealing (in practice here we use simulated annealing, but ultimately quantum annealers will be practicable).
As for the Taxicab problem, we begin constructing the Hamiltonian by simply squaring and summing the left hand side of all the above equations. We encode all the variables involved as in Eq. (3) with t i ∈ {Q i , U i , D i , L i , E i , N i } and take s i = −1 for all the charges. We set β = 2, thus looking for solutions with entries from −1 to 2. Note that although the number of bits we use to represent each variable is relatively low, the number of possible configurations of these 3 × 6 = 18 charges with possible values in [−1, 2] is already quite high: 4 18 ∼ 10 10 . It is worth mentioning that in this particular case a comprehensive scan can be completed with far fewer attempts due to generation permutation symmetry in the equations. Indeed, it is easy to see that the anomaly equations are invariant under arbitrary permutations of {A 1 , A 2 , A 3 }, where A ∈ {Q, U, D, L, E, N }, giving an (S 3 ) 6 permutational symmetry that could be exploited if we were looking for solutions by exhaustive scanning over all the different configurations.
Of course our goal here is to avoid using such tricks, but to instead find solutions using annealing on the reduced Ising Hamiltonian. For β = 2 the reduction requires only 18 auxiliaries. We have performed several anneal runs with 10000 reads obtaining an average of 60 distinct solutions per anneal run. In the following table we present a sample of three of them.  Table IX. A sample of three solutions found using β = 2 and 10000 reads in each anneal run.
One might expect higher values of β to lead to new solutions with bigger entries, along with those previously found. However, for this specific problem, classical annealing turns out to be unfruitful for β > 2. To explain why it is useful to inspect how the energy gap ∆ between the ground state and the first excited state scales as a function of the size of the problem. It can be shown (see Ref. [32]) that where α and µ are the number of additions and multiplications respectively in the Hamiltonian written in terms of binary variables, m is the number of equations we want to solve and n is the the effective precision, which is the difference between the largest and smallest nonzero absolute values representable among all the variables in the system. Increasing β makes all these parameter bigger, including n and µ, causing an exponential shrinkage of the energy gap between the ground states and the first excited states, which in turn considerably affects the algorithm's performance. To improve our results and find solutions with bigger entries we use the solution-mining method described in Sec. II B. After 30 anneal runs this yields 153 solutions with entries between −13 and 13. Note that a complete scan on all possible such configurations, even exploiting the (S 3 ) 6 permutational symmetry, would involve 13×2+1 3 6 ∼ 10 20 trials, which is infeasible with conventional computing methods. It should be noted that we do not make use of the permutational symmetry and the Ising machine is in principle succeeding within a search space of 26 18 ∼ 3 × 10 25 , although it is not yet clear how exhaustive the method of small β plus solution mining can eventually be.
In Table X we present a sample of ten of the solutions found.  Table X. A sample of ten solutions found using the solutionmining method.
The first of these solutions is equivalent to one found previously in Table IX. This is because in the first anneal run the algorithm looks for solutions centered around zero, i.e. with entries between [−1, 2]. Then it starts exploring the neighborhood of the solution found in the previous anneal run, gradually finding solutions with larger entries.

VI. CONCLUSIONS
Diophantine problems in physics and beyond are often computationally hard. In this paper we have investigated the use of Ising machines for finding solutions to such problems, and have shown that they can succeed within search spaces that are vast. For example finding the (3,8,8) number 50139, the lowest number we find that can be written as the sum of eight cubes in two different ways, represents a search in a space of size 10 24 . Furthermore using "solution mining" for the task of anomaly cancellation, we find a proven ability to find solutions in search spaces of order 10 26 .
The methods described here are valid for any Ising machine, including both quantum and simulated annealers. We found that currently available quantum annealers can already solve a large variety of Diophantine problems at a relatively advanced level, but they are not yet competitive with their simulated counterparts. Nevertheless there are several reasons to believe that ultimately quantum annealers will become dominant for such tasks. The first is to do with the current obstacle to performing higher order tasks on a quantum annealer which is the fact that with the increasing complexity of the equations, finding a suitable embedding (i.e. an embedding such that the connectivity and the number of auxilliary qubits required are within the limits imposed by the quantum processing unit) is a non trivial task. Indeed it is worth noting that deciding whether a graph G can be embedded in a graph H is itself an NP-complete problem (when H is arbitrary but as in current annealers G is built out of Chimera or Pegasus sub-graphs) [60]. Thus one expects significant improvement in embedding as the overall size of the annealer, and the connectivity of its sub-graphs is continually increased.
The second reason to be optimistic about quantum annealers is in the potential speed-up in the way that they find global minima. For example there are many techniques open to quantum annealers such as diabatic annealing that have the potential to avoid anneal times increasing exponentially with the difficulty of the problem [61], an issue that is seen in both adiabatic quantum annealing and in simulated annealing. These physical aspects of quantum annealing which were also crucial in the quantum field theory tunneling studies in Refs. [62][63][64] cannot be efficiently simulated classically. Indeed discrete problems generally favour quantum annealers because in a sense these machines can operate by performing a quantum gradient descent by tunnelling. By contrast any simulation method requires a defined dynamical process for its evolution, in order to hop between potential solutions, and this tends to become increasingly delicate with the difficulty of the problem. Thus for certain problems quantum tunneling could be an enormous advantage. For example, it has already been shown that quantum annealing overcomes simulated annealing in a large variety of cases (see for example Refs. [65][66][67]). Particularly interesting are the results found in Ref. [67] showing quantum annealing outperforming classical annealing by a factor of 10 8 in finding the minimum of a particular crafted problem with tall and narrow energy barriers separating local minima. This is precisely the kind of configuration one expects when solving Diophantine problems.
In summary, we believe that the methods presented here should be an effective heuristic search method for the many discrete problems one encounters in physics.