An overview of Saharon Shelah’s contributions to mathematical logic, in particular to model theory∗

∗This overview is based on a lecture the author gave in the Rolf Schock Prize Symposium in Logic and Philosophy in Stockholm, 2018. The author is grateful to Andrew Arana, John Baldwin, Mirna Dzamonja and Juliette Kennedy for comments concerning preliminary versions of this paper. †While writing this overview the author was supported by the Fondation Sciences Mathématiques de Paris Distinguished Professor Fellowship, the Faculty of Science of the University of Helsinki, and grant 322795 of the Academy of Finland.

sometimes tease Saharon by asking him what is in paper, say 716, and usually he knows it. Somebody once asked about a result and Saharon said it is in paper number 3. Three hundred what, asked the person, being used to the fact that Shelah has hundreds of papers, and Saharon answered, no, paper number 3. In fact paper number three [10] is a very influential paper for the development of model theory.
Shelah made essentially three transformative contributions to the field of mathematical logic: stability theory, proper forcing and PCF theory, the first in model theory and the other two in set theory. He started as a model theorist and I think he still considers himself mainly as a model theorist, but he has extended his interest and work to set theory.

Model theory: Stability theory
Model theory is a branch of mathematics that deals with the relationship between descriptions or "axioms" in the so-called first-order languages (sometimes also in extensions) and the structures that satisfy these descriptions. This is a very general characterization of model theory going all the way back to Tarski. "First order language" means that quantifiers "for all" and "exists" range over elements (not subsets) of the domain. An example is provided by the group axioms ∀x∀y∀z(x · (y · z) = (x · y) · z) ∀x(x · 1 = x ∧ 1 · x = 1) ∀x∃y(x · y = 1 ∧ y · x = 1) which talk about the group elements but not about sets of group elements. If I wanted to say that the group is free, I would have to talk about subsets of the group in order to say that there is a free basis. Other examples of first order axioms are the field axioms, the axioms of order, the (first order) Peano axioms, and the axioms of set theory.
Originally, more than 100 years ago, there was an idea, advocated for example by Hilbert, although Hilbert seemed not to make a great distinction between first and second order axiomatizations, that mathematical structures can be understood through their axiomatizations. There is a fundamental philosophical question, how is it possible that we understand infinite objects such as real or complex numbers, with the finite means that we have. How can we be certain about properties of infinitary things? The idea of at least Hilbert was that we write down axioms and the axioms characterize their models completely. It turned out to be not quite so. Skolem already in the 1920s and then Gödel in the 1930s showed that there are certain limitations to these attempts. In the 1960s model theory developed quite strongly but mainly using set-theoretic methods. The limitations of the extent to which first order axiomatisations capture mathematical concepts and structures were exposed in very manifest ways.
When Shelah entered the model theory scene he isolated an instability phenomenon in certain first order theories. It is something that people like Hilbert, Skolem and Gödel, who came to logic earlier, did not consider and had no idea about. It transforms model theory from the set-theoretic approach into a more geometric and algebraic form. In the 1960s when settheoretic model theory had become quite complicated, the more geometric approach brought new hope that we can understand models of first order theories by building on the long history of geometry and algebra. In this respect we can think of the weakness of first order logic, revealed by Skolem and Gödel, as a strength in the hands of modern model theorists, e.g. Shelah.
First order descriptions of structures are at the same time sufficiently strict, keeping the structure from being "too general", and sufficiently tolerant to allow a rich theory and interesting constructions. First order logic (i.e. language) strikes a kind of very successful balance.
In 1978 Shelah's book "Classification Theory" [12] appeared. This is a fundamental book that everybody in model theory rushed to read. It was not an easy book to read but it had everything that you needed at that time and long after. In particular, it contained the basics of stability theory. Stability theory is now the accepted state-of-the-art and focus of research for all who are working in model theory.
In the June 1982 issue of the Abstracts of the American Mathematical Society Shelah published a paper with the title "Why am I so happy". He had made a landmark breakthrough leading to the so-called Main Gap Theorem (See below). Hodges [6] writes: He had just brought to a successful conclusion a line of research which had cost him fourteen years of intensive work and not far off a hundred published books and papers. In the course of this work he had established a new range of questions about mathematics with implications far beyond mathematical logic.
I will now explain what this is about. For any consistent first order theory T and any cardinal number κ the spectrum function I(κ, T ) is defined as the number of non-isomorphic models of T of cardinality κ. By early results of Skolem, Tarski, Gödel and Maltsev, the spectrum function is defined for every infinite cardinal κ, and of course For any given T we have a function κ → I(κ, T ) which has its values in the interval [1, 2 κ ]. If T is the theory of vector spaces over a fixed finite field, the value is constant 1. If T is the theory of linear order, the value is always 2 κ .
Morley [9] had shown in 1965 that if the spectrum is 1 for one uncountable κ it is 1 for all uncountable κ, and proposed what became known as the Morley Conjecture: The spectrum function is always non-decreasing in κ for uncountable κ.
The name "Main Gap" refers to the gap between ω 1 (|ω + α|) and 2 ℵα . Depending on α this may be no gap at all, but in general ω 1 (|ω + α|) grows only moderately compared to 2 ℵα . The theorem says that for α ≥ 1 no theory T can have its number of non-isomorphic models of size ℵ α strictly less than the maximum 2 ℵα , but still as big as ω 1 (|ω + α|). But that is not all. The two cases of Theorem 1 can be separated from each other by strictly model theoretic properties of T with no reference to the number of models.
The case I(ℵ α , T ) = 2 ℵα is called the "non-structure case" because in this case the theory not only has maximally many models but among the models there are necessarily many that are verifiably extremely difficult to distinguish from each other. In this case we have a kind of chaos. The second case viz. the case where there are relatively few non-isomorphic models, is called the "structure case" because closer analysis reveals that in this case every model can be characterized up to isomorphism in terms of certain invariants, a little bit as vector spaces are characterized by their dimension and an algebraically closed field by its degree of transcendence. To see what this means we have to go an iota deeper into stability theory.
Whether a theory T is stable or not depends on its "types", i.e. sets of formulas consistent with the theory, essentially describing generalized notions of elements. If M is a model of T , every element of M has a certain type, namely the set of formulas that the element satisfies with parameters from the model. But there may be also types that are not types of any element of the model. For example, in the ordered model of the natural numbers there is the type of an infinitely big number. In the field of real numbers there is the type of an infinitely small positive number. In the field of rational numbers there is the type of √ 2, etc. The types form a kind of topological "Stone" space.
The stability of the theory means that the spaces of types in different cardinalities are not too big. Typically models with an infinite linear order have too many types to be stable. On the other hand, algebraically closed fields are stable. If the space of types is not too big, one can define a pregeometry on so-called strongly minimal subsets and you get a situation which resembles the case of vector spaces or algebraically closed fields in the sense that you get a concept of dimension. You can do geometry on models of a stable theory although a priori the theory can be quite arbitrary, with nothing to do with geometry, as long as it is stable. Eventually this leads to a complete characterization of all uncountable models of the theory in terms of geometric invariants.
On the other hand, if the theory is unstable, i.e. it has "many" types, we can build a maximal amount of models which are non-isomorphic but indistinguishable in a strong sense, manifesting the impossibility to characterize all models by means of geometric invariants, as in the stable case.
The Main Gap dividing line is internally characterized by a combination of four properties, which I mention without going into details: Superstability, NDOP (lack of dimensional order property), NOTOP (lack of omitting types order property), and shallowness (no infinite branches of the decomposition trees of models of the theory).
It would seem that it is too simple-minded to look at only the cardinality of the space of types. How can we get geometric information from mere cardinality information? Surely it is too simple-minded. But no, the cardinality matters. If we can place a bound on the cardinality, we can define the geometry.

Set theory: proper forcing
Let us look at a problem of which it is not clear whether it is model theory or set theory, or what is it? Well, it is algebra, not logic at all. It is the Whitehead Problem. Suppose A is an abelian group. It is called a Whitehead group, if the following holds: If B is another abelian group and π : B → A is an onto homomorphism such that ker(B) ∼ = Z, then there exists a homomorphism ρ : A → B such that πρ = id. It is not hard to see that every free abelian group is a Whitehead group.
The Whitehead Problem asks whether the converse is true, i.e. whether every Whitehead group is free. Shelah [11] showed that the problem cannot be solved in ZFC.
Eklof and Mekler write in the beginning of their book "Almost free modules" [2]: "The modern era in set-theoretic methods in algebra can be said to have begun on July 11, 1973 when Saharon Shelah borrowed Laszlo Fuchs' Infinite Abelian Groups from the Hebrew University library. Soon thereafter, he showed that Whitehead's Problem-to which many talented mathematicians had devoted much creative energy-was not solvable in ordinary set theory (ZFC)." We can think of the Whitehead Problem as an isolated problem and in fact Shelah is very good in solving problems, isolated or not, but what happens here is a kind of transformation of an area. Shelah proved from V=L that every Whitehead group is free, and from Martin's Axiom and the negation of the Continuum Hypothesis that there is a non-free Whitehead group of cardinality ℵ 1 . There is now a whole subarea of abelian group theory using set-theoretic methods. We are moving here from 1970 to 1973, so very soon after starting in model theory Shelah had to use also set theory, an area where he became a leading figure.
Set theory studies the mathematics of infinite sets, e.g. sets of real numbers. Its Zermelo-Fraenkel axiomatisation (ZFC) is also one of the possible established bases not only of set theory, but of all of mathematics, in the sense that all mathematically accepted proofs could in principle be derived from the ZFC axioms.
Still, ZFC may seem quite weak when it comes to deep set-theoretical questions. This was manifested by independence results (e.g. Shelah's results on the Whitehead Problem) i.e. results, usually based on forcing, demonstrating that ZFC cannot solve a particular set-theoretical question. After Paul Cohen's forcing method it seemed that if you ask almost anything non-trivial about sets themselves, especially about the arithmetic of cardinal numbers, the ZFC axioms cannot solve it.
Forcing adds some new sets to the universe, or more exactly to a countable model of ZFC. It became quickly clear that you cannot do everything you want in one step. More involved applications of the method involve iterating the basic method of forcing. The first step was taken by Solovay and Tennenbaum in the late 1960s who introduced "finite support" iterations and obtained the consistency of Martin's Axiom (eventually used by Shelah in his proof of the undecidability of the Whitehead Problem) together with the negation of the Continuum Hypothesis. While Martin's Axiom was very successful in settling a number of problems in set theory, measure theory, general topology, and so forth, it is insufficient for more sophisticated applications.
Shelah needed forcing for the solution of the Whitehead Problem and for different variants of his solution. For example, his first solution gave a non-free Whitehead group in a model of set theory where the Continuum Hypothesis is false, raising the question, whether we can have a non-free Whitehead group in the presence of the Continuum Hypothesis. He needed new ideas in forcing.
Coming from model theory in the late 1970s Shelah formulated the notion of "proper" forcing and showed that it is possible to iterate it using suitable supports, leading to hundreds of striking results in set theory as well as applications to other areas of mathematics. This is witnessed by Shelah's massive book "Proper forcing" from 1982 [13] and the second edition "Proper and improper forcing" from 1998 [16], as well as hundreds of papers by Shelah and others on the subject. Again, not a particularly easy book to read but it has become something of a bible for researchers using forcing.
What is proper forcing? The definition of properness-"preserves stationarity of subsets of λ ω for every uncountable λ"-is somewhat technical. However, it has many equivalent definitions, manifesting remarkable robustness. For example, the so-called CCC forcing (every antichain of forcing conditions is countable), used originally by Paul Cohen to explode the size of the continuum rendering the Continuum Hypothesis to be false, is proper. Also countably closed forcing (every descending ω-sequence of forcing conditions has a lower bound), which can be used to collapse the cardinality of the continuum to ℵ 1 forcing the Continuum Hypothesis to be true, is proper. Most importantly, properness can be iterated.
Perhaps the most striking of the applications of proper forcing is a joint paper of Foreman, Magidor and Shelah from 1988 [3]. In this paper the authors establish the consistency, modulo certain large cardinal assumptions, of Martin's Maximum, a very natural strengthening of Martin's Axiom, expressing, in a sense, the maximality of the universe under the largest possible class of forcings. It has become a standard point of reference. Just like V=L, Martin's Maximum solves many problems but is in a sense the opposite of V=L. Martin's Maximuum has a number of striking consequences, such as the saturation of the non-stationary ideal on ω 1 as well as 2 ℵ 0 = ℵ 2 . It is remarkable that maximality with respect to pushing the continuum up by (e.g.) CCC forcing and collapsing it down by (e.g.) countably closed forcing somehow reaches a "balance" at ℵ 2 , not ℵ 1 . Some have taken this as an indication that we should accept the continuum "really" being of cardinality ℵ 2 , but this is, by the way, not Shelah's view on the question of the Continuum Hypothesis, as he makes clear in his "Reflecting on logical dreams" [18].

Set theory: PCF theory
The emergence of powerful forcing techniques created a feeling that set theory, and cardinal arithmetic in particular, would be essentially all about independence results. Shelah has come forward very strongly with his own ideology that the axioms may be stronger than we think, and we should not give up too easily. Shelah has advocated that we should try to solve problems by proving them from ZFC rather than resorting to proving just independence results, even if we have to reformulate the problems.
With his powerful PCF theory (possible cofinality theory), launched in a sequence of papers starting 1978 and culminating in the monograph "Cardinal Arithmetic" [15], Shelah showed that if you ask the right questions, then independence begins to recede. His idea is that thinking only of the cardinality of a set is too simple-minded, we should rather think of cofinalities of various infinite reduced products of sets. He was able to reintroduce the idea that the ZFC axioms of set theory are able to decide questions about cardinal arithmetic, and he indeed proved straight from the axioms surprising results such as: This was at the time a shocking result because it was provable from ZFC and still it was a new fact about cardinal arithmetic. The received ideology had been that we cannot prove anything about such matters, independence is everywhere. This brought hope that if we formulate questions in the right way, much more can be proved from ZFC alone than was anticipated. PCF theory has been subsequently used by Shelah and others to prove results in set theory, model theory, algebra and topology, especially about singular cardinals. (A set is of singular cardinality if it is a union of fewer sets of smaller cardinality.) Typically in each case, there was no hope of proving the result in ZFC alone, before the emergence of PCF theory.
I will now briefly sketch PCF theory. There is something in set theory that may bother us. In model theory we have the universe of a model and then some structure, be it order structure, algebraic structure, tree structure, whatever, but there are relations that give structure. When you have structure, no wonder that you have structure theorems. Some models are isomorphic, some are not. But in set theory we have this bold starting point that we look at sets only as sets with no other internal structure than the membership-relation. It may look too simple-minded. PCF theory takes a step away from this view. We look at Cartesian products, more exactly reduced products. Elements of such sets are functions and not even just functions but equivalence classes of functions. So there is more structure, for there is the product structure and there is equivalence structure arising from the filter used. This extra structure gives new results that otherwise do not seem possible.
Suppose a is an infinite set of regular cardinals such that min(a) > |a|. pcf(a) is the set of possible cofinalities of ultraproducts of a i.e. the set of regular cardinals λ such that λ is the cofinality (the least cardinality of an unbounded set) of the ultraproduct Πa/D for some ultrafilter D on a.
Theorem 2 ( [15]). If a is the set of all regular cardinals on the interval [min(a), sup(a)), then |pcf(a)| ≤ |a| +3 . In consequence, If 2 ℵ 0 < ℵ ω , then The surprising thing about this was that according to the common view ZFC alone could not possibly put a bound on ℵ ℵ 0 ω . In fact, Easton had proved in 1970 that exponentiation 2 ℵα of regular cardinals ℵ α (ℵ ω is singular!) can consistently manifest any pattern what so ever as long as the two principles (1) α ≤ β implies 2 ℵα ≤ 2 ℵ β , (2) The cofinality of 2 ℵα is greater than ℵ α , are respected [1]. Now a few words about something else which brings us back to model theory, namely finite models. Shelah has during his career proved many results about finite models. Here is another result about finite sets. Again, it is not clear whether it is logic but let us not pay attention to that. Van der Waerden's Theorem says that if you fix a natural number k and divide numbers 1, 2, . . . , n, for large enough n, into two parts, then one of the parts contains an arithmetic progression of length k.
Graham-Rothschild-Spencer: "Ramsey Theory" (1990) [4] write: In 1987 the Israeli logician Saharon Shelah shocked the combinatorial world by finding a fundamentally new proof of the Hales-Jewett theorem, and hence of van der Waerden's theorem.
The theorem is of the form "for all k there is n . . .". Previous proofs gave an explicit bound for n in terms of k, which was enormously large, or technically: grew faster than any primitive recursive function (sum, product, exponentiation and their iterations are primitive recursive). Shelah's new proof, inspired by a model-theoretic insight, gave the first primitive recursive upper bound.
Finite model theory is a hot topic nowadays. For example, one can define stability theoretic concepts in the finite context. We can define what it means for a finite graph to be stable and prove interesting results about such graphs. It would be natural but far from the truth to think that finite models are easier than infinite models. In fact nothing is as complicated as finite models. What is emerging now is a kind of stability theory for finite models along the lines of Shelah's stability theory of infinite models [8].

Back to model theory: A new Lindström Theorem
Per Lindström (1969) proved that first order logic is maximal with respect to a Downward Löwenheim-Skolem Theorem and the Compactness Theorem [7]. This is a very famous result that every model theorist, maybe even every logician, knows about. If you aspire to have another nice logic with these two properties, it is hopeless, you cannot find such an extension, as any proper extension violates one of these two properties. When this result became more widely known in the early 1970s many people tried to find characterizations of other logics. A whole new field of model theory called abstract (or soft) model theory was born. Although many extensions of first order logic were known or introduced in the 1970s and the 1980s, no new "Lindström Theorems" were found. The situation with the Craig Interpolation Theorem was similar. The area of abstract model theory almost started to die because of this lack of new characterizations. In 2011 Shelah found a class of new infinitary logics L 1 κ which have a Lindström Theorem and satisfy the Craig Interpolation Theorem.
To understand L 1 κ it is helpful to go back to the Ehrenfeuch-Faïssé characterization of elementary equivalence in first order logic. This is in terms of a game. Suppose A and B are two models of the same vocabulary. In the game which we denote by G n (A, B) two players I and II pick one at a time elements from A ∪ B. During round i of the game I picks an element a i from A and then II picks an element b i from B and vice versa: If I picks an element b i from B then I picks an element a i from A. After n rounds the pairs of played elements {(a i , b i ) : 0 ≤ i < n} form a binary relation R on A × B. If this relation is a partial isomorphism, i.e. preserves atomic formulas and their negations, we say that II has won. In a finite vocabulary a model class K is definable in first order logic if and only if there is an n such that if A ∈ K and Player II has a winning strategy in the above G n (A, B), then B ∈ K. In this sense the games G n (A, B) completely determine first order logic. The standard proof of Lindström's Theorem uses these games in an essential way.
Let us try to do the same for the infinitary logic L κκ , a kind of straightforward generalization of first order logic to the realm of infinite operations, in which one can form conjunctions and disjunctions of length < κ and quantify over sequences of variables of length < κ. In the respective game which we denote by G κ α (A, B) two players I and II pick one at a time sequences of length < κ of elements from A ∪ B. During round i of the game I picks a sequence a ξ i , ξ < λ i , where λ i < κ, from A then II picks a sequence b ξ i , ξ < λ i , from B and vice versa. Additionally, when player I moves, he has to also pick α i < α such that move by move the α i form a descending sequence. The game ends when α n = 0. After this the pairs of played elements (a ξ i , b ξ i ) form a binary relation on A × B. If this relation is a partial isomorphism, we say that II has won. This game seems natural enough but it has serious weaknesses, known since the 1960s, which prevent it from being used e.g. to prove the Craig Interpolation Theorem for L κκ , let alone a Lindström Theorem for it.
Shelah introduced a new version DG κ α (A, B) of the game G κ α (A, B). This game resembles G κ α (A, B) in all respects except that there is a twist in each round of moves. When I picks a sequence a ξ i , ξ < λ i , from A, Player II picks a function f i : λ i → ω and a sequence b ξ i , f i (ξ) = 0, from B. During the next round of moves Player II gives a sequence b ξ i , f i (ξ) = 1, then a sequence b ξ i , f i (ξ) = 2, etc, until eventually α n = 0 and the game ends. So after the first round of moves Player II has a "debt": she has not yet revealed (or decided) what b ξ 0 for f 0 (ξ) > 0 are. During the game more and more of this "debt" is paid but in the end, when α n = 0, the remaining debt remains unpaid. Player I will never know what b ξ 0 for f 0 (ξ) > n might be, but he can choose how big n is. The same happens with each f i . A model class K is said to be definable in L 1 κ if (roughly) there are a θ < κ and an α < κ such that if A ∈ K and Player II has a winning strategy in the above game DG θ α (A, B), then B ∈ K. It is noteworthy that this definition of L 1 κ gives no hint as to what the syntax of L 1 κ might be. Thus the logic L 1 κ is merely a family of model classes with some closure properties, just as in Lindström's original paper [7]. Theorem 3 ([17]). For κ = κ the logic L 1 κ is maximal logic with respect to a Downward Löwenheim-Skolem property and the property of not being able (in a strong sense) to define the concept of well-ordering. For such κ the logic L 1 κ satisfies also the Craig Interpolation Theorem. This result opens the door to the possibility that a new kind of infinitary model theory can be developed along the lines of first order model theory. This was the hope in the past but it took 40 years to become a reality. Once again, the impossible has become possible.