學習模型論，何其難？

阿新 • • 發佈：2018-12-14

學習模型論，何其難？
四十年過去了，在國內學習純粹數學（例如：模型論）仍然困難重重，甚至無人問津。
什麼是數學模型理論？國內學界不發聲，不說話，令人很無奈。
為此，我們推薦一篇科普文章，請見本本文附件。
袁萌陳啟清 12月14日
附件：
Fundamentals of Model Theory
William Weiss and Cherie D’Mello
Department of Mathematics University of Toronto
c 2015 W.Weiss and C. D’Mello

1
Introduction
Model Theory is the part of mathematics which shows how to apply logic to the study of structures in pure mathematics. On the one hand it is the ultimate abstraction; on the other, it has immediate applications to every-day mathematics. The fundamental tenet of Model Theory is that mathematical truth, like all truth, is relative. A statement may be true or false, depending on how and where it is interpreted. This isn’t necessarily due to mathematics itself, but is a consequence of the language that we use to express mathematical ideas. What at ﬁrst seems like a deﬁciency in our language, can actually be shaped into a powerful tool for understanding mathematics. This book provides an introduction to Model Theory which can be used as a text for a reading course or a summer project at the senior undergraduate or graduate level. It is also a primer which will give someone a self contained overview of the subject, before diving into one of the more encyclopedic standard graduate texts. Any reader who is familiar with the cardinality of a set and the algebraic closure of a ﬁeld can proceed without worry. Many readers will have some acquaintance with elementary logic, but this is not absolutely required, since all necessary concepts from logic are reviewed in Chapter 0. Chapter 1 gives the motivating examples; it is short and we recommend that you peruse it ﬁrst, before studying the more technical aspects of Chapter 0. Chapters 2 and 3 are selections of some of the most important techniques in Model Theory. The remaining chapters investigate the relationship between Model Theory and the algebra of the real and complex numbers. Thirty exercises develop familiarity with the deﬁnitions and consolidate understanding of the main proof techniques. Throughout the book we present applications which cannot easily be found elsewhere in such detail. Some are chosen for their value in other areas of mathematics: Ramsey’s Theorem, the Tarski-Seidenberg Theorem. Some are chosen for their immediate appeal to every mathematician: existence of inﬁnitesimals for calculus, graph colouring on the plane. And some, like Hilbert’s Seventeenth Problem, are chosen because of how amazing it is that logic can play an important role in the solution of a problem from high school algebra. In each case, the derivation is shorter than any which tries to avoid logic. More importantly, the methods of Model Theory display clearly the structure of the main ideas of the proofs, showing how theorems of logic combine with theorems from other areas of mathematics to produce stunning results. The theorems here are all are more than thirty years old and due in great part to the cofounders of the subject, Abraham Robinson and Alfred Tarski. However, we have not attempted to give a history. When we attach a name to a theorem, it is simply because that is what mathematical logicians popularly call it. The bibliography contains a number of texts that were helpful in the preparation of this manuscript. They could serve as avenues of further study and in addition, they contain many other references and historical notes. The more recent titles were added to show the reader where the subject is moving today. All are worth a look. This book began life as notes for William Weiss’s graduate course at the University of Toronto. The notes were revised and expanded by Cherie D’Mello and
2
William Weiss, based upon suggestions from several graduate students. The electronic version of this book may be downloaded and further modiﬁed by anyone for the purpose of learning, provided this paragraph is included in its entirety and so long as no part of this book is sold for proﬁt.
Contents
Chapter 0. Models, Truth and Satisfaction 4 Formulas, Sentences, Theories and Axioms 4 Prenex Normal Form 9
Chapter 1. Notation and Examples 11
Chapter 2. Compactness and Elementary Submodels 14 The Compactness Theorem 14 Isomorphisms, elementary equivalence and complete theories 15 The Elementary Chain Theorem 16 The L¨owenheim-Skolem Theorem 19 The L o´s-Vaught Test 20 Every complex one-to-one polynomial map is onto 22
Chapter 3. Diagrams and Embeddings 24 Diagram Lemmas 25 Every planar graph can be four coloured 25 Ramsey’s Theorem 26 The Leibniz Principle and inﬁnitesimals 27 The Robinson Consistency Theorem 27 The Craig Interpolation Theorem 31
Chapter 4. Model Completeness 32 Robinson’s Theorem on existentially complete theories 32 Lindstr¨om’s Test 35 Hilbert’s Nullstellensatz 37
Chapter 5. The Seventeenth Problem 39 Positive deﬁnite rational functions are the sums of squares 39
Chapter 6. Submodel Completeness 45 Elimination of quantiﬁers 45 The Tarski-Seidenberg Theorem 48
Chapter 7. Model Completions 50 Almost universal theories 52 Saturated models 54 Blum’s Test 55
Bibliography 60
Index 61
3
CHAPTER 0
Models, Truth and Satisfaction
We will use the following symbols: • logical symbols: – the connectives ∧ ,∨ , ¬ , → , ↔ called “and”, “or”, “not”, “implies” and “iﬀ” respectively – the quantiﬁers ∀ , ∃ called “for all” and “there exists” – an inﬁnite collection of variables indexed by the natural numbers N v0 ,v1 , v2 , ... – the two parentheses ), ( – the symbol = which is the usual “equal sign” • constant symbols : often denoted by the letter c with subscripts • function symbols : often denoted by the letter F with subscripts; each function symbol is an m-placed function symbol for some natural number m ≥ 1 • relation symbols : often denoted by the letter R with subscripts; each relational symbol is an n-placed relation symbol for some natural number n ≥ 1. We now deﬁne terms and formulas. Definition 1. A term is deﬁned as follows: (1) a variable is a term (2) a constant symbol is a term (3) if F is an m-placed function symbol and t1,...,tm are terms, then F(t1 ...tm) is a term. (4) a string of symbols is a term if and only if it can be shown to be a term by a ﬁnite number of applications of (1), (2) and (3). Remark. This is a recursive deﬁnition. Definition 2. A formula is deﬁned as follows : (1) if t1 and t2 are terms, then (t1 = t2) is a formula. (2) if R is an n-placed relation symbol and t1,...,tn are terms, then (R(t1 ...tn)) is a formula. (3) if ϕ is a formula, then (¬ϕ) is a formula (4) if ϕ and ψ are formulas then so are (ϕ∧ψ), (ϕ∨ψ), (ϕ → ψ) and (ϕ ↔ ψ) (5) if vi is a variable and ϕ is a formula, then (∃vi)ϕ and (∀vi)ϕ are formulas (6) a string of symbols is a formula if and only if it can be shown to be a formula by a ﬁnite number of applications of (1), (2), (3), (4) and (5). Remark. This is another recursive deﬁnition. ¬ϕ is called the negation of ϕ;ϕ ∧ψ is called the conjunction of ϕ and ψ; and ϕ∨ψ is called the disjunction of ϕand ψ.
4
0. MODELS, TRUTH AND SATISFACTION 5
Definition 3. A subformula of a formula ϕ is deﬁned as follows: (1) ϕ is a subformula of ϕ (2) if (¬ψ) is a subformula of ϕ then so is ψ (3) if any one of (θ∧ψ), (θ∨ψ), (θ → ψ) or (θ ↔ ψ) is a subformula of ϕ, then so are both θ and ψ (4) if either (∃vi)ψ or (∀vi)ψ is a subformula of ϕ for some natural number i, then ψ is also a subformula of ϕ (5) A string of symbols is a subformula of ϕ, if and only if it can be shown to be such by a ﬁnite number of applications of (1), (2), (3) and (4).
Definition 4. A variable vi is said to occur bound in a formula ϕ iﬀ for some subformula ψ of ϕ either (∃vi)ψ or (∀vi)ψ is a subformula of ϕ. In this case each occurrence of vi in (∃vi)ψ or (∀vi)ψ is said to be a bound occurrence of vi. Other occurrences of vi which do not occur bound in ϕ are said to be free.
Example 1.
F(v3) is a term, where F is a unary function symbol. ((∃v3)(v0 = v3)∧(∀v0)(v0 = v0)) is a formula. In this formula the variable v3 only occurs bound but the variable v0 occurs both bound and free.
Exercise 1. Using the previous deﬁnitions as a guide, deﬁne the substitution of a term t for a variable vi in a formula ϕ. In particular, demonstrate how to substitute the term for the variable v0 in the formula of the example above. Definition 5. A language L is a set consisting of all the logical symbols with perhaps some constant, function and/or relational symbols included. It is understood that the formulas of L are made up from this set in the manner prescribed above. Note that all the formulas of L are uniquely described by listing only the constant, function and relation symbols of L. We use t(v0,...,vk) to denote a term t all of whose variables occur among v0,...,vk. We use ϕ(v0,...,vk) to denote a formula ϕ all of whose free variables occur among v0,...,vk.
Example 2. These would be formulas of any language : • For any variable vi: (vi = vi) • for any term t(v0,...,vk) and other terms t1 and t2: ((t1 = t2) → (t(v0,...,vi−1,t1,vi+1,...,vk) = t(v0,...,vi−1,t2,vi+1,...,vk))) • for any formula ϕ(v0,...,vk) and terms t1 and t2: ((t1 = t2) → (ϕ(v0,...,vi−1,t1,vi+1,...,vk) ↔ ϕ(v0,...,vi−1,t2,vi+1,...,vk))) Note the simple way we denote the substitution of t1 for vi. Definition 6. A model (or structure) A for a language L is an ordered pair hA,Ii where A is a nonempty set and I is an interpretation function with domain the set of all constant, function and relation symbols of L such that: (1) if c is a constant symbol, then I(c) ∈ A; I(c) is called a constant
0. MODELS, TRUTH AND SATISFACTION 6
(2) if F is an m-placed function symbol, then I(F) is an m-placed function on A (3) if R is an n-placed relation symbol, then I(R) is an n-placed relation on A. A is called the universe of the model A. We generally denote models with Gothic letters and their universes with the corresponding Latin letters in boldface. One set may be involved as a universe with many diﬀerent interpretation functions of the language L. The model is both the universe and the interpretation function. Remark. The importance of Model Theory lies in the observation that mathematical objects can be cast as models for a language. For instance, the real numbers with the usual ordering < < < and the usual arithmetic operations, addition + + + and multiplication • • • along with the special numbers 0 and 1 can be described as a model.Let L contain one two-placed (i.e. binary) relation symbol R0, two two-placed function symbols F1 and F2 and two constant symbols c0 and c1. We build a model by letting the universe A be the set of real numbers. The interpretation function I will map R0 to < < <, i.e. R0 will be interpreted as < < <. Similarly, I(F1) will be + + +, I(F2) will be • • •, I(c0) will be 0 and I(c1) will be 1. So hA,Ii is an example of a model for the language described by {R0,F1,F2,c0,c1}. We now wish to show how to use formulas to express mathematical statements about elements of a model. We ﬁrst need to see how to interpret a term in a model.
Definition 7. The value t[x0,...,xq] of a term t(v0,...,vq) at x0,...,xq in the universe A of the model A is deﬁned as follows: (1) if t is vi then t[x0,••• ,xq] is xi, (2) if t is the constant symbol c, then t[x0,...,xq] is I(c), the interpretation of c in A, (3) if t is F(t1 ...tm) where F is an m-placed function symbol and t1,...,tm are terms, then t[x0,...,xq] is G(t1[x0,...,xq],...,tm[x0,...,xq]) where G is the m-placed function I(F), the interpretation of F in A. Definition 8. Suppose A is a model for a language L. The sequencex 0,...,xq of elements of A satisﬁes the formula ϕ(v0,...,vq) all of whose free and bound variables are among v0,...,vq, in the model A, written A |= ϕ[x0,...,xq] provided we have: (1) if ϕ(v0,...,vq) is the formula (t1 = t2), then A |= (t1 = t2)[x0,...,xq] means that t1[x0,...,xq] equals t2[x0,...,xq], (2) if ϕ(v0,...,vq) is the formula (R(t1 ...tn)) where R is an n-placed relation symbol, then A |= (R(t1 ...tn))[x0,...,xq] means S(t1[x0,...,xq],...,tn[x0,...,xq]) where S is the n-placed relation I(R), the interpretation of R in A,(3) if ϕ is (¬θ), then A |= ϕ[x0,...,xq] means not A |= θ[x0,...,xq], (4) if ϕ is (θ∧ψ), then A |= ϕ[x0,...,xq] means both A |= θ[x0,...,xq] and A |= ψ[x0,...xq],
0. MODELS, TRUTH AND SATISFACTION 7
(5) if ϕ is (θ∨ψ) then A |= ϕ[x0,...,xq] means either A |= θ[x0,...,xq] or A |= ψ[x0,...,xq], (6) if ϕ is (θ → ψ) then A |= ϕ[x0,...,xq] means that A |= θ[x0,...,xq] implies A |= ψ[x0,...,xq], (7) if ϕ is (θ ↔ ψ) then A |= ϕ[x0,...,xq] means that A |= θ[x0,...,xq] iﬀ A |= ψ[x0,...,xq], (8) if ϕ is ∀viθ, then A |= ϕ[x0,...,xq] means for every x ∈ A,A |= θ[x0,...,xi−1,x,xi+1,...,xq], (9) if ϕ is ∃viθ, then A |= ϕ[x0,...,xq] means for some x ∈ A,A |= θ[x0,...,xi−1,x,xi+1,...,xq]. Exercise 2. Each of the formulas of Example 2 is satisﬁed in any model A for any language L by any (long enough) sequence x0,x1,...,xq of A. This is where you test your solution to Exercise 1, especially with respect to the term and formula from Example 1.
We now prove two lemmas which show that the preceding concepts are welldeﬁned. In the ﬁrst one, we see that the value of a term only depends upon the values of the variables which actually occur in the term. In this lemma the equal sign = is used, not as a logical symbol in the formal sense, but in its usual sense to denote equality of mathematical objects — in this case, the values of terms, which are elements of the universe of a model. Lemma 1. Let A be a model for L and let t(v0,...,vp) be a term of L. Letx 0,...,xq and y0,...,yr be sequences from A such that p ≤ q and p ≤ r, and letx i = yi whenever vi actually occurs in t(v0,...,vp). Then t[x0,...,xq] = t[y0,...,yr] .
Proof. We use induction on the complexity of the term t. (1) If t is vi then xi = yi and so we have t[x0,...,xq] = xi = yi = t[y0,...,yr] since p ≤ q and p ≤ r. (2) If t is the constant symbol c, then t[x0,...,xq] = I(c) = t[y0,...,yr] where I(c) is the interpretation of c in A.(3) If t is F(t1 ...tm) where F is an m-placed function symbol, t1,...,tm are terms and I(F) = G, then t[x0,...,xq] = G(t1[x0,...,xq],...,tm[x0,...,xq]) and t[y0,...,yr] = G(t1[y0,...,yr],...,tm[y0,...,yr]). By the induction hypothesis we have that ti[x0,...,xq] = ti[y0,...,yr] for 1 ≤ i ≤ m since t1,...,tm have all their variables among {v0,...,vp}. So we have t[x0,...,xq] = t[y0,...,yr].
0. MODELS, TRUTH AND SATISFACTION 8
In the next lemma the equal sign = is used in both senses — as a formal logical symbol in the formal language L and also to denote the usual equality of mathematical objects. This is common practice where the context allows the reader to distinguish the two usages of the same symbol. The lemma conﬁrms that satisfaction of a formula depends only upon the values of its free variables. Lemma 2. Let A be a model for L and ϕ a formula of L, all of whose free and bound variables occur among v0,...,vp. Let x0,...,xq and y0,...,yr (q,r ≥ p) be two sequences such that xi and yi are equal for all i such that vi occurs free in ϕ. Then A |= ϕ[x0,...,xq] iﬀ A |= ϕ[y0,...,yr] Proof. Let A and L be as above. We prove the lemma by induction on the complexity of ϕ. (1) If ϕ(v0,...,vp) is the formula (t1 = t2), then we use Lemma 1 to get: A |= (t1 = t2)[x0,...,xq] iﬀ t1[x0,...,xq] = t2[x0,...,xq] iﬀ t1[y0,...,yr] = t2[y0,...,yr] iﬀ A |= (t1 = t2)[y0,...,yr]. (2) If ϕ(v0,...,vp) is the formula (R(t1 ...tn)) where R is an n-placed relation symbol with interpretation S, then again by Lemma 1, we get: A |= (R(t1 ...tn))[x0,...,xq] iﬀ S(t1[x0,...,xq],...,tn[x0,...,xq]) iﬀ S(t1[y0,...,yr],...,tn[y0,...,yr]) iﬀ A |= R(t1 ...tn)[y0,...,yr]. (3) If ϕ is (¬θ), the inductive hypothesis gives that the lemma is true for θ. So, A |= ϕ[x0,...,xq] iﬀ not A |= θ[x0,...,xq] iﬀ not A |= θ[y0,...,yr] iﬀ A |= ϕ[y0,...,yr]. (4) If ϕ is (θ∧ψ), then using the inductive hypothesis on θ and ψ we get A |= ϕ[x0,...,xq] iﬀ both A |= θ[x0,...,xq] and A |= ψ[x0,...xq] iﬀ both A |= θ[y0,...,yr] and A |= ψ[y0,...yr] iﬀ A |= ϕ[y0,...,yr]. (5) If ϕ is (θ∨ψ) then A |= ϕ[x0,...,xq] iﬀ either A |= θ[x0,...,xq] or A |= ψ[x0,...,xq] iﬀ either A |= θ[y0,...,yr] or A |= ψ[y0,...,yr] iﬀ A |= ϕ[y0,...,yr]. (6) If ϕ is (θ → ψ) then A |= ϕ[x0,...,xq] iﬀ A |= θ[x0,...,xq] implies A |= ψ[x0,...,xq] iﬀ A |= θ[y0,...,yr] implies A |= ψ[y0,...,yr] iﬀ A |= ϕ[y0,...,yr].
0. MODELS, TRUTH AND SATISFACTION 9
(7) If ϕ is (θ ↔ ψ) then A |= ϕ[x0,...,xq] iﬀ we have A |= θ[x0,...,xq] iﬀ A |= ψ[x0,...,xq] iﬀ we have A |= θ[y0,...,yr] iﬀ A |= ψ[y0,...,yr] iﬀ A |= ϕ[y0,...,yr]. (8) If ϕ is (∀vi)θ, then A |= ϕ[x0,...,xq] iﬀ for every z ∈ A,A |= θ[x0,...,xi−1,z,xi+1,...,xq] iﬀ for every z ∈ A,A |= θ[y0,...,yi−1,z,yi+1,...,yr] iﬀ A |= ϕ[y0,...,yr]. The inductive hypothesis uses the sequences x0,...,xi−1,z,xi+1,...,xq and y0,...,yi−1,z,yi+1,...,yr with the formula θ. (9) If ϕ is (∃vi)θ, then A |= ϕ[x0,...,xq] iﬀ for some z ∈ A,A |= θ[x0,...,xi−1,z,xi+1,...,xq] iﬀ for some z ∈ A,A |= θ[y0,...,yi−1,z,yi+1,...,yr] iﬀ A |= ϕ[y0,...,yr]. The inductive hypothesis uses the sequences x0,...,xi−1,z,xi+1,...,xq and y0,...,yi−1,z,yi+1,...,yr with the formula θ.   Definition 9. A sentence is a formula with no free variables. If ϕ is a sentence, we can write A |= ϕ without any mention of a sequence fromA since by the previous lemma, it doesn’t matter which sequence from A we use. In this case we say: • A satisﬁes ϕ • or A is a model of ϕ • or ϕ holds in A • or ϕ is true in A If ϕ is a sentence of L, we write |= ϕ to mean that A |= ϕ for every model Afor L. Intuitively then, |= ϕ means that ϕ is true under any relevant interpretation (model forL). Alternatively, no relevant example (model forL) is a counterexample to ϕ — so ϕ is true. Lemma 3. Let ϕ(v0,...,vq) be a formula of the language L. There is anotherformula ϕ0(v0,...,vq) of L such that (1) ϕ0 has exactly the same free and bound occurrences of variables as ϕ. (2) ϕ0 can possibly contain ¬, ∧ and ∃ but no other connective or quantiﬁer. (3) |= (∀v0)...(∀vq)(ϕ ↔ ϕ0) Exercise 3. Prove the above lemma by induction on the complexity of ϕ. As a reward, note that this lemma can be used to shorten future proofs by induction on complexity of formulas.
Definition 10. A formula ϕ is said to be in prenex normal form whenever (1) there are no quantiﬁers occurring in ϕ, or (2) ϕ is (∃vi)ψ where ψ is in prenex normal form and vi does not occur bound in ψ, or
0. MODELS, TRUTH AND SATISFACTION 10
(3) ϕ is (∀vi)ψ where ψ is in prenex normal form and vi does not occur bound in ψ.
Remark. If ϕ is in prenex normal form, then no variable occurring in ϕ occurs both free and bound and no bound variable occurring in ϕ is bound by more than one quantiﬁer. In the written order, all of the quantiﬁers precede all of the connectives. Lemma 4. Let ϕ(v0,...,vp) be any formula of a language L. There is a formulaϕ ∗ of L which has the following properties: (1) ϕ∗ is in prenex normal form (2) ϕ and ϕ∗ have the same free occurrences of variables, and (3) |= (∀v0)...(∀vp)(ϕ ↔ ϕ∗) Exercise 4. Prove this lemma by induction on the complexity of ϕ.
There is a notion of rank on prenex formulas — the number of alternations of quantiﬁers. The usual formulas of elementary mathematics have prenex rank 0, i.e. no alternations of quantiﬁers. For example: (∀x)(∀y)(2xy ≤ x2 + y2). However, the −δ deﬁnition of a limit of a function has prenex rank 2 and is much more diﬃcult for students to comprehend at ﬁrst sight: (∀ )(∃δ)(∀x)((0 < ∧0 < |x−a| < δ) →|F(x)−L| < ). A formula of prenex rank 4 would make any mathematician look twice.
CHAPTER 1
Notation and Examples
Although the formal notation for formulas is precise, it can become cumbersome and diﬃcult to read. Conﬁdent that the reader would be able, if necessary, to put formulas into their formal form, we will relax our formal behaviour. In particular, we will write formulas any way we want using appropriate symbols for variables, constant symbols, function and relation symbols. We will omit parentheses or add them for clarity. We will use binary function and relation symbols between the arguments rather than in front as is the usual case for “plus”, “times” and “less than”. Whenever a language L has only ﬁnitely many relation, function and constant symbols we often write, for example: L = {<,R0,+,F1,c0,c1} omitting explicit mention of the logical symbols (including the inﬁnitely many variables) which are always in L. Correspondingly we may denote a model A for L as: A = hA,< < <,S0,+ + +,G1,a0,a1i where the interpretations of the symbols in the language L are given by I(<) = < < <, I(R0) = S0, I(+) = + + + , I(F1) = G1, I(c0) = a0 and I(c1) = a1. Example 3. R = hR,< < <,+ + +,•,0,1i and Q = hQ,< < <,+ + +,•,0,1i, where R is thereals and Q the rationals, are models for the language L = {<,+,•,0,1}. Here < is a binary relation symbol, + and • are binary function symbols, 0 and 1 are constant symbols whereas < < <, + + +, •, 0, 1 are the well known relations, arithmetic functionsand constants. Similarly, C = hC,+ + +,•,0,1i, where C is the complex numbers, is a model forthe language L = {+,•,0,1}. Note the exceptions to the boldface convention forthese popular sets. Example 4. Here L = {<,+,•,0,1}, where < is a binary relation symbol, +and • are binary function symbols and 0 and 1 are constant symbols. The following formulas are sentences. (1) (∀x)¬(x < x) (2) (∀x)(∀y)¬(x < y∧y < x) (3) (∀x)(∀y)(∀z)(x < y∧y < z → x < z) (4) (∀x)(∀y)(x < y∨y < x∨x = y) (5) (∀x)(∀y)(x < y → (∃z)(x < z∧z < y)) (6) (∀x)(∃y)(x < y) (7) (∀x)(∃y)(y < x) (8) (∀x)(∀y)(∀z)(x + (y + z) = (x + y) + z) (9) (∀x)(x + 0 = x) 11
1. NOTATION AND EXAMPLES 12
(10) (∀x)(∃y)(x + y = 0) (11) (∀x)(∀y)(x + y = y + x) (12) (∀x)(∀y)(∀z)(x•(y•z) = (x•y)•z) (13) (∀x)(x•1 = x) (14) (∀x)(x = 0∨(∃y)(y•x = 1)) (15) (∀x)(∀y)(x•y = y•x) (16) (∀x)(∀y)(∀z)(x•(y + z) = (x•y) + (y•z)) (17) 0 6= 1 (18) (∀x)(∀y)(∀z)(x < y → x + z < y + z) (19) (∀x)(∀y)(∀z)(x < y∧0 < z → x•z < y•z) (20) for each n ≥ 1 we have the formula (∀x0)(∀x1)•••(∀xn)(∃y)(xn •yn + xn−1 •yn−1 +•••+ x1 •y + x0 = 0∨xn = 0)
where, as usual, yk abbreviates
k z }| { y•y•••••y The latter formulas express that each polynomial of degree n has a root. The following formulas express the intermediate value property for polynomials of degree n: if the polynomial changes sign from w to z, then it is zero at some y between w and z. (21) for each n ≥ 1 we have (∀x0)...(∀xn)(∀w)(∀z)[(xn •wn + xn−1 •wn−1 +•••+ x1 •w + x0)• (xn •zn + xn−1 •zn−1 +•••+ x1 •z + x0) < 0 → (∃y)(((w < y∧y < z)∨(z < y∧y < w)) ∧(xn •yn + xn−1 •yn−1 +•••+ x1 •y + x0 = 0))] The most fundamental concept is that of a sentence σ being true when interpreted in a model A. We write this as A |= σ, and we extend this concept in the following deﬁnitions.
Definition 11. If Σ is a set of sentences, A is said to be a model of Σ, written A |= Σ, whenever A |= σ for each σ ∈ Σ. Σ is said to be satisﬁable iﬀ there is some A such that A |= Σ. Definition 12. A theory T is a set of sentences. If T is a theory and σ is a sentence, we write T |= σ whenever we have that for all A if A |= T then A |= σ. We say that σ is a consequence of T. A theory is said to be closed whenever it contains all of its consequences. Definition 13. If A is a model for the language L, the theory of A, denotedby Th A, is deﬁned to be the set of all sentences of L which are true in A, {σ of L : A |= σ}. This is one way that a theory can arise. Another way is through axioms. Definition 14. Σ ⊆T is said to be a set of axioms for T whenever Σ |= σ forevery σ in T; in this case we write Σ |= T. Remark. We will generally assume our theories are closed and we will often describe theories by specifying a set of axioms Σ. The theory will then be all consequences σ of Σ.
1. NOTATION AND EXAMPLES 13
Example 5. We will consider the following theories and their axioms. (1) The theory of Linear Orderings (LOR) is a theory in the language {<} which has as axioms sentences 1-4 from Example 4. (2) The theory of Dense Linear Orders (DLO) is a theory in the language{<} which has as axioms all the axioms of LOR, and sentences 5, 6 and 7 of Example 4. (3) The theory of Fields (FLD) is a theory in the language {0,1,+,•} which has as axioms sentences 8-17 from Example 4. (4) The theory of Ordered Fields (ORF) is a theory in the language given by {<,0,1,+,•} which has as axioms all the axioms of FLD, LOR and sentences 18 and 19 from Example 4. (5) The theory of Algebraically Closed Fields (ACF) is a theory in the language {0,1,+,•} which has as axioms all the axioms of FLD and all sentences from 20 of Example 4, i.e. inﬁnitely many sentences, one for each n ≥ 1. (6) The theory of Real Closed Ordered Fields (RCF) is a theory in the language {<,0,1,+,•} which has as axioms all the axioms of ORF, and all sentences from 21 of Example 4, i.e. inﬁnitely many sentences, one for each n ≥ 1. Exercise 5. Show that : (1) Q |= DLO (2) R |= RCF using the Intermediate Value theorem (3) C |= ACF using the Fundamental Theorem of Algebra where Q, R and C are as in Example 3.
Remark. The theory of Real Closed Ordered Fields is sometimes axiomatised diﬀerently. All the axioms of ORF are retained, but the sentences from 21 of Example 4, which amount to an Intermediate Value Property, are replaced by the sentences from 20 for odd n and the sentence (∀x)(0 < x → (∃y)y2 = x) which states that every positive element has a square root. A signiﬁcant amount of algebra would then be used to verify the Intermediate Value Property from these axioms.
CHAPTER 2
Compactness and Elementary Submodels
Theorem 1. The Compactness Theorem (Malcev) A set of sentences is satisﬁable iﬀ every ﬁnite subset is satisﬁable.
Proof. There are several proofs. We only point out here that it is an easy consequence of the following theorem which appears in all elementary logic texts:
Proposition. The Completeness Theorem (G¨odel, Malcev) A set of sentences is consistent if and only if it is satisﬁable.
Although we do not here formally deﬁne “consistent”, it does mean what you think it does. In particular, a set of sentences is consistent if and only if each ﬁnite subset is consistent.
Remark. The Compactness Theorem is the only one for which we do not give a complete proof. For the reader who has not previously seen the Completeness Theorem, there are other proofs of the Compactness Theorem which may be more easily absorbed: set theoretic (using ultraproducts), topological (using compact spaces, hence the name) or Boolean algebraic. However these topics are too far aﬁeld to enter into the proofs here. We will use the Compactness Theorem as a starting point — in fact, all that follows can be seen as its corollaries. Exercise 6. Suppose T is a theory for the language L and σ is a sentence of L such that T |= σ. Prove that there is some ﬁnite T0 ⊆T such that T0 |= σ. Recall that T |= σ iﬀ T ∪{¬σ} is not satisﬁable. Definition 15. If L, and L0 are two languages such that L⊆L0 we say that L0 is an expansion of L and L is a reduction of L0. Of course when we say that L⊆L0 we also mean that the constant, function and relation symbols of L remain (respectively) constant, function and relation symbols of the same type in L0. Definition 16. Given a model A for the language L, we can expand it to amodel A0 ofL0, whereL0 is an expansion ofL, by giving appropriate interpretations to the symbols in L0\L. We say that A0 is an expansion of A to L0 and that A is a reduct of A0 to L. We also use the notation A0|L for the reduct of A0 to L. Theorem 2. If a theory T has arbitrarily large ﬁnite models, then it has an inﬁnite model. Proof. Consider new constant symbols ci for i ∈ N, the usual natural numbers, and expand from L, the language of T, to L0 = L∪{ci : i ∈N}. Let Σ = T ∪{¬ci = cj : i 6= j,i,j ∈N}. 14
2. COMPACTNESS AND ELEMENTARY SUBMODELS 15
We ﬁrst show that every ﬁnite subset of Σ has a model by interpreting the ﬁnitely many relevant constant symbols as diﬀerent elements in an expansion of some ﬁnite model of T. Then we use compactness to get a model A0 of Σ. The model that we require is for the language L, so we take A to be the reductof A0 to L.   Definition 17. Two models A and A0 forLare said to be isomorphic whenever there is a bijection f : A → A0 such that (1) for each n-placed relation symbol R ofLand corresponding interpretations S of A and S0 of A0 we have S(x1,...,xn) iﬀ S0(f(x1),...,f(xn)) for all x1,...,xn in A (2) for each n-placed function symbol F of L and corresponding interpretations G of A and G0 of A0 we have f(G(x1,...,xn)) = G0(f(x1),...,f(xn)) for all x1,...,xn in A (3) for each constant symbol c of L and corresponding constant elements a of A and a0 of A0 we have f(a) = a0. We write A ∼ = A0. This is an equivalence relation. Example 6. Number theory is ThhN,+ + +,• • •,< < <,0 0 0,1 1 1i, the set of all sentences of L = {+,•,<,0,1} which are true in hN,+ + +,• • •,< < <,0 0 0,1 1 1i, the standard model which we all learned in school. Any model not isomorphic to the standard model of number theory is said to be a non-standard model of number theory.
Theorem 3. (T. Skolem) There exist non-standard models of number theory. Proof. Add a new constant symbol c to L. Consider ThhN,+,•,<,0,1i∪{ n z }| { 1 + 1 +•••+ 1 < c : n ∈N} and use the Compactness Theorem. The interpretation of the constant symbol c will not be a natural number.   Definition 18. Two models A and A0 for L are said to be elementarily equiv-alent whenever we have that for each sentence σ of L A |= σ iﬀ A0 |= σ We write A ≡ A0. This is another equivalence relation. Exercise 7. Suppose f : A → A0 is an isomorphism and ϕ is a formula suchthat A |= ϕ[a0,...,ak] for some a0,...,ak from A; prove A0 |= ϕ[f(a0),...,f(ak)]. Use this to show that A ∼ = A0 implies A ≡ A0. Definition 19. A model A0 is called a submodel of A, and we write A0 ⊆ A whenever φ 6= A0 ⊆ A and (1) each n-placed relation S0 of A0 is the restriction to A0 of the corresponding relation S of A, i. e. S0 = S ∩(A0)n (2) each m-placed function G0 of A0 is the restriction to A0 of the corresponding function G of A, i. e. G0 = G (A0)m
2. COMPACTNESS AND ELEMENTARY SUBMODELS 16
(3) each constant of A0 is the corresponding constant of A.
Definition 20. Let A and B be two models for L. We say A is an elementary submodel of B and B is an elementary extension of A and we write A ≺ B whenever (1) A ⊆ B and (2) for all formulas ϕ(v0,...,vk) of L and all a0,...,ak ∈ A A |= ϕ[a0,...,ak] iﬀ B |= ϕ[a0,...,ak]. Exercise 8. Prove that: • if A ⊆ B and B ⊆ C then A ⊆ C, • if A ≺ B and B ≺ C then A ≺ C, • if A ≺ B then A ⊆ B and A ≡ B. Example 7. Let N be the usual natural numbers with < < < as the usual ordering. Let B = hN,< < <i and A = hN\{0},< < <i be models for the language with one binaryrelation symbol <. Then A ⊆ B and A ≡ B; in fact A ∼ = B. But we do not have A ≺ B; 1 satisﬁes the formula describing the least element of the ordering in A but not so in B. So we see that being an elementary submodel is a very strong condition indeed. Nevertheless, later in the chapter we will obtain many examples of elementary submodels. Definition 21. A chain of models for a language L is an increasing sequence of models A0 ⊆ A1 ⊆•••⊆ An ⊆••• n ∈N. The union of the chain is deﬁned to be the model A = ∪{An : n ∈ N} where the universe of A is A = ∪{An : n ∈N} and: (1) each relation S on A is the union of the corresponding relations Sn of An; S = ∪{Sn : n ∈N}, i.e. the relation extending each Sn (2) each function G on A is the union of the corresponding functions Gn of An; G = ∪{Gn : n ∈N}, i.e. the function extending each Gn (3) all the models An and A have the same constant elements. Note that each An ⊆ A. Remark. To be sure, what is deﬁned here is a chain of models indexed by the natural numbers N. More generally, a chain of models could be indexed by any ordinal. However we will not need the concept of an ordinal at this point. Example 8. For each n ∈N, let An = {−n,−n + 1,−n + 2,...,0,1,2,3,...}⊆Z. Let An = hAn,≤i. Each An ≡ A0, but we don’t have A0 ≡∪{An : n ∈N}. Definition 22. An elementary chain is a chain of models {An : n ∈ N} such that for each m < n we have Am ≺ An. Theorem 4. (Tarski’s Elementary Chain Theorem) Let {An : n ∈N} be an elementary chain. For all n ∈N we have An ≺∪{An : n ∈N}. Proof. Denote the union of the chain by A. We have Ak ⊆ A for each k ∈N.
2. COMPACTNESS AND ELEMENTARY SUBMODELS 17
Claim. If t is a term of the language L and a0,...,ap are in Ak, then the value of the term t[a0,...,ap] in A is equal to the value in Ak.
Proof of Claim. We prove this by induction on the complexity of the term. (1) If t is the variable vi then both values are just ai. (2) If t is the constant symbol c then the values are equal because c has the same interpretation in A and in Ak. (3) If t is F(t1 ...tm) where F is a function symbol and t1,...,tm are terms such that each value ti[a0,...,ap] is the same in both A and Ak, then the value F(t1 ...tm)[a0,...,ap] in A is G(t1[a0,...,ap],...,tm[a0,...,ap]) where G is the interpretation of F in A and the value of
F(t1 ...tm)[a0,...,ap]
in Ak is
Gk(t1[a0,...,ap],...,tm[a0,...,ap]) where Gk is the interpretation of F in Ak. But Gk is the restriction of G to Ak so these values are equal.
In order to show that each Ak ≺ A it will suﬃce to prove the following statement for each formula ϕ(v0,...,vp) of L. “ For all k ∈N and all a0,...,ap in Ak: A |= ϕ[a0,...,ap] iﬀ Ak |= ϕ[a0,...,ap].” Claim. The statement is true whenever ϕ is t1 = t2 where t1 and t2 are terms. Proof of Claim. Fix k ∈N and a0,...,ap in Ak. A |= (t1 = t2)[a0,...,ap] iﬀ t1[a0,...,ap] = t2[a0,...,ap] in A iﬀ t1[a0,...,ap] = t2[a0,...,ap] in Ak iﬀ Ak |= (t1 = t2)[a0,...,ap].
Claim. The statement is true whenever ϕ is R(t1 ...tn) where R is a relation symbol and t1,...,tn are terms. Proof of Claim. Fix k ∈N and a0,...,ap in Ak. Let S be the interpretationof R in A and Sk be the interpretation in Ak; Sk is the restriction of S to Ak. A |= R(t1 ...tn)[a0,...,ap] iﬀ S(t1[a0,...,ap],...,tn[a0,...,ap]) iﬀ Sk(t1[a0,...,ap],...,tn[a0,...,ap]) iﬀ Ak |= R(t1 ...tn)[a0,...,ap]
Claim. If the statement is true when ϕ is θ, then the statement is true when ϕ is ¬θ.
2. COMPACTNESS AND ELEMENTARY SUBMODELS 18 Proof of Claim. Fix k ∈N and a0,...,ap in Ak. A |= (¬θ)[a0,...,ap] iﬀ not A |= θ[a0,...,ap] iﬀ not Ak |= θ[a0,...,ap] iﬀ Ak |= (¬θ)[a0,...,ap].
Claim. If the statement is true when ϕ is θ1 and when ϕ is θ2 then the statement is true when ϕ is θ1 ∧θ2. Proof of Claim. Fix k ∈N and a0,...,ap in Ak. A |= (θ1 ∧θ2)[a0,...,ap] iﬀ A |= θ1[a0,...,ap] and A |= θ2[a0,...,ap] iﬀ Ak |= θ1[a0,...,ap] and Ak |= θ2[a0,...,ap] iﬀ Ak |= (θ1 ∧θ2)[a0,...,ap].
Claim. If the statement is true when ϕ is θ then the statement is true when ϕ is ∃viθ. Proof of Claim. Fix k ∈N and a0,...,ap in Ak. Note thatA = ∪{Aj : j ∈N}. A |= ∃viθ[a0,...,ap] iﬀ A |= ∃viθ[a0,...,aq] where q is the maximum of i and p (by Lemma 2), iﬀ A |= θ[a0,...,ai−1,a,ai+1,...,aq] for some a ∈ A, iﬀ A |= θ[a0,...,ai−1,a,ai+1,...,aq] for some a ∈ Al for some l ≥ k iﬀ Al |= θ[a0,...,ai−1,a,ai+1,...,aq] since the statement is true for θ, iﬀ Al |= ∃viθ[a0,...,aq] iﬀ Ak |= ∃viθ[a0,...,aq] since Ak ≺ Al iﬀ Ak |= ∃viθ[a0,...,ap] (by Lemma 2).
By induction on the complexity of ϕ, we have proven the statement for all formulas ϕ which do not contain the connectives ∨, → and ↔ or the quantiﬁer ∀. To verify the statement for all ϕ we use Lemma 3. Let ϕ be any formula of L. By Lemma 3 there is a formula ψ which does not use ∨, →, ↔ nor ∀ such that | = (∀v0)...(∀vp)(ϕ ↔ ψ). Now ﬁx k ∈N and a0,...,ap in Ak. We have A |= (ϕ ↔ ψ)[a0,...,ap] and Ak |= (ϕ ↔ ψ)[a0,...,ap]. A |= ϕ[a0,...,ap] iﬀ A |= ψ[a0,...,ap] iﬀ Ak |= ψ[a0,...,ap] iﬀ Ak |= ϕ[a0,...,ap] which completes the proof of the theorem.
2. COMPACTNESS AND ELEMENTARY SUBMODELS 19
Lemma 5. (The Tarski-Vaught Condition) Let A and B be models for L with A ⊆ B. The following are equivalent: (1) A ≺ B (2) for any formula ψ(v0,...,vq) and any i ≤ q and any a0,...,aq from A: if there is some b ∈ B such that B |= ψ[a0,...,ai−1,b,ai+1,...,aq] then we have some a ∈ A such that B |= ψ[a0,...,ai−1,a,ai+1,...,aq]. Proof. Only the implication (2) ⇒ (1) requires a lot of proof. We will prove that for each formula ϕ(v0,...,vp) and all a0,...,ap from A we will have: A |= ϕ[a0,...,ap] iﬀ B |= ϕ[a0,...,ap] by induction on the complexity of ϕ using only the negation symbol ¬, the connective ∧ and the quantiﬁer ∃ (recall Lemma 3). (1) The cases of formulas of the form t1 = t2 and R(t1 ...tn) come immediately from the fact that A ⊆ B. (2) For negation: suppose ϕ is ¬ψ and we have it for ψ, then A |= ϕ[a0,...,ap] iﬀ not A |= ψ[a0,...,ap] iﬀ not B |= ψ[a0,...,ap] iﬀ B |= ϕ[a0,...,ap]. (3) The ∧ case proceeds similarly. (4) For the ∃ case we consider ϕ as ∃viψ. If A |= ∃viψ[a0,...,ap], then the inductive hypothesis for ψ and the fact that A ⊆ B ensure that B |= ∃viψ[a0,...,ap]. It remains to show that if B |= ϕ[a0,...,ap] then A |= ϕ[a0,...,ap]. Assume B |= ∃viψ[a0,...,ap]. By Lemma 2, B |= ∃viψ[a0,...,aq] where q is the maximum of i and p. By the deﬁnition of satisfaction, there is some b ∈ B such that B |= ψ[a0,...,ai−1,b,ai+1,...,aq]. By (2), there is some a ∈ A such that B |= ψ[a0,...,ai−1,a,ai+1,...,aq]. By the inductive hypothesis on ψ, for that same a ∈ A, A |= ψ[a0,...,ai−1,a,ai+1,...,aq]. By the deﬁnition of satisfaction, A |= ∃viψ[a0,...,aq]. Finally, by Lemma 2, A |= φ[a0,...,ap].   Recall that |B| is used to represent the cardinality, or size, of the set B. Note that since any language L contains inﬁnitely many variables, |L| is always inﬁnite, but may be countable or uncountable depending on the number of other symbols. We often denote an arbitrary inﬁnite cardinal by the lower case Greek letter κ.
2. COMPACTNESS AND ELEMENTARY SUBMODELS 20
Theorem 5. (L¨owenheim-Skolem Theorem) Let B be a model for L and let κ be any cardinal such that |L|≤ κ < |B|. Then B has an elementary submodel A of cardinality κ. Furthermore if X ⊆ B and |X|≤ κ, then we can also have X ⊆ A. Proof. Without loss of generosity assume |X| = κ. We recursively deﬁne setsX n for n ∈ N such that X = X0 ⊆ X1 ⊆ ••• ⊆ Xn ⊆ ••• and such that for each formula ϕ(v0,...,vp) of L and each i ≤ p and each a0,...,ap from Xn such that B |= ∃viϕ[a0,...,ap] we have x ∈ Xn+1 such that B |= ϕ[a0,...,ai−1,x,ai+1,...,ap]. Since |L|≤ κ and each formula of L is a ﬁnite string of symbols from L, there are at most κ many formulas of L. So there are at most κ elements of B that need to be added to each Xn and so, without loss of generosity each |Xn| = κ. Let A = ∪{Xn : n ∈N}; then |A| = κ. Since A is closed under functions from B and contains all constants from B, A gives rise to a submodel A ⊆ B. The Tarski-Vaught Condition is used to show that A ≺ B.   An interesting consequence of this theorem is that the ordered ﬁeld of real numbers R has a countable elementary submodel containing π and e. Definition 23. A theory T for a language L is said to be complete whenever for each sentence σ of L either T |= σ or T |= ¬σ. Lemma 6. A theory T for L is complete iﬀ any two models of T are elementarily equivalent. Proof. (⇒) easy. (⇐) easy.   Definition 24. A theory T is said to be categorical in cardinality κ whenever any two models of T of cardinality κ are isomorphic. We also say that T is κcategorical. The most interesting cardinalities in the context of categorical theories are ℵ0, the cardinality of countably inﬁnite sets, and ℵ1, the ﬁrst uncountable cardinal. Exercise 9. Show that DLO is ℵ0-categorical. There are two well-known proofs. One uses a back-and-forth construction of an isomorphism. The other constructs, by recursion, an isomorphism from the set of dyadic rational numbers between 0 and 1: { n 2m : m is a positive integer and n is an integer 0 < n < 2m}, onto a countable dense linear order without endpoints. Now use the following theorem to show that DLO is complete.
Theorem 6. (The L o´s-Vaught Test) Suppose that a theory T has only inﬁnite models for a language L and that T is κ-categorical for some cardinal κ ≥|L|. Then T is complete.
2. COMPACTNESS AND ELEMENTARY SUBMODELS 21 Proof. We will show that any two models of T are elementarily equivalent.Let A of cardinality λ1, and B of cardinality λ2, be two models of T. If λ1 > κ use the L¨owenheim-Skolem Theorem to get A0 such that |A0| = κand A0 ≺ A. If λ1 < κ use the Compactness Theorem on the set of sentences ThA∪{cα 6= cβ : α 6= β} where {cα : α ∈ κ} is a set of new constant symbols of size κ, to obtain a model C for this expanded language such that |C|≥ κ. The reduct C0 to the language L has the property that C0 |= ThA and hence A ≡ C0. Now use the L¨owenheim-Skolem Theorem to get A0 such that |A0| = κ and A0 ≺ C0. Either way, we can get A0 such that |A0| = κ and A0 ≡ A. Similarly, we canget B0 such that |B0| = κ and B0 ≡ B. Since T is κ -categorical, A0 ∼ = B0. Hence A ≡ B.   Recall that the characteristic of a ﬁeld is the prime number p such that p z }| { 1 + 1 +•••+ 1 = 0 provided that such a p exists, and, if no such p exists the ﬁeld has characteristic 0. All of our best-loved ﬁelds: Q, R and C have characteristic 0. On the other hand, ﬁelds of characteristic p include the ﬁnite ﬁeld of size p (the prime Galois ﬁeld).
Theorem 7. The theory of algebraically closed ﬁelds of characteristic 0 is complete. Proof. We use the L o´s-Vaught Test and the following Lemma.
Lemma 7. Any two algebraically closed ﬁelds of characteristic 0 and cardinality ℵ1 are isomorphic. Proof. Let A be such a ﬁeld containing the rationals Q = hQ,+ + +,• • •,0,1i as a prime subﬁeld. In a manner completely analogous to ﬁnding a basis for a vector space, we can ﬁnd a transcendence basis for A, that is, an indexed subset {aα : α ∈ I}⊆ A such that A is the algebraic closure of the subﬁeld A0 generated by {aα : α ∈ I} but no aβ is in the algebraic closure of the subﬁeld generated by the rest: {aα : α ∈ I and α 6= β}. Since the subﬁeld generated by a countable subset would be countable and the algebraic closure of a countable subﬁeld would also be countable, we must have that the transcendence base is uncountable. Since |A| = ℵ1, the least uncountable cardinal, we must have in fact that |I| = ℵ1. Now let B be any other algebraically closed ﬁeld of characteristic 0 and size ℵ1. As above, obtain a transcendence basis {bβ : β ∈ J} with |J| = ℵ1 and its generated subﬁeld B0. Since |I| = |J|, there is a bijection g : I → J which we can use to build an isomorphism from A to B. Since B has characteristic 0, a standard theorem of algebra gives that the rationals are isomorphically embedded into B. Let this embedding be: f : Q ,→ B. We extend f as follows: for each α ∈ I, let f(aα) = bg(α), which maps the transcendence basis of A into the transcendence basis of B.
2. COMPACTNESS AND ELEMENTARY SUBMODELS 22
We now extend f to map A0 onto B0 as follows: Each element of A0 is given by p(aα1,...,aαm) q(aα1,...,aαm) , where p and q are polynomials with rational coeﬃcients and the a’s come, of course, from the transcendence basis. Let f map such an element to ¯ p(bg(α1),...,bg(αm)) ¯ q(bg(α1),...,bg(αm)) where ¯ p and ¯ q are polynomials whose coeﬃcients are the images under f of the rational coeﬃcients of p and q. The ﬁnal extension of f to all of A and B comes from the uniqueness of algebraic closures.
Remark. Lemma 7 is also true when 0 is replaced by any ﬁxed characteristic and ℵ1 by any uncountable cardinal. Theorem 8. Let H be a set of sentences in the language of ﬁeld theory which are true in algebraically closed ﬁelds of arbitrarily high characteristic. Then H holds in some algebraically closed ﬁeld of characteristic 0. Proof. A ﬁeld is a model in the language {+,•,0,1} of the axioms of ﬁeld theory. Let ACF be the set of axioms for the theory of algebraically closed ﬁelds; see Example 5. For each n ≥ 2, let τn denote the sentence ¬( n z }| { 1 + 1 +•••+ 1) = 0 Let Σ = ACF∪H∪{τn : n ≥ 2} Let Σ0 be any ﬁnite subset of Σ and let m be the largest natural number such that τm ∈ Σ0 or let m = 1 by default. Let A be an algebraically closed ﬁeld of characteristic p > m such that A |= H; then in fact A |= Σ0. So by compactness there is B such that B |= Σ. B is the required ﬁeld.
Corollary 1. Let C denote, as usual, the complex numbers. Every one-to-one polynomial map f : Cm →Cm is onto. Proof. A polynomial map is a function of the form f(x1,...,xm) = hp1(x1,...,xm),...,pm(x1,...,xm)i where each pi is a polynomial in the variables x1,...,xm. We call max { degree of pi : i ≤ m} the degree of f. Let L be the language of ﬁeld theory and let θm,n be the sentence of L which expresses that “each polynomial map of m variables of degree < n which is one-toone is also onto”. We wish to show that there are algebraically closed ﬁelds of arbitrarily high characteristic which satisfy H = {θm,n : m,n ∈ N}. We will then apply Theorem 8, Theorem 7, Lemma 6 and Exercise 5 and be ﬁnished.
2. COMPACTNESS AND ELEMENTARY SUBMODELS 23
Let p be any prime and let Fp be the prime Galois ﬁeld of size p. The algebraic closure ˜ Fp is the countable union of a chain of ﬁnite ﬁelds Fp = A0 ⊆ A1 ⊆ A2 ⊆•••⊆ Ak ⊆ Ak+1 ⊆••• obtained by recursively adding roots of polynomials. We ﬁnish the proof by showing that each h ˜ Fp,+ + +,•,0,1i satisﬁes H. Given any polynomial map f : ( ˜ Fp m ) → ( ˜ Fp m ) which is one-to-one, we show that f is also onto. Given any elements b1,...,bm ∈ ˜ Fp, there is some Ak containing b1,...,bm as well as all the coeﬃcients of f. Since f is one-to-one, f Am k : Am k → Am k is a one-to-one polynomial map. Hence, since Am k is ﬁnite, f Am k is onto and so there are a1,...,am ∈ Ak suchthat f(a1,...,am) = hb1,...,bmi. Therefore f is onto. Thus, for each prime number p and each m,n ∈ N, θm,n holds in a ﬁeld of characteristic p, i.e. h˜ Fp,+ + +,• • •,0 0 0,1 1 1i satisﬁes H.   The above corollary is the famous Ax-Grothendieck Theorem. It is a signiﬁcant problem to replace “one-to-one” with “locally one-to-one”.
CHAPTER 3
Diagrams and Embeddings
Let A = hA,Ii be a model for a language L. Expand L to the language LA = L∪{ca : a ∈ A} by adding new constant symbols to L. We can expand A to a model AA = hA,I0i for LA by choosing I0 extending I such that I0(ca) = a for each a ∈ A. More generally, if f : X → A, we can expand L to LX = L∪{cx : x ∈ X} and expand A = hA,Ii to hA,I0i where I0 extends I with each I0(cx) = f(x). We denote the resulting model as hA,f(x)ix∈X or AX = hA,xix∈X if f is the identity function. Definition 25. Let A be a model for L. (1) The elementary diagram of A is Th(AA), the set of all sentences of LA which hold in AA. (2) The diagram of A, denoted by 4A, is the set of all those sentences in Th(AA) without quantiﬁers.
Remark. There is a notion of atomic formula, which is a formula of the form (t1 = t2) or (R(t1 ...tn)) where t1,...,tn are terms. Sometimes 4A is deﬁned to be the set of all atomic formulas and negations of atomic formulas which occur in Th(AA). However this is not substantially diﬀerent from Deﬁnition 25, since the reader can quickly show that for any model B, B |= 4A in one sense iﬀ B |= 4A in the other sense. Exercise 10. Let A and B be models for L with X ⊆ A ⊆ B. Prove: (i) A ⊆ B iﬀ AX ⊆ BX iﬀ BA |= 4A. (ii) A ≺ B iﬀ AX ≺ BX iﬀ BA |= Th(AA). Hint: A |= ϕ[a1,...,ap] iﬀ AA |= ϕ∗ where ϕ∗ is the sentence of LA formed by replacing each free occurrence of vi with cai. Definition 26. A is said to be isomorphically embedded into B whenever (1) there is a model C such that A ∼ = C and C ⊆ B or (2) there is a model D such that A ⊆ D and D ∼ = B. Exercise 11. Prove that, in fact, (1) and (2) are equivalent conditions.
Definition 27. A is said to be elementarily embedded into B whenever (1) there is a model C such that A ∼ = C and C ≺ B or (2) there is a model D such that A ≺ D and D ∼ = B. Exercise 12. Again, prove that, in fact, (1) and (2) are equivalent.
24
3. DIAGRAMS AND EMBEDDINGS 25
The next result is extremely useful; the ﬁrst part is called the Diagram Lemma and the second part is called the Elementary Diagram Lemma. Theorem 9. Let A and B be models for L. (1) A is isomorphically embedded into B if and only if B can be expanded to a model of 4A.(2) A is elementarily embedded into B if and only if B can be expanded to a model of Th(AA). Proof. We sketch the proof of (1). (⇒) If f is the isomorphism as in 1 of Deﬁnition 26 above, then hB,f(a)ia∈A |= 4A. (⇐) If hB,baia∈A |= 4A, then C = {ba : a ∈ A} generates C ⊆ B with C ∼ = A.   Exercise 13. Give a complete proof of (2). Exercise 14. Show that if A is a model for the language L and C is a model for the language LA such that C |= 4A then there is a model B such that A ⊆ B and BA ∼ = C.
Exercise 15. The L¨owenheim-Skolem Theorem is sometimes called the Downward L¨owenheim-Skolem Theorem. It’s partner is the Upward L¨owenheim-Skolem Theorem: if A is an inﬁnite model for L and κ is any cardinal such that |L| ≤ κ and |A| < κ, then A has an elementary extension of cardinality κ. Prove it. We now apply these notions to graph theory and to calculus. The natural language for graph theory has one binary relation symbol which we call E (to suggest the word “edge”). Graph Theory has the following two axioms: • (∀x)(∀y)E(x,y) ↔ E(y,x) • (∀x)¬E(x,x). A graph is, of course, a model of graph theory.
Corollary 2. Every planar graph can be four coloured.
Proof. We will have to use the famous result of Appel and Haken that every ﬁnite planar graph can be four coloured. Model Theory will take us from the ﬁnite to the inﬁnite. We recall that a planar graph is one that can be embedded, or drawn, in the usual Euclidean plane and to be four coloured means that each vertex of the graph can be assigned one of four colours in such a way that no edge has the same colour for both endpoints. Let A be an inﬁnite planar graph. Introduce four new unary relation symbols: R,G,B,Y (for red, green, blue and yellow). We wish to prove that there is some expansion A0 of A such that A0 |= σ where σ is the sentence in the expanded language: (∀x)[R(x)∨G(x)∨B(x)∨Y (x)] ∧(∀x)[R(x) →¬(G(x)∨B(x)∨Y (x))]∧... ∧(∀x)(∀y)¬(R(x)∧R(y)∧E(x,y))∧••• which will ensure that the interpretations of R,G,B and Y will four colour the graph.
3. DIAGRAMS AND EMBEDDINGS 26
Let Σ = 4A ∪{σ}. Any ﬁnite subset of Σ has a model, based upon the appropriate ﬁnite subset of A. By the compactness theorem, we get B |= Σ. Since B |= σ, the interpretations of R,G,B and Y four colour it. By the diagram lemma A is isomorphically embedded in the reduct of B, and this isomorphism delivers the four-colouring of A.
A graph with the property that every pair of vertices is connected with an edge is called complete. At the other extreme, a graph with no edges is called discrete. A very important theorem in ﬁnite combinatorics says that most graphs contain an example of one or the other as a subgraph. A subgraph of a graph is, of course, a submodel of a model of graph theory.
Corollary 3. (Ramsey’s Theorem) For each n ∈N there is an r ∈N such that if G is any graph with r vertices, then either G contains a complete subgraph with n vertices or a discrete subgraph with n vertices.
Proof. We follow F. Ramsey who began by proving an inﬁnite version of the theorem (also called Ramsey’s Theorem).
Claim. Each inﬁnite graph G contains either an inﬁnite complete subgraph or an inﬁnite discrete subgraph.
Proof of Claim. By force of logical necessity, there are two possiblities: (1) there is an inﬁnite X ⊆ G such that for all x ∈ X there is a ﬁnite Fx ⊆ X such that E(x,y) for all y ∈ X \Fx, (2) for all inﬁnite X ⊆ G there is a x ∈ X and an inﬁnite Y ⊆ X such that ¬E(x,y) for all y ∈ Y . If (1) occurs, we recursively pick x1 ∈ X, x2 ∈ X\Fx1, x3 ∈ X\(Fx1∪Fx2), etc, to obtain an inﬁnite complete subgraph. If (2) occurs we pick x0 ∈ G and Y0 ⊆ G with the property and then recursively choose x1 ∈ Y0 and Y1 ⊆ Y0 , x2 ∈ Y1 and Y2 ⊆ Y1 and so on, to obtain an inﬁnite discrete subgraph.
We now use Model Theory to go from the inﬁnite to the ﬁnite. Let σ be the sentence, of the language of graph theory, asserting that there is no complete subgraph of size n. (∀x1 ...∀xn)[¬E(x1,x2)∨¬E(x1,x3)∨•••∨¬E(xn−1,xn)]. Let τ be the sentence asserting that there is no discrete subgraph of size n. (∀x1 ...∀xn)[E(x1,x2)∨E(x1,x3)∨•••∨E(xn−1,xn)]. Let T be the set consisting of σ, τ and the axioms of graph theory. If there is no r as Ramsey’s Theorem states, then T has arbitrarily large ﬁnite models. By Theorem 2, T has an inﬁnite model, contradicting the claim.
Ramsey’s Theorem says that for each n there is some r. The proof does not, however, let us know exactly which r corresponds to any given n. There has been considerable eﬀorts made to ﬁnd a more constructive proof. In particular we would
3. DIAGRAMS AND EMBEDDINGS 27
like to know, for each n, the smallest value of r which would satisfy Ramsey’s Theorem, called the Ramsey Number of n. The Ramsey number of 3 is 6; the Ramsey number of 4 is 18; the Ramsey number of 5 is ...unknown; but it’s somewhere between 40 and 50. Even less is known about the Ramsey numbers for higher values of n. Determining the Ramsey numbers may be the most mysterious problem in all of mathematics.
The following theorem of A. Robinson ﬁnally solved the centuries old problem of inﬁnitesimals in the foundations of calculus. Theorem 10. (The Leibniz Principle) There is an ordered ﬁeld ∗R called the hyperreals, containing the reals R and a number larger than any real number such that any statement about the reals which holds in R also holds in ∗R. Proof. Let R be hR,+ + +,• • •,< < <,0 0 0,1 1 1i. We will make the statement of the theoremprecise by proving that there is some model H, in the same language L as R andwith the universe called ∗R , such that R ≺ H and there is b ∈∗R such that a < bfor each a ∈R. For each real number a, we introduce a new constant symbol ca. In addition, another new constant symbol d is introduced. Let Σ be the set of sentences in the expanded language given by: ThRR∪{ca < d : a is a real} We can obtain a model C |= Σ by the compactness theorem. Let C0 be the reduct of C to L. By the elementary diagram lemma R is elementarily embedded in C0, and so there is a model H for L such that C0 ∼ = H and R ≺ H. Take b to be theinterpretation of d in H.   Remark. The element b ∈ ∗R gives rise to an inﬁnitesimal 1/b ∈ ∗R. Anelement x ∈ ∗R is said to be inﬁnitesimal whenever −1/n < x < 1/n for eachn ∈ N. 0 is inﬁnitesimal. Two elements x,y ∈ R are said to be inﬁnitely close, written x ≈ y whenever x−y is inﬁnitesimal, so that x is inﬁnitesimal iﬀ x ≈ 0. An element x ∈ ∗R is said to be ﬁnite whenever −r < x < r for some positive r ∈ R. Else it is inﬁnite. Each ﬁnite x ∈ ∗R is inﬁnitely close to some real number, called the standardpart of x, written st(x). This idea is extremely useful in understanding calculus. To diﬀerentiate f, for each Mx ∈ ∗R generate My = f(x + Mx)−f(x). Then f0(x) = stMy Mx wheneverthis exists and is the same for each inﬁnitesimal Mx 6= 0. This legitimises the intuition of the founders of the diﬀerential calculus and allows us to use that intuition to move from the (ﬁnitely) small to the inﬁnitely small. Proofs of the usual theorems of calculus are now much easier. More importantly, reﬁnements of these ideas, now called non-standard analysis, form a powerful tool for applying calculus, just as its founders envisaged. The following theorem is considered one of the most fundamental results of mathematical logic. We give a detailed proof. Theorem 11. (Robinson Consistency Theorem) Let L1 and L2 be two languages with L = L1∩L2. Suppose T1 and T2 are satisﬁable
3. DIAGRAMS AND EMBEDDINGS 28
theories in L1 and L2 respectively. Then T1∪T2 is satisﬁable iﬀ there is no sentence σ of L such that T1 |= σ and T2 |= ¬σ. Proof. The direction ⇒ is easy and motivates the whole theorem. We begin the proof in the ⇐ direction. Our goal is to show that T1 ∪T2 is satisﬁable. The following claim is a ﬁrst step. Claim. T1 ∪{ sentences σ of L : T2 |= σ} is satisﬁable. Proof of Claim. Using the compactness theorem and considering conjunctions, it suﬃces to show that if T1 |= σ1 and T2 |= σ2 with σ2 a sentence of L, then {σ1,σ2}is satisﬁable. But this is true, since otherwise we would have σ1 |= ¬σ2 and hence T1 |= ¬σ2 and so ¬σ2 would be a sentence of L contradicting our hypothesis. This proves the claim.
The basic idea of the proof from now on is as follows. In order to construct a model of T1 ∪T2 we construct models A |= T1 and B |= T2 and an isomorphism f : A|L → B|L between the reducts of A and B to the language L, witnessing that A|L∼ = B|L. We then use f to carry over interpretations of symbols in L1 \Lfrom A to B , giving an expansion B∗ of B to the language L1 ∪L2. Then, sinceB ∗|L1 ∼ = A and B∗|L2 = B we get B∗ |= T1 ∪T2. The remainder of the proof will be devoted to constructing such an A, B and f. A and B will be constructed as unions of elementary chains of An’s and Bn’s while f will be the union of fn : An ,→ Bn. We begin with n = 0, the ﬁrst link in the elementary chain. Claim. There are models A0 |= T1 and B0 |= T2 with an elementary embeddingf 0 : A0|L ,→ B0|L. Proof of Claim. Using the previous claim, let A0 |= T1 ∪{ sentences σ of L : T2 |= σ} We ﬁrst wish to show that Th(A0|L)A0∪T2 is satisﬁable. Using the compactness theorem, it suﬃces to prove that if σ ∈ Th(A0|L)A0 then T2 ∪{σ} is satisﬁable. For such a σ let ca0,...,can be all the constant symbols from LA0 \L which appear in σ. Let ϕ be the formula of L obtained by replacing each constant symbol cai by a new variable ui. We have A0|L|= ϕ[a0,...,an] and so A0|L|= ∃u0 ...∃unϕ By the deﬁnition of A0, it cannot happen that T2 |= ¬∃u0 ...∃unϕ and so there is some model D for L2 such that D |= T2 and D |= ∃u0 ...∃unϕ. So there are elements d0,...,dn of D such that D |= ϕ[do,...,dn]. Expand D to a model D∗ for L2 ∪LA0, making sure to interpret each cai as di. Then D∗ |= σ, and so D∗ |= T2 ∪{σ}. Let B∗ 0 |= Th(A0|L)A0∪T2. Let B0 be the reduct of B∗ 0 toL2; clearly B0 |= T2.Since B0|L can be expanded to a model of Th(A0|L)A0, the Elementary Diagram Lemma gives an elementary embedding f0 : A0|L ,→ B0|L and ﬁnishes the proof of the claim.
3. DIAGRAMS AND EMBEDDINGS 29
The other links in the elementary chain are provided by the following result. Claim. For each n ≥ 0 there are models An+1 |= T1 and Bn+1 |= T2 with an elementary embedding fn+1 : An+1|L ,→ Bn+1|L such that An ≺ An+1, Bn ≺ Bn+1, fn+1 extends fn and Bn ⊆ range of fn+1. A0 ≺ A1 ≺ ••• ≺ An ≺ An+1 ≺ ••• ↓f0 ↓f1 ↓fn ↓fn+1 B0 ≺ B1 ≺ ••• ≺ Bn ≺ Bn+1 ≺ ••• The proof of this claim will be discussed shortly. Assuming the claim, let A =Sn∈N An, B =Sn∈N Bn and f =Sn∈N fn. The Elementary Chain Theorem gives that A |= T1 and B |= T2. The proof of the theorem is concluded by simply verifying that f : A|L→ B|L is an isomorphism. The proof of the claim is long and quite technical; it would not be inappropriate to omit it on a ﬁrst reading. The proof, of course, must proceed by induction on n. The case of a general n is no diﬀerent from the case n = 0 which we state and prove in some detail. Claim. There are models A1 |= T1 and B1 |= T2 with an elementary embeddingf 1 : A1|L ,→ B1|L such that A0 ≺ A1, B0 ≺ B1, f1 extends f0 andB 0 ⊆ range of f1. A0 ≺ A1 ↓f0 ↓f1 B0 ≺ B1 Proof of Claim. Let A+ 0 be the expansion of A0 to the language L+ 1 = L1∪ {ca : a ∈ A0}formed by interpreting each ca as a ∈ A0; A+ 0 is just another notation for (A0)A0. The elementary diagram of A+ 0 is ThA+ 0 A+ 0 . Let B∗ 0 be the expansion of B0|L to the language L∗ = L∪{ca : a ∈ A0}∪{cb : b ∈ B0} formed by interpreting each ca as f0(a) ∈ B0 and each cb as b ∈ B0. We wish to prove that ThA+ 0 A+ 0 ∪ThB∗ 0 is satisﬁable. By the compactness theorem it suﬃces to prove that ThA+ 0 A+ 0 ∪{σ} is satisﬁable for each σ in ThB∗ 0. For such a sentence σ, let ca0,...,cam,cb0,...,cbn be all those constant symbols occuring in σ but not in L. Let ϕ(u0,...,um,w0,...,wm) be the formula of L obtained from σ by replacing each constant symbol cai by a new variable ui and each constant symbol cbi by a new variable wi. We have B∗ 0 |= σ so B0|L|= ϕ[f0(a0),...,f0(am),b0,...,bn] So B0|L|= ∃w0 ...∃wnϕ[f0(a0),...,f0(am)] Since f0 is an elementary embedding we have : A0|L|= ∃w0 ...∃wnϕ[a0,...,am]
3. DIAGRAMS AND EMBEDDINGS 30
Let ˆ ϕ(w0,...,wn) be the formula of L+ 1 obtained by replacing occurrences of ui in ϕ(u0,...,um,w0,...,wn) by cai; then A+ 0 |= ∃w0 ...∃wn ˆ ϕ. So, of course, A+ 0 A+ 0 |= ∃w0 ...∃wn ˆ ϕ and this means that there are d0,...,dn in A+ 0 = A0 such that (A+ 0 )A+ 0 |= ˆ ϕ[d0,...,dn]. We can now expandA+ 0 A+ 0 to a model D by interpreting each cbi as di to obtain D |= σ and so ThA+ 0 A+ 0 ∪{σ} is satisﬁable. Let E |= ThA+ 0 A+ 0 ∪ThB∗ 0. By the elementary diagram lemma A+ 0 is elementarily embedded into E|L+ 1 . So there is a model A+ 1 for L+ 1 with A+ 0 ≺ A+ 1 and an isomorphism g : A+ 1 → E|L+ 1 . Using g we expand A+ 1 to a model A0 1 isomorphic to E. Let A∗ 1 denote A0 1|L∗; we have A∗ 1 |= ThB∗ 0. We now wish to prove that Th(A∗ 1)A∗ 1 ∪ThB+ 0 B+ 0 is satisﬁable, where B+ 0 is the common expansion of B0 and B∗ 0 to the language L+ 2 = L2 ∪{ca : a ∈ A0}∪{cb : b ∈ B0}. By the compactness theorem, it suﬃces to show that ThB+ 0 B+ 0 ∪{σ} is satisﬁable for each σ in Th(A∗ 1)A∗ 1 . Let cx0,...,cxn be all those constant symbols which occur in σ but are not inL∗. Let ψ(u0,...,un) be the formula ofL∗ obtained from σ by replacing

學習模型論，何其難？

學習模型論，何其難？

學習模型論，何其難？（修改稿）

機器學習模型準確率，精確率，召回率，F-1指標及ROC曲線

keras實現常用深度學習模型LeNet，AlexNet，ZFNet，VGGNet，GoogleNet，Resnet

【機器學習】隨機森林 Random Forest 得到模型後，評估參數重要性

循環復循環，函數何其難

[機器學習]模型評價參數，準確率，召回率，F1-score

######好好好，本質#####基於LSTM搭建一個文字情感分類的深度學習模型:準確率往往有95%以上

關於訓練深度學習模型deepNN時，訓練精度維持固定值，模型不收斂的解決辦法（tensorflow實現）

OpenCV 3.3版本釋出，載入CAFFE/Darknet深度學習模型

【基於深度學習的細粒度分類筆記8】深度學習模型引數量(weights)計算，決定訓練模型最終的大小

關於影像辨識，所有你應該知道的深度學習模型

如何將量子鏈引入業務系統，使用JSON-RPC與錢包通訊（內附區塊鏈學習論，個人理解，歡迎交流）

python之--學習模型類view，註冊類

What-If 工具：無需寫程式碼，即可測試機器學習模型

Tensorflow學習筆記：VGG16模型——Finetuning，貓狗大戰，VGGNet的重新針對訓練

[機器學習]模型評估：ROC，AUC，KS，GINI，Lift，Gain 總結

幾種深度學習模型，keras實現

提升深度學習模型的表現，你需要這20個技巧

深度學習方法：受限玻爾茲曼機RBM（三）模型求解，Gibbs sampling

學習模型論，何其難？

相關推薦