Meditations on Probability Theory: Sensible Reasoning

classifier for sums of money René Descartes is used as the cover for this article, not only because Descartes' book Meditations on First Philosophy^[1]It is the ideological source of the book's Chinese translation, and even more so because Descartes represents the subjective turn in the history of Western philosophy, and his rationalist philosophy is one of the sources of the Bayesian school of thought (of which the author of this book is an avowed supporter).

introductory

Currently, actual logic is only good at dealing with things that are certain, impossible, or completely dubious. Fortunately, none of these three require us to reason. Thus, the real logic of the world is the logic of probabilistic arithmetic, which considers the magnitude of the probabilities that have been, or should have been, present in the brain of a rational thinker.

--James Clerk Maxwell (1850)

Recently Kouhei Academic hosted the weekly Probability Theory Meditations^[2][3]Through the book club activity, which coincided with the fact that I was also reading the Chinese translation of the book, I learned about the different understandings of the book by teachers of different disciplines (math/physics/statistics/computers), and my own understanding of the book has gradually deepened in the process. So I am going to keep updating my reading notes every week.

The author of this book is a physicist, and unlike Kolmogorov's axiomatization of probability theory, which is based on the definitions of probability spaces and measures (in fact, this book treats probability theory as an extended logic, not to mention measures and even set theory and Venn diagrams), the concepts of sympathetic inference and degree of sympathy are introduced in the real-world empirical context, and the qualitative conditions (i.e., the conditions of sympathy) that need to be fulfilled for the degree of sympathy. degree needs to fulfill the qualitative conditions (i.e., the fitness conditions), and finally, on this basis, derive the quantitative rules that fitness reasoning needs to fulfill, i.e., the multiplicative rule and the additive rule (corresponding to the contents of Chapters 1 and 2 of this book). First, let's look at what we mean by sympathetic reasoning.

1 Deductive Reasoning and Sensible Reasoning

In our study of mathematical theory, we often use reasoning in the following wayDeductive reasoning (deductive reasoning)which can be decomposed into two kinds of Aristotelian logicSTRONG SYLLOGISMS of repeated applications:

\[\begin{aligned} Athe true principleBtrue \\ \underline{\quad \quad \space \space \space Atrue}\\ Btrue \end{aligned}\tag{1} \]

and its inverse

\[\begin{aligned} Athe true principleBtrue \\ \underline{\quad \quad \space \space \space Bvacation}\\ Avacation \end{aligned}\tag{2} \]

This kind of reasoning is characterized by the fact that we need a completely certain piece of information for theMajor premise(e.g., "If A is true, then B is true" and "If A is true, then B is true" mentioned above), where the major premise is often expressed in math asAxiom (axiom) form of presentation; in addition, we need an appropriate message for theMinor premise(e.g., "A True" and "B False" mentioned above); itsConclusion for a proposition to be true (or false). For example, if you take 'All men die' as the major premise and 'Socrates is a man' as the minor premise, then you can conclude that 'Socrates will die'.

However, the real world is very complex and we often do not have enough information to apply strong trinitarianism. Sometimes it is the case that we have difficulty in observing the appropriate information for the minor premise (of strong trinitarianism), and sometimes it is the case that our major premise is not completely certain and reliable.

For example, if we see a cloudy sky at 9 a.m. and then expect rain to follow, the reasoning process at that point is not deductive. Let the proposition\(A\)is "the sky starts raining at 10 a.m." and B is "the sky becomes cloudy before 10 a.m.", then the major premise on which we are reasoning is\(A\)the true principle\(B\)True, the minor premise is\(B\)true, unlike the two types of strong trinitarianism we mentioned above. We call this method of reasoningWeaker syllogisms (weak syllogisms), too, that is:

\[\begin{aligned} Athe true principleBtrue \\ \underline{\quad \quad \space \space \space Btrue}\\ ABecoming more sensible \end{aligned}\tag{3} \]

In weak trichotomy, the evidence\(B\)The truth doesn't prove it.\(A\)True, but\(B\)act as\(A\)of a result being verified would give us an insight into the\(A\)of increased confidence.

Weak trichotomy likewise has its inverse:

\[\begin{aligned} Athe true principleBtrue \\ \underline{\quad \quad \space \space \space Avacation}\\ BIt's getting even more unconscionable. \end{aligned}\tag{4} \]

Similarly, the evidence\(A\)False does not prove that\(B\)False, but\(B\)The exclusion of one of the possible bases for this would have left us with a strong opinion about the\(B\)of reduced confidence.

It is worth mentioning here that the reasoning process by which a scientist accepts or rejects a theory is almost entirely determined by the first\((2)\)Type I or Type II\((3)\)Kinds of triads are composed, meaning that scientists tend to make further inferences from evidence measured by experiments.

The method of reasoning that employs weak trichotomies we callPlausible reasoning (plausible reasoning). Sensible reasoning is not deductive reasoning, but we still consider it to have some degree of validity. The conclusions we reach, while no longer in deductive reasoning either\(\text{True}\)either one or the other\(\text{False}\)thatCertain), yet it still looksPlausible。

Let's look at a more complex example of synoptic reasoning. A police officer on patrol in the dark of night notices a masked man climbing out of a broken window with a bag full of jewelry on his back, and then he immediately concludes that the man is a bad guy. Again, the reasoning is not deductive because it is entirely possible that there are innocent explanations for the man, such as that the man is the owner of a jewelry store, has just come home from a costume party, does not have his keys with him, and broke the glass with a rock and took the jewelry through the window in order to protect his property.

We set the proposition\(A\)for "this man is a bad man".\(B\)for "the man did the above behavior," then the major premise of the police officer's reasoning is\(A\)the true principle\(B\)More sensible, the minor premise is\(B\)True:

\[\begin{aligned} Athe true principleBmore sensible \\ \underline{\quad \quad \space \space \space Btrue}\\ A变得more sensible \end{aligned}\tag{5} \]

Let's summarize the characteristics of the above synoptic reasoning, and we can see that all synoptic reasoning needs to bePrior information to assess the degree of sensibility of the new question, a reasoning process that is unconscious and that we callCommon sense. Unlike the major premise made axiomatic in deductive reasoning, the common sense of sensible reasoning is presented in terms of a person's past experience, which is subject to constant revision. In the example of the police, common sense here is the past experience of the police in enforcing the law. If we want to change the police's judgment on the issue, all we need to do is change that past experience. For example, if this sort of thing happens many times every night, and the man proves to be completely innocent every time, then the police will choose to ignore the incident.

2 Analogies to physical theories

Above we mentioned that the police can revise the a priori information of sensible reasoning through the accumulation of experience, and eventually arrive at different conclusions of sensible reasoning, and in fact this way of thinking has been widely used in physics and even the whole field of science. For example, in physical theory we often explain some phenomena by building a mathematical model (these models are calledPhysical theories), and then improve this model to explain additional phenomena such as

\[Newtonian mechanics\rightarrow Special relativity \rightarrow General relativity \]

The process of evolution embodies this idea. No one knows whether there is a natural end to this process or whether it will continue indefinitely.

We take a similar approach to understanding common sense, i.e., we build a mathematical model to reproduce some features of human common sense, and then in the future there may be a more comprehensive model to be used as a replacement for the current model. Again, we do not know if there is a natural end point to this process.

3 Thinking Computer

Indeed, whenever we construct a mathematical model that reproduces part of our common sense through a set of explicit operations, we are in fact showing how to construct aThe thinking computer to operate on incomplete information and to reason sympathetically rather than deductively by applying a quantitative version of the weak triad described above (the specific quantitative rules will be told in Chapter 2).

Once we move from the qualitative to the quantitative, we pursue the question further: in the police's weak trichotomy\((5)\)What determines the\(A\)Is the fitness of the data increased dramatically to the point where it is almost certain, or is it only improved by a negligible amount and makes the data\(B\)Almost irrelevant? The goal of this book is to develop a mathematical theory that answers this question with the greatest depth and generality currently possible.

We know that the human brain reasons with questions that are tinged with emotions and all sorts of grotesque misconceptions, which are psychological in nature and which we do not take into account. In fact, the actual workings of the human brain are so complex that it is difficult for us to explain all of its mysteries, and therefore, instead of asking, "How can we mathematically model human common sense?", we should be asking. , but should instead ask:Following the clear principle of expressing idealized common sense, how can we build a machine capable of useful sensible reasoning?

4 Reasoning robots

We will invent a virtual life form with a brain designed by us so that it can reason according to certain defined rules. These rules are derived from simple conditions of sensibility (desiderata) that, in our view, need to be satisfied for the human brain to reason sensibly. We argue that a rational person would want to revise his or her thoughts if he or she realized that he or she had violated a desiderata condition.

Our robot will reason about propositions. As mentioned earlier, we use italicized capital letters (A, B, C, etc.) to denote various propositions, and we tentatively require that the propositions used must have a clear meaning to the robot, and must belong to a simple, unambiguous type of logic (either true or false). That is, we focus only on binary logic.

5 Boolean algebra

To state these ideas more formally, we introduce some common symbolic logics (orBoolean algebra) notation, which includes the following three basic operations:

Logical product/conjunction denoted by\(AB\)The proposition "A and B are both true" is denoted. Obviously.\(AB\)respond in singing\(BA\)The meaning is the same.
Logical sum / disjunction denoted by\(A+B\), which means "at least one of the two propositions A and B is true". Obviously.\(A + B\)together with\(B + A\)The meaning is the same.
Denial denoted by\(\overline{A}\), said."\(A\)The proposition is false."

Given two propositions\(A\)cap (a poem)\(B\)If one is true if and only if the other is true, we say that they have the sameTruth value. This could be a simple syllogism (tautology) (i.e., A and B are clearly saying the same thing), or it could be a difficult mathematical derivation that finally proves that\(A\)be\(B\)of sufficiently necessary conditions. From a logical point of view, it is irrelevant what exactly is the case, because once we have established that the\(A\)cap (a poem)\(B\)have the same truth value, they are logically equivalent propositions.

In Boolean algebra, the equal sign "=" is used to indicate the same truth value rather than the same value:\(A=B\)This Boolean algebra "equation" asserts that the proposition on the left has the same truth value as the proposition on the right. As usual, we use the additional notation "\(\equiv\)" to mean "equal by definition".

Based on the above definitions of the three basic operations mentioned above, the following basic properties of Boolean algebra described by equations are obtained:

Idempotence.：
- \(AA=A\)，
- \(A + A = A\)。
Commutativity：
- \(AB = BA\)，
- \(A + B = B + A\)。
Associativity：
- \(A(BC)=(AB)C=ABC\)，
- \(A + (B + C) = (A + B) + C = A + B + C\)。
Distributivity：
- \(A(B + C) = AB + AC\)，
- \(A + (BC) = (A + B)(A + C)\)。
Duality：
- as\(C=AB\)follow\(\overline{C}=\overline{A} + \overline{B}\)，
- as\(D = A + B\)conjunction used express contrast with a previous sentence or clause\(\overline{D}=\overline{A}\space\overline{B}\)。

Application of these properties leads to further conclusions, some of which are not obvious. For example, the following proposition will be used by us as a fundamental theorem:
assign an essay topic

\[If \overline{B} = AD, \space then A\overline{B} = \overline{B} and B\overline{A} = \overline{A} \]

show that

honorific title\(D=\overline{B}\)There are obviously\(A\overline{B}=\overline{B}\)；
honorific title\(D=B\)obtain\(\overline{B}=AB\)and applying duality (two sides taken together or not) yields\(\overline{A} + \overline{B}=B\)Both sides.\(\cdot\overline{A}\)obtain\(\overline{A} + \overline{A}\space \overline{B} = \overline{A}B\)(I can't guarantee that this step is correct, but I haven't found any other solution either), so\(\overline{A}(1 + \overline{B})=\overline{A}B\)The final result is\(B\overline{A}=\overline{A}\)。

Next we'll look atImplicationThe assign an essay topic

\[A \Rightarrow B \]

Pronounced "\(A\)contain\(B\)"and keep on saying\(A\)true or\(B\)is true, it only means that\(A\overline{B}\)is false (i.e.\(\overline{A}+B\)(to be true). This can also be written as a logistic equation\(A=AB\). That is, if\(A\)is true, then\(B\)must be true; or if\(B\)is false, then\(A\)It must be false. That's exactly what the Strong Triad is.\((1)\)cap (a poem)\((2)\)What is expressed.

In addition, for implication relations, if the\(A\)is false, to\(B\)can't account for anything; if\(B\)is true for\(A\)It doesn't explain anything either, but those are exactly the weak trinomials\((3)(4)\)Situations that do convey some information.

lit. a pit for trapping animals Note that in ordinary language "\(A\)contain\(B\)"said\(B\)It is logically possible to start with\(A\)is derived in formal logic, but in formal logic it only denotes the proposition\(A\)cap (a poem)\(AB\)have the same truth value (our use of the term "implication" is actually misleading, as we will see later as an example).

In addition, only if\(A\)for real\(B\)vacation\(A \Rightarrow B\)is false, in all other cases\(A \Rightarrow B\)are true (including\(A\)(for false cases). If\(A\)is false, then for any\(Q\)，\(AQ\)Also false, then\(A=AB\)cap (a poem)\(A=A\overline{B}\)are both true, so\(A \Rightarrow B\)respond in singing\(A \Rightarrow \overline{B}\)All are true. Thus a false proposition implies all propositions. If we try to interpret the implication relation as logically derivable (i.e., the\(B\)cap (a poem)\(\bar{B}\)All can be accessed from the\(A\)derived in), then this results in each false proposition being logically self-contradictory. But the fact that a proposition is false does not mean that it is self-contradictory (i.e., eternally false). For example, for the proposition "Beethoven lived longer than Berlioz", we check Wikipedia to find out that it is false, but it is not self-contradictory (eternally false) because it is possible that the proposition is true (Beethoven did live longer than many of Berlioz's contemporaries).

6 Complete sets of operations

Based on the four basic logical operations mentioned above: logical product (combined take)\(AB\)Logic and (disambiguation)\(A+B\)Implications\(A\Rightarrow B\)negate\(\overline{A}\), we can start with two propositions\(A\)cap (a poem)\(B\)Start generating an arbitrary number of new propositions such as:

\[C\equiv (A + \overline{B}) (\overline{A} + A\overline{B}) + \overline{A}B(A+B) \]

Our question now is: how many new propositions are generated in this way? Are there infinitely many, or are there finite closed sets under these operations? Or are these four so over-complete that some of them can be omitted? More generally, if we do not have only two propositions\(A\)respond in singing\(B\)Instead, there's\(n\)proposition\(A_1,\cdots, A_n\), then want to generate information about the independent variable (known in the philosophy of language as thePropositional variable (propositional variable)）\(\{A_1, \cdots, A_n\}\)of all possible logical functions, is this set of operators complete?

classifier for sums of money We specify that these logical functions can only have\(\text{T}/\text{F}\)These two function values, and all independent variables can only take these two values, such functions are known in the philosophy of language asTruth function^[4]

Let us first consider a simple case where the logic function\(C=f(A, B)\)How many such logical functions are there at this point? The domain of definition of this function is of size\(2^2=4\)of a two-dimensional discrete space\(S=\{\text{T}\text{T}, \text{T}\text{F}, \text{F}\text{T}, \text{F}\text{F}\}\), where each point (2-dimensional vector) can be assigned to the\(\{\text{T}, \text{F}\}\)in one of the values, so there is a total of\(2^4=16\)Different Logic Functions\(f(A, B)\)。

Further generalization, for those involved in\(n\)expression of a proposition (math.)\(B=\{A_1,\cdots, A_n\}\)which is defined by a domain of size\(2^n\)discrete space (math.)\(S\)then it ultimately corresponds to\(2^{2^n}\)different logic functions.

insofar as\(n=1\)That is to say, the\(C=f(A)\)The special case of (defined by a domain of size\(2^1\)of a one-dimensional discrete space\(S=\{\text{T}, \text{F}\}\)), at which point there is\(2^2=4\)The different logic functions (naming them each as\(\{f_1(A), \cdots, f_4(A)\}\)They can be defined by enumeration in the form of truth tables:

\(A\)	\(\text{T}\)	\(\text{F}\)
\(f_1(A)\)	\(\text{T}\)	\(\text{T}\)
\(f_2(A)\)	\(\text{T}\)	\(\text{F}\)
\(f_3(A)\)	\(\text{F}\)	\(\text{T}\)
\(f_4(A)\)	\(\text{F}\)	\(\text{F}\)

By careful observation, we can construct these functions:

\[\begin{aligned} f_1(A) &= A + \overline{A}\\ f_2(A) &= A \\ f_3(A) &= \overline{A}\\ f_4(A) &= A\overline{A} \end{aligned} \]

In this way, we use a constructive proof to show that three operators (disjunction, conjunction and negation) are sufficient to generate all logical functions of a single proposition.

For more general\(n\)values, we first consider some special functions, each of which is defined in the domain of definition\(S\)(used form a nominal expression)One and only one pointsupernumerary\(\text{T}\), the rest of the points are\(\text{F}\). For\(n=2\)Existence\(2^2=4\)A number of such special functions (i.e., for a function of size\(2^2\)domain (math.)\(S=\{\text{TT}, \text{TF}, \text{TF}, \text{TF}\}\)(each two-dimensional point in which only a fixed truth value can be assigned).

\(A, B\)	\(\text{TT}\)	\(\text{TF}\)	\(\text{TF}\)	\(\text{TF}\)
\(f_1(A, B)\)	\(\text{T}\)	\(\text{F}\)	\(\text{F}\)	\(\text{F}\)
\(f_2(A, B)\)	\(\text{F}\)	\(\text{T}\)	\(\text{F}\)	\(\text{F}\)
\(f_3(A, B)\)	\(\text{F}\)	\(\text{F}\)	\(\text{T}\)	\(\text{F}\)
\(f_4(A, B)\)	\(\text{F}\)	\(\text{F}\)	\(\text{F}\)	\(\text{T}\)

These functions can be constructed as basic meromorphisms:

\[\begin{aligned} f_1(A, B) &= AB\\ f_2(A, B) &= A\overline{B} \\ f_3(A, B) &= \overline{A} B\\ f_4(A, B) &= \overline{A}\space \overline{B} \end{aligned} \]

The four special functions here we can consider as the functions defined by the\(S\rightarrow\{\text{T}, \text{F}\}\)of the function space formed by the functions ofBasis. We can compose arbitrary functions from these base functions.

Consider, for example, that the\(S\)(used form a nominal expression)Certain designated pointson is true for any function, such as the following:

\(A, B\)	\(\text{TT}\)	\(\text{TF}\)	\(\text{TF}\)	\(\text{TF}\)
\(f_5(A, B)\)	\(\text{F}\)	\(\text{T}\)	\(\text{F}\)	\(\text{T}\)
\(f_6(A, B)\)	\(\text{T}\)	\(\text{F}\)	\(\text{T}\)	\(\text{T}\)

We found that\(f_5(A, B)\)respond in singing\(f_6(A, B)\)Both can be written as logical sums of base functions:

\[\begin{aligned} f_5(A, B) &= f_2(A, B) + f_4(A, B)\\ & = A\overline{B} + \overline{A}\space \overline{B}\\ & = (A + \overline{A})\overline{B}\\ &= \overline{B} \end{aligned} \\ \begin{aligned} f_6(A, B) &= f_1(A, B) + f_3(A, B) + f_4(A, B)\\ & = AB + \overline{A}B + \overline{A}\space \overline{B}\\ & = (A + \overline{A})\overline{B}\\ &= B + \overline{A}\space \overline{B}\\ & = \overline{A} + B \end{aligned} \]

(by\(B + \overline{A}\space\overline{B}\)until (a time)\(\overline{A} + B\)This step uses the distributive, i.e.\(B + \overline{A}\space \overline{B} = (B + \overline{A}) (B + \overline{B})=\overline{A} + B\)）

In fact, anyone who is in\(S\)The logical functions that take true values at at least one point in can all be constructed by the logic and construction of the basis functions we mentioned above, for a total of\(2^4 - 1\)One such function, the remaining one that is false at all points, is simply defined as a contradictory proposition\(f_{16}(A, B)\equiv A \overline{A}\)is sufficient (analogous to the vector space\(0\)(Vector).

This approach (known in logic textbooks as "simplification toanalytic paradigm（disjunctive normal form, DNF）”^[5]) For any\(n\)All hold, for example, in the\(n=5\)In the case of the\(2^5=32\)basic combinatorial formula

\[\{ABCDE, ABCD\overline{E}, ABC\overline{D}E, \cdots, \overline{A}\space \overline{B}\space \overline{C}\space\overline{D}\space\overline{E}\} \]

cap (a poem)\(2^{32}\)Different Logic Functions\(f_i(A, B, C, D, E)\)funded by\(2^{32}-1\)Logic of the basic combinatorial formulae and the addition of the paradoxical formulae

\[f_{2^{32}}(A, B, C, D, E) \equiv A\overline{A} \]

Ingredients.

Up to this point, we have verified that\(\{ Combine, Analyze, Negate \}\)(i.e.)\(\{\text{AND}, \text{OR}, \text{NOT}\}\)) These three operations are sufficient to generate all possible logic functions; more succinctly, they form aadequate set. And by the law of duality in Boolean algebra, we can further reduce it to\(\{\text{AND}, \text{NOT}\}\) Because

\[A + B = \overline{\overline{A}\space\overline{B}} \]

Up to this point, the two operations (\(\text{AND}\)cap (a poem)\(\text{NOT}\)) already constitute a complete set of deductive logic (this fact is crucial for us to determine whether there is a complete set of rules for collegial reasoning; see Chapter 2 for details).

Although we can no longer delete either of these two operations (i.e., they can no longer be tabulated with respect to each other), the possibility still exists that both conjunction and negation can be reduced to another operation that has not yet been introduced, and henceA single logical operation can form a complete set。

With and without arithmetic (NAND)cap (a poem)Non-Ordering (NOR) Both can accomplish our purpose. The operation with and without is defined as the negation of AND:

\[A\uparrow B = \overline{AB} = \overline{A} + \overline{B} \]

It can be read as "A and not B". We can get it right away:

\[\begin{aligned} \overline{A} &= A \uparrow A,\\ AB &= (A \uparrow B) \uparrow (A \uparrow B), \\ A + B &= (A \uparrow A) \uparrow (B \uparrow B) \end{aligned} \]

Therefore, each logic function can be constructed using only NAND. Contingent and non-contingent operations\(A\downarrow B \equiv \overline{A + B} = \overline{A}\space \overline{B}\)Id.

One of the standard components of a logic circuit is the "quadruple gate", i.e., an integrated circuit containing four independent gates on a single semiconductor chip. Given a sufficient number of quadruple gates, no other circuit components are needed to generate any desired logic function by interconnecting them in various ways.

classifier for sums of money In the philosophy of language, it has also been pointed out that all truth functions of a set of known atomic propositions can be determined by "non\(p\)or non-\(q\)"or "not\(p\)peacebuilding\(q\)"One of these two functions is used. Wittgenstein's method of proof uses the latter function, and his method of proof is roughly as follows: pick any set of atomic propositions and negate them all (we mentioned earlier that "non\(p\)"Equivalent to" non-\(p\)peacebuilding\(p\)"), then pick any set of propositions now obtained, add any of the original propositions, and so on and so forth to infinity (cf. Wittgenstein's Tractatus Logico-Philosophicus, Propositions 6 to 6.02)^[4]). In this way, all non-atomic propositions are able to be derived from these atomic propositions in a uniform way.

7 Basic conditions of reasonableness

Turning now to our extended logics, they will be derived according to certain conditions. We will call these conditions"Desiderata" (conditions of reasonableness) Not axioms, because they do not assert that anything is true, but merely state the desirable goal of the extended logic.

As stated in Part 1, for each proposition to be reasoned about, our reasoning robot assigns a certain degree of sympathy based on the evidence we have given it, and then re-adjusts the assignment of sympathy once it receives new evidence. In order to store and modify these sensibilities in its "brain" circuits, we have to associate them with a defined physical quantity, such as a voltage, a pulse duration, or a binary-encoded value, etc. These are the details that engineers consider. These are the details that engineers consider, and for our purposes, this means thatThere must be some correlation between the degree of sensibility and the real numbers：

\[(Ⅰ) \space A real number to indicate the degree of amity. \]

We adopt a natural convention: higher congeniality corresponds to larger values. Also for convenience, we assume that there exists aContinuityThe smallest increase in the value of a value should correspond to a small increase in the value of the value of the value of the value of the value of the value of the value of the value.

As we said earlier, the robot is a proposition for some\(A\)Split-fit affectivity depends on whether we tell it another proposition or not\(B\)(evidence) of truth or falsity. We use the symbol

\[A \mid B \]

To express this, read "given the\(B\)For real.\(A\)(conditional) sensibility as true" or simply "given the\(B\)(used form a nominal expression)\(A\)", which represents some real number. Thus.

\[A \mid BC \]

(can be read as "given\(BC\)(used form a nominal expression)\(A\)") indicates that the given\(B\)respond in singing\(C\)All true.\(A\)for really sensible sex. Furthermore.

\[A + B \mid CD \]

denote a given\(C\)cap (a poem)\(D\)All true.\(A\)cap (a poem)\(B\)At least one of them is really sensible, and so on.

We have decided to use a larger value to indicate greater conformity, thus

\[(A \mid B) > (C \mid B) \]

denotes "given\(B\)For real.\(A\)(particle used for comparison and "-er than")\(C\)more sensible" (here we have added brackets for clarity).

In order to avoid unsolvable problems, we will not ask the robot to reason on the basis of contradictory premises, which would also make it impossible to have a correct answer. Therefore, we assume that\(B\)cap (a poem)\(C\)are compatible propositions.

Also, we don't want this robot to think in a way that is contrary to the human way of thinking, so we'll be thinking in a way that's at least in thequalitativelydesign it in a manner similar to the way human reasoning works, as in the aforementioned weak trichotomy.

Thus, assuming that it has the old information\(C\)Updated to\(C^{\prime}\)makes\(A\)The fitness of the situation increases:

\[(A\mid C^{\prime}) > (A \mid C) \]

But in the given\(A\)Later.\(B\)The sensible nature of the situation has not changed:

\[(B\mid A C^{\prime}) = (B\mid AC) \]

This, of course, can only lead to\(A\)cap (a poem)\(B\)At the same time for the really sensible increase, and can not lead to its decrease:

\[(AB\mid C^{\prime}) \geqslant (AB\mid C) \]

besides\(A\)Fitness for Falsehood must also be reduced:

\[(\bar{A}\mid C^{\prime}) < (\bar{A}\mid C) \]

This qualitative condition simply gives a "sense of direction" to the robot's reasoning; it makes no mention of how much sensibility has changed, but only reflects the requirements of our continuity assumption:\(A\mid C\)A small change in the\(AB\mid C\)cap (a poem)\(\overline{A}\mid C\)of small changes. The exact way in which these qualitative conditions are used will be given in Chapter 2, and here we first briefly summarize them:

\[(II) \space Qualitatively consistent with common sense. \]

Finally, we want the robot to provide another desirable property: it alwaysConsistently Reasoning:

\[(Ⅲ) \space With consistency. \]

It should be emphasized that the word "consistent" needs to have the following three common meanings:

\[\begin{aligned} & (Ⅲ\text{a}) \space If the conclusion can be reasoned in more than one way, then each possible way must give the same result. \\space & (Ⅲ\text{b}) \space A robot always considers all the evidence it has that is relevant to a problem and does not arbitrarily ignore some information. \\\\\ & (Ⅲ\text{c}) \space The robot always represents the same state of knowledge by assigning the same amount of sympathy. \end{aligned} \]

The conditions of reasonableness here\((Ⅰ)(Ⅱ)(Ⅲ\text{a})\)It is the basic "structural" condition for the inner workings of the robot's "brain", while the\((Ⅲ\text{b})(Ⅲ\text{c})\)is the "interface" condition that indicates how the robot's behavior should relate to the external world.

Indeed, these qualitative conditions described above have uniquely determined the robot's rules of reasoning, i.e., there is only one set of mathematical algorithmic rules for dealing with amenability that satisfies all these properties. These rules will be derived in Chapter 2.

8 Commentary

No real numbers?

There is much more to explore about the theories in this chapter. There are ultimately differences between the human brain and the robot brain. Such as the condition of fitness\((Ⅰ)\)The robot's mental state with respect to any proposition will be represented by a real number. But for the human brain, our attitude toward a particular proposition may have more than one "dimension"; we are not only concerned with whether it is sensible or not, but also with whether it is desirable, important, useful, funny, and so on. If we assume that each of these judgments can be expressed as a numerical value, then a sufficiently complete description of human mental states would be given by theVectors in multidimensional spacesRepresentation. Not all propositions require multiple dimensions, e.g., the proposition "The refractive index of water is less than 1.3" does not evoke any emotion, so it produces a mindset with very few dimensions. However, the proposition "Your mother-in-law just destroyed your new car" triggers a multi-dimensional mental response.

Even instead of choosing real numbers to represent the degree of fitness, we can directly base the qualitative ordering of the relational system on the"Comparative" theory(e.g.\((A \mid C) > (B \mid C)\) ) to represent it (see here pairwise loss and pointwise loss in recommender systems)^[6]）。

Ordinary language and formal logic

We tend to thinkFormal logic statements must be better than theOrdinary language More accurate, but not really.

In particular, because ordinary language is commonly used for purposes other than stating logic, it is capable of expressing subtle differences and of making hints without directly saying them. This is not the case with formal logic. For example, Mr. A, in order to confirm his objectivity, says, "I believe what I see." (Reduced to implication as "If I see something, then I believe it.") Mr. B responds, "He doesn't see what he doesn't believe." (reduced to the implication "If I don't believe a thing, then I don't see it.") From the point of view of formal logic, they seem to be saying the same thing (inverse negatives of each other); but from the point of view of ordinary language, the two sentences convey the intent and effect of opposite meanings (the latter seeming to mock the limitations of the former's field of vision? ) .

Here is a more illustrative example, taken from a math textbook. Set\(L\)is a line in the plane.\(S\)is an infinite set of points in that plane, and projecting each of those points onto\(L\), now consider the following statement:

\[\begin{aligned} & (Ⅰ) The projection of a limit is the limit of the projection. \\\\ &( Ⅱ ) The projection of a limit is the projection of a limit. \\\\ \end{aligned} \]

They have the same grammatical structure"\(A\)Yes (is)\(B\)"and"\(B\)Yes (is)\(A\)" ("is" here, according to the philosophical view of language, asIdentity The sign (sign) of the^[4]), and thus logically appear to be equivalent. In that textbook, however, (I) is considered correct, while (II) is generally incorrect, on the grounds that the limit of a projection may exist when the limit of a set does not.

As the above example shows, we already express nuances in ordinary language with precise wording (although we may not realize it). We have in effect replaced "\(A\)be\(B\)"Interpreted as first asserting that\(A\)exists (as a kind of major premise), and the rest of the statement is classified as conditional on that premise. That is, the verb "is" implies a difference between subject and object in the grammar of ordinary languages, but not on either side of the equals sign "=" in formal logic and traditional mathematics (except for "=" in computer languages). " in computer languages).

Another interesting example is the old adage "knowledge is power", which is a very convincing truth both in human relations and in thermodynamics (for the meaning in thermodynamics, see the famous Landauer erasure, where it takes 1 bit of information to erase).\(kT\ln2\)energies^[7]). But a chemical trade journal^[8]The ad copywriter who adapted the phrase to "power is knowledge" is absurdly, outrageously wrong.

In English, the verb“is” Like any other verb, it is used with a subject and a predicate, but this verb has two completely different meanings. Consider the following two sentences:

The room is Noisy.
There is noise in the room.

In fact, the statement in the second sentenceOntologicalIt serves asExistence expression appears, asserting the physical existence of something; the statement in the second sentence isEpistemologicalIt's done asTethered verbs (copula) appears, it means "is" and expresses only the speaker's personal perception.

classifier for sums of money It is not limited to the English "is", but the German word "Sein" can also be used to denote "being", "having", in addition to the equivalent and tense verb "to be". "to exist", "to have", cf. Wittgenstein's Tractatus Logico-Philosophicus, Proposition 3.323.^[4]。

There is a general tendency in ordinary languages (at least in English and German, which are also Germanic languages) to disguise epistemological statements as ontological statements through grammatical forms. A major source of error in current probabilistic theories is the failure to recognize this. Interpreting epistemic utterances in an ontological sense is to assert that one's thoughts and feelings are facts that exist in nature, which we refer to as the"Thought-projection fallacy."(mind projection fallacy). This problem is not even confined to probability theory; in fact, the discourse of many philosophers and tautological psychology, as well as the common sense of some physicists interpreting quantum theory, is reduced to meaninglessness by repeatedly falling into the mind projection fallacy.

consultation

[1] Descartes R. Meditations on first philosophy: With selections from the objections and replies[M]. Oxford University Press, USA, 2008.
[2] Jaynes E T. Probability theory: The logic of science[M]. Cambridge university press, 2003.
[3] Jaynes. Liao, H. R. Translation. Meditations on probability theory [M]. People's Posts and Telecommunications Publishing House, 2024.
[4] Wittgenstein L. Tractatus logico-philosophicus[J]. 2023.
[5] Lehman E, Leighton F T, Meyer A R. Mathematics for computer science[M]. Massachusetts Institute of Technology, 2010.
[6] Recommender Systems: A Fine-Tuned Multi-Objective Fusion and Hyperparametric Learning Approach
[7] "Why is information not energy?
[8] LC-CG Magazine, March 1988, p. 211.