Processing math: 100%

2 Set Theory

Set theory and the first-order predicate logic are pretty much the foundation of modern mathematics. If logic gives us the ability to do reasoning, set theory gives us the objects to reason about. The combination of the two provides us with the strength of constructing rigorous but concise rationales. Set theory starts with some of the simplest ideas and extends them in a reasonably straightforward way to encompass some astonishingly complicated ones.

A set refers to a collection of objects.8 When we say “objects,” it means that they should be logically distinguishable. That is, if x and y are two objects, x=y and xy cannot hold simultaneously ( principle of contradiction and principle of the excluded third). When the definition of an object makes sense, it is known as showing that the object is well defined. The objects could be some elements/members, like red, white, blue, or some numbers 1,2,,n, or some variables, x1,x2,,xn. From now on, we will use the calligraphic letter to denote a set. For example, if X is a set of x1,x2,,xn, then we write the collection as X={x1,,xn}.

A set can have just one member, e.g., x, then this set {x} is called singleton. If an object x is a member of a set X, we write xX or say X contains x. If x is not a member of X, we denote xX. The empty set indicated by has no object. Here are some other definitions for the relations of sets:

  • Subset: AB means that if xA, then xB. It is the set-theoretic version of logical implication.9 Very often for AB, we also say a set A is included in a set B.

  • Union: AB means that if xAB, then xA or xB. It is the set-theoretic version of OR.

  • Intersection: AB means that if xAB, then xA and xB. It is the set-theoretic version of AND.

  • Complement: Ac means that if xAc, then xA. It is the set-theoretic version of logical negation.

  • Equivalence: If AB and BA, then two sets are equivalent A=B. This is set-theoretic version of logical bicondition.

The predicates can be simplified in the set-theoretic setting. When the sets consist of those elements satisfying some condition or possessing some specified property, we can define such a set as X={x|S(x)} where the condition S(x) is precisely the predicate in logic. Sometimes the expression can be written as X={x:S(x)}. The vertical line or the colon means “such that.” This set-theoretic expression makes the sentence more compact. If we have composed predicates such that 1) A(x) is true iff xA; 2) B(x) is true iff xB; then the combination A(x)B(x) is true iff A(x) and B(x) are both true. These composed predicates can be put in a simpler set-theoretic definition AB={x|xA and xB}. As we have seen, set theory is closely intertwined with first-order predicate logic. In general, these two can form a nicely closed formal system: sets provide objects for which the logic can talk about, and logic provides tools for talking about the sets and their objects.

2.1 Function

The predicates give some descriptions of a relationship between the variable x and S(x), the object represented by the variable. The usual way of establishing such a relation for numerical objects is thorough a function. A function is a rule that, for each element in some set X of possible inputs, assigns a potential output. More formally, the definition of a function relies on a set-theoretic operation called the Cartesian product.

  • Cartesian product and ordered pair: Cartesian product is a product of two non-empty sets X and Y, denoted as X×Y, is as follows X×Y={(x,y)|xX,yY}. The pair (x,y) is called ordered pair as the order matters, namely (x,y)(y,x) unless x=y.10 For a set, the order doesn’t matter. The set {x,y} and the set {y,x} contain the same objects, x and y so {x,y}={y,x}. The order pair in the set-theoretic form is (x,y)={{x},{x,y}}. For any distinct objects x and y, there is {{x},{x,y}}{{y},{x,y}}, thus (x,y)(y,x).

Cartesian product X×Y is a set of all ordered pairs (x,y) where x comes from X and y comes from Y. Cartesian product allows the set theory to express the concept of a predicate of more than one variable. The above definition of 2-fold product can be extended to an n-fold product, X1×X2××Xn or ni=1Xi in short. If all the sets in this n-fold product are identical X1==Xn=X, then ni=1Xi can be written as Xn. With the definition of the Cartesian product, we view the function as a rule that describes how to transform one set into another. That is, the Cartesian product characterizes this rule.

  • Function: A function f():XY gives a relation for each object in the set X to a single object in the set Y. The function f() is a set of ordered pair (x,f(x)) in X×Y such that fX×Y.11 In mathematics, map and mapping can be used interchangeably with function. Map can be used as a verb, emphasizing the action of a function on elements or subsets of its domain.

  • Image, domain and co-domain: For y=f(x), the output y is the image of the input x under f. If AX, the image of the input set A under f is the set f(A)={y|y=f(x),xA}. The set X is the domain of f and the set Y is the co-domain of f.

  • Injective, surjective and bijective function: f is called injective or one-to-one when f(x1)=f(x2) implies x1=x2. f:XY is called surjective or onto when yY, xX such that f(x)=y, namely f(X)=Y. The function is called bijective if it is both injective and surjective.

  • Inverse function: For a bijective function f():XY, its inverse function f1():YX has the property f(x)=yf1(y)=x and f1 is unique for each invertible f. The composition f1f(x)=f1(y)=x where denotes the composition of two functions.12 The composition can be used for any other function g as long as the domain of g is Y. Then two functions f():XY and g():YZ have their composition gf(x)=g(y)=z for zZ.

The use of functions is extensive. Different types of functions suit different purposes. Suppose that we need to check if an object x satisfies some desired features a member of the set X. In logic language, we have to define the statements S of the desired features and then use predicate S(x) to see whether it is true or false. With set theory and functions, the procedure of checking becomes simpler. We just need a particular function for testifying. The following function is called the indicator function 1X():X{0,1} such that 1X(x)={0 if xX,1 if xX. In section 1.2, we see the logical results, true and false, can be replaced by binary numbers 1 and 0. Then the indicator function 1X plays the exact role as the predicate for sentences presented by the set X. If the sentences are true under the value x, then the predicate 1X(x) is true or say 1X(x)=1 because the value is contained in the sentences xX. The following example gives another type of function useful for encrypting information.

2.2 Example: Cryptosystem

Caesar was said to have used a cryptosystem. In this system, he replaced each letter with the one three steps forward in the alphabet. That is, the bijective function f() maps each plain text letter to its cyphertext replacement:AD,BE,CF,,WZ,XA,YB,ZC namely, f(A)=D and f1(D)=A, etc. The domain and image of this function are both the alphabet X={A,B,,Z}. We denote the function as f:XX.

Suppose we encrypt the word WAY in a message by the function f(). For those who don’t know the inverse function f1(), the word ZDB makes nonsense. But for those who know f1(), the encrypted word WAY was known. This encryption/decryption procedure is one of the earliest and most straightforward. Some historical conjectures about Caesar’s assassination attributed his tragedy to the use of such a weak cryptosystem.

In WWII, the German crypto machine Enigma deployed double encryption. In a simplified version, one can consider that Enigma doubles the encryption procedures of Caesar’s cryptosystem. Suppose another function g() creates a “cycle” for {A,E} such that g(A)=E and g(E)=A. Assume that g() does nothing for the rest letters. The simple idea about double procedure encryption is the use of composition function gf such that gf(B)=g(E)=A.13 The real Enigma cryptosystem imposed a bijective condition, called permutation. Each permutation can be a product of disjoint cycles. The cycle-structure of g preserves under fgf1, where f is the cyclic permutation. For any g=fgf1, g and g are conjugate. The general idea is to encrypt the function g in its conjugate. If someone wants to decrypt the code, firstly, one needs f and f1 to decrypt g. Then with both f and g, one can proceed on the decryption. Although the double encryption procedure was more complicated than Caser’s, the Enigma cipher machine was “unlocked” by Polish intelligence in 1932, before the Blitzkrieg in Poland. So quite likely, this cryptosystem was not crypto far before its widespread use.

2.3 Order

Functions are sets of ordered pairs. Natural numbers are figures with an order. In real life, we are accustomed to ranking things in order. They are not necessarily numerical. One can give an order in the preference or can even establish some in society. Would it be possible to put everything in order? It seems an exaggerating question, but many people must have asked it. Set theory helps us to observe some features of order so that we can (partially) answer the previous question.

For a set X, if an order exists in the set, then any object xiX can be compared with another object xjX. If xi and xj are comparable, then the result would be either one of them is ranked in front of the other, or they both are equivalent. If we know which one is ranked first, we can put them in an ordered pair, such as (xi,xj) or (xj,xi). If they are equal, then two ordered pairs should be the same (x,y)=(y,x) as the order doesn’t matter for them. Thus the purpose is to find out the conditions that make two things comparable or equivalent. The following definitions are for this purpose.

  • Equivalence relation (): The set X×X with equivalence relation satisfies
  1. Reflexive: xx means (x,x)X×X for any xX.
  2. Symmetric: If xixj, then xjxi.
  3. Transitive: If xixj and xjxk, then xixk.14 2) and 3) can also be written in the following way 2) if (xi,xj)X×X for any xi,xjX, then (xj,xi)X×X. 3) if (xi,xj)X×X and (xj,xk)X×X for any xi,xj,xkX, then (xi,xk)X×X.
  • Total order (): The set X×X with a total order satisfies
  1. Reflexive: xx.
  2. Anti-symmetric: If xixj and xjxi, then xixj.
  3. Transitive: If xixj and xjxk, then xixk.
  4. Connex: Either xixj or xjxi is true.
  • Partial order, pre-order, total pre-order (): With above four conditions, if the set X×X satisfies {1),2),3),then  is a partial order.1),3),then  is a pre-order.1),3),4),then  is a total pre-order.

From the definitions, it is clear that the condition reflexivity along with (anti-)symmetry and transitivity define the comparable relation for a pair. Whether the relation is equivalence or it is in an ordered structure depends on whether the relation is symmetric or anti-symmetric. Connexity extends the partial order structure to any pair in the set so that partial order becomes total order.

It is not difficult to see that the set of natural numbers N has a total order (). The equivalence relation in N is simply equality (=). In Economics or choice theory, the equivalence relation is called indifferent relation. When someone is said to be indifferent between xi and xj, it means that this person may consider choosing either xi or xj, so in the choice set of this person xixj. The partial order is the usual order structure for the preferences over a set of possible choices. If the preferences only have transitivity and connexity, then in economics these preferences are rational.15 We will come back to this point in section []. When a partially ordered set X characterizs the preference, then for all choices xi,xjX, xjxi will be interpreted as “choice xi is at least as good as choice xj.” When a choice xiX is said to be Pareto optimal, then for any other choice xj in the same choice set X, there is xjxi. If more than one choice is said to be Pareto optimal, then the set of these choices is called the Pareto optimal set.

Let’s return to the previous question about putting everything in order. If a set can characterize things and if this set can at least be reflexive and transitive, then such things can indeed be put in (pre-)order. Unfortunately, these conditions are not always true for everything. Here is a counterexample that none of these conditions except reflexivity would hold. Consider the imaginary number i where i2=1. Neither 1i nor i1 makes sense. Thus the order relation should not be expected to exist everywhere. However, order relations, as long as they exist, would make things quantitatively analyzable.

The following section discusses one of the attempts in quantifying human’s decision and choice. The utility function, known as one of the roots in neoclassic economics, is to map one’s preference set to some positive real numbers. It becomes the fundamental of many emerging quantitative tools used in business, management, and related fields.

2.4 Miscellaneous: Utilitarianism, Happiness and Utility Function

The utilitarianism refers typically to the thoughts of Jeremy Bentham, John Stuart Mill, and their followers in the 19th century. The ideology lies in one principle called the greatest happiness principle, also names as the principle of utility, formulated by Bentham. The principle considers that the good is the general happiness. According to the principle, each individual always pursues his interests that are believed to be his happiness. The moral order of the community results from an equilibrium of such interests. Bentham believes that the principle of utility can give a criterion in morals and legislation.

Similar arguments about happiness existed amongst ancient Greek philosophers. Aristotle and Plato consider the good as happiness and also as an activity of the soul. They define the soul to consist of two kinds of virtues, intellectual and moral. Education enhances intellectual while good habits shape the moral. For good habits, Aristotle thinks that one finds happiness from acting “good”. Thus the priority of ethics and legislation is to define the good and the virtue. When legislator makes citizens acquire the habits of acting good, it will eventually lead them to find happiness in virtue.

In the 19th century, many utilitarians believed that pleasure and pain are states that can be, more or less, quantified. When people make choices, the choices reveal their preferences so that the order between the selected choice and the unselected ones becomes visible. The utility function is for numerically measuring the satisfaction or happiness caused by the choices. The input of the function is the choice set attached with an order, and the output of the function is the set of quantitative numerics such as positive real numbers. The utility function is an order-preserving function such that for u():X[0,), xjxi if and only if u(xj)u(xi). The first relation is usually a pre-order or a total pre-order for the choice set X, and the second relation is simply the inequality symbol for positive real numbers. All the values on [0,) are comparable, and inequality is a total order on [0,). The rule of the utility function is to reveal output numerics that preserve the order from the input choice set. The utility function initiated a calculus of pleasures and pains.

Page built: 2021-06-01