## 1.2 Functions

Informally, when we write $$f:X \to Y$$ or say ‘f is a function from X to Y’ we mean that f is a definite rule which associates to each element $$x \in X$$ a single element $$f(x)$$ of Y. Some times the word map is used in place of function - it means exactly the same thing.

There is a formal set-theoretic definition of function: a function f from a set X to a set Y is a subset $$f \subseteq X\times Y$$ with the property that for each $$x \in X$$ there is a unique $$y \in Y$$ such that $$\langle x,y\rangle \in f$$. This avoids the vaguness of “a definite rule” in our informal definition.

Definition 1.3 Let $$f: X \to Y$$ be a function.

• The set X is called the domain of f.
• The set Y is called the codomain of f.

Often people talk of functions in terms of input and output: if $$f: X \to Y$$ then when we give it an input $$x\in X$$, it gives an output $$f(x)$$. The x in f(x) is called the argument of f, and we talk about ‘applying’ a function f to an input x to get an output f(x)’.

Here’s a quick example of how functions are identified with subsets of a Cartesian product. Let $$X= \{0,1,2\}, Y = \{1, 2,3,4,5,6\}$$ and let f be the function from X to Y whose ‘rule’ is that for all $$x\in X$$ we have f(x)=2x+1. The description above was that a function was identified with the subset of $$X\times Y$$ consisting of all the pairs $$\langle x, f(x) \rangle$$, so $f = \{ \langle 0, 1\rangle, \langle 1, 3\rangle, \langle 2, 5\rangle\}.$

Two functions $$f,g : X \to Y$$ are said to be equal if and only if for all $$x \in X$$ we have $$f(x)=g(x)$$. This means that when we describe a function informally, by giving a rule, it isn’t the description of the rule that is important but merely which output f(x) it produces for a given input x. The functions: \begin{align*} f: \{0,1\} \to \{0,1\} & \;\;\; f(x) = x^2 & \text{and} \\ g: \{0,1\} \to \{0,1\} & \;\;\; g(x)=x^3 \end{align*} are equal, even though the rules look different, because f(0)=g(0) and f(1)=g(1).

Rather than use this set theoretic definition of a function, we will tend to define functions by specifying their output f(x) for a given input x. A common alternative notation is to write

\begin{align*} f: & X \to Y \\ & x \mapsto x^2 \end{align*}

to indicate that the function f has domain X and codomain Y, and sends an input x to the output $$x^2$$.

### 1.2.1 Function properties and composition

Definition 1.4 Let $$f:X \to Y$$ be a function.

• f is called injective or one to one if for all $$a,b \in X$$, if $$f(a)=f(b)$$ then $$a=b$$.
• The image of f, written $$\im f$$, is $$\{f(x): x \in X\}$$.
• f is called surjective or onto if $$\im f = Y$$.
• f is called a bijection if it is injective and surjective.

If we have a function $$f: X \to Y$$ and $$g: Y \to Z$$ then ‘do f then do g’ is a rule for getting from X to Z. The associated function is called the composition of f and g:

Definition 1.5 Let $$f:X \to Y$$ and $$g: Y \to Z$$ be functions. The composition of g and f, written $$g \circ f$$ is the function $$g\circ f: X \to Z$$ such that $$(g\circ f)(x)= g(f(x))$$.

Thus $$g\circ f$$ means ‘do f, then do g.’ Be careful to get this the correct way round. Notice that composition only makes sense when the codomain of f is the same as the domain of g.

Function composition is associative: if $$f:X \to Y, g: Y \to Z, h : Z \to W$$ then $h \circ (g \circ f) = (h \circ g) \circ f$ as functions from X to Z. The reason this is true is because both sides send an input $$x\in X$$ to the output $$h(g(f(x)))$$.

One important function from a set X to itself is the identity function $$\id_X$$, which does nothing: it is defined by $$\id_X(x) = x$$ for all $$x \in X$$.

Definition 1.6 Let $$f:X \to Y$$ and $$g: Y \to X$$ be functions.

• We say that g is a left inverse to f, and that f is a right inverse to g, if $$g\circ f = \id_X$$.
• We say that f is invertible if there is a function $$h:Y \to X$$ such that $$f\circ h = \id_Y$$ and $$h \circ f=\id_X$$.

A left inverse to f is a function $$g:Y \to X$$ such that $$g\circ f=\id_x$$, and a right inverse to f is a function $$h:Y \to X$$ such that $$f\circ h = \id_Y$$. In fact, if f has a left inverse g and a right inverse h then it must be that g=h: given any $$y \in Y$$, $g(y) = g(f(h(y)) = h(y)$ with the first equality because f(h(y))=y as h is right inverse to f, and the second equality because g(f(x))=x for any x as g is left inverse to f. This shows that if f has a left inverse and a right inverse, it is invertible.

It also follows that if f is invertible then there is one and only one function which is a left and right inverse to f. We write this function $$f^{-1}$$ and call it the inverse to f.

We can connect the properties of left and right invertibility to those of injectivity and surjectivity:

Proposition 1.1 Let $$f: X \to Y$$ be a function.

1. f has a left inverse if and only if it is injective.
2. f has a right inverse if and only if it is surjective.
3. f is invertible if and only if it is a bijection.

Proof.

• (1, only if). Suppose f has a left inverse g, and that $$f(a)=f(b)$$. Then $$g(f(a))=g(f(b))$$, and since $$g\circ f = \id_X$$ we have $$a=b$$, which shows f is injective.
• (1, if). Suppose f is injective. Notice that if $$y \in \im f$$ then there is a unique $$x \in X$$ such that $$f(x)=y$$, by injectivity. Choose any $$x_0 \in X$$, and define $$g:Y \to X$$ by $g(y) = \begin{cases} \text{the unique } x \text{ such that } f(x)=y & y \in \im f \\ x_0 & y \notin \im f \end{cases}$ Then g is left inverse to f.
• (2, only if) Suppose f has a right inverse g and let $$y \in Y$$. Then $$f(g(y))=y$$ as $$f\circ g = \id_Y$$. This shows $$y \in \im f$$, so $$\im f=Y$$ and f is onto.
• (2, if) Suppose f is surjective. Then every $$y \in Y$$ is in the image of f, so for each $$y \in Y$$ pick an element $$g(y) \in X$$ such that $$f(g(y))=y$$. Then g is right inverse to f.
• (3, only if) If f is invertible then it has a left inverse, so is injective by part 1, and it has a right inverse, so is surjective by part 2. Thus f is a bijection.
• (3, if) If f is a bijection then it has a left inverse by part 1 and a right inverse by part 2, so it is invertible as we showed in the text before this proposition.

Proposition 1.2 If $$f:X \to Y$$ and $$g:Y \to Z$$ are both invertible then so is $$g\circ f$$, and its inverse is $$f^{-1}\circ g^{-1}$$.
Proof. Notice that \begin{align*} (f^{-1}\circ g^{-1}) \circ (g \circ f) &= f^{-1} \circ (g^{-1}\circ g) \circ f &\text{associativity} \\ &= f^{-1}\circ \id_Y \circ f &\text{definition of inverse}\\ &= f^{-1}\circ f &\text{as } \id_Y\circ f=f \\ &= \id_X \end{align*} and similarly $$(g\circ f)\circ (f^{-1}\circ g^{-1})=\id_Z$$, so $$f^{-1}\circ g^{-1}$$ is a two-sided inverse to $$f\circ g$$.

Using this repeatedly, if $$f_1, f_2, \ldots, f_n$$ are invertible and the composition $$f_1\circ f_2 \circ \cdots \circ f_n$$ makes sense, then it is invertible with inverse $$f_n^{-1}\circ f_{n-1}^{-1} \circ \cdots \circ f_1^{-1}$$.

Since a function is invertible if and only if it is a bijection, this tells us that the composition of two bijections is again a bijection.

### 1.2.2 Functions on finite sets

Theorem 1.3 Let X and Y be finite sets.

1. If $$f: X \to Y$$ is injective then $$|X| \leq |Y|$$.
2. If $$f: X \to Y$$ is surjective then $$|X| \geq |Y|$$.
3. If $$f: X \to Y$$ is bijective then $$|X| = |Y|$$.

Proof. Let $$X = \{x_1, \ldots, x_m\}$$ have size m.

1. $$\{f(x_1), \ldots, f(x_m)\}$$ is a subset of $$Y$$ with size $$m$$, because $$f$$ is injective, so $$|Y| \geq m$$.
2. $$\{f(x_1), \ldots, f(x_m)\} = Y$$ has at most $$m$$ different elements.
3. This follows from the first two parts.
Theorem 1.4 Let X be a finite set. A function $$f : X \to X$$ is injective if and only if it is surjective.

This implies that if such an f is injective or surjective, then it is a bijection. This isn’t true for infinite sets: a function from an infinite set to itself can be injective without being surjective, or surjective without being injective.

Proof. Let $$X = \{x_1, \ldots, x_m\}$$ have size m.

If f is not injective then $$\operatorname{im} f = \{f(x_1), \ldots, f(x)\}$$ has a repeated element, so it has size at most m-1, so $$\operatorname{im} f \neq X$$ and f is not surjective.

If f is not surjective then $$\operatorname{im} f = \{f(x_1), \ldots, f(x_m)\}$$ isn’t all of X, so it has size at most m-1, so two of the listed elements must be equal and therefore f is not injective.

### 1.2.3 Well-definedness

This is a topic which often causes confusion. When people talk about whether a certain function is or isn’t “well-defined”, they’re really discussing whether a function has been successfully defined or not. You don’t have “functions” and “well-defined functions”, there are only functions and attempts to define a function that didn’t succeed.

This is easier to understand with an example.

Consider the set $$\mathbb{Q}$$ of all rational numbers. Suppose someone says “I define a function $$f:\mathbb{Q} \to \mathbb{N}$$ by $$f(p/q) = p + q$$.” This doesn’t work, and there’s no such function! The problem is that $$1/2 = 2/4$$, but $$1+2 \neq 2 + 4$$: the proposed definition violates the rule that a function has one and only one output for each input. In this situation we say “f isn’t well-defined.”