1.3 Functions

Informally, when we write \(f:X \to Y\) or say ‘f is a function from X to Y’ we mean that f is a definite rule which associates to each element \(x \in X\) a single element \(f(x)\) of Y. Some times the word map is used in place of function - it means exactly the same thing.

Just like for relations, there is a formal set-theoretic definition of function: a function f from a set X to a set Y is a subset \(f \subseteq X\times Y\) with the property that for each \(x \in X\) there is a unique \(y \in Y\) such that \(\langle x,y\rangle \in f\). This avoids the vaguness of “a definite rule” in our informal definition.

Definition 1.6 Let \(f: X \to Y\) be a function. - The set X is called the domain of f. - The set Y is called the codomain of f.

Often people talk of functions in terms of input and output: if \(f: X \to Y\) then when we give it an input \(x\in X\), it gives an output \(f(x)\). The x in f(x) is called the argument of f, and people talk about ‘applying’ a function f to an input x to get an output f(x).

Here’s a quick example of how functions are identified with subsets of a Cartesian product. Let \(X= \{0,1,2\}, Y = \{1, 2,3,4,5,6\}\) and let f be the function from X to Y whose ‘rule’ is that for all \(x\in X\) we have f(x)=2x+1. The description above was that a function was identified with the subset of \(X\times Y\) consisting of all the pairs \(\langle x, f(x) \rangle\), so \[ f = \{ \langle 0, 1\rangle, \langle 1, 3\rangle, \langle 2, 5\rangle\}.\]

Two functions \(f,g : X \to Y\) are said to be equal if for all \(x \in X\) we have \(f(x)=g(x)\). This means that when we describe a function informally, by giving a rule, it isn’t the description of the rule that is important but merely which output f(x) it produces for a given input x. The functions: \[\begin{align*} f: \{0,1\} \to \{0,1\} & \;\;\; f(x) = x^2 & \text{and} \\ g: \{0,1\} \to \{0,1\} & \;\;\; g(x)=x^3 \end{align*}\] are equal, even though the rules look different, because f(0)=g(0) and f(1)=g(1).

Rather than use this set theoretic definition of a function, we will tend to define functions by specifying their output f(x) for a given input x. A common alternative notation is to write

\[\begin{align*} f: & X \to Y \\ & x \mapsto x^2 \end{align*}\]

to indicate that the function f has domain X and codomain Y, and sends an input x to the output \(x^2\).

1.3.1 Function properties and composition

Definition 1.7 Let \(f:X \to Y\) be a function.

f is called injective or one to one if for all \(a,b \in X\), if \(f(a)=f(b)\) then \(a=b\).
The image of f, written \(\im f\), is \(\{f(x): x \in X\}\).
f is called surjective or onto if \(\im f = Y\).
f is called a bijection if it is injective and surjective.

If we have a function \(f: X \to Y\) and \(g: Y \to Z\) then ‘do f then do g’ is a rule for getting from X to Z. The associated function is called the composition of f and g:

Definition 1.8 Let \(f:X \to Y\) and \(g: Y \to Z\) be functions. The composition of g and f, written \(g \circ f\) is the function \(g\circ f: X \to Z\) such that \((g\circ f)(x)= g(f(x))\).

Thus \(g\circ f\) means ‘do f, then do g.’ Be careful to get this the correct way round. Notice that composition only makes sense when the codomain of f is the same as the domain of g.

Function composition is associative: if \(f:X \to Y, g: Y \to Z, h : Z \to W\) then \[ h \circ (g \circ f) = (h \circ g) \circ f\] as functions from X to Z. The reason this is true is because both sides send an input \(x\in X\) to the output \(h(g(f(x)))\).

One important function from a set X to itself is the identity function \(\id_X\), which does nothing: it is defined by \(\id_X(x) = x\) for all \(x \in X\).

Definition 1.9 Let \(f:X \to Y\) and \(g: Y \to X\) be functions.

We say that g is a left inverse to f, and that f is a right inverse to g, if \(g\circ f = \id_X\).
We say that f is invertible if there is a function \(h:Y \to X\) such that \(f\circ h = \id_Y\) and \(h \circ f=\id_X\).

A left inverse to f is a function \(g:Y \to X\) such that \(g\circ f=\id_x\), and a right inverse to f is a function \(h:Y \to X\) such that \(f\circ h = \id_Y\). In fact, if f has a left inverse g and a right inverse h then it must be that \(g=h\): given any \(y \in Y\), \[ g(y) = g(f(h(y)) = h(y)\] with the first equality because \(f(h(y))=y\) as h is right inverse to f, and the second equality because \(g(f(x))=x\) for any x as g is left inverse to f. This shows that if f has a left inverse and a right inverse, it is invertible.

It also follows that if f is invertible then there is one and only one function which is a left and right inverse to f. We write this function \(f^{-1}\) and call it the inverse to f.

We can connect the properties of left and right invertibility to those of injectivity and surjectivity:

Proposition 1.2 Let \(f: X \to Y\) be a function.

f has a left inverse if and only if it is injective.
f has a right inverse if and only if it is surjective.
f is invertible if and only if it is a bijection.

Proof.

(1, only if). Suppose f has a left inverse g, and that \(f(a)=f(b)\). Then \(g(f(a))=g(f(b))\), and since \(g\circ f = \id_X\) we have \(a=b\), which shows f is injective.
(1, if). Suppose f is injective. Notice that if \(y \in \im f\) then there is a unique \(x \in X\) such that \(f(x)=y\), by injectivity. Choose any \(x_0 \in X\), and define \(g:Y \to X\) by \[ g(y) = \begin{cases} \text{the unique } x \text{ such that } f(x)=y & y \in \im f \\ x_0 & y \notin \im f \end{cases} \] Then g is left inverse to f.
(2, only if) Suppose f has a right inverse g and let \(y \in Y\). Then \(f(g(y))=y\) as \(f\circ g = \id_Y\). This shows \(y \in \im f\), so \(\im f=Y\) and f is onto.
(2, if) Suppose f is surjective. Then every \(y \in Y\) is in the image of f, so for each \(y \in Y\) pick an element \(g(y) \in X\) such that \(f(g(y))=y\). Then g is right inverse to f.
(3, only if) If f is invertible then it has a left inverse, so is injective by part 1, and it has a right inverse, so is surjective by part 2. Thus f is a bijection.
(3, if) If f is a bijection then it has a left inverse by part 1 and a right inverse by part 2, so it is invertible as we showed in the text before this proposition.

Proposition 1.3 If \(f:X \to Y\) and \(g:Y \to Z\) are both invertible then so is \(g\circ f\), and its inverse is \(f^{-1}\circ g^{-1}\).

Proof. Notice that \[\begin{align*} (f^{-1}\circ g^{-1}) \circ (g \circ f) &= f^{-1} \circ (g^{-1}\circ g) \circ f &\text{associativity} \\ &= f^{-1}\circ \id_Y \circ f &\text{definition of inverse}\\ &= f^{-1}\circ f &\text{as } \id_Y\circ f=f \\ &= \id_X \end{align*}\] and similarly \((g\circ f)\circ (f^{-1}\circ g^{-1})=\id_Z\), so \(f^{-1}\circ g^{-1}\) is a two-sided inverse to \(f\circ g\).

Using this repeatedly, if \(f_1, f_2, \ldots, f_n\) are invertible and the composition \(f_1\circ f_2 \circ \cdots \circ f_n\) makes sense, then it is invertible with inverse \(f_n^{-1}\circ f_{n-1}^{-1} \circ \cdots \circ f_1^{-1}\).

Since a function is invertible if and only if it is a bijection, this tells us that the composition of two bijections is again a bijection.

1.3.2 Well-definedness

This is a topic which often causes confusion. When people talk about whether a certain function is or isn’t “well-defined”, they’re really discussing whether a function has been successfully defined or not. You don’t have “functions” and “well-defined functions”, there are only functions and attempts to define a function that didn’t do so successfully.

This is easier to understand with an example.

Consider the set \(\mathbb{Q}\) of all rational numbers. Suppose someone says “I define a function \(f:\mathbb{Q} \to \mathbb{N}\) by \(f(p/q) = p + q\).” This doesn’t work, and there’s no such function! The problem is that \(1/2 = 2/4\), but \(1+2 \neq 2 + 4\): the proposed definition violates the rule that a function has one and only one output for each input. In this situation we say “f isn’t well-defined.”