Suppose that you have two finite sets $X$ and $Y$ and a function $f:X\to Y$. If you know that $f$ is onto then you get some information about $X$ and $Y$: you know that $X$ must be at least as large as $Y$.
But an arbitrary function between two vector spaces doesn’t necessarily give you any information about their relationship as vector spaces. To get such information, we need to restrict to functions that respect the vector space structure — that is, the scalar multiplication and the vector addition.
Functions with this property, which we’re going to define shortly, are called linear maps. They allow us to do something similar to the finite set example above: for example, if you have a surjective linear map from a vector space $X$ to another vector space $Y$, it is true that $dimX\u2a7edimY$.
Let $V$ and $W$ be vector spaces over the same field $\mathbb{F}$. A function $T:V\to W$ is called a linear map or a linear transformation if
for all $\lambda \in \mathbb{F}$ and all $\mathbf{v}\in V$ we have $T(\lambda \mathbf{v})=\lambda T(\mathbf{v})$, and
for all $\mathbf{v},\mathbf{w}\in V$ we have $T(\mathbf{v}+\mathbf{w})=T(\mathbf{v})+T(\mathbf{w})$.
Point 1 is what it means to say that $T$ respects scalar multiplication and point 2 is what it means to say that $T$ respects vector addition.
This concept is so common that it has many names. For us,
$T$ is a linear map
$T$ is a linear function
$T$ is a linear transformation
$T$ is linear
all mean exactly the same thing, namely that $T$ satisfies Definition 4.13.1.
For any vector space $V$, the identity map $\mathrm{id}:V\to V$ and the zero map $z:V\to V$ given by $z(v)={\mathrm{\U0001d7ce}}_{V}$ for all $v\in V$ are linear.
Let $A$ be a $m\times n$ matrix with entries in a field $\mathbb{F}$. Then ${T}_{A}:{\mathbb{F}}^{n}\to {\mathbb{F}}^{m}$ defined by ${T}_{A}(\mathbf{x})=A\mathbf{x}$ is linear.
$T:{M}_{n\times n}(\mathbb{R})\to {M}_{n\times n}(\mathbb{R}),T(A)={A}^{2}$ is not linear.
$T:{\mathbb{R}}^{n}\to \mathbb{R},T\left(\begin{array}{c}{x}_{1}\\ \mathrm{\vdots}\\ {x}_{n}\end{array}\right)={\sum}_{i=1}^{n}{x}_{i}$ is linear.
$D:{\mathbb{R}}_{\u2a7dn}[x]\to {\mathbb{R}}_{\u2a7dn}[x]$ given by $D(f)=\frac{df}{dx}$ is linear.
Let’s look at why some of these are true, starting with example 2. To show that ${T}_{A}$ is a linear map we have to check the two parts of the definition of being a linear map. Both of these are going to follow from properties of matrix multiplication and addition that you learned in the previous section.
Let $\mathbf{x}\in {\mathbb{F}}^{n}$ and $\lambda \in \mathbb{F}$. Then
${T}_{A}(\lambda \mathbf{x})$ | $=A(\lambda \mathbf{x})$ | $\text{definition of}{T}_{A}$ | ||
$=\lambda A\mathbf{x}$ | matrix multiplication properties | |||
$=\lambda {T}_{A}(x)$ | $\text{definition of}{T}_{A}$ |
Let $\mathbf{x},\mathbf{y}\in {\mathbb{F}}^{n}$. Then
${T}_{A}(\mathbf{x}+\mathbf{y})$ | $=A(\mathbf{x}+\mathbf{y})$ | $\text{definition of}{T}_{A}$ | ||
$=A\mathbf{x}+A\mathbf{y}$ | matrix multiplication properties | |||
$={T}_{A}(\mathbf{x})+{T}_{A}(\mathbf{y})$ | $\text{definition of}{T}_{A}$ |
The properties of matrix multiplication used were proved in Proposition 3.4.1.
Similarly, the fact that the differentiation map $D$ of example 5 is linear follows from standard properties of derivatives: you know, for example, that for any two functions (not just polynomials) $f$ and $g$ we have $\frac{d}{dx}(f+g)=\frac{df}{dx}+\frac{dg}{dx}$, which shows that $D$ satisfies the second part of the linearity definition.
As an example where the linearity of a map doesn’t just come from standard facts you already know, consider
$$T:{\mathbb{R}}^{2}\to \mathbb{R}T\left(\begin{array}{c}x\\ y\end{array}\right)=2x-y$$ |
To show $T$ is linear we have to show that it has properties 1 and 2 from the definition.
$T\left(\lambda \left(\begin{array}{c}x\\ y\end{array}\right)\right)$ | $=T\left(\begin{array}{c}\lambda x\\ \lambda y\end{array}\right)$ | ||
$=2\lambda x-\lambda y$ | |||
$=\lambda (2x-y)$ | |||
$=\lambda T\left(\begin{array}{c}x\\ y\end{array}\right)$ |
$T\left(\left(\begin{array}{c}{x}_{1}\\ {y}_{1}\end{array}\right)+\left(\begin{array}{c}{x}_{2}\\ {y}_{2}\end{array}\right)\right)$ | $=T\left(\begin{array}{c}{x}_{1}+{x}_{2}\\ {y}_{1}+{y}_{2}\end{array}\right)$ | ||
$=2({x}_{1}+{x}_{2})-({y}_{1}+{y}_{2})$ | |||
$=(2{x}_{1}-{y}_{1})+(2{x}_{2}-{y}_{2})$ | |||
$=T\left(\begin{array}{c}{x}_{1}\\ {y}_{1}\end{array}\right)+T\left(\begin{array}{c}{x}_{2}\\ {y}_{2}\end{array}\right).$ |
Here are some examples of things which are not linear maps:
$T:\mathbb{R}\to \mathbb{R},T(x)=|x|$ isn’t linear. It doesn’t satisfy either linearity property. $T(-2\cdot 3)\ne -2\cdot T(3)$, and $T(-1+1)\ne T(-1)+T(1)$.
$T:{\mathbb{R}}^{3}\to {\mathbb{R}}^{3},T(\mathbf{x})=\mathbf{x}+\left(\begin{array}{c}1\\ 0\\ 0\end{array}\right)$. Again it doesn’t satisfy either part of the definition - you should check that.