2.3. Norms¶
Norms are valuable tools for arguing about the extent and magnitude of error. We introduce some concepts that we will broadly use later on:
Definition
A vector norm is a function from \(\mathbb R^n\) to \(\mathbb R\). If \(x\in\mathbb R^n\), we symbolize its norm by \(\lVert x\rVert\), and its defining properties are:
- \(\lVert x\rVert\geq 0\) for all \(x\in\mathbb R^n\), also \(\lVert x\rVert=0\) if and only if \(x=0\).
- \(\lVert \alpha x\rVert = \lvert\alpha\rvert\cdot \lVert x\rVert\) for all \(\alpha\in\mathbb R,x\in\mathbb R^n\).
- \(\lVert x+y\rVert\leq \lVert x\rVert + \lVert y\rVert\) for all \(x,y\in\mathbb R^n\).
Note that the properties above do not determine a unique form of a “norm” function. In fact, many different valid norms exist. Typically, we will use subscripts (for example, \(\lVert \cdot\rVert_a\)) to denote different types of norms.
2.3.1. Why are vector norms needed?¶
In many applications, such as the solution of a nonlinear equation \(f(x)=0\), the error \(e=x_\textsf{approx}-x_\textsf{exact}\) is a single number. Thus, the absolute value of \(\vert e\vert\) gives us a good idea of the “extent” of the error.
When solving a system of linear equations \(Ax=b\), the exact solution \(x_\textsf{exact}\) as well as any approximation \(x_\textsf{approx}\) are vectors, and the error \(e=x_\textsf{approx}-x_\textsf{exact}\) is a vector too. It is not as straightforward to assess the “magnitude” of such a vector-valued error. For example, consider \(e_1,e_2\in\mathbb R^{1000}\), and
Which one is worse? While \(e_1\) has a modest amount of error uniformly distributed over all components, all components of \(e_2\) are exact, but one of them has a huge discrepancy! Exactly how we quantify and assess the extent of error is application-dependent. Vectors norms are alternative ways to measure this magnitude, and different norms would be appropriate for different tasks.
2.3.2. Types of Vector Norms¶
Consider a vector \(v\in\mathbb R^n\). Some norms which satisfy the properties of vector norms are:
- The \(L_1\)-norm (or \(1\)-norm):
- The \(L_2\)-norm (or \(2\)-norm, or Euclidean norm)
- The infinity norm (or max-norm)
- (Less common) \(L_p\)-norm
NumPy has a package called linalg
that allows us
to directly compute all these different kinds of norms for a given vector, as shown below.
Here, the second parameter specifies the particular norm that we wish to compute
(in our case, the \(L_1\) norm, \(L_2\) norm, or \(L_\infty\) norm, respectively):
>>> v = v = np.array([3,1,4,5,2,7])
>>> np.linalg.norm(v,1)
22
>>> np.linalg.norm(v,2)
10.198039027185569
>>> np.linalg.norm(v,np.inf)
7
2.3.3. Equivalence of different norms¶
Several inequalities can be proved for the \(L_1\), \(L_2\), and \(L_\infty\)-norms, respectively. We start by showing that \(\lVert x\rVert_\infty\leq \lVert x\rVert_1\).
The inequality above follows from the fact that the sum of the absolute values \(\sum_{i=1}^n \vert x_i\vert\) contains the maximum absolute value \(\max_{i=1}^n \vert x_i\vert\). Furthermore, the absolute value \(\vert x_i\vert\) of each component is less than or equal to the maximum of the absolute values over all components. This allows us to prove an inequality in the reverse direction as well, albeit with the multiplicative factor of \(n\).
The above two inequalities can be compactly written as \(\lVert x\rVert_\infty \leq \lVert x\rVert_1\leq n\cdot\lVert x\rVert_\infty\). The equality holds when all components of \(x\) are equal, as can be easily verified.
Example 1
Consider the vector \(v = (5,2,1,4,3)\). The \(L_1\) and \(L_\infty\)-norms of \(v\) are
respectively. Note that \(\lVert v\rVert_\infty = 5\leq \lVert v\rVert_1 = 15 \leq 5\cdot\lVert v\rVert_\infty = 5\cdot 5 = 25\).
Definition
Two vector norms \(\lVert x\rVert_a\) and \(\lVert x\rVert_b\) are called equivalent if there exist real numbers \(c,d>0\), such that
It follows from this definition that the \(L_1\) and \(L_\infty\)-norms are equivalent, as proved above. Now consider the square of the \(L_1\)-norm
The above inequality implies that \(\lVert x\rVert_1\geq \lVert x\rVert_2\). For the reverse direction of the inequality, consider the AM-GM inequality:
The second inequality follows from the fact that each \(x_i\) appears with exactly \(n-1\) other components \(x_j\). Using the above inequality with the square of the \(L_1\)-norm gives
This implies that \(\lVert x\rVert_1\leq\sqrt{n}\lVert x\rVert_2\). Combining the two inequalities gives \(\lVert x\rVert_2\leq\lVert x\rVert_1\leq\sqrt{n}\lVert x\rVert_2\), proving that the \(L_1\) and \(L_2\)-norms are equivalent as well. Again, equality holds when all components of \(x\) are equal.
Example 2
Consider the same vector \(v = (5,2,1,4,3)\). The \(L_1\) and \(L_2\)-norms of \(v\) are
respectively. Note that \(\lVert v\rVert_2 = 7.4162 \leq \lVert v\rVert_1 = 15 \leq \sqrt{5}\cdot 7.4162 = 16.5831\).