Tensors
This post explores the mathematical concept of a tensor.

Linear algebra is based on the concept of a vector as represented as an element of an abstract vector space. Multilinear algebra is based on a generalized concept called a tensor as represented as a element of an abstract tensor product space.
Multilinear Maps
Rather than vectors, their dual, covectors (discussed in this post) are the starting point for the generalization to tensors. Let \(V\) be a finite-dimensional, real vector space (all vector spaces in this post are assumed to be so). Recall that the dual space \(V^*\) is the space of all linear functionals on \(V\), i.e. the space of maps \(\omega : V \rightarrow \mathbb{R}\) that are linear over \(\mathbb{R}\). A (concrete) tensor is a multilinear functional, that is, given vector spaces \(V_1,\dots,V_k\), a tensor is a multilinear map
\[F : V_1 \times \dots \times V_k \rightarrow \mathbb{R}.\]
Multilinearity means that, for a map \(F : V_1 \times \dots \times V_k \rightarrow W\) between the product vector space \(V_1 \times \dots \times V_k \) and a vector space \(W\), the following properties hold:
- \(F(v_1,\dots,v_i + v'_i,\dots,v_k) = F(v_1,\dots,v_i,\dots,v_k) + (v_1,\dots,v'_i,\dots,v_k)\),
- \(F(v_1,\dots,av_i,\dots,v_k) = aF(v_1,\dots,v_i,\dots,v_k)\).
In other words, the map \(F\) is linear separately in each argument, if the other arguments are held fixed.
We will denote the set of all such multilinear maps as \(L(V_1,\dots,V_k;W)\). It is a vector space under the following operations:
- \((F + F')(v_1,\dots,v_k) = F(v_1,\dots,v_k) + F'(v_1,\dots,v_k)\),
- \((aF)(v_1,\dots,v_k) = a(F(v_1,\dots,v_k))\).
Tensors are multilinear functionals, that is, multilinear maps where the codomain is the field \(\mathbb{R}\). They thus directly generalize covectors.
Given two multilinear functionals \(F \in L(V_1,\dots,V_k)\) and \(G \in L(W_1,\dots,W_l)\), their tensor product \(F \otimes G\) can be computed a follows:
\[(F\otimes G)(v_1,\dots,v_k,w_1,\dots,w_l) = F(v_1,\dots,v_k)G(w_1,\dots,w_l).\]
Tensors and Covectors
Next, we want to explore the relationship between tensors and covectors; in particular, we want to express tensors in terms of covectors.
For a sequence of covectors \(\omega^j \in V^*_j\) of length \(k\), we can form the tensor product \(\omega^1 \otimes \dots \otimes \omega^k \in L(V_1,\dots,V_k;\mathbb{R})\) of these covectors as follows:
\[\omega^1 \otimes \dots \otimes \omega^k(v_1,\dots,v_k) = \omega^1(v_1) \dots \omega^k(v_k).\]
Note that
\begin{align}\omega^1 \otimes \dots \otimes \omega^i \otimes \dots \otimes \omega^k (v_1,\dots,v_i + v'_i,\dots,v_k) &= \omega^1(v_1) \otimes \dots \otimes \omega^i(v_i + v'_i) \otimes \dots \otimes \omega^k (v_k) \\&= \omega^1(v_1) \dots \omega^i(v_i + v'_i) \dots \omega^k (v_k) \\&= \omega^1(v_1) \dots (\omega^i(v_i) + \omega^i(v'_i)) \dots \omega^k (v_k) \\&= (\omega^1(v_1) \dots \omega^i(v_i) \dots \omega^k (v_k)) \\&+ (\omega^1(v_1) \dots \omega^i(v'_i) \dots \omega^k (v_k)) \\&= (\omega^1(v_1) \otimes \dots \otimes \omega^i(v_i) \dots \otimes \omega^k (v_k)) \\&+ (\omega^1(v_1) \otimes \dots \otimes \omega^i(v'_i) \dots \otimes \omega^k (v_k)) \\&= \omega^1 \otimes \dots \otimes \omega^i \otimes \dots \otimes \omega^k (v_1,\dots,v_i,\dots,v_k) \\&+ \omega^1 \otimes \dots \otimes \omega^i \otimes\dots \otimes \omega^k (v_1,\dots,v'_i,\dots,v_k),\end{align}
and
\begin{align}\omega^1 \otimes \dots \otimes \omega^i \otimes \dots \otimes \omega^k (v_1,\dots,av'_i,\dots,v_k) &= \omega^1(v_1) \otimes \dots \otimes \omega^i(av_i) \dots \otimes \omega^k (v_k) \\&= \omega^1(v_1) \dots \omega^i(av_i) \dots \omega^k (v_k) \\&= \omega^1(v_1) \dots (a\omega^i(v_i)) \dots \omega^k (v_k) \\&= a(\omega^1(v_1) \dots \omega^i(v_i) \dots \omega^k (v_k)) \\&= a(\omega^1(v_1) \otimes \dots \otimes \omega^i(v_i) \otimes \dots \otimes \omega^k (v_k)) \\&= a(\omega^1 \otimes \dots \otimes \omega^i \otimes \dots \otimes \omega^k (v_1,\dots,v'_i,\dots,v_k)).\end{align}
Every tensor product of covectors is a multilinear functional, and is thus a tensor. More generally, any linear combination of tensor products of covectors will be a tensor.
We next examine the converse, namely, whether every tensor can be expressed as a linear combination of tensor products of covectors. Given vector spaces \(V_1,\dots,V_k\) of dimensions \(n_1,\dots,n_k\), respectively, and a basis \((E^j_1,\dots,E^j_{n_j})\) for each vector space \(V^j\) with dual basis \((\varepsilon_j^1,\dots,\varepsilon_j^{n_j})\) for \(V^*_j\), we compute the following for a sequence of vectors \(v_1,\dots,v_k\):
\begin{align}F(v_1,\dots,v_k) &= F(v_1^{i_1}E^1_{i_1},\dots,v_k^{i_k}E^k_{i_k}) \\&= v_1^{i_1} \dots v_k^{i_k} F(E^1_{i_1},\dots,E^k_{i_k}) \\&= F(E^1_{i_1},\dots,E^k_{i_k}) v_1^{i_1} \dots v_k^{i_k} \\&= F(E^1_{i_1},\dots,E^k_{i_k}) \varepsilon_1^{i_1}(v_1) \dots \varepsilon_k^{i_k}(v_k) \\&= F(E^1_{i_1},\dots,E^k_{i_k}) \varepsilon_1^{i_1} \otimes \dots \otimes\varepsilon_k^{i_k}(v_1,\dots,v_k).\end{align}
Thus, the situation is analogous to the situation for covectors; every covector \(\omega\) can be expressed as \(\omega = \omega(E_i) \varepsilon^i\). We write \(\omega_i = \omega(E_i)\) and thus \(\omega = \omega_i \varepsilon^i\). Likewise, writing
\[F_{i_1,\dots,i_k} = F(E^1_{i_1},\dots,E^k_{i_k}),\]
we obtain
\[F = F_{i_1,\dots,i_k} \varepsilon_1^{i_1} \otimes \dots \otimes\varepsilon_k^{i_k}.\]
Note that, if \(F = 0\), then, for any selection of basis vectors \((E^1_{j_1},\dots,E^k_{j_k})\)
\begin{align}0 &= F(E^1_{j_1},\dots,E^k_{j_k}) \\&= F_{i_1,\dots,i_k} \varepsilon_1^{i_1} \otimes \dots \otimes\varepsilon_k^{i_k}(E^1_{j_1},\dots,E^k_{j_k}) \\&= F_{i_1,\dots,i_k}.\end{align}
This means that each of the tensors \(\varepsilon_1^{i_1} \otimes \dots \otimes\varepsilon_k^{i_k}\) is linearly independent. Thus, the set
\[\{\varepsilon_1^{i_1} \otimes \dots \otimes\varepsilon_k^{i_k} : 1 \le i_i \le n_1,\dots,1 \le i_k \le n_k\}\]
comprises a basis for \(L(V_1,\dots,V_k;\mathbb{R})\). In other words, the basis of \(L(V_1,\dots,V_k;\mathbb{R})\) consists of every possible tensor product of dual basis vectors.
Note that it is not the case that every tensor is equal to a tensor product of covectors \(\omega^1 \otimes \dots \otimes \omega^k\), but, rather, every tensor is equal to a linear combination of such tensor products. For instance, consider the dot product of vectors \(v,w\in \mathbb{R}^2\) in terms of the standard basis \((e_1,e_2)\) and the standard dual basis \((e^1,e^2)\):
\begin{align}v \cdot w &= v^1w^1 + v^2w^2 \\&= e^1(v)e^1(w) + e^2(v)e^2(w) \\&= (e^1 \otimes e^1)(v,w) + (e^2 \otimes e^2)(v,w).\end{align}
Each covector \(\omega \in (\mathbb{R}^2)^*\) is expressed as \(\omega = \omega_1 e^1 + \omega_2 e^2\) in terms of the standard dual basis. In order to express the dot product in the form \(\omega \otimes \omega'\) for two covectors \(\omega\) and \(\omega'\), the following would have to hold:
\begin{align}v \cdot w &= v^1w^1 + v^2w^2 \\&= \omega \otimes \omega'(v,w) \\&= \omega(v) \omega'(w) \\&= (\omega_1 e^1 + \omega_2 e^2)(v)(\omega'_1 e^1 + \omega'_2 e^2)(w) \\&= (\omega_1 e^1(v) + \omega_2 e^2(v))(\omega'_1 e^1(w) + \omega'_2 e^2(w)) \\&= (\omega_1 v^1 + \omega_2 v^2)(\omega'_1 w^1 + \omega'_2 w^2) \\&= \omega_1 v^1 \omega'_1 w^1 + \omega_1 v^1 \omega'_2 w^2 + \omega_2 v^2 \omega'_1 w^1 + \omega_2 v^2 \omega'_2 w^2.\end{align}
This equation must hold for all values of \(v\) and \(w\), so, in particular, it must hold whenever \(v^1 = w^2 = 0\), which means that \(\omega_2 v^2 \omega'_1 w^1 = 0\) and thus either \(\omega_2 = 0\) or \(\omega'_1 = 0\). Likewise, if \(v^2 = w^1 = 0\), then either \(\omega'_2 = 0\) or \(\omega_1 = 0\). If \(\omega_1 = \omega'_1 = 0\), then, if \(v^2 = 0\) or \(w^2 = 0\), \(v^1w^1 = 0\), which, in general, is not true. Likewise, if \(\omega_2 = \omega'_2 = 0\), then, if \(v^1 = 0\) or \(w^1 = 0\), \(v^2w^2 = 0\), which, in general, is not true. If \(\omega_1 = \omega_2 = 0\), then \(v \cdot w = 0\), which, in general, is not true. If \(\omega'_1 = \omega'_2 = 0\), then \(v \cdot w = 0\), which, in general, is not true. Thus, regardless of the choice of coefficients, this equation cannot be satisfied in general.
Abstract Tensor Product Spaces
So far, we have succeeded in defining "concrete" tensors as multilinear functionals in \(L(V_1,\dots,V_k;\mathbb{R})\). We could also describe this space as the set of all linear combinations of "pure" tensor products of the form
\[\omega_1 \otimes \dots \otimes \omega_k\]
for \(\omega_1 \in V_1^*, \dots, \omega_k \in V_k^*\). We can therefore adopt the suggestive notation
\[V_1^* \otimes \dots \otimes V_k^* = L(V_1,\dots,V_k;\mathbb{R})\]
for the tensor product space of the vector spaces \(V_1^*,\dots,V_k^*\).
Since every vector space \(V\) is canonically isomorphic to its second dual space \(V^{**}\), we could also define tensor product spaces of any vector spaces \(V_1,\dots,V_k\) in the same manner as
\[V_1 \otimes \dots \otimes V_k = L(V_1^* \times \dots \times V_k^*; \mathbb{R}).\]
However, we will define the tensor product space \(V_1 \otimes \dots \otimes V_k\) in a more abstract manner, and then prove that it is naturally isomorphic to the space \(L(V_1^* \times \dots \times V_k^*; \mathbb{R})\).
We now consider more abstract characterizations of tensor product spaces. We begin by abstracting the characteristic properties of covector tensor products. Recall that every multilinear functional \(F \in L(V_1, \dots,V_k; \mathbb{R})\) is a linear combination of covector tensor products of the form \(\omega^1 \otimes \dots \otimes \omega^k\), and that
- \(\omega_1 \otimes \dots \otimes (\omega_i + \omega'_i) \otimes \dots \otimes \omega_k = (\omega_1 \otimes \dots \otimes \omega_i \otimes \dots \otimes \omega_k) + (\omega_1 \otimes \dots \otimes \omega'_i \otimes \dots \otimes \omega_k)\)
- \(\omega_1 \otimes \dots \otimes a\omega_i \otimes \dots \otimes \omega_k = a(\omega_1 \otimes \dots \otimes \omega_i \otimes \dots \otimes \omega_k)\).
Given vector spaces \(V_1,\dots,V_k\), we want to construct a new vector space \(V_1 \otimes \dots \otimes V_k\) containing "pure" elements denoted \(v_1 \otimes \dots \otimes v_k\) for \(v_1 \in V_1, \dots, v_k \in V_k\), such that every element in the new space is a linear combination of these pure elements, and the following relations are satisfied:
- \(v_1 \otimes \dots \otimes v_i + v'_i \dots \otimes v_k = (v_1 \otimes \dots \otimes v_i \otimes \dots \otimes v_k) + (v_1 \otimes \dots \otimes v'_i \otimes \dots \otimes v_k)\)
- \(v_1 \otimes \dots \otimes av'_i \dots \otimes v_k = a(v_1 \otimes \dots \otimes v_i \otimes \dots \otimes v_k)\).
We can use a standard technique from universal algebra for constructing \(V_1 \otimes \dots \otimes V_k\).
First, we specify a set of generators; in this case, we want one generator corresponding to each "pure" element. There is one such "pure" element per tuple of vectors \((v_1,\dots,v_k) \in V_1 \times \dots \times V_k\), so we take the underlying set of such tuples \(V_1 \times \dots \times V_k\) as the set of generators. The generators are not themselves the "pure" elements, but each "pure" element represents a generator.
Next, we generate the free vector space \(\mathcal{F}(V_1 \times \dots \times V_k)\) on this set of generators. By definition, this means that there is a set map \(i : V_1 \times \dots \times V_k \rightarrow \mathcal{F}(V_1 \times \dots \times V_k)\) that maps each generator to its corresponding element within the free vector space, and the following universal property is satisfied: for any vector space \(W\) and any set map \(A : V_1 \times \dots \times V_k \rightarrow W\), there exists a unique linear map \(\bar{A} : \mathcal{F}(V_1 \times \dots \times V_k) \rightarrow W\) such that \(\bar{A} \circ i = A\) (as set maps). In other words, all of the generators \((v_1,\dots,v_k)\) are represented in the free vector space by some element \(i(v_1,\dots,v_k)\), and, if we specify a mapping \(A\) of these generators to another vector space \(W\), this induces a unique linear map \(\bar{A}\) that preserves the mapping of generators. The free vector space is "free" in the following sense: the existence of the linear map \(\bar{A}\) means that it can be adapted to any vector space \(W\) once a mapping of generators is specified and thus it is the most generic vector space containing a representation of the generators (it does not satisfy any relations except those required by the axioms for a vector space, since, otherwise, it would not admit a linear map to vector spaces that do not likewise satisfy these extraneous relations); the uniqueness of this map means that there are no extraneous elements in the free vector space, i.e. it contains the minimal set of elements that are absolutely required.
Although the universal property completely characterizes the free vector space, we need to actually exhibit such a space in order to demonstrate that it exists. We will take the set of formal linear combinations of elements of \(V_1 \times \dots \times V_k\) as our model. Given a set \(S\), the set of formal linear combinations of elements of \(S\) is the set of all functions \(f : S \rightarrow \mathbb{R}\) such that \(f(s) = 0\) for all but finitely many elements \(s \in S\). These are "formal" linear combinations in the sense that \(S\) need not admit any algebraic structure that permits us to make sense of actual addition and scalar multiplication of its elements; we may think of this as similar to a term algebra that represents pure syntax without reference to any semantics. Each formal linear combination then represents a syntactic element
\[a_1 \cdot s_1 + \dots + a_k \cdot s_k,\]
where \(a_i = f(s_i)\); note that scalar multiplication and addition are undefined here since \(S\) need not admit any notion of scalar multiplication and addition, so we are simply indicating the sort of syntactic expression that the function \(f\) is meant to represent. Thus, \(f\) specifies the formal scalar coefficients of elements of \(S\) and doesn't specify any ordering for the addition (since addition in a vector spaces is associative and so the ordering is irrelevant). The set of formal linear combinations is a vector space under the following operations:
- \((f + f')(s) = f(s) + f'(s),\)
- \((a \cdot f)(s) = a \cdot f(s).\)
We also need to represent each generator within the free vector space. We specify the function \(\delta_s\) to represent each element \(s \in S\) which is defined as follows:
\[\delta_s(s') = \begin{cases} 1,& \text{if } s = s',\\0, & \text{otherwise.} \end{cases}\]
Note that if \(s_1,\dots,s_k\) are the elements of \(S\) such that \(f(s_i) \ne 0\) for a formal linear combination \(f\), then \(f\) can be uniquely represented as a linear combination as follows:
\[f = \sum_{i=1}^k f(s_i) \cdot \delta_{s_i}.\]
This implies that the set \(\{\delta_s : s \in S\}\) comprises a basis for \(\mathcal{F}(S)\), which is therefore finite-dimensional if and only if \(S\) is finite.
Let's confirm that the vector space of formal linear combinations of elements of \(S\) satisfies the universal property of the free vector space on \(S\). Let \(W\) be any vector space, and suppose that there is a set map \(A : S \rightarrow W\). For each formal linear combination \(f\), we use the following notation:
\[S_f = \{s \in S : f(s) \neq 0\}.\]
Define a linear map \(\bar{A} : \mathcal{F}(S) \rightarrow W\) as follows:
\[\bar{A}(f) = \sum_{s \in S_f} f(s) \cdot A(s).\]
Note the following for the generator mapping \(i : S \rightarrow W\) defined as \(i(s) = \delta_s\):
\begin{align}(\bar{A} \circ i)(s) &= \bar{A}(i(s)) \\&= \bar{A}(\delta_s) \\&= \sum_{x \in S_{\delta_s}} \delta_s(x) \cdot A(x) \\&= \delta_s(s) \cdot A(s) \\&= A(s).\end{align}
Now, suppose that there is another linear map \(\hat{A} : \mathcal{F}(S) \rightarrow W\) such that \(\hat{A} \circ i = A\). Then:
\begin{align}\hat{A}(f) &= \hat{A}\left(\sum_{i=1}^k f(s_i) \cdot \delta_{s_i}\right) \\&= \sum_{i=1}^k f(s_i) \cdot \hat{A}(\delta_{s_i}) \\&= \sum_{i=1}^k f(s_i) \cdot \hat{A}(i(s_i)) \\&= \sum_{i=1}^k f(s_i) \cdot A(s_i) \\&= \bar{A}(f).\end{align}
Thus, \(\bar{A}\) is unique.
Finally, after specifying generators and generating the free vector space, we compute a quotient of the free vector space that identifies elements according to the relations we require.
Recall that, given a vector space \(V\) and a subspace \(N\), we can define an equivalence relation \(\sim\) on \(V\) such that \(v \sim w\) whenever \(v - w \in N\). In particular, since, for every element \(n \in N\), \(n \sim 0\) since \(n - 0 \in N\), it follows that the equivalence class of \(0\), \([0]\), is equal to \(N\). We construct the quotient space \(V/N\) by mapping every element \(v \in V\) to its equivalence class \([v] \in V/N\); moreover, this mapping should be linear (i.e. \([v + w] = [v] + [w]\) and \([av] = a[v]\)). It follows that
\[[v] = \{v + n : n \in N\},\]
since \([v + n] = [v] + [n] = [v] + [0] = [v]\). We can define a vector space structure on \(V/N\) as follows:
- \([v] + [n] = [v + n]\),
- \(a[v] = [av]\).
Note that the quotient vector space satisfies the following universal property: given any map \(A : V \rightarrow W\) such that \(Av = Av'\) whenever \(v \sim v'\), there exists a unique map \(u : V/N \rightarrow W\) such that \(u([v]) = Av\) for all \(v \in V\). In fact, we can define the map \(u\) such that \(u([v]) = Av\) for all \(v \in V\), which is well-defined because, whenever \(v\) and \(v'\) are elements of the same equivalence class, i.e. \(v \sim v'\) and hence \([v] = [v']\), since \(Av = Av'\) it follows that \(u([v]) = u([v'])\).
We define a subspace \(\mathcal{R}\) of \(\mathcal{F}(V_1 \times \dots \times V_k)\) which is spanned by elements of the following forms:
\[i(v_1, \dots,v_i + v'_i, \dots, v_k) - i(v_1,\dots,v_i,\dots,v_k) - i(v_1,\dots,v'_i,\dots,v_k),\]
\[i(v_1,\dots,av_i,\dots,v_k) - a \cdot i(v_1,\dots,v_i,\dots,v_k).\]
We can now define the tensor product of vector spaces:
\[V^1 \otimes \dots \otimes V^k = \mathcal{F}(V_1 \times \dots \times V_k)/\mathcal{R}.\]
We can also now define the tensor product of vectors:
\[v_1 \otimes \dots \otimes v_k = [i(v_1,\dots,v_k)].\]
This means that the following relations are satisfied:
\begin{align}v_1 \otimes \dots \otimes (v_i + v'_i) \otimes \dots \otimes v_k &= [i(v_1,\dots,v_i+v'_i,\dots,v_k)] \\&= [i(v_1,\dots,v_i,\dots,v_k) + i(v_1,\dots,v'_i,\dots,v_k)] \\&= [i(v_1,\dots,v_i,\dots,v_k)] + [i(v_1,\dots,v'_i,\dots,v_k)] \\&= (v_1 \otimes \dots \otimes v_i \otimes \dots \otimes v_k) + (v_1 \otimes \dots \otimes v'_i \otimes \dots \otimes v_k),\end{align}
\begin{align}v_1 \otimes \dots \otimes av_i \otimes \dots \otimes v_k &= [i(v_1,\dots,av_i,\dots,v_k)] \\&= [a \cdot i(v_1,\dots,v_i,\dots,v_k)] \\&= a[i(v_1,\dots,v_i,\dots,v_k)] \\&= a(v_1 \otimes \dots \otimes v_i \otimes \dots \otimes v_k).\end{align}
Thus, we have succeeded in constructing the tensor product space using this new construction.
Universal Property of Tensor Product Spaces
We now seek to characterize tensor product spaces using a universal property.
Note that, for any linear map \(\bar{A} : V_1 \otimes \dots \otimes V_k \rightarrow W\)
\begin{align}\bar{A}(v_1 \otimes \dots \otimes (v_i + v'_i) \otimes \dots \otimes v_k) &= \bar{A}((v_1 \otimes \dots \otimes v_i \otimes \dots \otimes v_k) + (v_1 \otimes \dots \otimes v'_i \otimes \dots \otimes v_k)) \\&= \bar{A}(v_1 \otimes \dots \otimes v_i \otimes \dots \otimes v_k) + \bar{A}(v_1 \otimes \dots \otimes v'_i \otimes \dots \otimes v_k),\end{align}
and
\begin{align}\bar{A}(v_1 \otimes \dots \otimes av_i \otimes \dots \otimes v_k) &= \bar{A}(a(v_1 \otimes \dots \otimes v_i \otimes \dots \otimes v_k)) \\&= a \bar{A}(v_1 \otimes \dots \otimes v_i \otimes \dots \otimes v_k).\end{align}
This is precisely analogous to the defining properties of multilinear maps, if the tensor products are replaced with tuples and the linear maps are replaced with multilinear maps instead. In fact, this is the entire point of the construction of the tensor product space: multilinear maps can be converted to corresponding linear maps. The tensor product space is constructed in such a manner that it internalizes the notion of multilinearity, i.e. the internal relations among its elements mirror the external relations among multilinear maps. The tensor product space "reflects" the properties of multilinear maps. In fact, using the concrete definition of the tensor product space, the elements actually are multilinear maps.
This means that we expect the tensor product space to satisfy the following universal property: for every multilinear map \(A : V_1 \times \dots \times V_k \rightarrow W\), there exists a unique linear map \(\bar{A} : V_1 \otimes \dots \otimes V_k \rightarrow W\) such that \(\bar{A} \circ \pi = A\), where \(\pi : V_1 \times \dots \times V_k \rightarrow V_1 \otimes \dots \otimes V_k\) is the projection map, i.e. the mapping \((v_1,\dots,v_k) \mapsto v_1 \otimes \dots \otimes v_k\).
Thus, tensor products permit us to perform multilinear algebra in the category of vector spaces and linear maps by expressing multilinear maps as linear maps.
There is another way to arrive at this same universal property. First, we define the internal hom. space ("hom" abbreviates "homomorphism") \([V,W]\). This space is supposed to be a vector space consisting of all linear maps between \(V\) and \(W\). This space has natural vector space structure under the following operations:
- \((A + B)(v) = A(v) + B(v)\),
- \((sA)(v) = s(A(v))\).
This space does not satisfy the universal property of the exponential object in the category of vector spaces, however, which requires linear maps \(U \times V \rightarrow W\) to be equivalent to linear maps \(U \rightarrow [V,W]\), since linear maps \(U \rightarrow [V,W]\) correspond to bilinear maps \(U \times V \rightarrow W\). However, this can be rectified by using the tensor product instead. Thus, there is a natural isomorphism
\[\mathrm{Hom}(U \otimes V, W) \cong \mathrm{Hom}(U, [V, W]).\]
This is an example of an adjoint functor: the functor \(- \otimes V\) is left adjoint to the functor \([V,-]\). Another way to analyze adjoint functors is via the unit of this adjunction, which expresses the following universal property: there exists a map \(\eta_U : U \rightarrow [V,U \otimes V]\) such that, for any linear map \(A : U \rightarrow [V,W]\), there exists a unique linear map \(\bar{A} : U \otimes V \rightarrow W\) such that \([V,\bar{A}] \circ \eta_U = A\).
Due to the correspondence between linear maps into hom. spaces and bilinear maps out of product spaces, this is equivalent to the following universal property: there exists a map \(\pi : U \times V \rightarrow U\otimes V\) such that, for any multilinear map \(A : U \times V \rightarrow W\), there exists a unique linear map \(\bar{A} : U \otimes V \rightarrow W\) such that \(\bar{A} \circ \pi = A\). This is precisely the same as the original universal property that we deduced.
Now we want to prove that the abstract tensor product spaces satisfy this universal property. Let \(A : V_1 \times \dots \times V_k \rightarrow W\) be any multilinear map. Let \(i : V_1 \times \dots \times V_k \rightarrow \mathcal{F}(V_1 \times \dots \times V_k)\) be the inclusion of generators. By the universal property of the free vector space \(\mathcal{F}(V_1 \times \dots \times V_k)\), there exists a unique linear map \(\bar{A} : \mathcal{F}(V_1 \times \dots \times V_k) \rightarrow W\) such that \(\bar{A}(i(v_1,\dots,v_k)) = A(v_1,\dots,v_k)\) for all \((v_1,\dots,v_k) \in V_1 \times \dots \times V_k\). Next, note the following:
\begin{align}\bar{A}(i(v_1,\dots,av_i,\dots,v_k)) &= A(v_1,\dots,av_i,\dots,v_k) \\&= aA(v_1,\dots,v_i,\dots,v_k) \\&= a\bar{A}(i(v_1,\dots,v_i,\dots,v_k)) \\&= \bar{A}(a \cdot i(v_1,\dots,v_i,\dots,v_k)),\end{align}
and
\begin{align}\bar{A}(i(v_1,\dots,v_i + v_i',\dots,v_k)) &= A(v_1,\dots,v_i + v_i',\dots,v_k) \\&= A(v_1,\dots,v_i,\dots,v_k) + A(v_1,\dots,v_i',\dots,v_k) \\&= \bar{A}(i(v_1,\dots,v_i,\dots,v_k)) + \bar{A}(i(v_1,\dots,v_i',\dots,v_k)).\end{align}
Thus, \(\bar{A}\) identifies those elements that are equivalent according to the equivalence relation \(\mathcal{R}\), and so, by the universal property of the quotient space \(V_1 \otimes \dots \otimes V_k = \mathcal{F}(V_1 \times \dots \times V_k)/\mathcal{R}\), there exists a unique linear map \(\tilde{A} : V_1 \otimes \dots \otimes V_k \rightarrow W\) such that \(\tilde{A} \circ \Pi = \bar{A}\), where \(\Pi : \mathcal{F}(V_1 \times \dots \times V_k) \rightarrow V_1 \otimes \dots \otimes V_k\) is the projection map. Note that the projection map \(\pi : V_1 \times \dots \times V_k \rightarrow V_1 \otimes \dots \otimes V_k\) is defined such that \(\pi = \Pi \circ i\). Then, since \(\bar{A} \circ i = A\), this means that \(\tilde{A} \circ \pi = A\), as required. Every element of \(V_1 \otimes \dots \otimes V_k\) can be expressed as a linear combination of pure tensor products \(v_1 \otimes \dots \otimes v_k\), and
\[\tilde{A}(v_1 \otimes \dots \otimes v_k) = \bar{A}(i(v_1,\dots,v_k)) = A(v_1,\dots,v_k).\]
This means that \(\tilde{A}\) is unique, since any other similar mapping must also map \(v_1 \otimes \dots \otimes v_k \mapsto A(v_1,\dots,v_k)\).
Tensor Product Space Basis
The basis for an abstract tensor product space \(V_1 \otimes \dots \otimes V_k\) is, in analogy with the basis for \(L(V_1,\dots,V_k;\mathbb{R})\), the set consisting of all possible abstract tensor products of the respective basis vectors of each vector space. Given a basis \((E^i_1,\dots,E^i_{n_k})\) (where \(\mathrm{dim}(V_i) = n_i\)) for each vector space \(V^i\), the basis is the set
\[\mathcal{B} = \{ E^1_{i_1} \otimes \dots \otimes E^k_{i_k} : 1 \le i_1 \le n_1, \dots, 1 \le i_k \le n_k\}.\]
First, recall that, by definition, the tensor product space is spanned by "pure" elements of the form
\[v_1 \otimes \dots \otimes v_k\]
where \(v_i \in V_i\), etc. Consider the expansion of such an element in terms of the respective bases:
\begin{align}v_1 \otimes \dots \otimes v_k &= v^{i_1}E^1_{i_1} \otimes \dots \otimes v^{i_k}E^k_{i_k} \\&= v^{i_1} \dots v^{i_k}E^1_{i_1} \otimes \dots \otimes E^k_{i_k}.\end{align}
Thus, each "pure" element is a linear combination of tensor products of the respective basis vectors, and hence so is every element.
Next, we need to confirm that the vectors in \(\mathcal{B}\) are linearly independent, so suppose that
\[a^{i_1 \dots i_k}E^1_{i_1} \otimes \dots \otimes E^k_{i_k} = 0.\]
Define a family of multilinear maps, one for each tuple \(j_1,\dots,j_k\) of indices as follows:
\[\mu^{j_1 \dots j_k}(v_1,\dots,v_k) = \varepsilon^{j_1}(v_1) \dots \varepsilon^{j_1}(v_k).\]
By the universal property of tensor product spaces, each of these multilinear maps corresponds to a unique linear map \(\tilde{\mu}^{j_1 \dots j_k} : V_1 \otimes \dots \otimes V_k \rightarrow \mathbb{R}\) such that
\[\tilde{\mu}^{j_1 \dots j_k}(v_1 \otimes \dots \otimes v_k) = \mu^{j_1 \dots j_k}(v_1,\dots,v_k).\]
Thus:
\begin{align}0 &= \tilde{\mu}^{j_1 \dots j_k}(0) \\&= \tilde{\mu}^{j_1 \dots j_k}\left(a^{i_1 \dots i_k}E^1_{i_1} \otimes \dots \otimes E^k_{i_k}\right) \\&= a^{i_1 \dots i_k}\tilde{\mu}^{j_1 \dots j_k}\left(E^1_{i_1} \otimes \dots \otimes E^k_{i_k}\right) \\&= a^{i_1 \dots i_k} \mu^{j_1 \dots j_k}\left(E^1_{i_1}, \dots, E^k_{i_k}\right) \\&= a^{i_1 \dots i_k} \varepsilon^{j_1}(E^1_{i_1}) \dots \varepsilon^{j_1}(E^1_{i_k}) \\&= a^{i_1 \dots i_k} \delta^{j_1}_{i_1} \dots \delta^{j_k}_{i_k} \\&= a^{j_1 \dots j_k}.\end{align}
This demonstrates that, for any choice of indexes \(j_1,\dots,j_k\), it follows that \(a^{j_1 \dots j_k} = 0\). Thus, \(\mathcal{B}\) is a linearly independent set and hence a basis.
Abstract vs. Concrete Tensors
Now we want to relate the abstract definition of tensor products and tensor product spaces to the concrete definition. For any vector spaces \(V_1,\dots,V_k\), we will use the universal property of tensor product spaces to convert back and forth between abstract and concrete tensor product spaces.
First, we define a map \(\varphi : V_1^* \times \dots \times V_k^* \rightarrow L(V_1,\dots,V_k;\mathbb{R})\) as follows:
\[\varphi(\omega^1,\dots,\omega^k)(v_1,\dots,v_k) = \omega^1(v_1)\dots\omega^k(v_k).\]
Thus the map \(\varphi\) maps a tuple of covectors to the tensor product of the covectors. Since the tensor product of covectors is multilinear, the map \(\varphi\) is multilinear. Thus, by the universal property of tensor product spaces, this map extends to a linear map \(\bar{\varphi} : V_1^* \otimes \dots \otimes V_k^* \rightarrow L(V_1,\dots,V_k;\mathbb{R})\) which satisfies \(\bar{\varphi} \circ \pi = \varphi\), which means that
\[\bar{\varphi}(\omega^1 \otimes \dots \otimes \omega^k)(v_1,\dots,v_k) = \omega^1(v_1)\dots\omega^k(v_k).\]
Thus, the map \(\bar{\varphi}\) maps abstract tensor products to concrete tensor products of covectors. This only defines the action of \(\bar{\varphi}\) on pure tensor, i.e. those of the form \(\omega^1 \otimes \dots \otimes \omega^k\), but this is sufficient, since such tensors span the tensor product space. Another way to view this is to examine the effect of the mapping on basis vectors:
\begin{align}\bar{\varphi}(\varepsilon_1^{i_1} \otimes \dots \otimes \varepsilon_k^{i_k})(v_1,\dots,v_k) &= \varepsilon_1^{i_1}(v_1) \dots \varepsilon_k^{i_k}(v_k).\end{align}
The mapping likewise maps basis vectors of the abstract tensor product space to basis vectors of the concrete tensor product space, and is thus an isomorphism.
Thus, we may identify abstract and concrete vector spaces:
\[V_1^* \otimes \dots \otimes V_k^* \cong L(V_1,\dots,V_k;\mathbb{R}).\]
Likewise, since each vector space \(V_i\) is naturally isomorphic to its second dual space \(V_i^{**}\), and thus
\[V_1 \otimes \dots \otimes V_k \cong V_1^{**} \otimes \dots \otimes V_k^{**},\]
which means that
\[V_1 \otimes \dots \otimes V_k \cong L(V_1^*,\dots,V_k^*;\mathbb{R}).\]
Thus, due to this isomorphism, we may use whichever representation is convenient for our purposes.
In general, for any vector spaces \(V_1,\dots,V_k\) and \(V'_1,\dots,V'_k\) such that \(V_i \cong V'_i\), it follows that \(V_1 \otimes \dots \otimes V_k \cong V'_1 \otimes \dots \otimes V'_k\). To see this, denote each isomorphism as \(\varphi_i : V_i \rightarrow V'_i\) and define a multilinear map \(\varphi : V_1 \times \dots \times V_k \rightarrow V'_1 \otimes \dots \otimes V'_k\) as follows:
\[\varphi(v_1,\dots,v_k) = \varphi_1(v_1) \otimes \dots \otimes \varphi(v_k).\]
Note that this map is indeed multilinear since it is the component-wise composition of linear functions. By the universal property of tensor products, there exists a linear map \(\tilde{\varphi} : V_1 \otimes \dots V_k \rightarrow V'_1 \otimes \dots \otimes V'_k\)such that
\[\tilde{\varphi}(v_1 \otimes \dots \otimes v_k) = \varphi(v_1) \otimes \dots \otimes \varphi(v_k).\]
Given any basis \((E^j_{i_j})\) for \(V_j\), the map \(\tilde{\varphi}\) maps the basis vectors \(E^1_{i_1} \otimes \dots \otimes E^k_{i_k}\) to corresponding basis vectors for \(V'_1 \otimes \dots \otimes V'_k\), since each map \(\varphi^j\) is an isomorphism and maps basis vectors to basis vectors. Thus, the map \(\tilde{\varphi}\) is an isomorphism.
Tensor product spaces are fully associative, which means that, for any vector spaces \(V_1,V_2,V_3\),
\[V_1 \otimes (V_2 \otimes V_3) \cong V_1 \otimes V_2 \otimes v_3 \cong (V_1 \otimes V_2) \otimes V_3.\]
To see this, define a multlinear map \(\varphi : V_1 \times V_2 \times V_3 \rightarrow (V_1 \otimes V_2) \otimes V_3\) as follows:
\[\varphi(v_1,v_2,v_3) = (v_1 \otimes v_2) \otimes v_3.\]
By the universal property, there is a unique linear map \(\tilde{\varphi} : V_1 \otimes V_2 \otimes V_3 \rightarrow (V_1 \otimes V_2) \otimes V_3\) such that
\[\tilde{\varphi}(v_1 \otimes v_2 \otimes v_3) = (v_1 \otimes v_2) \otimes v_3.\]
Since elements of the form \((v_1 \otimes v_2) \otimes v_3\) span \((V_1 \otimes V_2) \otimes V_3\), the map \(\tilde{\varphi}\) is surjective, and hence, since \(\mathrm{dim}(V_1 \otimes V_2 \otimes V_3) = \mathrm{dim}((V_1 \otimes V_2) \otimes V_3)\), the map is also an isomorphism. The other ismorphism can be proved in a similar manner. Thus, we do not need to specify parentheses when writing tensor product spaces and may freely associate.
The tensor product of multilinear functionals \(\alpha \in L(V_1,\dots,V_k;\mathbb{R})\) and \(\beta \in L(W_1,\dots,W_l;\mathbb{R})\) is a multilinear map \(\varphi : L(V_1,\dots,V_k;\mathbb{R}) \times L(W_1,\dots,W_l;\mathbb{R}) \rightarrow L(V_1,\dots,V_k,W_1,\dots,W_l;\mathbb{R})\) defined such that \(\varphi(\alpha, \beta)(v_1,\dots,v_k,w_1,\dots,w_l) = \alpha(v_1,\dots,v_k)\beta(w_1,\dots,w_l)\), and hence descends to a unique linear map from \(\tilde{\varphi} : L(V_1,\dots,V_k;\mathbb{R}) \otimes L(W_1,\dots,W_l;\mathbb{R}) \rightarrow L(V_1,\dots,V_k,W_1,\dots,W_l;\mathbb{R})\) such that \(\tilde{\varphi}(\alpha \otimes \beta)(v_1,\dots,v_k,w_1,\dots,w_l) = \alpha(v_1,\dots,v_k)\beta(w_1,\dots,w_l)\). Given the isomorphisms \(V_1^* \otimes \dots \otimes V_k^* \cong L(V_1,\dots,V_k;\mathbb{R})\) and \(W_1^* \otimes \dots \otimes W_l^* \cong L(W_1,\dots,W_l;\mathbb{R})\), it follows that \((V_1^* \otimes \dots \otimes V_k^*) \otimes (W_1^* \otimes \dots \otimes W_l^*) \cong L(V_1,\dots,V_k;\mathbb{R}) \otimes L(W_1,\dots,W_l;\mathbb{R})\). Composing the latter isomorphism with the linear map \(\tilde{\varphi}\), we see that this composite map takes abstract tensor products \(\alpha \otimes \beta\) to concrete tensor products, i.e.
\[\alpha \otimes \beta \mapsto (v_1,\dots,v_k,w_1,\dots,w_l) \mapsto \alpha(v_1,\dots,v_k)\beta(w_1,\dots,w_l).\]
Thus, we can relate abstract tensor products to concrete tensor products. The relation between "pure" abstract tensor products and tensor products of covectors already implies this fact since every tensor is a linear combination of "pure" tensor products. However, this provides another avenue for understanding the correspondence.
Tensors and Linear Maps
The tensor product \(V^* \otimes W\) is naturally isomorphic to the internal hom \([V,W]\). This means that all linear maps can be expressed in terms of tensor products.
This isomorphism is witnessed by the map \(\varphi : V^* \otimes W \rightarrow [V,W]\) defined as follows:
\[\varphi(\omega \otimes w)(v) = \omega(v) \cdot w.\]
Given a basis \((E_1,\dots,E_k)\) for \(V\) and the corresponding dual basis \((\varepsilon^1,\dots,\varepsilon^k)\) for \(V^*\), the inverse map \(\varphi^{-1} : [V,W] \rightarrow V^* \otimes W\) is defined as follows:
\[\varphi^{-1}(A) =\varepsilon^i \otimes A(E_i).\]
Observe that
\begin{align}\varphi(\varphi^{-1}(A))(v) &= \varphi(\varepsilon^i \otimes A(E_i))(v) \\&= \varepsilon^i(v) \cdot A(E_i) \\&= v^i \cdot A(E_i) \\&= A(v^i \cdot E_i) \\&= A(v),\end{align}
and
\begin{align}\varphi^{-1}(\phi(\omega \otimes w)) &= \varepsilon^i \otimes \varphi(\omega \otimes w)(E_i) \\&= \varepsilon^i \otimes \omega(E_i) \cdot w \\&=\omega(E_i) \cdot (\varepsilon^i \otimes w) \\&= (\omega(E_i) \cdot \varepsilon^i) \otimes w \\&= \omega \otimes w.\end{align}
Since elements of the form \(\omega \otimes v\) span \((\mathbb{R}^3)^* \otimes \mathbb{R}^3\), this is sufficient. Let's consider the same calculation, this time using basis vectors. Each tensor \(\alpha\) is represented as \(\alpha = \alpha^i_j \cdot \varepsilon^j \otimes E_i\) in the basis, so we compute the following:
\begin{align}\varphi^{-1}(\varphi(\alpha)) &= \varphi^{-1}(\varphi(\alpha^i_j \cdot (\varepsilon^j \otimes E_i))) \\&= \varepsilon^k \otimes \varphi(\alpha^i_j \cdot \varepsilon^j \otimes E_i)(E_k) \\&= \varepsilon^k \otimes (\alpha^i_j \cdot \varphi(\varepsilon^j \otimes E_i)(E_k)) \\&= \varepsilon^k \otimes \alpha^i_j \cdot (\varepsilon^j(E_k) \cdot E_i) \\&= \varepsilon^k \otimes (\alpha^i_j \cdot \varepsilon^j(E_k) \cdot E_i) \\&= \varepsilon^k \otimes (\alpha^i_j \cdot \delta^j_k \cdot E_i) \\&= \varepsilon^k \otimes (\alpha^i_k \cdot E_i) \\&= \alpha^i_k \cdot \varepsilon^k \otimes E_i \\&=\alpha^i_j \cdot \varepsilon^j \otimes E_i.\end{align}
Thus, this is an isomorphism.
For instance, consider the Cauchy stress tensor, a concept from physics. According to one definition, this is a linear map \(\sigma : \mathbb{R}^3 \rightarrow \mathbb{R}^3\) that maps a vector \(n \in \mathbb{R}^3\) which is normal to a surface embedded in \(\mathbb{R}^3\) to its corresponding traction vector. At first glance, this does not appear to be a tensor since it isn't expressed as a multilinear map, etc. However, any linear map with signature \(\mathbb{R}^3 \rightarrow \mathbb{R}^3\) is equivalent to a tensor in the tensor product space \((\mathbb{R}^3)^* \otimes \mathbb{R}^3\). In fact, we can calculate the tensor as \(e^i \otimes \sigma(e_i)\).
Note that, since \(V^* \otimes W \cong [V,W]\), this means that the adjunction
\[\mathrm{Hom}(U \otimes V, W) \cong \mathrm{Hom}(U, [V,W])\]
implies the adjunction
\[\mathrm{Hom}(U \otimes V, W) \cong \mathrm{Hom}(U, V^* \otimes W),\]
and indeed any universal properties satisfied by one is automatically satisfied by the other. Given a definition for the tensor product space, this provides a universal property that defines the dual vector space.
Tensors on a Vector Space
Often, tensors are expressed with respect to a single vector space \(V\) of interest. For instance, when discussing smooth manifolds, tensors are usually expressed with respect to the tangent space at a point. A covariant \(k\)-tensor is an element of the \(k\)-fold tensor product \(V^* \otimes \dots \otimes V^* \). A contravariant \(k\)-tensor is an element of the \(k\)-fold tensor product \(V \otimes \dots \otimes V\). In any case, the number \(k\) is called the rank of a tensor.
The notation
\[T^k(V^*) = \underbrace{V^* \otimes \dots \otimes V^*}_{k~\text{times}}\]
is used for the vector space of all covariant \(k\)-tensors, and likewise the notation
\[T^k(V) = \underbrace{V\otimes \dots \otimes V}_{k~\text{times}}\]
is used for the vector space of all contravariant \(k\)-tensors.
By convention, a \(0\)-tensor is just a real number. A \(1\)-tensor is the same as a covector. Thus, tensors generalize covectors by extending them to arbitrary rank. The space of mixed tensors on \(V\) of type \(k,l\) is the space
\[T^{(k,l)} =\underbrace{V \otimes \dots \otimes V}_{k~\text{times}} \otimes \underbrace{V^* \otimes \dots \otimes V^*}_{k~\text{times}}.\]
Thus, given a basis \((E_1,\dots,E_k)\) for \(V\) with \(\mathrm{dim}(V) = n\) and the corresponding dual basis \((\varepsilon^1,\dots,\varepsilon^k)\) for \(V^*\), \(T^k(V^*)\) has the basis
\[\{\varepsilon^{i_1} \otimes \dots \otimes \varepsilon^{i_k} : 1 \leq i_1, \dots, i_k \le n\}.\]
The coefficients of any such covariant tensor \(\alpha\) are
\[\alpha = \alpha_{i_1 \dots i_k} \varepsilon^{i_1} \otimes \dots \otimes \varepsilon^{i_k}\]
which are determined by
\[\alpha_{i_1 \dots i_k} = \alpha(E_{i_1}, \dots, E_{i_k}).\]
Likewise, \(T^k(V)\) has the basis
\[\{E_{i_1} \otimes \dots \otimes E_{i_k} : 1 \leq i_1, \dots, i_k \le n\}.\]
Such a contravariant tensor \(\alpha\) is expressed in terms of components as
\[\alpha = \alpha^{i_1 \dots i_k } E_{i_1} \otimes \dots \otimes E_{i_k}.\]
Furthermore, \(T^{(k,l)}\) has the basis
\[\{E_{i_1} \otimes \dots \otimes E_{i_k} \otimes \varepsilon^{j_1} \otimes \dots \otimes \varepsilon^{j_l} : 1 \leq i_1, \dots, i_k,j_1,\dots,j_l \le n\}.\]
Such a mixed tensor \(\alpha\) is expressed in terms of components as
\[\alpha = \alpha^{i_1 \dots i_k }_{j_1 \dots j_l} E_{i_1} \otimes \dots \otimes E_{i_k} \otimes \varepsilon^{j_1} \otimes \dots \otimes \varepsilon^{j_l}.\]
This shows that \(\mathrm{dim}(T^k(V)) = \mathrm{dim}(T^k(V^*)) = n^k\) and \(\mathrm{dim}(T^{(k,l)}) = n^{k+l}.\) For instance, any tensor \(\alpha \in T^2(V^*)\) has \(n^2\) components \((\alpha_{ij})\), that is, \(\alpha = \alpha_{ij} \varepsilon^1 \otimes \varepsilon^2\) , and these components are often expressed as an \(n \times n\) matrix.
Symmetric Tensors
A covariant \(k\)-tensor \(\alpha\) on a vector space \(V\) is symmetric if its value does not change when any two of its arguments are interchanged, that is
\[\alpha(v_1,\dots,v_i,\dots,v_j,\dots,v_k) = \alpha(v_1,\dots,v_j,\dots,v_i, \dots,v_k)\]
for \(1 \le i \lt j \le k\).
This is equivalent to stating that the value of \(\alpha\) is unchanged under any permutation of its arguments, since every permutation is the composition of interchanges. To see this, define a permutation on the set of indices \(I_k = \{1,\dots,k\}\) to be an automorphism of \(I_k\), i.e. a function \(\sigma : I_k \rightarrow I_k\) such that there exists an inverse function \(\sigma^{-1} : I_k \rightarrow I_k\) and \(\sigma \circ \sigma^{-1} = \sigma^{-1} \circ \sigma = \mathrm{Id}_{I_k}\). The set of all such permutations comprises a group \(S_k\), the symmetric group on \(k\) elements.
Then, an interchange of indexes \(i\) and \(j\) is a special automorphism \(\tau_{i,j}\) defined as follows:
\[\tau_{i,j}(k) = \begin{cases}j, &\text{ if } k = i, \\ i, & \text{ if } k = j \\ k, & \text{ otherwise.}\end{cases}\]
Note that \(\tau_{i,j} = \tau_{j,i}\) and \(\tau_{i,j} \circ \tau_{i,j} = \mathrm{Id}_I\). Now, we proceed by induction on the size of the index set \(I_k\). For \(I_0 = \emptyset\), it is vacuously true that all permutations (of which there are none) on this set are the composition of a finite sequence of interchanges. Next, suppose that, for some natural number \(k\), every permutation on \(I_k\) is the composition of a finite sequence of interchanges, and consider an automorphism \(\sigma : I_{k+1} \rightarrow I_{k+1}\). Note that \(\tau_{k+1,\sigma(k+1)} \circ \sigma\) is again an automorphism of \(I_{k+1}\), since it has inverse \(\sigma^{-1} \circ \tau_{k+1,\sigma(k+1)}\). Now, \(\tau_{k+1,\sigma(k+1)} \circ \sigma(k+1) = k+1\), i.e. \(k+1\) is a fixed point, which implies that its restriction to \(I_k\) is an automorphism of \(I_k\), since automorphisms are injective and thus the only index that maps to \(k+1\) is \(k+1\) itself. By inductive hypothesis, this restriction is the composition of a finite sequence of interchanges, and thus so is \(\tau_{k+1,\sigma(k+1)} \circ \sigma\), since \(k+1\) is a fixed point and hence it performs no additional interchanges. Finally, \(\sigma = \tau_{k+1,\sigma(k+1)} \circ \tau_{k+1,\sigma(k+1)} \circ \sigma\), and hence is the composition of a finite sequence of interchanges.
Finally, note that the components of any covariant tensor are computed as
\[\alpha_{i_1\dots i_k} = \alpha(E_1,\dots,E_k)\]
for a given basis \((E_1,\dots,E_k)\) of the vector space \(V\). But, since any permutation of the indexes does not change the value of \(\alpha(E_1,\dots,E_k)\), the components are unaffected by permutations as well.
The set of symmetric, covariant \(k\)-tensors on a vector space \(V\) is denoted \(\Sigma^k(V^*)\). This set is a vector space under the usual operations of addition and scalar multiplication of tensors. It is closed under these operations since the sum of two symmetric tensors is a symmetric tensor and a scalar multiple of a symmetric tensor is a symmetric tensor. This space is a linear subspace of \(T^k(V^*)\). Given a \(k\)-tensor \(\alpha\) and a permutation \(\sigma \in S_k\), we can define a new \(k\)-tensor \(\sigma * \alpha\) as follows:
\[(\sigma * \alpha)(v_1,\dots,v_k) = \alpha\left(v_{\sigma(1)},\dots,v_{\sigma(k)}\right).\]
Note that \(\tau * (\sigma * \alpha) = (\tau \circ \sigma) * \alpha.\) A covariant \(k\)-tensor is symmetric if and only if \(\sigma * \alpha = \sigma\) for all \(\sigma \in S_k\). We can define a projection from \(\mathrm{Sym} : T^k(V^*)\) to \(\Sigma^k(V^*)\), called the symmetrization of \(\alpha\), as follows:
\[\mathrm{Sym}(\alpha) = \frac{1}{k!} \sum_{\sigma \in S_k} \sigma * \alpha.\]
Note that there are exactly \(k!\) elements of \(S_k\).
The symmetrization therefore acts on vectors as follows:
\[\mathrm{Sym}(\alpha)(v_1,\dots,v_k) = \frac{1}{k!} \sum_{\sigma \in S_k} \alpha(v_{\sigma(1)},\dots,v_{\sigma(k)}).\]
Given any fixed permutation \(\tau \in S_k\), every permutation \(\eta \in S_k\) can be written as the composition \(\sigma \circ \tau\) with some other permutation \(\sigma \in S_k\), namely \(\sigma = \eta \circ \tau^{-1}\). Also, of course, the composition of two permutations is again a permutation. These facts justify the following equations for any \(\alpha \in T^k(V^*)\) and \(\tau \in S_k\):
\begin{align}\mathrm{Sym}(\alpha)(v_{\tau(1)},\dots,v_{\tau(k)}) &= \frac{1}{k!}\sum_{\sigma \in S_k}\alpha\left(v_{\sigma(\tau(1))},\dots,v_{\sigma(\tau(k))}\right) \\&= \frac{1}{k!}\sum_{\sigma \in S_k}((\sigma \circ \tau) * \alpha)(v_1,\dots,v_k) \\&= \frac{1}{k!} \sum_{\eta \in S_k} (\eta * \alpha)(v_1,\dots,v_k) \\&= \mathrm{Sym}(\alpha)(v_1,\dots,v_k).\end{align}
This demonstrates that \(\mathrm{Sym}(\alpha)\) is indeed symmetric, as desired. Also note that, if \(\alpha\) is symmetric, then, since \(\sigma * \alpha = \alpha\), it follows that
\begin{align}\mathrm{Sym}(\alpha) &= \frac{1}{k!}\sum_{\sigma \in S_k} \sigma * \alpha \\&= \frac{1}{k!}\sum_{\sigma \in S_k} \alpha \\&= \frac{1}{k!} \cdot k! \cdot \alpha \\&= \alpha.\end{align}
Conversely, if \(\mathrm{Sym}(\alpha) = \alpha\), then, since \(\mathrm{Sym}(\alpha)\) is symmetric, so is \(\alpha\). Thus \(\alpha\) is symmetric if and only if \(\mathrm{Sym}(\alpha) = \alpha\).
Note that, given a symmetric tensor \(\alpha \in \Sigma^k(V^*)\) and a symmetric tensor \(\beta \in \Sigma^l(V^*)\), their tensor product is not symmetric in general. Recall that this tensor product is defined as
\[(\alpha \otimes \beta)(v_1,\dots,v_{k+l}) = \alpha(v_1,\dots,v_k)\beta(v_{k+1},\dots,v_{k+l}).\]
In order for the tensor product to be symmetric, permutations across the entire set of indexes \(v_1,\dots,v_{k+l}\) should not affect the computation, but the symmetry of \(\alpha\) and \(\beta\) do not ensure this.
However, we can define a product \(\alpha\beta\) of tensors called the symmetric product, as follows:
\[\alpha\beta = \mathrm{Sym}(\alpha \otimes \beta).\]
The symmetric product therefore acts as follows:
\[\alpha\beta(v_1,\dots,v_{k+l}) = \frac{1}{(k+l)!}\sum_{\sigma \in S_{k+l}}\alpha\left(v_{\sigma(1)},\dots,v_{\sigma(k)}\right)\beta\left(v_{\sigma(k+1)},\dots,v_{\sigma(k+l)}\right).\]
Define a permutation \(\tau \in S_{k+l}\) as follows:
\[\tau(i) = \begin{cases}l + i, & \text{ if } i \le k\\ i - k, & \text{ if } i \gt k.\end{cases}\]
Since every permutation \(\sigma \in S_{k+l}\) can be expressed in the form \(\sigma = (\sigma \circ \tau^{-1}) \circ \tau\), and since every composition \(\sigma \circ \tau\) is an element of \(S_{k+l}\), it follows that \(\{\sigma \circ \tau : \sigma \in S_{k+l}\} = S_{k+l}\). This justifies the following equivalence:
\begin{align}\alpha\beta(v_1,\dots,v_{k+l}) &= \frac{1}{(k+l)!}\sum_{\sigma \in S_{k+l}} \alpha \left(v_{\sigma(1)}, \dots, v_{\sigma(k)} \right)\beta\left(v_{\sigma(k+1)},\dots,v_{\sigma(k+l)}\right) \\&= \frac{1}{(k+l)!}\sum_{\sigma \in S_{k+l}} \alpha \left(v_{\sigma(\tau(1))}, \dots, v_{\sigma(\tau(k))} \right)\beta\left(v_{\sigma(\tau(k+1))},\dots,v_{\sigma(\tau(k+l)}\right) \\&= \frac{1}{(k+l)!}\sum_{\sigma \in S_{k+l}} \alpha \left(v_{\sigma(l+1)}, \dots, v_{\sigma(l+k)} \right)\beta\left(v_{\sigma(1)},\dots,v_{\sigma(l)}\right) \\&= \frac{1}{(l+k)!}\sum_{\sigma \in S_{k+l}} \beta\left(v_{\sigma(1)},\dots,v_{\sigma(l)}\right)\alpha \left(v_{\sigma(l+1)}, \dots, v_{\sigma(l+k)} \right) \\&= \beta\alpha(v_1,\dots,v_{l+k}).\end{align}
Thus, \(\alpha\beta = \beta\alpha\). In the special case where \(\alpha\) and \(\beta\) are covectors, since there are only two permutations on \(S_2\), namely he identity permutation and the permutation \(1 \mapsto 2\) and \(2 \mapsto 1\), we compute
\begin{align}\alpha\beta(v_1,v_2) &= \frac{1}{2!}(\alpha(v_1)\beta(v_2) + \alpha(v_2)\beta(v_1)) \\&= \frac{1}{2}(\alpha(v_1)\beta(v_2) + \beta(v_1)\alpha(v_2)) \\&= \frac{1}{2}((\alpha \otimes \beta)(v_1,v_2) + (\beta \otimes \alpha)(v_1,v_2)) \\&= \frac{1}{2}(\alpha \otimes \beta + \beta \otimes \alpha)(v_1,v_2).\end{align}
Thus, \(\alpha\beta = 1/2(\alpha \otimes \beta + \beta \otimes \alpha)\).
Alternating Tensors
A covariant \(k\)-tensor \(\alpha\) on a vector space \(V\) is alternating (also called antisymmetric or skew-symmetric) if it changes sign whenever any two of its parameters are interchanged. For vectors \(v_1,\dots,v_k \in V\) and distinct indexes \(i\) and \(j\)
\[\alpha(v_1,\dots,v_i,\dots,v_j,\dots,v_k) = -\alpha(v_1,\dots,v_j,\dots,v_i,\dots,v_k).\]
Note that this is equivalent to stating that, for any transposition \(\tau_{i,j}\) (defined above),
\[\alpha(\tau_{i,j}(v_1),\dots,\tau_{i,j}(v_k)) = -\alpha(v_1,\dots,v_k).\]
The vector space of all alternating covariant \(k\)-vectors on \(V\) is denoted \(\Lambda^k(V^*)\), and this vector space is a subspace of \(T^k(V^*)\).
For an permutation \(\sigma \in S_k\), the sign of \(\sigma\), written \(\mathrm{sgn}(\sigma)\), is \(+1\) if \(\sigma\) is equal to the composition of an even number of interchanges, and \(-1\) if \(\sigma\) is equal to the composition of an odd number of interchanges.
An alternating covariant \(k\)-tensor is equivalently defined as a covariant \(k\)-tensor \(\alpha\) on a vector space \(V\) such that, for all \(v_1,\dots,v_k \in V\) and any permutation \(\sigma \in S_k\),
\[\alpha(v_1,\dots,v_k) = \mathrm{sgn}(\sigma) \cdot \alpha\left(v_{\sigma(1)},\dots,v_{\sigma(k)}\right).\]
We previously demonstrated that every permutation \(\sigma \in S_k\) is equal to a composition of a finite sequence of transpositions. Thus, we proceed by induction on the length of the sequence of transpositions. For a sequence of length \(1\), i.e. a single transposition, then, by definition, \(\alpha(v_1,\dots,v_k) = \mathrm{sgn}(\sigma) \cdot \alpha \left(v_{\sigma(1)},\dots,v_{\sigma(k)} \right)\). Suppose that for all permutations \(\sigma \in S_k\) that are compositions of a sequence of transpositions of length \(n\), \(\alpha(v_1,\dots,v_k) = \mathrm{sgn}(\sigma) \cdot \alpha \left(v_{\sigma(1)},\dots,v_{\sigma(k)}\right)\), and consider a permutation \(\sigma' \in S_k\) of length \(n+1\), i.e. \(\sigma' = \tau_{i,j} \circ \sigma\) for some transposition \(\tau_{i,j}\) of indexes \(i\) and \(j\), and hence \(\mathrm{sgn}(\sigma') = -\mathrm{sgn}(\sigma)\). Then
\begin{align}\alpha\left(v_{\sigma'(1)},\dots,v_{\sigma'(k)}\right) &= \alpha\left(v_{\tau_{i,j} \circ\sigma(1)},\dots,v_{\tau_{i,j} \circ \sigma(k)}\right) \\&= -\alpha\left(v_{\sigma(1)},\dots,v_{\sigma(k)}\right) \\&= -\mathrm{sgn}(\sigma) \cdot \alpha(v_1,\dots,v_k) \\&= \mathrm{sgn}(\sigma') \cdot \alpha(v_1,\dots,v_k).\end{align}
Also note that, for any alternating covariant \(k\)-tensor \(\alpha\), since its components relative to a basis \((E_i)\) for \(V\) are computed as
\[\alpha_{i_1 \dots i_k} = \alpha(E_{i_1},\dots,E_{i_k}),\]
it then follows that the components change sign whenever any two component indexes are interchanged (since this corresponds to an interchange of the corresponding basis vectors).
Multicategory of Tensors
Note that vector spaces and multilinear maps comprise a prime example of a multicategory. The notion of tensor product can be defined for any multicategory (see this link for details).