The Einstein-Hilbert Action
This post describes how to derive the field equation of General Relativity from the Einstein-HIlbert action.

One of the most celebrated equations in all of physics is the field equation of General Relativity:
\[\mathrm{Rc} - \frac{1}{2}Sg = \kappa T.\]
Here, \(\mathrm{Rc}\) is the Ricci curvature tensor, \(S\) is the scalar curvature function, \(g\) is the metric tensor, \(\kappa\) is the Einstein gravitational constant, and \(T\) is the stress-energy tensor. A solution to this equation is a metric \(g\) and the equation thus serves as a constraint on admissible metrics.
The Action Functional
The mathematician David Hilbert proposed that General Relativity be formulated in a manner similar to Lagrangian mechanics, namely, that the metric \(g\) be selected as a critical point (or stationary point) of a suitable functional \(\mathcal{S}[g]\), called an action functional. This functional is often called the Einstein-Hilbert action. Such critical points can be analyzed systematically using variational methods by solving the equation
\[\delta_g \mathcal{S} = 0,\]
where \(\delta_g \mathcal{S}\) is the total derivative of \(\mathcal{S}\) at the point \(g\) (in the sense of the total derivative of a continuous map between normed vector spaces, described in another post).
Hence, such a metric is in some sense optimal. However, the problem of selecting an appropriate functional still remains. One of the basic premises of General Relativity is that curvature is, in some sense, directly proportional to the density and flux of energy and momentum (which generalizes mass density from Newton's theory of gravitation). Thus, one idea is to propose one functional \(\mathcal{S}_C\) for the former, and another functional \(\mathcal{S}_E\) for the latter, and to require that the variation of the former be directly proportional to the variation of the latter, which means that
\[\delta_g\mathcal{S}_C = \pm \kappa \delta_g\mathcal{S}_E,\]
where \(\kappa\) is some constant of proportionality, and the \(\pm\) indicates the chosen sign convention. As it turns out, the standard sign convention in physics is
\[\delta_g\mathcal{S}_C = - \kappa \delta_g\mathcal{S}_E,\]
which means that
\begin{align}0 &= \delta_g\mathcal{S}_C + \kappa \delta_g\mathcal{S}_E \\&= \delta_g(\mathcal{S}_C + \kappa \mathcal{S}_E).\end{align}
Hence, we define the functional \(\mathcal{S}\) as
\[\mathcal{S} = \mathcal{S}_C + \kappa \mathcal{S}_E.\]
In the Lagrangian formulation, the functional \(\mathcal{S}\) is intended to summarize the dynamics of the entire system, and thus is expressed as an integral. In the case of General Relativity, this means that it should be expressed an integral of a suitable function \(\mathcal{L} : M \rightarrow \mathbb{R}\) over the entire spacetime manifold \(M\) with respect to the volume form \(dV_g\) induced by the metric \(g\):
\[\mathcal{S}[g] = \int_M \mathcal{L}~dV_g.\]
We may also write
\[\mathcal{S}_C[g] = \int_M \mathcal{L}_C~dV_g\]
and
\[\mathcal{S}_E[g] = \int_M \mathcal{L}_E~dV_g\]
so that
\begin{align}\mathcal{S}[g] &= \mathcal{S}_C[g] + \kappa \mathcal{S}_E[g] \\&= \int_M \mathcal{L}_C~dV_g + \kappa\int_M \mathcal{L}_E~dV_g \\&= \int_M (\mathcal{L}_C + \kappa\mathcal{L}_E)~dV_g.\end{align}
Volume Forms
It makes no sense to integrate a function \(f : M \rightarrow \mathbb{R}\) over an arbitrary manifold, since integration requires some notion of volume which cannot be expressed in a coordinate-independent manner on an arbitrary manifold, but can be expressed in a canonical, coordinate-independent manner on a manifold equipped with a metric. Every manifold with a (semi-Riemannian) metric \(g\) induces a unique differential form \(dV_g\) called the volume form which satisfies
\[dV_g(E_1,\dots,E_n) = 1\]
for every local, positively oriented orthonormal frame \((E_i)\) on an open subset \(U \subseteq M\). In other words, it corresponds to the expected notion of volume, i.e. one which assigns a volume of \(1\) to the unit \(n\)-cube spanned by the orthonormal tangent vectors \((E_1\rvert_p,\dots,E_n\rvert_p)\) at every point \(p \in M\). Thus, the volume form assigns a measure which accords with the usual notion of volume (unit cubes have unit volume).
Since the volume form is a top-level differential form, it can be written as
\[dV_g = f~\varepsilon^1 \wedge \dots \wedge \varepsilon^n,\]
where \((\varepsilon^i)\) is the dual coframe corresponding to \((E_i)\) and \(f \in C^{\infty}(M)\) is a smooth function on \(M\). The requirement that \(dV_g(E_1,\dots,E_n) = 1\) then implies that \(f=1\), and thus
\[dV_g = \varepsilon^1 \wedge \dots \wedge \varepsilon^n.\]
Now, if \((\tilde{E}_i)\) is another local, positively oriented orthonormal frame with dual coframe \((\tilde{\varepsilon}^i)\), then each vector field \(\tilde{E}_i\) may be expressed in terms of the frame \((E_i)\) as follows:
\[\tilde{E}_i = A^j_i E_j\]
for some matrix of smooth functions \(A^j_i\). Since \(A^j_i\) is an orthogonal \(n \times n\) matrix and each frame is positively oriented, it follows that \(\mathrm{det}(A^j_i) = 1\). It then follows that, if \(d\tilde{V}_g\) is the volume form determined by \((\tilde{E}_i)\), then
\begin{align}dV_g(\tilde{E}_1,\dots,\tilde{E}_n) &= \varepsilon^1 \wedge \dots \wedge \varepsilon^n(\tilde{E}_1,\dots,\tilde{E}_n) \\&= \mathrm{det}(\varepsilon^j(\tilde{E}_i)) \\&= \mathrm{det}(A^j_i) \\&= 1 \\&= d\tilde{V}_g(\tilde{E}_1,\dots,\tilde{E}_n),\end{align}
and thus \(dV_g = d\tilde{V}_g\). This indicates that the volume form is uniquely determined irrespective of the chosen frame, and thus any local, positively oriented orthonormal frame may be chosen.
Now, on any local, positively oriented smooth chart \((U, (x^i))\), the volume form can be expressed in terms of the dual coordinate coframe \((dx^i)\) corresponding to the respective coordinate frame \((\partial_i)\) as
\[dV_g = f dx^1 \wedge \dots \wedge dx^n\]
for some suitable smooth function \(f\). To determine \(f\), first express each coordinate vector field \(\partial_i\) in terms of an arbitrary smooth, positively oriented orthonormal frame \((E_i)\) with dual coframe \((\varepsilon^i)\) as
\[\partial_i = A^j_iE_j\]
and then, compute the following:
\begin{align}f &= dV_g(\partial_1,\dots,\partial_n) \\&= \varepsilon^1 \wedge \dots \wedge \varepsilon^n(\partial_1,\dots,\partial_n) \\&= \mathrm{det}(\varepsilon^j(\partial_i)) \\&= \mathrm{det}(A^j_i).\end{align}
Next, note that
\begin{align}g_{ij} &= g(\partial_i,\partial_j) \\&= g(A^k_iE_k,A^l_jE_l) \\&= A^k_iA^l_jg(E_k,E_l) \\&= \pm\sum_k A^k_iA^k_j\end{align}
since \(g(E_k,E_l) = 0\) whenever \(k \ne l\) and \(g(E_k,E_l) = \pm 1\) whenever \(k = l\).
If \(A\) represents the matrix \(A^j_i\), then
\[(A^TA)_{ij} = \sum_k A^k_iA^k_j,\]
and hence \(g_{ij} = \pm (A^TA)_{ij}\).
It then follows that
\begin{align}\lvert \mathrm{det}(g_{ij}) \rvert &= \lvert \mathrm{det}(\pm A^TA) \rvert \\&= \lvert\pm \mathrm{det}(A^TA) \rvert \\&= \lvert \mathrm{det}(A^TA) \rvert \\&= \lvert \mathrm{det}(A^T)\mathrm{det}(A) \rvert \\&= \lvert (\mathrm{det}(A))^2\rvert \\&= (\mathrm{det}(A))^2, \end{align}
and thus
\[\mathrm{det}(A) = \pm\sqrt{\lvert \mathrm{det}(g_{ij}) \rvert},\]
and, since both frames \((\partial_i)\) and \((E_i)\) are positively oriented, the sign is positive, and so
\[\mathrm{det}(A) = \sqrt{\lvert \mathrm{det}(g_{ij}) \rvert}.\]
This means that
\[dV_g = \sqrt{\lvert \mathrm{det}(g_{ij}) \rvert}~dx^1 \wedge \dots \wedge dx^n.\]
If, however, the frame is negatively oriented, then
\[dV_g = -\sqrt{\lvert \mathrm{det}(g_{ij}) \rvert}~dx^1 \wedge \dots \wedge dx^n.\]
Integration on Semi-Riemannian Manifolds
Now that we have defined a suitable volume form, it remains to define the integration of functions over semi-Riemannian manifolds.
First, we will define the integration of a differential \(n\)-form \(\omega\) (where \(n = \mathrm{dim}(M)\)) over a smooth manifold \(M\). Suppose that \(\omega\) is compactly supported in the domain of a single smooth chart \((U,\varphi)\) with either negative or positive orientation. Then, define the integral of \(\omega\) over \(M\) as
\[\int_M \omega = \pm \int_{\varphi(U)}(\varphi^{-1})^*\omega,\]
with a positive sign for a positively oriented chart and a negative sign for a negatively oriented chart. It can be demonstrated that this definition does not depend on the chosen chart, and thus is well-defined.
If \(\{U_i\}\) is a finite open cover of \(\mathrm{supp}(\omega)\) by oriented smooth charts and \({\psi_i}\) is a subordinate smooth partition of unity, define the integral over the entire manifold as
\[\int_M \omega = \sum_i \int_M \psi_i \omega.\]
It can be demonstrated that this definition does not depend on the choice of open cover or partition of unity, and is thus well-defined.
Next, define the integral of a smooth function \(f \in C^{\infty}(M)\) over a semi-Riemannian manifold \(M\) with metric \(g\) as
\[\int_M f~dV_g.\]
In particular, if \(f\) is compactly supported in a single smooth chart \((U,\varphi)\) with coordinate functions \((x^i)\), then
\begin{align}\int_M f~dV_g &= \pm\int_{\varphi(U)}(\varphi^{-1})^*(f~dV_g) \\&= \int_{\varphi(U)}(\varphi^{-1})^*(f\sqrt{\lvert \mathrm{det}(g_{ij}) \rvert}~dx^1 \wedge \dots \wedge dx^n) \\&= \pm\int_{\varphi(U)}(f\sqrt{\lvert \mathrm{det}(g_{ij}) \rvert}) \circ \varphi^{-1} \mathrm{det}(D\varphi^{-1})~dx^1 \wedge \dots \wedge dx^n & \text{(pullback of differential forms)} \\&= \pm\int_{\varphi(U)}\hat{f}\sqrt{\lvert \mathrm{det}(\hat{g}_{ij}) \rvert} \mathrm{det}(D\varphi^{-1})~dx^1 \dots dx^n,\end{align}
where \(\hat{f} = f \circ \varphi^{-1}\) and \(\hat{g}_{ij} = g_{ij} \circ \varphi^{-1}\) are the coordinate representations of \(f\) and \(g_{ij}\), respectively, and \(D\varphi^{-1}\) is the Jacobian matrix which represents \(\varphi^{-1}\). Note that
\begin{align}(D\varphi^{-1})_{ij}(\hat{p}) &= \frac{\partial (\widehat{\varphi^{-1}})^j}{\partial x^i}(\hat{p}) \\&= \frac{(\varphi \circ \varphi^{-1} \circ \mathrm{Id})^j}{\partial x^i}(\hat{p}) \\&= \frac{\partial x^j}{\partial x^i}(\hat{p}) \\&= \delta^j_i.\end{align}
Thus \(D\varphi^{-1}\) is the identity matrix, so \(\mathrm{det}(D\varphi^{-1})=1\). Additionally, the signs due to the orientation of the chart and the volume form cancel out if both are negative.
It follows that
\[\int_M f~dV_g = \int_{\varphi(U)}\hat{f}\sqrt{\lvert \mathrm{det}(\hat{g}_{ij} \rvert}) ~dx^1 \dots dx^n.\]
Subsequently, we will follow the standard practice of suppressing the "hat" notation and using the same symbols (e.g. \(f\) or \(g\)) for both functions defined on manifolds and their coordinate representations. Context will determine which is intended. Thus, we will simply write
\[\int_M f~dV_g = \int_{\varphi(U)}f(x)\sqrt{\lvert \mathrm{det}(g_{ij}(x)\rvert}) ~dx^1 \dots dx^n.\]
It suffices to consider the case where the integrand (differential form) is compactly supported in a single smooth chart, since every compactly-supported \(n\)-form can be written as a sum of such forms using a partition of unity. Thus, in the ensuing discussion, we will only consider a single smooth chart.
Total Scalar Curvature
The total scalar curvature functional is defined as follows:
\[\mathcal{S}_C[g] = \int_M S~dV_g,\]
where \(S\) is the scalar curvature function on \(M\). In coordinates, the scalar curvature function is the contraction of the Ricci curvature tensor, that is
\[S = g^{ij}R_{ij}.\]
Thus, it follows that
\[\int_M S~dV_g = \int_{\varphi(U)}g^{ij}(x)R_{ij}(x)\sqrt{\lvert \mathrm{det}(g_{ij}(x)) \rvert} ~dx^1 \dots dx^n.\]
Thus, we propose the total scalar curvature as the appropriate curvature functional.
The Variation of the Curvature Functional
Next, we will define the variation of the curvature functional. Since spacetime is a semi-Riemannian manifold with signature \((-,+,+,+)\) or \((+,-,-,-)\), \(\mathrm{det}(g_{ij})\) is negative, so \(\lvert \mathrm{det}(g_{ij}(x)) \rvert = -\mathrm{det}(g_{ij}(x))\). Thus
\[\mathcal{S}_C[g] = \int_{\varphi(U)} g^{ij}(x)R_{ij}(x)~\sqrt{-\mathrm{det}(g_{ij}(x))}) ~dx^1 \dots dx^n,\]
where \(\mathcal{S}_C : \mathcal{M} \rightarrow \mathbb{R}\) and \(\mathcal{M}\) is the space of all metrics with volume \(1\), i.e.
\[\mathrm{Vol}(g) = \int_M dV_g = 1.\]
The first variation (or just variation) of the functional \(\mathcal{S}_C\) is the total derivative \(\delta_g\mathcal{S}_C\) of \(\mathcal{S}_C\) (in the sense of the total derivative of a map between normed vector spaces). If the total derivative exists, then the directional derivative exists and both are equal.
\begin{align}\delta_g\mathcal{S}_C[\eta] &= \frac{d}{dt}\bigg\rvert_0\mathcal{S}_C[g + t \cdot \eta] \\&= \frac{d}{dt}\bigg\rvert_0\int_{\varphi(U)} (g + t \cdot \eta)^{ij}(x)R_{ij}(x)\sqrt{-\mathrm{det}((g + t \cdot \eta)_{ij}(x))} ~dx^1 \dots dx^n \\&= \frac{d}{dt}\bigg\rvert_0\int_{\varphi(U)} (g^{ij}(x) + t \cdot \eta^{ij}(x))R_{ij}(x)\sqrt{-\mathrm{det}(g_{ij}(x) + t \cdot \eta_{ij}(x))} ~dx^1 \dots dx^n \\&= \int_{\varphi(U)} \frac{d}{dt}\bigg\rvert_0\left[(g^{ij}(x) + t \cdot \eta^{ij}(x))R_{ij}(x)\sqrt{-\mathrm{det}(g_{ij}(x) + t \cdot \eta_{ij}(x))}\right] ~dx^1 \dots dx^n \\&= \int_{\varphi(U)} g^{ij}(x)R_{ij}(x)\frac{d}{dt}\bigg\rvert_0\left[\sqrt{-\mathrm{det}(g_{ij}(x) + t \cdot \eta_{ij}(x))}\right] \\&+ \frac{d}{dt}\bigg\rvert_0\left[(g^{ij}(x) + t \cdot \eta^{ij}(x))R_{ij}(x)\right]\sqrt{-\mathrm{det}(g_{ij}(x))} ~dx^1 \dots dx^n \\&= \int_{\varphi(U)} g^{ij}(x)R_{ij}(x)\frac{d}{dt}\bigg\rvert_0\left[\sqrt{-\mathrm{det}(g_{ij}(x) + t \cdot \eta_{ij}(x))}\right] \\&+ \frac{d}{dt}\bigg\rvert_0\left[g^{ij}(x) + t \cdot \eta^{ij}(x)\right]R_{ij}(x)\sqrt{-\mathrm{det}(g_{ij}(x))} ~dx^1 \dots dx^n \\&= \int_{\varphi(U)} g^{ij}(x)R_{ij}(x)\frac{d}{dt}\bigg\rvert_0\left[\sqrt{-\mathrm{det}(g_{ij}(x) + t \cdot \eta_{ij}(x))}\right] + \eta^{ij}(x)R_{ij}(x)\sqrt{-\mathrm{det}(g_{ij}(x))} ~dx^1 \dots dx^n\end{align}
Next, we use the following identities:
\[\mathrm{log}(\mathrm{det} A(t)) = \mathrm{tr}(\mathrm{log}~ A(t)),\]
and
\[\frac{d}{dt}\bigg\rvert_0 \mathrm{log}~f(t) = \frac{1}{f(0)}\frac{df}{dt}(0)\]
and
\[\frac{d}{dt}\bigg\rvert_0 \mathrm{log}~A(t) = A^{-1}(0) \frac{d}{dt}\bigg\rvert_0 A(t).\]
It follows that
\[\frac{d}{dt}\bigg\rvert_0 \mathrm{log}(\mathrm{det}~ A(t)) = \frac{1}{\mathrm{det}~A(0)} \frac{d}{dt}\bigg\rvert_0 \mathrm{det}~ A(t)\]
and
\[\frac{d}{dt}\bigg\rvert_0 \mathrm{tr}(\mathrm{log}~ A(t)) = \mathrm{tr}\left(\frac{d}{dt}\bigg\rvert_0 \mathrm{log}~ A(t)\right) = \mathrm{tr}\left(A^{-1}(0)\frac{d}{dt}\bigg\rvert_0 A(t)\right).\]
Thus:
\[\frac{d}{dt}\bigg\rvert_0 \mathrm{det}A(t) = \mathrm{tr}\left(A^{-1}(0)\frac{d}{dt}\bigg\rvert_0 A(t)\right)\mathrm{det}A(0).\]
We then compute
\begin{align}\frac{d}{dt}\bigg\rvert_0 \mathrm{det}(g_{ij}(x) + t \cdot \eta_{ij}(x)) &= \mathrm{tr}\left(g^{ij}(x)\frac{d}{dt}\bigg\rvert_0 (g_{ij}(x) + t \cdot \eta_{ij}(x))\right)\mathrm{det}(g_{ij}(x)) \\&= \mathrm{tr}(g^{ij}(x)\eta_{ij}(x))\mathrm{det}(g_{ij}(x)) \\&= g^{ij}(x)\eta_{ij}(x)\mathrm{det}(g_{ij}(x)).\end{align}
Using the fact that \(-g_{ij}\eta^{ij} = g^{ij}\eta_{ij}\), we further compute that
\begin{align}\frac{d}{dt}\bigg\rvert_0\sqrt{-\mathrm{det}(g_{ij}(x) + t \cdot \eta_{ij}(x))} &= \frac{1}{2\sqrt{-\mathrm{det}(g_{ij}(x))}}\frac{d}{dt}\bigg\rvert_0(-\mathrm{det}(g_{ij}(x) + t \cdot \eta_{ij}(x))) \\&= \frac{1}{2\sqrt{-\mathrm{det}(g_{ij}(x))}}(-g^{ij}(x)\eta_{ij}(x)\mathrm{det}(g_{ij}(x)) ) \\&= \frac{1}{2}g^{ij}(x)\eta_{ij}(x)\sqrt{-\mathrm{det}(g_{ij}(x))} \\&= -\frac{1}{2}g_{ij}(x)\eta^{ij}(x)\sqrt{-\mathrm{det}(g_{ij}(x))}.\end{align}
Then, substituting into the integral above, we obtain
\begin{align}\delta_g\mathcal{S}_C[\eta] &= \int_{\varphi(U)} g^{ij}(x)R_{ij}(x)\frac{d}{dt}\bigg\rvert_0\left[\sqrt{-\mathrm{det}(g_{ij}(x) + t \cdot \eta_{ij}(x))}\right] + \eta^{ij}(x)R_{ij}(x)\sqrt{-\mathrm{det}(g_{ij}(x))} ~dx^1 \dots dx^n \\&= \int_{\varphi(U)} -\frac{1}{2}S(x)g_{ij}(x)\eta^{ij}(x)\sqrt{-\mathrm{det}(g_{ij}(x))} + \eta^{ij}(x)R_{ij}(x)\sqrt{-\mathrm{det}(g_{ij}(x))} ~dx^1 \dots dx^n \\&= \int_{\varphi(U)} (R_{ij}(x) -\frac{1}{2}S(x)g_{ij}(x))\eta^{ij}(x)\sqrt{-\mathrm{det}(g_{ij}(x))} ~dx^1 \dots dx^n \end{align}
The Variation of the Stress-Energy Functional
Next we consider the variation of the stress-energy functional:
\[\mathcal{S}_E[g] = \int_{\varphi(U)} \mathcal{L}_E(g_{11}(x),\dots,g_{ij}(x),\dots,g_{nn}(x))~\sqrt{-\mathrm{det}(g_{ij}(x))} ~dx^1 \dots dx^n.\]
Here, \(\mathcal{L}_E\) is treated generically as an arbitrary function accepting \(n^2\) arguments, one for each component \(g_{ij}\), with formal parameters denoted as \(y_{11}, \dots, y_{nn}\):
\[\mathcal{L}_E(y_{11},\dots,y_{ij},\dots,y_{nn}).\]
We again compute the total derivative:
\begin{align}\delta_g\mathcal{S}_E[\eta] &= \frac{d}{dt}\bigg\rvert_0\mathcal{S}_E[g + t \cdot \eta] \\&= \frac{d}{dt}\bigg\rvert_0\int_{\varphi(U)} \mathcal{L}_E((g_{11}+t\cdot \eta_{11})(x),\dots,(g_{nn}+t\cdot \eta_{nn})(x))\sqrt{-\mathrm{det}(g_{ij}(x) + t \cdot \eta_{ij}(x)}) ~dx^1 \dots dx^n \\&= \int_{\varphi(U)}\frac{d}{dt}\bigg\rvert_0\left[\mathcal{L}_E((g_{11}+t\cdot \eta_{11})(x),\dots,(g_{nn}+t\cdot \eta_{nn})(x))\sqrt{-\mathrm{det}(g_{ij}(x) + t \cdot \eta_{ij}(x)})\right] ~dx^1 \dots dx^n \\&= \int_{\varphi(U)}\mathcal{L}_E(g_{11}(x),\dots,g_{nn}(x))\frac{d}{dt}\bigg\rvert_0\sqrt{-\mathrm{det}(g_{ij}(x) + t \cdot \eta_{ij}(x)}) \\&+ \frac{\partial \mathcal{L}_E}{\partial y_{ij}}(g_{11}(x),\dots,g_{nn}(x))\eta^{ij}(x)\sqrt{-\mathrm{det}(g_{ij}(x))}~dx^1 \dots dx^n \\&= \int_{\varphi(U)}-\frac{1}{2}\mathcal{L}_E(g_{11}(x),\dots,g_{nn}(x))g_{ij}(x)\eta^{ij}(x)\sqrt{-\mathrm{det}(g_{ij}(x))}\\&+ \frac{\partial \mathcal{L}_E}{\partial y_{ij}}(g_{11}(x),\dots,g_{nn}(x))\eta^{ij}(x)\sqrt{-\mathrm{det}(g_{ij}(x))} ~dx^1 \dots dx^n \\&= \int_{\varphi(U)}\left[-\frac{1}{2}\mathcal{L}_E(g_{11}(x),\dots,g_{nn}(x))g_{ij}(x) + \frac{\partial \mathcal{L}_E}{\partial y_{ij}}(g_{11}(x),\dots,g_{nn}(x))\right]\eta^{ij}(x)\sqrt{-\mathrm{det}(g_{ij}(x))} ~dx^1 \dots dx^n \\&= \int_{\varphi(U)}-\frac{1}{2}\left[\mathcal{L}_E(g_{11}(x),\dots,g_{nn}(x))g_{ij}(x) -2 \frac{\partial \mathcal{L}_E}{\partial y_{ij}}(g_{11}(x),\dots,g_{nn}(x))\right]\eta^{ij}(x)\sqrt{-\mathrm{det}(g_{ij}(x))} ~dx^1 \dots dx^n.\end{align}
By definition, the functional derivatives of \(\mathcal{S}_E\) with respect to each \(g_{ij}\) are the functions denoted \(\delta \mathcal{S}_E/\delta g_{ij}\) and defined such that they collectively satisfy the following equation:
\[\delta_g\mathcal{S}_E[\eta] = \int_{\varphi(U)} \frac{\delta \mathcal{S}_E}{\delta g_{ij}}(x)\eta^{ij}(x)~dx^1 \dots dx^n.\]
This implies that
\[\frac{\delta \mathcal{S}_E}{\delta g_{ij}}(x) = -\frac{1}{2}\left[\mathcal{L}_E(g_{11}(x),\dots,g_{nn}(x))g_{ij}(x) -2 \frac{\partial \mathcal{L}_E}{\partial y_{ij}}(g_{11}(x),\dots,g_{nn}(x))\right]\sqrt{-\mathrm{det}(g_{ij}(x))}.\]
The bracketed expression represents a tensor in coordinates, and we define it as the stress-energy tensor \(T\) with component functions \(T_{ij}\):
\[T_{ij}(x) = \mathcal{L}_E(g_{11}(x),\dots,g_{nn}(x))g_{ij}(x) -2 \frac{\partial \mathcal{L}_E}{\partial y_{ij}}(g_{11}(x),\dots,g_{nn}(x)).\]
This means that
\[\frac{\delta \mathcal{S}_E}{\delta g_{ij}}(x) = -\frac{1}{2}T_{ij}\sqrt{-\mathrm{det}(g_{ij}(x))}.\]
Solving for \(T_{ij}\), we obtain an alternative expression:
\[T_{ij} = -\frac{2}{\sqrt{\mathrm{det}(g_{ij}(x))}}\frac{\delta \mathcal{S}_E}{\delta g_{ij}}(x)\]
Regardless, it follows that
\begin{align}\delta_g\mathcal{S}_E[\eta] &= \int_{\varphi(U)}-\frac{1}{2}T_{ij}(x)\eta^{ij}(x)\sqrt{-\mathrm{det}(g_{ij}(x))} ~dx^1 \dots dx^n.\end{align}
The Cosmological Constant
The functional is
\[\mathcal{S}_K[g] = \int_{\varphi(U)}\Lambda \sqrt{-\mathrm{det}(g_{ij}(x))} ~dx^1 \dots dx^n.\]
This means that
\[\mathcal{S}_K[g] = \Lambda \cdot \mathrm{Vol}(g).\]
The variation is
\begin{align}\delta_g\mathcal{S}_K[\eta] &= \frac{d}{dt}\bigg\rvert_0\mathcal{S}_K[g + t \cdot \eta] \\&= \frac{d}{dt}\bigg\rvert_0\int_{\varphi(U)}\Lambda \sqrt{-\mathrm{det}(g_{ij}(x) + t \cdot \eta^{ij}(x))} ~dx^1 \dots dx^n \\&= \int_{\varphi(U)}\frac{d}{dt}\bigg\rvert_0\left[\Lambda \sqrt{-\mathrm{det}(g_{ij}(x) + t \cdot \eta^{ij}(x))}\right] ~dx^1 \dots dx^n \\&= \int_{\varphi(U)} -\frac{1}{2}\Lambda g_{ij}(x)\eta^{ij}(x)\sqrt{-\mathrm{det}(g_{ij}(x))} ~dx^1 \dots dx^n.\end{align}
This factor is subtracted from the overall functional, resulting in a combined functional of
\[\mathcal{S} = \mathcal{S}_C + \kappa\mathcal{S}_E - \mathcal{S}_K.\]
The Combined Functional
Combining each of the functionals together yields
\begin{align}\delta_g\mathcal{S}[\eta] &= \delta_g\mathcal{S}_C[\eta] + \kappa\delta_g\mathcal{S}_E[\eta] -\delta_g\mathcal{S}_K[g] \\&= \int_{\varphi(U)} (R_{ij}(x) -\frac{1}{2}S(x)g_{ij}(x))\eta^{ij}(x)\sqrt{-\mathrm{det}(g_{ij}(x))} ~dx^1 \dots dx^n \\&+ \kappa\int_{\varphi(U)}-\frac{1}{2}T_{ij}(x)\eta^{ij}(x)\sqrt{-\mathrm{det}(g_{ij}(x))} ~dx^1 \dots dx^n \\&- \int_{\varphi(U)} -\frac{1}{2}\Lambda g_{ij}(x)\eta^{ij}(x)\sqrt{-\mathrm{det}(g_{ij}(x))} ~dx^1 \dots dx^n \\&= \int_{\varphi(U)} \left[R_{ij}(x) -\frac{1}{2}S(x)g_{ij}(x) -\frac{1}{2}\kappa T_{ij}(x) +\frac{1}{2}\Lambda g_{ij}(x)\right]\eta^{ij}(x)\sqrt{-\mathrm{det}(g_{ij}(x))} ~dx^1 \dots dx^n\end{align}
The Fundamental Lemma
On a closed manifold, that is, a compact manifold without boundary, the Fundamental Lemma of the Calculus of Variations indicates that if \(\delta_g\mathcal{S} = 0\) and hence, by our prior calculations,
\[\int_{\varphi(U)} \left[R_{ij}(x) -\frac{1}{2}S(x)g_{ij}(x) -\frac{1}{2}\kappa T_{ij}(x) +\frac{1}{2}\Lambda g_{ij}(x)\right]\eta^{ij}(x)\sqrt{-\mathrm{det}(g_{ij}(x))} ~dx^1 \dots dx^n = 0\]
for all compactly-supported "test" functions \(\eta\), then
\[R_{ij}(x) -\frac{1}{2}S(x)g_{ij}(x) -\frac{1}{2}\kappa T_{ij}(x) +\frac{1}{2}\Lambda g_{ij}(x) = 0.\]
Otherwise, if the manifold is not closed, this integrand will not be identically \(0\) and a boundary term must be considered; this boundary term is often called the Gibbons-Hawking-York Boundary Term.
The Result
It thus follows that
\[R_{ij}(x) -\frac{1}{2}S(x)g_{ij}(x) -\frac{1}{2}\kappa T_{ij}(x) +\frac{1}{2}\Lambda g_{ij}(x) = 0\]
and hence
\[R_{ij}(x) -\frac{1}{2}S(x)g_{ij}(x) +\frac{1}{2}\Lambda g_{ij}(x) = \frac{1}{2}\kappa T_{ij}(x).\]
Writing in coordinate-free tensor format:
\[Rc -\frac{1}{2}Sg +\frac{1}{2}\Lambda g = \frac{1}{2}\kappa T.\]
Since both \(\kappa\) and \(\Lambda\) are constants, we are free to redefine them such that they incorporate the \(1/2\) terms and thus eliminate the \(1/2\) terms from the equation:
\[Rc -\frac{1}{2}Sg +\Lambda g = \kappa T.\]
Without the cosmological constant, this yields
\[Rc -\frac{1}{2}Sg = \kappa T.\]
The Newtonian Limit
We will not discuss the derivation in this post, but the constant \(\kappa\) can be expressed in terms of the gravitational constant \(G\) as
\[\kappa = \frac{8\pi G}{c^4},\]
or just
\[\kappa = 8\pi G\]
if \(c=1\).