Einstein Field Equation Derivation in about a Dozen Steps

In this blog I will derive the Einstein field equations starting from the Hilbert action. Since there are only two terms in the Hilbert action, one of which is left alone, there is not that much to do. Well, there is always way more to do - how well is this step really understood? Where does that factor come from? What kinds of variations could one do? The core of this blog is an extensive translation of the wikipedia page on the Einstein-Hilbert action written in my own personal style, doing minor variations so the steps made more sense to me. Go there if my style confuses you for a step or two.

In the Annus Mirabilis, 1905, one of Einstein's accomplishments was to establish the theory of special relativity. What was special was that all observers must travel at a constant speed, neither accelerating or decelerating. For such an observer, the speed of light is a constant. Different observers will see different wavelengths and frequencies, but the product of wavelength with the frequency is identical. The wavelength and frequency are said to be Lorentz covariant, meaning we know how they change for different observers. The speed of light is Lorentz invariant. It is one of my pet peeves that invariants should always be paired with their corresponding covariant quantities or else an incomplete story is being told.

Newton's law of gravity does a remarkable job in describing the motion of the planets. It is all that is needed by today's rocket ships unless those devices also carry atomic clocks or other tools of exceptional accuracy. Here is Newton's law in potential form:

$4 \pi G \rho = \nabla^2 \phi$

From the perspective of special relativity, the equation suffers a fatal flaw: if there is a change in the mass density rho, then that must propagate everywhere instantaneously. Oops.

Einstein set out to fix this flaw. The struggle took him ten years ("Subtle is the Lord..." by Abraham Pais http://www.amazon.com/Subtle-Is-Lord-Einstein-Paperbacks/dp/0192851381 is the was to get the real details on the subject). The math was hard then and remains hard today. At a far away level, it sounds easy - describe all physics the same way whether one is accelerating or not. It is the details of Riemann geometry that are daunting. Einstein got a private tutor and collaborator for the subject, his school buddy Marcel Grossmann. He also traded letters on his math struggles with the leading math minds of his day, including David Hilbert. Einstein came to the field equations not from an action, but from thinking all about the physics. Hilbert figured out the action that generates the Einstein field equations. That is where the derivation begins:

1. Start with the Hilbert action:

$S = \int{\sqrt{-g} d^4 x\left( \frac{c^4}{16 \pi G}R + \mathcal{L_M} \right)}$

Note the square root of the determinant of the metric as part of the volume element. That is required so the volume element can be in curved spacetime. It plays a critical role in the derivation, so I wish I had a better handle on why that factor in that form is required so that the differential volume element transforms like a tensor.

2. Vary with respect to the metric tensor $g_{\mu\nu}$ :

$\delta S = \int{ d^4 x\left( \frac{c^4}{16 \pi G} \frac{\delta (\sqrt{-g}R)}{\delta g^{\mu\nu}} + \frac{\delta (\sqrt{-g}\mathcal{L_M})}{\delta g^{\mu\nu}} \right)\delta g^{\mu\nu}}$

3. Pull back the factor of the square root of the metric and use the product rule on the term with the Ricci scalar R:

$\delta S = \int{ \sqrt{-g} d^4 x\left( \frac{c^4}{16 \pi G} \left( \frac{\delta R}{\delta g^{\mu\nu}}+ \frac{R}{\sqrt{-g}} \frac{\delta \sqrt{-g}}{\delta g^{\mu\nu}} \right) + \frac{1}{\sqrt{-g}}\frac{\delta (\sqrt{-g}\mathcal{L_M})}{\delta g^{\mu\nu}} \right)\delta g^{\mu\nu}}$

4. Focus on the first term, using the definition of a Ricci scalar as a contraction of the Ricci tensor:

$\begin{align*} \frac{\delta R}{\delta g^{\mu\nu}} &= \frac{\delta(g^{\mu\nu} R_{\mu\nu})}{\delta g^{\mu\nu}}\\ &= R_{\mu\nu} \frac{\delta g^{\mu\nu}}{\delta g^{\mu\nu}} + g^{\mu\nu} \frac{\delta R_{\mu\nu}}{\delta g^{\mu\nu}}\\ &=R_{\mu\nu} + \rm{a \;total \; derivative} \end{align*}$

A total derivative does not make a contribution to the variation of the functional, so can be ignored in our quest to find an extremum. This is Stokes theorem in action.

<SIDEBAR>
Show that the variation in the Ricci tensor is a total derivative.

Since I don't understand this all in detail, I will try to get you in the neighborhood of getting it.

SB1. Start with the Riemann curvature tensor:

$R^{\rho}_{\;\,\sigma\mu\nu}= \partial_{\mu}\Gamma^{\rho}_{\;\,\sigma\nu}-\partial_{\nu}\Gamma^{\rho}_{\;\,\sigma\mu} + \Gamma^{\rho}_{\;\,\lambda \mu}\Gamma^{\lambda}_{\,\;\sigma \nu} - \Gamma^{\rho}_{\;\,\lambda \nu}\Gamma^{\lambda}_{\,\;\sigma \mu}$

Lots of stuff there, but here is a simplifying viewpoint. One is comparing two paths, that is why there is a subtraction here. The two paths are found by switching the order of the mu and the nu. This is a really complicated structure, but that should be obvious :-)

SB2: Vary the Riemann curvature tensor with respect to the metric tensor:

$\begin{align*} \delta R^{\rho}_{\;\,\sigma\mu\nu}=& \partial_{\mu}\delta \Gamma^{\rho}_{\;\,\sigma\nu}-\partial_{\nu}\delta \Gamma^{\rho}_{\;\,\sigma\mu}\\& + \delta \Gamma^{\rho}_{\;\,\lambda \mu}\Gamma^{\lambda}_{\,\;\sigma \nu} - \delta \Gamma^{\rho}_{\;\,\lambda \nu}\Gamma^{\lambda}_{\,\;\sigma \mu}\\& + \Gamma^{\rho}_{\;\,\lambda \mu}\delta\Gamma^{\lambda}_{\,\;\sigma \nu} - \Gamma^{\rho}_{\;\,\lambda \nu}\delta \Gamma^{\lambda}_{\,\;\sigma \mu} \end{align*}$

Lots of terms, but remember the mu <-> nu exchange is responsible for half of them.

One cannot take a covariant derivative of a connection since it does not transform like a tensor. Apparently the difference of two connections does transform like a tensor. I say "apparently" because this is an example where I have to rely on authority, I don't appreciate the details.

SB3: Calculate the covariant derivative of the variation of the connection:

$\begin{align*} \nabla_{\mu} (\delta \Gamma^{\rho}_{\;\,\sigma\nu})&= \partial_{\mu}(\delta \Gamma^{\rho}_{\;\,\sigma\nu})\\&+ \Gamma^{\rho}_{\;\,\lambda \mu}\delta \Gamma^{\lambda}_{\,\;\sigma \nu} \\&-\delta \Gamma^{\rho}_{\,\;\lambda \sigma} \Gamma^{\lambda}_{\;\,\mu \nu}\\& - \delta\Gamma^{\rho}_{\,\;\lambda \nu}\Gamma^{\lambda}_{\;\,\sigma \mu} \\ \\ \nabla_{\nu} (\delta \Gamma^{\rho}_{\;\,\sigma\mu})&= \partial_{\nu}(\delta \Gamma^{\rho}_{\;\,\sigma\mu})\\&+ \Gamma^{\rho}_{\;\,\lambda \nu}\delta \Gamma^{\lambda}_{\,\;\sigma \mu} \\&-\delta \Gamma^{\rho}_{\,\;\lambda \sigma} \Gamma^{\lambda}_{\;\,\mu \nu}\\& - \delta\Gamma^{\rho}_{\,\;\lambda \mu}\Gamma^{\lambda}_{\;\,\sigma \nu} \end{align*}$

Notice that the third terms of these two expressions are identical because the mu and nu are neighbors in the connection.

Again, this is a step whose details I don't understand enough to clarify should others have questions.

SB4: Rewrite the variation of the Riemann curvature tensor as the difference of two covariant derivatives of the variation of the connection written in step SB3.

$\delta R^{\rho}_{\;\,\sigma\mu\nu} = \nabla_{\mu} (\delta \Gamma^{\rho}_{\;\,\sigma\nu}) -\nabla_{\nu} (\delta \Gamma^{\rho}_{\;\,\sigma\mu})$

SB5: Contract the result of SB4

$\delta R^{\rho}_{\;\,\mu\rho\nu} = \delta R_{\mu\nu} = \nabla_{\rho} (\delta \Gamma^{\rho}_{\;\,\mu\nu}) -\nabla_{\nu} (\delta \Gamma^{\rho}_{\;\,\rho\mu})$

SB6: Contract the result of SB5:
$\begin{align*}g^{\mu\nu} \delta R_{\mu\nu} &= \nabla_{\rho} \;g^{\mu\nu}(\delta \Gamma^{\rho}_{\;\,\mu\nu}) -\nabla_{\nu} \;g^{\mu\nu} (\delta \Gamma^{\rho}_{\;\,\rho\mu})\\&= \nabla_{\sigma} \;g^{\mu\nu}(\delta \Gamma^{\sigma}_{\;\,\mu\nu}) -\nabla_{\sigma} \;g^{\mu\sigma} (\delta \Gamma^{\rho}_{\;\,\rho\mu}) \\&= \nabla_{\sigma} \left(\;g^{\mu\nu}(\delta \Gamma^{\sigma}_{\;\,\mu\nu}) -\;g^{\mu\sigma} (\delta \Gamma^{\rho}_{\;\,\rho\mu}) \right) \end{align*}$

This now looks to my eye like a total derivative, so will not contribute to the action.
<END SIDEBAR>

Since that was such a long sidebar, what has been done is the first of three terms in the variation is the Ricci tensor.

5. Focus on evaluating the variation of the second term in the action. Transform the coordinate system to one where the metric is diagonal and use the product rule:

$\begin{align*}\frac{R}{\sqrt{-g}} \frac{\delta \sqrt{-g}}{\delta g^{\mu\nu}} &= \frac{R}{\sqrt{-g}} \frac{-1}{2 \sqrt{-g}}(-1)g g^{\mu\nu} \frac{\delta g_{\mu\nu}}{\delta g^{\mu\nu}}\\&= -\frac{1}{2} g_{\mu\nu} R\end{align*}$

Notice there was a flip of the metric in the variation which required one more sign change. That is the kind of detail I always trip on.

6. Define the stress energy tensor as the third term:

$\frac{1}{\sqrt{-g}}\frac{\delta (\sqrt{-g}\mathcal{L_M})}{\delta g^{\mu\nu}} = -\frac{1}{2}T_{\mu\nu}$

That factor of a minus a half? I don't get it. Bet it comes out of some classical limit. Hopefully I can research that later in the week.

7. The variation of the Hilbert action will be at an extremum when the integrand is equal to zero:

$\frac{c^4}{16 \pi G} \left( R_{\mu\nu} -\frac{1}{2} g_{\mu\nu} R \right) - \frac{1}{2} T_{\mu\nu} = 0$

or

$R_{\mu\nu} -\frac{1}{2} g_{\mu\nu} R = \frac{8 \pi G}{c^4}T_{\mu\nu}$

Fini.

But not fini. This was a math exercise. Note how little physics was involved. There are a huge number of physics issues one could go into. As an example, these equations bind to particles with integral spin which is good for bosons, but there are quite a few fermions that also participate in gravity. To include those, one can consider the metric and the connection to be independent of each other. That is the Palatini approach.

Doug

Next Monday/Tuesday: Dot and cross products, differences and overlaps with quaternions

Berkshire_Bee
I have found these online: Toponium: the smallest bound state and simplest hadron in quantum mechanics (Feb 2025)

Toponium Found By CMS! · 1 day ago
Tommaso Dorigo
For sure if we get a stronger signal we can understand more of the threshold effects. About the discovery thing, yes, maybe we could call it an "observation". It is largely a question of...

Toponium Found By CMS! · 1 day ago
Alessandro Strumia
Maybe toponium could be used to measure better the top mass? I would not call toponium a discovery, it's just an expected bound state.

Toponium Found By CMS! · 1 day ago
Mitchell Porter
Thank you very much for blogging this! This is of profound interest to me because I have a strong interest in theories where the Higgs is some kind of toponium (e.g. this may be the case in some...

Toponium Found By CMS! · 2 days ago
Anonymous Snowboarder Needs Sn
Add another. Though have to admit, a PhD stats class makes me shudder ;)

The Probability Density Function: A Known Unknown · 5 days ago

Related articles

Comments

Know Science And Want To Write?

Donate or Buy SWAG