Falling through the earth part 2

In my previous post, I showed that an elevator falling from the surface through the center of the earth due to gravity alone would obey the dynamics of a simple harmonic oscillator. I did not know what would happen if the shaft went through some arbitrary chord through the earth. Rick Gerkin believed that it would take the same amount of time for all chords and it turns out that he is correct. The proof is very simple. Consider any chord (straight path) through the earth. Now take a plane and slice the earth through that chord and the center of the earth. This is always possible because it takes three points to specify a plane. Now looking perpendicular to the plane, you can always rotate the earth such that you see

Let the blue dot represent the elevator on this chord. It will fall towards the midpoint. The total force on the elevator is towards the center of the earth along the vector r. From the previous post, we know that the gravitational acceleration is \omega^2 r. The force driving the elevator is along the chord and will have a magnitude that is given by r times the cosine of the angle between x and r. But this has magnitude exactly equal to x! Thus, the acceleration of the elevator along the chord is \omega^2 x and thus the equation of motion for the elevator is \ddot x = \omega^2 x, which will be true for all chords and is the same as what we derived before. Hence, it will take the same amount of time to transit the earth. This is a perfect example of how most problems are solved by conceptualizing them in the right way.

Falling through the earth

The 2012 remake of the classic film Total Recall features a giant elevator that plunges through the earth from Australia to England. This trip is called the “fall”, which I presume to mean it is propelled by gravity alone in an evacuated tube. The film states that the trip takes 17 minutes (I don’t remember if this is to get to the center of the earth or the other side). It also made some goofy point that the seats flip around in the vehicle when you cross the center because gravity reverses. This makes no sense because when you fall you are weightless and if you are strapped in, what difference does it make what direction you are in. In any case, I was still curious to know if 17 minutes was remotely accurate and the privilege of a physics education is that one is given the tools to calculate the transit time through the earth due to gravity.

The first thing to do is to make an order of magnitude estimate to see if the time is even in the ballpark. For this you only need middle school physics. The gravitational acceleration for a mass at the surface of the earth is g = 9.8 m/s^2. The radius of the earth is 6.371 million metres. Using the formula that distance r = 1/2 g t^2 (which you get by integrating twice over time), you get t = \sqrt{2 r / g}. Plugging in the numbers gives 1140 seconds or 19 minutes. So it would take 19 minutes to get to the center of the earth if you constantly accelerated at 9.8 m/s^2. It would take the same amount of time to get back to the surface. Given that the gravitational acceleration at the surface should be an upper bound, the real transit time should be slower. I don’t know who they consulted but 17 minutes is not too far off.

We can calculate a more accurate time by including the effect of the gravitational force changing as you transit through the earth but this will require calculus. It’s a beautiful calculation so I’ll show it here. Newton’s law for the gravitational force between a point mass m and a point mass M separated by a distance r is

F = - \frac{G Mm}{r^2}

where G = 6.67\times 10^{-11} m^3 kg^{-1} s^{-2} is the gravitational constant. If we assume that mass M (i.e. earth) is fixed then Newton’s 2nd law of motion for the mass m is given by m \ddot r = F. The equivalence of inertial mass and gravitational mass means you can divide m from both sides. So, if a mass m were outside of the earth, then it would accelerate towards the earth as

\ddot r = F/m = - \frac{G M}{r^2}

and this number is g when r is the radius of the earth. This is the reason that all objects fall with the same acceleration, apocryphally validated by Galileo on the Tower of Pisa. (It is also the basis of the theory of general relativity).

However, the earth is not a point but an extended ball where each point of the ball exerts a gravitational force on any mass inside or outside of the ball. (Nothing can shield gravity). Thus to compute the force acting on a particle we need to integrate over the contributions of each point inside the earth. Assume that the density of the earth \rho is constant so the mass M = \frac{4}{3} \pi  R^3\rho where R is the radius of the earth. The force between two point particles acts in a straight line between the two points. Thus for an extended object like the earth, each point in it will exert a force on a given particle in a different direction. So the calculation involves integrating over vectors. This is greatly simplified because a ball is highly symmetric. Consider the figure below, which is a cross section of the earth sliced down the middle.

The particle/elevator is the red dot, which is located a distance r from the center. (It can be inside or outside of the earth). We will assume that it travels on an axis through the center of the earth. We want to compute the net gravitational force on it from each point in the earth along this central axis. All distances (positive and negative) are measured with respect to the center of the earth. The blue dot is a point inside the earth with coordinates (x,y). There is also the third coordinate coming out of the page but we will not need it. For each blue point on one side of the earth there is another point diametrically opposed to it. The forces exerted by the two blue points on the red point are symmetrical. Their contributions in the y direction are exactly opposite and cancel leaving the net force only along the x axis. In fact there is an entire circle of points with radius y (orthogonal to the page) around the central axis where each point on the circle combines with a partner point on the opposite side to yield a force only along the x axis. Thus to compute the net force on the elevator we just need to integrate the contribution from concentric rings over the volume earth. This reduces an integral over three dimensions to just two.

The magnitude of the force (density) between the blue and red dot is given by

\frac{G m \rho}{(r-x)^2+y^2}

To get the component of the force along the x direction we need to multiple by the cosine of the angle between the central axis and the blue dot, which is

\frac{r-x}{((r-x)^2+y^2)^{1/2}}

(i.e. ratio of the adjacent to the hypotenuse of the relevant triangle). Now, to capture the contributions for all the pairs on the circle we multiple by the circumference which is 2\pi y. Putting this together gives

F/m = -G\rho \int_{-R}^{R}\int_{0}^{\sqrt{R^2-x^2}}\frac{r-x}{((r-x)^2+y^2)^{3/2}} 2\pi y dy dx

The y integral extends from zero to the edge of the earth, which is \sqrt{R^2-x^2}. (This is R at x=0 (the center) and zero at x=\pm R (the poles) as expected). The x integral extends from one pole to the other, hence -R to R. Completing the y integral gives

2\pi G\rho \int_{-R}^{R}\left. \frac{r-x}{((r-x)^2+y^2)^{1/2}} \right|_{0}^{\sqrt{R^2-x^2}}dx

= 2\pi G\rho \int_{-R}^{R}\left[ \frac{r-x}{((r-x)^2+R^2-x^2)^{1/2}} - \frac{r-x}{|r-x|} \right]dx (*)

The second term comes from the 0 limit of the integral, which is \frac{r-x}{((r-x)^2)^{1/2}}. The square root of a number has positive and negative roots but the denominator here is a distance and thus is always a positive quantity and thus must include the absolute value. The first term of the above integral can be completed straightforwardly (I’ll leave it as an exercise) but the second term must be handled with care because r-x can change sign depending on whether r is greater or less than x. For a particle outside of the earth r-x is always positive and we get

\int_{-R}^{R} \frac{r-x}{|r-x|} dx =  \int_{-R}^{R} dx = 2R, r > R

Inside the earth, we must break the integral up into two parts

\int_{-R}^{R} \frac{r-x}{|r-x|} dx = \int_{-R}^{r}  dx - \int_{r}^{R} dx = r+R - R + r = 2r, -R \le r\le R

The first term of (*) integrates to

\left[ \frac{(r^2-2rx+R^2)^{1/2}(-2r^2+rx+R^2)}{3r^2}  \right]_{-R}^{R}

= \frac{(r^2-2rR+R^2)^{1/2}(-2r^2+rR+R^2)}{3r^2} -   \frac{(r^2+2rR+R^2)^{1/2}(-2r^2-rR+R^2)}{3r^2}

Using the fact that (r \pm R)^2 = r^2 \pm 2rR + R^2, we get

= \frac{|R-r|(-2r^2+rR+R^2)}{3r^2} -   \frac{|R+r|(-2r^2-rR+R^2)}{3r^2}

(We again need the absolute value sign). For r > R, the particle is outside of the earth) and |R-r| = r-R, |R+r| = r + R. Putting everything together gives

F/m = 2\pi G\rho \left[ \frac{6r^2R-2R^3}{3r^2} - 2 R\right] = -\frac{4}{3}\pi R^3\rho G\frac{1}{r^2} = - \frac{MG}{r^2}

Thus, we have explicitly shown that the gravitational force exerted by a uniform ball is equivalent to concentrating all the mass in the center. This formula is true for r < - R too.

For -R \le r  \le R we have

F/m = 2\pi G\rho \left[ \frac{4}{3} r - 2r\right] =-\frac{4}{3}\pi\rho G r = -\frac{G M}{R^3}r

Remarkably, the gravitational force on a particle inside the earth is just the force on the surface scaled by the ratio r/R. The equation of motion of the elevator is thus

\ddot r = - \omega^2 r with \omega^2 = GM/R^3 = g/R

(Recall that the gravitational acceleration at the surface is g = GM/R^2 = 9.8 m/s^2). This is the classic equation for a harmonic oscillator with solutions of the form \sin \omega t. Thus, a period (i.e. round trip) is given by 2\pi/\omega. Plugging in the numbers gives 5062 seconds or 84 minutes. A transit through the earth once would be half that at 42 minutes and the time to fall to the center of the earth would be 21 minutes, which I find surprisingly close to the back of the envelope estimate.

Now Australia is not exactly antipodal to England so the tube in the movie did not go directly through the center, which would make the calculation much harder. This would be a shorter distance but the gravitational force would be at an angle to the tube so there would be less acceleration and something would need to keep the elevator from rubbing against the walls (wheels or magnetic levitation). I actually don’t know if it would take a shorter or longer time than going through the center. If you calculate it, please let me know.

The dynamics of inflation

Inflation, the steady increase of prices and wages, is a nice example of what is called a marginal mode, line attractor, or invariant manifold in dynamical systems. What this means is that the dynamical system governing wages and prices has an equilibrium that is not a single point but rather a line or curve in price and wage space. This is easy to see because if we suddenly one day decided that all prices and wages were to be denominated in some other currency, say Scooby Snacks, nothing should change in the economy. Instead of making 15 dollars an hour, you now make 100 Scooby Snacks an hour and a Starbucks Frappuccino will now cost 25 Scooby Snacks, etc. As long as wages and prices are in balance, it does not matter what they are denominated in. That is why the negative effects of inflation are more subtle than simply having everything cost more. In a true inflationary state your inputs should always balance your outputs but at an ever increasing price. Inflation is bad because it changes how you think about the future and that adjustments to the economy always take time and have costs.

This is why our current situation of price increases does not yet constitute inflation. We are currently experiencing a supply shock that has made goods scarce and thus prices have increased to compensate. Inflation will only take place when businesses start to increase prices and wages in anticipation of future increases. We can show this in a very simple mathematical model. Let P represent some average of all prices and W represent average wages (actually they will represent the logarithm of both quantities but that will not matter for the argument). So in equilibrium P = W. Now suppose there is some supply shock and prices now increase. In order to get back into equilibrium wages should increase so we can write this as

\dot{W} = P - W

where the dot indicates the first derivative (i.e. rate of change of W is positive if P is greater than W). Similarly, if wages are higher than prices, prices should increase and we have

\dot{P} = W- P

Now notice that the equilibrium (where there is no change in W or P) is given by W=P but given that there is only one equation and two unknowns, there is no unique solution. W and P can have any value as long as they are the same. W – P = 0 describes a line in W-P space and thus it is called a line attractor. (Mathematicians would call this an invariant manifold because a manifold is a smooth surface and the rate of change does not change (i.e. is invariant) on this surface. Physicists would call this a marginal mode because if you were to solve the eigenvalue equation governing this system, it would have a zero eigenvalue, which means that its eigenvector (called a mode) is on the margin between stable and unstable.) Now if you add the two equations together you get

\dot{P} + \dot{W} = \dot{S} = 0

which implies that the rate of change of the sum of P and W, which I call S, is zero. i.e. there is no inflation. Thus if prices and wages respond immediately to changes then there can be no inflation (in this simple model). Now suppose we have instead

\ddot{W} = P - W

\ddot{P} = W-P

The second derivative of W and P respond to differences. This is like having a delay or some momentum. Instead of the rate of S responding to price wage differences, the rate of the momentum of S reacts. Now when we add the two equations together we get

\ddot{S} = 0

If we integrate this we now get

\dot{S} = C

where C is some nonnegative constant. So in this situation, the rate of change of S is positive and thus S will just keep on increasing forever. Now what is C? Well it is the anticipatory increases in S. If you were lucky enough that C was zero (i.e. no anticipation) then there would be no inflation. Remember that W and P are logarithms so C is the rate of inflation. Interestingly, the way to combat inflation in this simple toy model is to add a first derivative term. This changes the equation to

\ddot{S} + \dot{S} = 0

which is analogous to adding friction to a mechanical system (used differently to what an economist would call friction). The first derivative counters the anticipatory effect of the second derivative. The solution to this equation will return to a state zero inflation (exercise to the reader).

Now of course this model is too simple to actually describe the real economy but I think it gives an intuition to what inflation is and is not.

2022-05-18: Typos corrected.

The probability of extraterrestrial life

Since, the discovery of exoplanets nearly 3 decades ago most astronomers, at least the public facing ones, seem to agree that it is just a matter of time before they find signs of life such as the presence of volatile gases in the atmosphere associated with life like methane or oxygen. I’m an agnostic on the existence of life outside of earth because we don’t have any clue as to how easy or hard it is for life to form. To me, it is equally possible that the visible universe is teeming with life or that we are alone. We simply do not know.

But what would happen if we find life on another planet. How would that change our expected probability for life in the universe? MIT astronomer Sara Seager once made an offhand remark in a podcast that finding another planet with life would make it very likely there were many more. But is this true? Does the existence of another planet with life mean a dramatic increase in the probability of life in the universe. We can find out by doing the calculation.

Suppose you believe that the probability of life on a planet is f (i.e. fraction of planets with life) and this probability is uniform across the universe. Then if you search n planets, the probability for the number of planets with life you will find is given by a Binomial distribution. The probability that there are x planets is given by the expression P(x | f) = C(x,n) f^x(1-f)^{n-x}, where C is a factor (the binomial coefficient) such that the sum of x from one to n is 1. By Bayes Theorem, the posterior probability for f (yes, that would be the probability of a probability) is given by

P(f | x) = \frac{ P(x | f) P(f)}{P(x)}

where P(x) = \int_0^1 P(x | f) P(f)  df. As expected, the posterior depends strongly on the prior. A convenient way to express the prior probability is to use a Beta distribution

P(f |\alpha, \beta) = B(\alpha,\beta)^{-1} f^{\alpha-1} (1-f)^{\beta-1} (*)

where B is again a normalization constant (the Beta function). The mean of a beta distribution is given by E(f) =  \alpha/(\alpha + \beta) and the variance, which is a measure of uncertainty, is given by Var(f) = \alpha \beta /(\alpha + \beta)^2 (\alpha + \beta + 1). The posterior distribution for f after observing x planets with life out of n will be

P(f | x) = D f^{\alpha + x -1} (1-f)^{n+\beta - x -1}

where D is a normalization factor. This is again a Beta distribution. The Beta distribution is called the conjugate prior for the Binomial because it’s form is preserved in the posterior.

Applying Bayes theorem in equation (*), we see that the mean and variance of the posterior become (\alpha+x)/(\alpha + \beta  +n) and (\alpha+x)( \beta+n-x) /(\alpha + \beta + n)^2 (\alpha + \beta + n + 1), respectively. Now let’s consider how our priors have updated. Suppose our prior was \alpha = \beta = 1, which gives a uniform distribution for f on the range 0 to 1. It has a mean of 1/2 and a variance of 1/12. If we find one planet with life after checking 10,000 planets then our expected f becomes 2/10002 with variance 2\times 10^{-8}. The observation of a single planet has greatly reduced our uncertainty and we now expect about 1 in 5000 planets to have life. Now what happens if we find no planets. Then, our expected f only drops to 1 in 10000 and the variance is about the same. So, the difference between finding a planet versus not finding a planet only halves our posterior if we had no prior bias. But suppose we are really skeptical and have a prior with \alpha =0 and \beta = 1 so our expected probability is zero with zero variance. The observation of a single planet increases our posterior to 1 in 10001 with about the same small variance. However, if we find a single planet out of much fewer observations like 100, then our expected probability for life would be even higher but with more uncertainty. In any case, Sara Seager’s intuition is correct – finding a planet would be a game breaker and not finding one shouldn’t really discourage us that much.

Why middle school science should not exist

My 8th grade daughter had her final (distance learning) science quiz this week on work, or as it is called in her class, the scientific definition of work. I usually have no idea what she does in her science class since she rarely talks to me about school but she so happened to mention this one tidbit because she was proud that she didn’t get fooled by what she thought was a trick question. I’ve always believed that work, as in force times displacement (not the one where you produce economic value), is one of the most useless concepts in physics and should not be taught to anyone until they reach graduate school, if then. It is a concept that has long outlived its usefulness and all it does now is to convince students that science is just a bunch of concepts invented to confuse you. The problem with science education in general is that it is taught as a set of facts and definitions when the only thing that kids need to learn is that science is about trying to show something is true using empirical evidence. My daughter’s experience is evidence that science education in the US has room for improvement.

Work, as defined in science class, is just another form of energy, and the only physics that should be taught to middle school kids is that there are these quantities in the universe called energy and momentum and they are conserved. Work is just the change in energy of a system due to a force moving something. For example, the work required to lift a mass against gravity is the distance the mass was lifted multiplied by the force used to move it. This is where it starts to get a little confusing because there are actually two reasons you need force to move something. The first is because of Newton’s First Law of inertia – things at rest like to stay at rest and things in motion like to stay in motion. In order to move something from rest you need to accelerate it, which requires a force and from Newton’s second law, Force equals mass times acceleration, or F = ma. However, if you move something upwards against the force of gravity then even to move at a constant velocity you need to use a force that is equal to the gravitational force pulling the thing downwards, which from Newton’s law of gravitation is given by F = G M m/r^2, where G is the universal gravitational constant, M is the mass of the earth, m is the mass of the object and r is the distance between the objects. By a very deep property of the universe, the mass in Newton’s law of gravitation is the exact same mass as that in Newton’s second law, called inertial mass. So that means if we let GM/r^2 = g, then we get F = mg, and g = 9.8 m/s^2 is the gravitational acceleration constant if we set r be the radius of the earth, which is much bigger than the height of things we usually deal with in our daily lives. All things dropped near the earth will accelerate to the ground at 9.8 m/s^2. If gravitational mass and inertial mass were not the same, then objects of different masses would not fall with the same acceleration. Many people know that Galileo showed this fact in his famous experiment where he dropped a big and small object from the Leaning Tower of Pisa. However, many probably also cannot explain why including my grade 7 (or was it 8) science teacher who thought it was because the earth’s mass was much bigger than the two objects so the difference was not noticeable. The equivalence of gravitational and inertial mass was what led Einstein to his General Theory of Relativity.

In the first part of my daughter’s quiz, she was asked to calculate the energy consumed by several appliances in her house for one week. She had to look up how much power was consumed by the refrigerator, computer, television and so forth on the internet. Power is energy per unit time so she computed the amount of energy used by multiplying the power used by the total time the device is on per week. In the second part of the quiz she was asked to calculate how far she must move to power those devices. This is actually a question about conservation of energy and to answer the question she had to equate the energy used with the work definition of force times distance traveled. The question told her to use gravitational force, which implies she had to be moving upwards against the force of gravity, or accelerating at g if moving horizontally, although this was not specifically mentioned. So, my daughter took the energy used to power all her appliances and divided it by the force, i.e. her mass times g, and got a distance. The next question was, and I don’t recall exactly how it was phrased but something to the effect of: “Did you do scientifically defined work when you moved?”

Now, in her class, she probably spent a lot of time examining situations to distinguish work from non-work. Lifting a weight is work, a cat riding a Roomba is not work. She learned that you did no work when you walked because the force was perpendicular to your direction of motion. I find these types of gotcha exercises to be useless at best and in my daughter’s case completely detrimental. If you were to walk by gliding along completely horizontally with absolutely no vertical motion at a constant speed then yes you are technically not doing mechanical work. But your muscles are contracting and expanding and you are consuming energy. It’s not your weight times the distance you moved but some very complicated combination of metabolic rate, muscle biochemistry, energy losses in your shoes, etc. Instead of looking at examples and identifying which are work and which are not, it would be so much more informative if they were asked to deduce how much energy would be consumed in doing these things. The cat on the Roomba is not doing work but the Roomba is using energy to turn an electric motor that has to turn the wheel to move the cat. It has to accelerate from standing still and also gets warm, which means some of the energy is wasted to heat. A microwave oven uses energy because it must generate radio waves. Boiling water takes energy because you need to impart random kinetic energy to the water molecules. A computer uses energy because it needs to send electrons through transistors. Refrigerators work by using work energy to pump the heat energy from the inside to the outside. You can’t cool a room by leaving the refrigerator door open because you will just pump heat around in a circle and some of the energy will be wasted as extra heat.

My daughter’s answer to the question of was work done was that no work was done because she interpreted movement to be walking horizontally and she knew from all the gotcha examples that walking was not work. She read to me her very legalistically parsed paragraph explaining her reasoning, which made me think that while science may not be in her future, law might be. I tried to convince her that in order for the appliances to run, energy had to come from somewhere so she must have done some work at some point in her travels but she would have no part of it. She said it must be a trick question so the answer has to not make sense. She proudly submitted the quiz convinced more then ever that her so-called scientist Dad is a complete and utter idiot.

 

 

Audio of SIAM talk

Here is an audio recording synchronized to slides of my talk a week and a half ago in Pittsburgh. I noticed some places where I said the wrong thing such as conflating neuron with synapse.  I also did not explain the learning part very well. I should point out that we are not applying a control to the network.  We train a set of weights so that given some initial condition, the neuron firing rates follow a specified target pattern. I also made a joke that implied that the Recursive Least Squares algorithm dates to 1972. That is not correct. It goes back much further back than that. I also take a pot shot at physicists. It was meant as a joke of course and describes many of my own papers.

Talk at Maryland

I gave a talk at the Center for Scientific Computing and Mathematical Modeling at the University of Maryland today.  My slides are here.  I apologize for the excessive number of pages but I had to render each build in my slides, otherwise many would be unreadable.  A summary of the work and links to other talks and papers can be found here.

The problem with sci fi movies

I, like many people, enjoy science fiction films. The biggest problem I find in these fictional universes is not that sounds can propagate through space, people can travel at the speed of light with no relativistic effects then decelerate to a stop in a few seconds and not even be knocked to the floor, be able to generate artificial gravity everywhere, have power sources that rarely need refueling, and so forth. I accept that these are convenient plot devices that keep the story moving forward. Although I do have to say that successful films like 2001: A Space Odyssey and more recently Interstellar and The Martian show that trying to be faithful to science can often provide an even better plot device. I am still impressed by the special effects in 2001 and the amazing attention to detail of director Stanley Kubrick, e.g. near the beginning of the movie when they are on the rotating space station you can see the subtle curvature of the floor inside the rim. I hope the success of these movies lead to more realistic science fiction and even realistic action movies where the violence is realistically portrayed – people can’t be hit by a brick and then get up.

No, the thing that most irks me about science fiction movies is that the film makers either refuse or are too lazy to make their universes self-consistent. This list is in no particular order and is by no means exhaustive.

  1. Why do storm troopers in Star Wars movies wear plastic suits if they don’t protect them from anything?
  2. In an age with extremely powerful computers and communication devices, why should various control systems only be accessed at specific locations in a building or space craft. Do you really need to go to the engine room to fix the engine? Haven’t they progressed beyond a WWII aircraft carrier?
  3. Why are weapons in the future so bad? Why do people ever miss? There is self-aiming, self-guided bullet technology now and in a future universe with flying cars no one has thought of making this? This also goes for space crafts still engaging in dog fights like the Battle of Britain in 1940.
  4. In the Avenger movies, Iron Man Tony Stark invents a fusion reactor that can fit in his chest and power a flying suit for at least the duration of the movie without ever refueling. Shouldn’t this have transformed the world? This could solve global warming if not end global poverty. Even if he is not making the invention public shouldn’t the rest of the world be working on this?
  5. In the Hunger Games series they have technology to make mutant animals and plants so why is there hunger? They have a ban on GMO’s for food? Why do they still need coal mining or at least need people to do it?
  6. My very first blog post was about the thermodynamic impossibility of the premise of the Matrix movies. Stupid premises seem to be a major problem with the Warchowski sibbling’s films that I have seen. They have this pretense for being intellectual and try to infuse their films with a social consciousness but unfortunately fail. The theme in both the Matrix and the more recent film Jupiter Ascending (JA) is that there is an evil future society that treats humans as commodities – as energy in the Matrix and as a source for an immortal elixir in JA. That could be fine if in JA there was something mystical about humans that could not be reproduced elsewhere but what the Warchowskis do instead is try to infuse some science in it so it is not magic. There is a proto-human race that caused the dinosaurs on earth to go extinct so that humans could arise and then waited 65 million years before they could harvest them for the elixir. That was the easiest way to create a farm for humans? A second premise is that the heroine of the movie is an exact genetic replica of a former Queen who owns earth and who bequeathed her wealth to anyone who is a genetic replica. Again, the Warchowskis forgot to do their math. The probability of an exact genetic replica coming from chance, which is what they insisted on, would be at most 1 in 2^{10,000,000} (if differences are only biallelic common variants), which is unimaginably small. The proto-humans are also billions of years old but have not evolved in any way over that time even though squirrel-like creatures turned into humans in 65 million years on earth.
  7. Even in the movie Interstellar, there is a future race of humans that have the technology to tame a black hole and send messages to the past but they can’t send back instructions for making crops that will grow on earth?

I appreciate that some of these movies are not about science or the future but remakes of old western, adventure, or war movies. However, some are really trying to portray a possible future. If that is the case then some amount of self-consistency is necessary to make the story compelling. One very possible future that I don’t see being explored in popular movies is that unlike dystopian futures where there is a return to feudalism and people are exploited by evil overlords or capitalists, a real problem we may face is that people will become obsolete. People should make movies about what a world where machines can replace almost everything people do would look like. In fact a better premise for the Matrix is that we chose to live in a big simulacrum and a subset of us rebelled. Now that would be an interesting movie.

Are we in a fusion renaissance?

Fusion is a potentially unlimited source of non-carbon emitting energy. It requires the mashing together of small nuclei such as deuterium and tritium to make another nucleus and a lot of leftover energy. The problem is that nuclei do not want to be mashed together and thus to achieve fusion you need something to confine high energy nuclei for a long enough time. Currently, there are only two methods that have successfully demonstrated fusion: 1) gravitational confinement as in the center of a star, and 2) inertial confinement as in a nuclear bomb. In order to get nuclei at high enough energy to overcome the energy barrier for a fusion reaction, electrons can no longer be bound to nuclei to form atoms. A gas of quasi-neutral hot nuclei and electrons is called a plasma and has often been dubbed the fourth state of matter. Hence, the physics of fusion is mostly the physics of plasmas.

My PhD work was in plasma physics and although my thesis ultimately dealt with chaos in nonlinear partial differential equations, my early projects were tangentially related to fusion. At that time there were two approaches to attaining fusion, one was to try to do controlled inertial confinement by using massive lasers to implode a tiny pellet of fuel and the second was to use magnetic confinement in a tokamak reactor. Government sponsored research has been focused almost exclusively on these two approaches for the past forty years. There is a huge laser fusion lab at Livermore and an even bigger global project for magnetic confinement fusion in Cadarache France, called ITER. As of today, neither has proven that they will ever be viable sources of energy although there is evidence of break even where the reactors produce more energy than is put in.

However, these approaches may not ultimately be viable and there really has not been much research funding to pursue alternative strategies. This recent New York Times article reports on a set of privately funded efforts to achieve fusion backed by some big names in technology including Paul Allen, Jeff Bezos and Peter Thiel. Although there is well deserved skepticism for the success of these companies,  (I’m sure my thesis advisor Abe Bers would have had some insightful things to say about them), the time may be ripe for new approaches. In an impressive talk I heard many years ago, roboticist Rodney Brooks remarked that Moore’s Law has allowed robotics to finally be widely available because you could use software to compensate for hardware. Instead of requiring cost prohibitive high precision motors, you could use cheap ones and use software to control them. The hybrid car is only possible because of the software to decide when to use the electric motor and when to use the gas engine. The same idea may also apply to fusion. Fusion is so difficult because plasmas are inherently unstable. Most of the past effort has been geared towards designing physical systems to contain them. However, I can now imagine using software instead.

Finally, government attempts have mostly focused on using a Deuterium-Tritium fusion reaction because it has the highest yield. The problem with this reaction is that it produces a neutron, which then destroys the reactor. However, there are reactions that do not produce neutrons (see here). Abe used to joke that that we could mine the moon for Helium 3 to use in a Deuterium-Helium 3 reactor. So, although we may never have viable fusion on earth, it could be a source of energy on Elon Musk’s moon base, although solar would probably be a lot cheaper.

Abraham Bers, 1930 – 2015

I was saddened to hear that my PhD thesis advisor at MIT, Professor Abraham Bers, passed away last week at the age of 85. Abe was a fantastic physicist and mentor. He will be dearly missed by his many students. I showed up at MIT in the fall of 1986 with the intent of doing experimental particle physics. I took Abe’s plasma physics course as a breadth requirement for my degree. When I began, I didn’t know what a plasma was but by the end of the term I had joined his group. Abe was one of the best teachers I have ever had. His lectures exemplified his extremely clear and insightful mind. I still consult the notes from his classes from time to time.

Abe also had a great skill in finding the right problem for students. I struggled to get started doing research but one day Abe came to my desk with this old Russian book and showed me a figure. He said that it didn’t make sense according to the current theory and asked me to see if I could understand it. Somehow, this lit a spark in me and pursuing that little puzzle resulted in my first three papers. However, Abe also realized, even before I did I think, that I actually liked applied math better than physics. Thus, after finishing these papers and building some command in the field, he suggested that I completely switch my focus to nonlinear dynamics and chaos, which was very hot at the time. This turned out to be the perfect thing for me and it also made me realize that I could always change fields. I have never been afraid of going outside of my comfort zone since. I am always thankful for the excellent training I received at MIT.

The most eventful experience of those days was our weekly group meetings. These were famous no holds barred affairs where the job of the audience was to try to tear down everything the presenter said. I would prepare for a week to get ready when it was my turn. I couldn’t even get through the first slide my first time but by the time I graduated, nothing could faze me. Although the arguments could get quite heated at times, Abe never lost his cool. He would also come to my office after a particularly bad presentation to cheer me up. I don’t ever have any stress when giving talks or speaking in public now because I know that there could never be a sharper or tougher audience than Abe.

To me, Abe will always represent the gentleman scholar to which I’ve always aspired. He was always impeccably dressed with his tweed jacket, Burberry trench coat, and trademark bow tie. Well before good coffee became de rigueur in the US, Abe was a connoisseur and kept his coffee in a freezer in his office. He led a balanced life. He took work very seriously but also made sure to have time for his family and other pursuits. I visited him at MIT a few years ago and he was just as excited about what he was doing then as he was when I was a graduate student. Although he is gone, he will not be forgotten. The book he had been working on, Plasma Waves and Fusion, will be published this fall. I will be sure to get a copy as soon as it comes out.

2015-9-16: Here is a link to his MIT obituary.

Hopfield on the difference between physics and biology

Here is a short essay by theoretical physicist John Hopfield of the Hopfield net and kinetic proofreading fame among many other things (hat tip to Steve Hsu). I think much of the hostility of biologists towards physicists and mathematicians that Hopfield talks about have dissipated over the past 40 years, especially amongst the younger set. In fact these days, a good share of Cell, Science, and Nature papers have some computational or mathematical component. However, the trend is towards brute force big data type analysis rather than the simple elegant conceptual advances that Hopfield was famous for. In the essay, Hopfield gives several anecdotes and summarizes them with pithy words of advice. The one that everyone should really heed and one I try to always follow is “Do your best to make falsifiable predictions. They are the distinction between physics and ‘Just So Stories.’”

Talk in Göttingen

I’m currently in Göttingen, Germany at the Bernstein Sparks Workshop: Beyond mean field theory in the neurosciences, a topic near and dear to my heart.  The slides for my talk are here.  Of course no trip to Göttingen would be complete without a visit to Gauss’s grave and Max Born’s house. Photos below.

IMG_3037IMG_3039

New paper on path integrals

Carson C. Chow and Michael A. Buice. Path Integral Methods for Stochastic Differential Equations. The Journal of Mathematical Neuroscience,  5:8 2015.

Abstract: Stochastic differential equations (SDEs) have multiple applications in mathematical neuroscience and are notoriously difficult. Here, we give a self-contained pedagogical review of perturbative field theoretic and path integral methods to calculate moments of the probability density function of SDEs. The methods can be extended to high dimensional systems such as networks of coupled neurons and even deterministic systems with quenched disorder.

This paper is a modified version of our arXiv paper of the same title.  We added an example of the stochastically forced FitzHugh-Nagumo equation and fixed the typos.

Talk at Jackfest

I’m currently in Banff, Alberta for a Festschrift for Jack Cowan (webpage here). Jack is one of the founders of theoretical neuroscience and has infused many important ideas into the field. The Wilson-Cowan equations that he and Hugh Wilson developed in the early seventies form a foundation for both modeling neural systems and machine learning. My talk will summarize my work on deriving “generalized Wilson-Cowan equations” that include both neural activity and correlations. The slides can be found here. References and a summary of the work can be found here. All videos of the talks can be found here.

 

Addendum: 17:44. Some typos in the talk were fixed.

Addendum: 18:25. I just realized I said something silly in my talk.  The Legendre transform is an involution because the transform of the transform is the inverse. I said something completely inane instead.

Analytic continuation continued

As I promised in my previous post, here is a derivation of the analytic continuation of the Riemann zeta function to negative integer values. There are several ways of doing this but a particularly simple way is given by Graham Everest, Christian Rottger, and Tom Ward at this link. It starts with the observation that you can write

\int_1^\infty x^{-s} dx = \frac{1}{s-1}

if the real part of s>0. You can then break the integral into pieces with

\frac{1}{s-1}=\int_1^\infty x^{-s} dx =\sum_{n=1}^\infty\int_n^{n+1} x^{-s} dx

=\sum_{n=1}^\infty \int_0^1(n+x)^{-s} dx=\sum_{n=1}^\infty\int_0^1 \frac{1}{n^s}\left(1+\frac{x}{n}\right)^{-s} dx      (1)

For x\in [0,1], you can expand the integrand in a binomial expansion

\left(1+\frac{x}{n}\right)^{-s} = 1 +\frac{sx}{n}+sO\left(\frac{1}{n^2}\right)   (2)

Now substitute (2) into (1) to obtain

\frac{1}{s-1}=\zeta(s) -\frac{s}{2}\zeta(s+1) - sR(s)  (3)

or

\zeta(s) =\frac{1}{s-1}+\frac{s}{2}\zeta(s+1) +sR(s)   (3′)

where the remainder R is an analytic function when Re s > -1 because the resulting series is absolutely convergent. Since the zeta function is analytic for Re s >1, the right hand side is a new definition of \zeta that is analytic for s >0 aside from a simple pole at s=1. Now multiply (3) by s-1 and take the limit as s\rightarrow 1 to obtain

\lim_{s\rightarrow 1} (s-1)\zeta(s)=1

which implies that

\lim_{s\rightarrow 0} s\zeta(s+1)=1     (4)

Taking the limit of s going to zero from the right of (3′) gives

\zeta(0^+)=-1+\frac{1}{2}=-\frac{1}{2}

Hence, the analytic continuation of the zeta function to zero is -1/2.

The analytic domain of \zeta can be pushed further into the left hand plane by extending the binomial expansion in (2) to

\left(1+\frac{x}{n}\right)^{-s} = \sum_{r=0}^{k+1} \left(\begin{array}{c} -s\\r\end{array}\right)\left(\frac{x}{n}\right)^r + (s+k)O\left(\frac{1}{n^{k+2}}\right)

 Inserting into (1) yields

\frac{1}{s-1}=\zeta(s)+\sum_{r=1}^{k+1} \left(\begin{array}{c} -s\\r\end{array}\right)\frac{1}{r+1}\zeta(r+s) + (s+k)R_{k+1}(s)

where R_{k+1}(s) is analytic for Re s>-(k+1).  Now let s\rightarrow -k^+ and extract out the last term of the sum with (4) to obtain

\frac{1}{-k-1}=\zeta(-k)+\sum_{r=1}^{k} \left(\begin{array}{c} k\\r\end{array}\right)\frac{1}{r+1}\zeta(r-k) - \frac{1}{(k+1)(k+2)}    (5)

Rearranging (5) gives

\zeta(-k)=-\sum_{r=1}^{k} \left(\begin{array}{c} k\\r\end{array}\right)\frac{1}{r+1}\zeta(r-k) -\frac{1}{k+2}     (6)

where I have used

\left( \begin{array}{c} -s\\r\end{array}\right) = (-1)^r \left(\begin{array}{c} s+r -1\\r\end{array}\right)

The righthand side of (6) is now defined for Re s > -k.  Rewrite (6) as

\zeta(-k)=-\sum_{r=1}^{k} \frac{k!}{r!(k-r)!} \frac{\zeta(r-k)(k-r+1)}{(r+1)(k-r+1)}-\frac{1}{k+2}

=-\sum_{r=1}^{k} \left(\begin{array}{c} k+2\\ k-r+1\end{array}\right) \frac{\zeta(r-k)(k-r+1)}{(k+1)(k+2)}-\frac{1}{k+2}

=-\sum_{r=1}^{k-1} \left(\begin{array}{c} k+2\\ k-r+1\end{array}\right) \frac{\zeta(r-k)(k-r+1)}{(k+1)(k+2)}-\frac{1}{k+2} - \frac{\zeta(0)}{k+1}

Collecting terms, substituting for \zeta(0) and multiplying by (k+1)(k+2)  gives

(k+1)(k+2)\zeta(-k)=-\sum_{r=1}^{k-1} \left(\begin{array}{c} k+2\\ k-r+1\end{array}\right) \zeta(r-k)(k-r+1) - \frac{k}{2}

Reindexing gives

(k+1)(k+2)\zeta(-k)=-\sum_{r'=2}^{k} \left(\begin{array}{c} k+2\\ r'\end{array}\right) \zeta(-r'+1)r'-\frac{k}{2}

Now, note that the Bernoulli numbers satisfy the condition \sum_{r=0}^{N-1} B_r = 0.  Hence,  let \zeta(-r'+1)=-\frac{B_r'}{r'}

and obtain

(k+1)(k+2)\zeta(-k)=\sum_{r'=0}^{k+1} \left(\begin{array}{c} k+2\\ r'\end{array}\right) B_{r'}-B_0-(k+2)B_1-(k+2)B_{k+1}-\frac{k}{2}

which using B_0=1 and B_1=-1/2 gives the self-consistent condition

\zeta(-k)=-\frac{B_{k+1}}{k+1},

which is the analytic continuation of the zeta function for integers k\ge 1.

Analytic continuation

I have received some skepticism that there are possibly other ways of assigning the sum of the natural numbers to a number other than -1/12 so I will try to be more precise. I thought it would be also useful to derive the analytic continuation of the zeta function, which I will do in a future post.  I will first give a simpler example to motivate the notion of analytic continuation. Consider the geometric series 1+s+s^2+s^3+\dots. If |s| < 1 then we know that this series is equal to

\frac{1}{1-s}                (1)

Now, while the geometric series is only convergent and thus analytic inside the unit circle, (1) is defined everywhere in the complex plane except at s=1. So even though the sum doesn’t really exist outside of the domain of convergence, we can assign a number to it based on (1). For example, if we set s=2 we can make the assignment of 1 + 2 + 4 + 8 + \dots = -1. So again, the sum of the powers of two doesn’t really equal -1, only (1) is defined at s=2. It’s just that the geometric series and (1) are the same function inside the domain of convergence. Now, it is true that the analytic continuation of a function is unique. However, although the value of -1 for s=-1 is the only value for the analytic continuation of the geometric series, that doesn’t mean that the sum of the powers of 2 needs to be uniquely assigned to negative one because the sum of the powers of 2 is not an analytic function. So if you could find some other series that is a function of some parameter z that is analytic in some domain of convergence and happens to look like the sum of the powers of two for some z value, and you can analytically continue the series to that value, then you would have another assignment.

Now consider my example from the previous post. Consider the series

\sum_{n=1}^\infty \frac{n-1}{n^{s+1}}  (2)

This series is absolutely convergent for s>1.  Also note that if I set s=-1, I get

\sum_{n=1}^\infty (n-1) = 0 +\sum_{n'=1}^\infty n' = 1 + 2 + 3 + \dots

which is the sum of then natural numbers. Now, I can write (2) as

\sum_{n=1}^\infty\left( \frac{1}{n^s}-\frac{1}{n^{s+1}}\right)

and when the real part of s is greater than 1,  I can further write this as

\sum_{n=1}^\infty\frac{1}{n^s}-\sum_{n=1}^\infty\frac{1}{n^{s+1}}=\zeta(s)-\zeta(s+1)  (3)

All of these operations are perfectly fine as long as I’m in the domain of absolute convergence.  Now, as I will show in the next post, the analytic continuation of the zeta function to the negative integers is given by

\zeta (-k) = -\frac{B_{k+1}}{k+1}

where B_k are the Bernoulli numbers, which is given by the Taylor expansion of

\frac{x}{e^x-1} = \sum B_n \frac{x^n}{n!}   (4)

The first few Bernoulli numbers are B_0=1, B_1=-1/2, B_2 = 1/6. Thus using this in (4) gives \zeta(-1)=-1/12. A similar proof will give \zeta(0)=-1/2.  Using this in (3) then gives the desired result that the sum of the natural numbers is (also) 5/12.

Now this is not to say that all assignments have the same physical value. I don’t know the details of how -1/12 is used in bosonic string theory but it is likely that the zeta function is crucial to the calculation.

Nonuniqueness of -1/12

I’ve been asked to give an example of how the sum of the natural numbers could lead to another value in the comments to my previous post so I thought it may be of general interest to more people. Consider again S=1+2+3+4\dots to be the sum of the natural numbers.  The video in the previous slide gives a simple proof by combining divergent sums. In essence, the manipulation is doing renormalization by subtracting away infinities and the left over of this renormalization is -1/12. There is another video that gives the proof through analytic continuation of the Riemann zeta function

\zeta(s)=\sum_{n=1}^\infty \frac{1}{n^s}

The zeta function is only strictly convergent when the real part of s is greater than 1. However, you can use analytic continuation to extract values of the zeta function to values where the sum is divergent. What this means is that the zeta function is no longer the “same sum” per se, but a version of the sum taken to a domain where it was not originally defined but smoothly (analytically) connected to the sum. Hence, the sum of the natural numbers is given by \zeta(-1) and \zeta(0)=\sum_{n=1}^\infty 1, (infinite sum over ones). By analytic continuation, we obtain the values \zeta(-1)=-1/12 and \zeta(0)=-1/2.

Now notice that if I subtract the sum over ones from the sum over the natural numbers I still get the sum over the natural numbers, e.g.

1+2+3+4\dots - (1+1+1+1\dots)=0+1+2+3+4\dots.

Now, let me define a new function \xi(s)=\zeta(s)-\zeta(s+1) so \xi(-1) is the sum over the natural numbers and by analytic continuation \xi(-1)=-1/12+1/2=5/12 and thus the sum over the natural numbers is now 5/12. Again, if you try to do arithmetic with infinity, you can get almost anything. A fun exercise is to create some other examples.

The sum of the natural numbers is -1/12?

This wonderfully entertaining video giving a proof for why the sum of the natural numbers  is -1/12 has been viewed over 1.5 million times. It just shows that there is a hunger for interesting and well explained math and science content out there. Now, we all know that the sum of all the natural numbers is infinite but the beauty (insidiousness) of infinite numbers is that they can be assigned to virtually anything. The proof for this particular assignment considers the subtraction of the divergent oscillating sum S_1=1-2+3-4+5 \dots from the divergent sum of the natural numbers S = 1 + 2 + 3+4+5\dots to obtain 4S.  Then by similar trickery it assigns S_1=1/4. Solving for S gives you the result S = -1/12.  Hence, what you are essentially doing is dividing infinity by infinity and that as any school child should know, can be anything you want. The most astounding thing to me about the video was learning that this assignment was used in string theory, which makes me wonder if the calculations would differ if I chose a different assignment.

Addendum: Terence Tao has a nice blog post on evaluating such sums.  In a “smoothed” version of the sum, it can be thought of as the “constant” in front of an asymptotic divergent term.  This constant is equivalent to the analytic continuation of the Riemann zeta function. Anyway, the -1/12 seems to be a natural way to assign a value to the divergent sum of the natural numbers.

Talk in Taiwan

I’m currently at the National Center for Theoretical Sciences, Math Division, on the campus of the National Tsing Hua University, Hsinchu for the 2013 Conference on Mathematical Physiology.  The NCTS is perhaps the best run institution I’ve ever visited. They have made my stay extremely comfortable and convenient.

Here are the slides for my talk on Correlations, Fluctuations, and Finite Size Effects in Neural Networks.  Here is a list of references that go with the talk

E. Hildebrand, M.A. Buice, and C.C. Chow, `Kinetic theory of coupled oscillators,’ Physical Review Letters 98 , 054101 (2007) [PRL Online] [PDF]

M.A. Buice and C.C. Chow, `Correlations, fluctuations and stability of a finite-size network of coupled oscillators’. Phys. Rev. E 76 031118 (2007) [PDF]

M.A. Buice, J.D. Cowan, and C.C. Chow, ‘Systematic Fluctuation Expansion for Neural Network Activity Equations’, Neural Comp., 22:377-426 (2010) [PDF]

C.C. Chow and M.A. Buice, ‘Path integral methods for stochastic differential equations’, arXiv:1009.5966 (2010).

M.A. Buice and C.C. Chow, `Effective stochastic behavior in dynamical systems with incomplete incomplete information.’ Phys. Rev. E 84:051120 (2011).

MA Buice and CC Chow. Dynamic finite size effects in spiking neural networks. PLoS Comp Bio 9:e1002872 (2013).

MA Buice and CC Chow. Generalized activity equations for spiking neural networks. Front. Comput. Neurosci. 7:162. doi: 10.3389/fncom.2013.00162, arXiv:1310.6934.

Here is the link to relevant posts on the topic.