The probability of extraterrestrial life

Since, the discovery of exoplanets nearly 3 decades ago most astronomers, at least the public facing ones, seem to agree that it is just a matter of time before they find signs of life such as the presence of volatile gases in the atmosphere associated with life like methane or oxygen. I’m an agnostic on the existence of life outside of earth because we don’t have any clue as to how easy or hard it is for life to form. To me, it is equally possible that the visible universe is teeming with life or that we are alone. We simply do not know.

But what would happen if we find life on another planet. How would that change our expected probability for life in the universe? MIT astronomer Sara Seager once made an offhand remark in a podcast that finding another planet with life would make it very likely there were many more. But is this true? Does the existence of another planet with life mean a dramatic increase in the probability of life in the universe. We can find out by doing the calculation.

Suppose you believe that the probability of life on a planet is f (i.e. fraction of planets with life) and this probability is uniform across the universe. Then if you search n planets, the probability for the number of planets with life you will find is given by a Binomial distribution. The probability that there are x planets is given by the expression P(x | f) = C(x,n) f^x(1-f)^{n-x}, where C is a factor (the binomial coefficient) such that the sum of x from one to n is 1. By Bayes Theorem, the posterior probability for f (yes, that would be the probability of a probability) is given by

P(f | x) = \frac{ P(x | f) P(f)}{P(x)}

where P(x) = \int_0^1 P(x | f) P(f)  df. As expected, the posterior depends strongly on the prior. A convenient way to express the prior probability is to use a Beta distribution

P(f |\alpha, \beta) = B(\alpha,\beta)^{-1} f^{\alpha-1} (1-f)^{\beta-1} (*)

where B is again a normalization constant (the Beta function). The mean of a beta distribution is given by E(f) =  \alpha/(\alpha + \beta) and the variance, which is a measure of uncertainty, is given by Var(f) = \alpha \beta /(\alpha + \beta)^2 (\alpha + \beta + 1). The posterior distribution for f after observing x planets with life out of n will be

P(f | x) = D f^{\alpha + x -1} (1-f)^{n+\beta - x -1}

where D is a normalization factor. This is again a Beta distribution. The Beta distribution is called the conjugate prior for the Binomial because it’s form is preserved in the posterior.

Applying Bayes theorem in equation (*), we see that the mean and variance of the posterior become (\alpha+x)/(\alpha + \beta  +n) and (\alpha+x)( \beta+n-x) /(\alpha + \beta + n)^2 (\alpha + \beta + n + 1), respectively. Now let’s consider how our priors have updated. Suppose our prior was \alpha = \beta = 1, which gives a uniform distribution for f on the range 0 to 1. It has a mean of 1/2 and a variance of 1/12. If we find one planet with life after checking 10,000 planets then our expected f becomes 2/10002 with variance 2\times 10^{-8}. The observation of a single planet has greatly reduced our uncertainty and we now expect about 1 in 5000 planets to have life. Now what happens if we find no planets. Then, our expected f only drops to 1 in 10000 and the variance is about the same. So, the difference between finding a planet versus not finding a planet only halves our posterior if we had no prior bias. But suppose we are really skeptical and have a prior with \alpha =0 and \beta = 1 so our expected probability is zero with zero variance. The observation of a single planet increases our posterior to 1 in 10001 with about the same small variance. However, if we find a single planet out of much fewer observations like 100, then our expected probability for life would be even higher but with more uncertainty. In any case, Sara Seager’s intuition is correct – finding a planet would be a game breaker and not finding one shouldn’t really discourage us that much.

Nobel Prize has outlived its usefulness

The Nobel Prize in Physiology was awarded for the discovery of Hepatitis C today. The work is clearly deserving of recognition but this is another case where there were definitely more than three people who played an essential role in the work. I really think that the Nobel Prize should change its rules to allow for more winners. Below is my post when one of the winners of this years prize, Michael Houghton, turned down the Gairdner Award in 2013:

Hepatitis C and the folly of prizes

The scientific world was set slightly aflutter when Michael Houghton turned down the prestigious Gairdner Award for the the discovery of Hepatitis C. Harvey Alter and Daniel Bradley were the two other recipients. Houghton, who had previously received the Lasker Award with Alter, felt he could not accept one more award because two colleagues Qui-Lim Choo and George Kuo did not receive either of these awards, even though their contributions were equally important.

Hepatitis, which literally means inflammation of the liver, was characterized by Hippocrates and known to be infectious since the 8th century. The disease had been postulated to be viral at the beginning of the 20th century and by the 1960’s two viruses termed Hepatitis A and Hepatitis B had been established. However, there still seemed to be another unidentified infectious agent which was termed Non-A Non-B Hepatitis NANBH.

Michael Hougton, George Kuo and Qui-Lim Choo were all working at the Chiron corporation in the early 1980’s.   Houghton started a project to discover the cause of NANBH in 1982 with Choo joining a short time later. They made significant process in generating mouse monoclonal antibodies with some specificity to NANBH infected materials from chimpanzee samples received from Daniel Bradley at the CDC. They used the antibodies to screen cDNA libraries from infected materials but they had not isolated an agent. George Kuo had his own lab at Chiron working on other projects but would interact with Houghton and Choo. Kuo suggested that they try blind cDNA immunoscreening on serum derived from actual NANBH patients. This approach was felt to be too risky but Kuo made a quantitative assessment that showed it was viable. After two years of intensive and heroic screening by the three of them, they identified one clone that was clearly derived from the NANBH genome and not from human or chimp DNA. This was definitive proof that NANBH was a virus, which is now called Hepatitis C. Kuo then developed a prototype of a clinical Hepatitis C antibody detection kit and used it to screen a panel of NANBH blood provided by Harvey Alter of the NIH. Kuo’s test was a resounding success and the blood test that came out of that work has probably saved 300 million or more people from Hepititis C infection.

The question then is who deserves the prizes. Is it Bradley and Alter, who did careful and diligent work obtaining samples or is it Houghton, Choo, and Kuo, who did the heroic experiments that isolated the virus? For completely unknown reasons, the Lasker was awarded to just Houghton and Alter, which primed the pump for more prizes to these two. Now that the Lasker and Gairdner prizes have been cleared, that leaves just the Nobel Prize. The scientific community could get it right this time and award it to Kuo, Choo, and Houghton.

Addendum added 2013-5-2:  I should add that many labs from around the world were also trying to isolate the infective agent of NANBH and all failed to identify the correct samples from Alter’s panel.  It is not clear how long it would have been and how many more people would have been infected if Kuo, Choo, and Houghton had not succeeded when they did.

Why middle school science should not exist

My 8th grade daughter had her final (distance learning) science quiz this week on work, or as it is called in her class, the scientific definition of work. I usually have no idea what she does in her science class since she rarely talks to me about school but she so happened to mention this one tidbit because she was proud that she didn’t get fooled by what she thought was a trick question. I’ve always believed that work, as in force times displacement (not the one where you produce economic value), is one of the most useless concepts in physics and should not be taught to anyone until they reach graduate school, if then. It is a concept that has long outlived its usefulness and all it does now is to convince students that science is just a bunch of concepts invented to confuse you. The problem with science education in general is that it is taught as a set of facts and definitions when the only thing that kids need to learn is that science is about trying to show something is true using empirical evidence. My daughter’s experience is evidence that science education in the US has room for improvement.

Work, as defined in science class, is just another form of energy, and the only physics that should be taught to middle school kids is that there are these quantities in the universe called energy and momentum and they are conserved. Work is just the change in energy of a system due to a force moving something. For example, the work required to lift a mass against gravity is the distance the mass was lifted multiplied by the force used to move it. This is where it starts to get a little confusing because there are actually two reasons you need force to move something. The first is because of Newton’s First Law of inertia – things at rest like to stay at rest and things in motion like to stay in motion. In order to move something from rest you need to accelerate it, which requires a force and from Newton’s second law, Force equals mass times acceleration, or F = ma. However, if you move something upwards against the force of gravity then even to move at a constant velocity you need to use a force that is equal to the gravitational force pulling the thing downwards, which from Newton’s law of gravitation is given by F = G M m/r^2, where G is the universal gravitational constant, M is the mass of the earth, m is the mass of the object and r is the distance between the objects. By a very deep property of the universe, the mass in Newton’s law of gravitation is the exact same mass as that in Newton’s second law, called inertial mass. So that means if we let GM/r^2 = g, then we get F = mg, and g = 9.8 m/s^2 is the gravitational acceleration constant if we set r be the radius of the earth, which is much bigger than the height of things we usually deal with in our daily lives. All things dropped near the earth will accelerate to the ground at 9.8 m/s^2. If gravitational mass and inertial mass were not the same, then objects of different masses would not fall with the same acceleration. Many people know that Galileo showed this fact in his famous experiment where he dropped a big and small object from the Leaning Tower of Pisa. However, many probably also cannot explain why including my grade 7 (or was it 8) science teacher who thought it was because the earth’s mass was much bigger than the two objects so the difference was not noticeable. The equivalence of gravitational and inertial mass was what led Einstein to his General Theory of Relativity.

In the first part of my daughter’s quiz, she was asked to calculate the energy consumed by several appliances in her house for one week. She had to look up how much power was consumed by the refrigerator, computer, television and so forth on the internet. Power is energy per unit time so she computed the amount of energy used by multiplying the power used by the total time the device is on per week. In the second part of the quiz she was asked to calculate how far she must move to power those devices. This is actually a question about conservation of energy and to answer the question she had to equate the energy used with the work definition of force times distance traveled. The question told her to use gravitational force, which implies she had to be moving upwards against the force of gravity, or accelerating at g if moving horizontally, although this was not specifically mentioned. So, my daughter took the energy used to power all her appliances and divided it by the force, i.e. her mass times g, and got a distance. The next question was, and I don’t recall exactly how it was phrased but something to the effect of: “Did you do scientifically defined work when you moved?”

Now, in her class, she probably spent a lot of time examining situations to distinguish work from non-work. Lifting a weight is work, a cat riding a Roomba is not work. She learned that you did no work when you walked because the force was perpendicular to your direction of motion. I find these types of gotcha exercises to be useless at best and in my daughter’s case completely detrimental. If you were to walk by gliding along completely horizontally with absolutely no vertical motion at a constant speed then yes you are technically not doing mechanical work. But your muscles are contracting and expanding and you are consuming energy. It’s not your weight times the distance you moved but some very complicated combination of metabolic rate, muscle biochemistry, energy losses in your shoes, etc. Instead of looking at examples and identifying which are work and which are not, it would be so much more informative if they were asked to deduce how much energy would be consumed in doing these things. The cat on the Roomba is not doing work but the Roomba is using energy to turn an electric motor that has to turn the wheel to move the cat. It has to accelerate from standing still and also gets warm, which means some of the energy is wasted to heat. A microwave oven uses energy because it must generate radio waves. Boiling water takes energy because you need to impart random kinetic energy to the water molecules. A computer uses energy because it needs to send electrons through transistors. Refrigerators work by using work energy to pump the heat energy from the inside to the outside. You can’t cool a room by leaving the refrigerator door open because you will just pump heat around in a circle and some of the energy will be wasted as extra heat.

My daughter’s answer to the question of was work done was that no work was done because she interpreted movement to be walking horizontally and she knew from all the gotcha examples that walking was not work. She read to me her very legalistically parsed paragraph explaining her reasoning, which made me think that while science may not be in her future, law might be. I tried to convince her that in order for the appliances to run, energy had to come from somewhere so she must have done some work at some point in her travels but she would have no part of it. She said it must be a trick question so the answer has to not make sense. She proudly submitted the quiz convinced more then ever that her so-called scientist Dad is a complete and utter idiot.



Duality and computation in the MCU

I  took my kindergartener to see Avengers: Endgame recently. My son was a little disappointed, complaining that the film had too much talking and not enough fighting. To me, the immense popularity of the Marvel Cinematic Universe series and so-called science fiction/fantasy in general is an indicator of how people think they like science but really want magic. Popular science-fictiony franchises like MCU and Star Wars are couched in scientism but are often at odds with actual science as practiced today. Arthur C Clarke famously stated in his third law that “Any sufficiently advanced technology is indistinguishable from magic.” A sentiment captured in these films.

Science fiction should extrapolate from current scientific knowledge to the possible. Otherwise, it should just be called fiction. There have been a handful of films that try to do this like 2001: A Space Odyssey or more recently Interstellar and The Martian. I think there is a market for these types of films but they are certainly not as popular as the fantasy films. To be fair, neither Marvel nor Star Wars (both now owned by Disney) market themselves as science fiction as I defined it. They are intended to be mythologies a la Joseph Campbell’s Hero’s Journey. However, they do have a scientific aesthetic with worlds dominated by advanced technology.

Although I find the MCU films not overly compelling, they do bring up two interesting propositions. The first is dualism. The superhero character Ant-Man has a suit that allows him to change size and even shrink to sub-atomic scales, called the quantum realm in the films. (I won’t bother to discuss whether energy is conserved in these near instantaneous size changes, an issue that affects the Hulk as well). The film was advised by physicist Spiros Michalakis and is rife with physics terminology and concepts like quantum entanglement. One crucial concept it completely glosses over is how Ant-man maintains his identity as a person, much less his shape, when he is smaller than an atom. Even if one were to argue that one’s consciousness could be transferred to some set of quantum states at the sub-atomic scale, it would be overwhelmed by quantum fluctuations. The only self-consistent premise of Ant-Man is that the essence or soul if you wish of a person is not material. The MCU takes a definite stand for dualism on the mind-body problem, a sentiment with which I presume the public mostly agrees. 

The second is that magic has immense computational power. In the penultimate Avengers movie, the villain Thanos snaps his fingers while in possession of the complete set of infinity stones and eliminates half of all living things. (Setting aside the issue that Thanos clearly does not understand the the concept of exponential growth. If you are concerned about overpopulation, it is pointless to shrink the population and do nothing else because it will just return to its original size in short time.) What I’d like to know is who or what does the computation to carry out the command. There are at least two hard computational problems that must be solved. The first is to identify all lifeforms.  This is clearly no easy task as we to this day have no precise definition of life. Do viruses get culled by the snap? Do the population of silicon-based lifeforms of Star Trek get halved or is it only biochemical life? What algorithm does the snap use to find all the life forms? Living things on earth range in size from single cells (or viruses if you count them) all the way to 35 metre behemoths, which are comprised of over 10^{23} numbers of atoms. How do the stones know what scales they span in the MCU? Do photosynthetic lifeforms get spared since they don’t use many resources? What about fungi? Is the MCU actually a simulated universe where there is a continually updated census of all life? How accurate is the algorithm? Was it perfect? Did it aim for high specificity (i.e. reduce false positives so you only kill lifeforms and not non lifeforms) or high sensitivity (i.e. reduce false negatives and thus don’t miss any lifeforms). I think it probably favours sensitivity over specificity – who cares if a bunch of ammonia molecules accidentally get killed. The find-all-life problem is made much easier by proposition 1 because if all life were material then the only way to detect them would be to look for multiscale correlations between atoms (or find organic molecules if you only care about biochemical life). If each lifeform has a soul then you can simply search for “soulfulness”. The lifeforms were not erased instantly but only after a brief delay. What was happening over this delay. Is magic propagation limited by the speed of light or some other constraint? Or did the computation take time? In Endgame, the Hulk restores all the Thanos erased lifeforms and Tony Stark then snaps away Thanos and all of his allies. Where were the lifeforms after they were erased? In Heaven? In a soul repository somewhere? Is this one of the Nine Realms of the MCU? How do the stones know who is a Thanos ally? The second computation is to then decide which half to extinguish. The movie seems to imply that the choice was random so where did the randomness come from? Do the infinity stones generate random numbers? Do they rely on quantum fluctuations? Finally, in a world with magic, why is there also science? Why does the universe follow the laws of physics sometimes and magic other times. Is magic a finite resource as in Larry Niven’s The Magic Goes Away. So many questions, so few answers.

Mosquito update

IMG_2917It’s been about two weeks since I first set out my bucket, although I had to move it to a less obtrusive location. Still no signs of mosquito larvae, although judging from my bite frequency even with mosquito repellant, mosquito activity is still high in my garden. I see the occasional insect trapped (they are not really floating since at their size water is highly viscous) in the surface and there is a nice collection of plant debris at the bottom. The water level seems a little bit higher. It has rained at least once every two days since my first post although it has also been very hot so the input seems mostly balanced by the evaporative loss. I’m starting to believe that mosquitos have their prefered gestation grounds that they perpetually use and only exploit new locales when necessary.

How to be a scientific dilettante

I have worked in a lot of disparate fields from plasma physics, to nonlinear dynamics, to posture control, to neuroscience, to inflammation, to obesity, to gene transcription, and population genetics. I have had a lot of fun doing this but I definitely do not recommend it as a career path for a young person. I may have gotten away with it but I was really lucky and it certainly carried a cost. First of all, by working in so many fields, I definitely lack the deep knowledge that specialists have. Thus, I don’t always see the big picture and am often a step behind others. Secondly, while I am pretty well cited, my citations are diluted over multiple fields. Thus, while my total H index (number of papers where number of citations exceeds rank) is pretty decent, my H index in each given field is relatively small. I thus do not have much impact in any given area. To be taken seriously as a scientist, one must be a world expert in something. The system is highly nonlinear; being pretty good in a lot of things is much worse than being really good in one thing. There is a threshold for relevance and if you don’t cross it then it is like you don’t exist.

However, if you do want to work in a lot of fields, the worse thing to do is to say, “Hey I really find field X to be interesting so I’m just going to read some books and papers on it and try to do something.” I have reviewed quite a few papers, where some mathematician or physicist has read some popular science book or newspaper article on the topic and then tried to publish a paper on a problem mentioned in the book. I then have to tell them to read up on four decades of previous work first, and then resubmit. The way I have managed to meander through multiple fields is that someone will either contact me directly about some specific question or mention something to me either in a casual setting or at a conference. I could not possibly have made any progress at all if I didn’t have great collaborators who really knew the field and the literature. Still, people constantly ask me if I still work in neuroscience, to which I can only respond “Just because you don’t cite me doesn’t mean I don’t publish!”


The Drake equation and the Cambrian explosion

This summer billionaire Yuri Milner announced that he would spend upwards of 100 million dollars to search for extraterrestrial intelligent life (here is the New York Times article). This quest to see if we have company started about fifty years ago when Frank Drake pointed a radio telescope at some stars. To help estimate the number of possible civilizations, N, Drake wrote down his celebrated equation,

N = R_*f_p n_e f_l f_i f_c L

where R_* is the rate of star formation, f_p is the fraction of stars with planets, n_e is the average number of planets per star that could support life, f_l fraction of planets that develop life, f_i fraction of those planets that develop intelligent life, f_c fraction of civilizations that emit signals, and L is the length of time civilizations emit signals.

The past few years have demonstrated that planets in the galaxy are likely to be plentiful and although the technology to locate earth-like planets does not yet exist, my guess is that they will also be plentiful. So does that mean that it is just a matter of time before we find ET? I’m going to come on record here and say no. My guess is that life is rare and intelligent life may be so rare that there could only be one civilization at a time in any given galaxy.

While we are now filling in the numbers for the left side of Drake’s equation, we have absolutely no idea about the right side of the equation. However, I have good reason to believe that it is astronomically small and that reason is statistical independence. Although Drake characterized the probability of intelligent life into the probability of life forming times the probability it goes on to develop extra-planetary communication capability, there are actually a lot of factors in between. One striking example is the probability of the formation of multi-cellular life. In earth’s history, for the better part of three and a half billion years we had mostly single cellular life and maybe a smattering of multicellular experiments. Then suddenly about half a billion years ago, we had the Cambrian Explosion where multicellular animal life from which we are descended suddenly came onto the scene. This implies that forming multicellular life is extremely difficult and it is easy to envision an earth where it never formed at all.

We can continue. If it weren’t for an asteroid impact, the dinosaurs may never have gone extinct and mammals may not have developed. Even more recently, there seem to have been many species of advanced primates yet only one invented radios. Agriculture only developed ten thousand years ago, which meant that modern humans took about a hundred thousand years to discover it and only in one place. I think it is equally plausible that humans could have gone extinct like all of our other australopithecus and homo cousins. Life in the sea has existed much longer than life on land and there is no technologically advanced sea creature although I do think octopuses, dolphins and whales are intelligent.

We have around 100 billion stars in the galaxy and let’s just say that each has a habitable planet. Well, if the probability of each stage of life is one in a billion and if we need say three stages to attain technology then the probability of finding ET is one in 10^{16}. I would say that this is an optimistic estimate. Probabilities get small really quickly when you multiply them together. The probability of single cellular life will be much higher. It is possible that there could be hundred planets in our galaxy that have life but the chance that one of those is within a hundred light years will again be very low. However, I do think it is a worthwhile exercise to look for extracellular life, especially for oxygen or other life emitting gases in the atmosphere of exoplanets. It could tell us a lot about biology on earth.

2015-10-1: I corrected a factor of 10 error in some of the numbers.

The world of Gary Taubes

Science writer Gary Taubes has a recent New York Times commentary criticizing Kevin Hall’s recent paper on the differential metabolic effects of low fat vs low carbohydrate diets. See here for my recent post on the experiment. Taubes is probably best known for his views on nutrition and as an advocate for low carb diets although he has two earlier books on the sociology of physics. The main premise running through his four books is that science is susceptible to capture by the vanity, ambition, arrogance, and plain stupidity of scientists. He is pro-science but anti-scientist.

His first book on nutrition – Good Calories, Bad Calories, was about how the medical establishment and in particular nutritionists have provided wrong and potentially dangerous advice on diets for decades. He takes direct aim at Ancel Keys as one of the main culprits for pushing the reduction of dietary fat to prevent heart disease. The book is a great read and clearly demonstrates Taubes’s sharp mind and gifts as a story teller. In the course of researching the book, Taubes also discovered the biological mechanisms of insulin and this is what has mostly shaped his thinking about carbohydrates and obesity. He spells it out in more detail in his subsequent book – Why We Get Fat. I think that these two books are a perfect demonstration of why having a little knowledge and a high IQ can be a dangerous thing.

Most people know of insulin as the hormone that goes awry in diabetes. When we fast, our insulin levels are low and our body, except for our brain, burns fat. If we then ingest carbohydrates, our insulin levels rise, which induces our body to utilize glucose (the main source of fuel in carbs) in favour of insulin. Exercise will also cause a switch in fuel choice from fat to glucose. What is less well known is that insulin also suppresses the release of fat from fat cells (adipocytes), which is something I have modeled (see here). This seems to have been a revelation to Taubes – Clearly, if you eat lots of carbs, you will have lots of insulin, which will sequester fat in fat cells. Ergo, eating carbs makes you fat! Nutritionists were so focused on their poorly designed studies that they missed the blatantly obvious. This is just another example of how arrogant scientists get things wrong.

Taubes then proposed a simple experiment – take two groups of people and put one group on a high carb diet and the other on a low carb diet with the same caloric content, and see who loses weight. Well, Kevin Hall anticipated this request with basically the same experiment although for a different purpose. What Kevin noticed in his model was that if you cut carbs and keep everything else the same, insulin goes down and the body responds by burning much more fat. However, if you cut fat, there is nothing in the model that told the body that the fat was missing. Insulin didn’t change and thus the body just burned the same amount of carbs as before. He found this puzzling. Surely there must be a fat detector that we don’t know about so he went about to test it. I remember he and his fellows labouring diligently for what seemed like years writing the protocol and getting the necessary approval and resources to do the experiment. The result was exactly as the model predicted. We really don’t have a fat sensor. However, the subjects lost more fat on the low fat diet then they did on the low carb diet.  This is not exactly the experiment Taubes wanted to do, which was to change the macronutrient composition but keep the calories the same. He then hypothesized that those on the low carb diet would lose weight and those on the low fat, high carb diet would gain weight. Kevin and a consortium of top obesity researchers has since done that experiment and the results will come out shortly.

Now is this surprising? Well not really, for while Taubes is absolutely correct in that insulin suppresses fat utilization the net outcome of insulin reduction is a quantitative and not a qualitative question. You cannot deduce the outcome with formal logic. The reason is that insulin cannot be elevated all the time. Even a continuous grazer must sleep at some point where upon insulin falls. You then must consider the net effect of high and low insulin over a day or longer to assess the outcome. This can only be determined empirically and this is what Taubes fails to see or accept. He also commits a logical fallacy –  Just because a scientist is stupid doesn’t mean he is wrong.

Taubes’s recent commentary criticizes Kevin’s experiment by saying that it 1) is a diet that is impossible to follow and 2) it ignores appetite. The response to the first point is that the experiment was meant to test a metabolic hypothesis and was not meant to test the effect of a diet. My response to his second point is to stare agape. When Taubes visited NIH a few years ago after his Good Calories, Bad Calories book came out I offered the hypothesis that low carb diets could suppress appetite and this could be why they may be effective in reducing weight. However, he had no interest in this idea and Kevin has told me that he has repeatedly shown no interest in it. (I don’t need to give details on how people have been interested in appetite for decades since it is well done in this post.) I came to the conclusion that appetite control was the primary driver of the obesity epidemic shortly after arriving at NIH. In fact my first BSC presentation was on this topic. The recommendation by the committee was that I should do something else and that NIH was a bad fit for me. However, I am still here and I still believe appetite control is the key.

Hopfield on the difference between physics and biology

Here is a short essay by theoretical physicist John Hopfield of the Hopfield net and kinetic proofreading fame among many other things (hat tip to Steve Hsu). I think much of the hostility of biologists towards physicists and mathematicians that Hopfield talks about have dissipated over the past 40 years, especially amongst the younger set. In fact these days, a good share of Cell, Science, and Nature papers have some computational or mathematical component. However, the trend is towards brute force big data type analysis rather than the simple elegant conceptual advances that Hopfield was famous for. In the essay, Hopfield gives several anecdotes and summarizes them with pithy words of advice. The one that everyone should really heed and one I try to always follow is “Do your best to make falsifiable predictions. They are the distinction between physics and ‘Just So Stories.’”

Jobs at the Allen Institute

The Allen Institute for Brain Science is currently recruiting. The positions are listed on this link but in particular see below:


The Modeling, Analysis and Theory team at the Allen Institute is seeking a candidate with strong mathematical and computational skills who will work closely with both the team as well as experimentalists in order to both maximize the potential of datasets as well as realize that potential via analysis and theory. The successful candidate will be expected to develop analysis for populations of neurons as well as establish theoretical results on cortical computation, object recognition, and related areas in order to aid the Institute in understanding the most complex piece of matter in the universe.


Michael Buice, Scientist II


Why science is hard to believe

Here is an excerpt from a well written opinion piece by Washington Post columnist Joel Achenbach:

Washington Post: We live in an age when all manner of scientific knowledge — from the safety of fluoride and vaccines to the reality of climate change — faces organized and often furious opposition. Empowered by their own sources of information and their own interpretations of research, doubters have declared war on the consensus of experts. There are so many of these controversies these days, you’d think a diabolical agency had put something in the water to make people argumentative.

Science doubt has become a pop-culture meme. In the recent movie “Interstellar,” set in a futuristic, downtrodden America where NASA has been forced into hiding, school textbooks say the Apollo moon landings were faked.

I recommend reading the whole piece.

Open source software for math and science

Here is a list of open source software that you may find useful.  Some, I use almost every day, some I have not yet used, and some may be so ubiquitous that you have even forgotten that it is software.

1. XPP/XPPAUT. Bard Ermentrout wrote XPP in the 1980’s as a dynamical systems tool for himself. It’s now the de facto tool for the Snowbird community.  I still find it to be the easiest and fastest way to simulate and visualize differential equations.  It includes the equally excellent bifurcation continuation software tool AUTO originally written by Eusebius Doedel with contributions from a who’s who list of mathematicians.  XPP is also available as an iPad and iPhone App.

2. Julia. I only learned about Julia this spring and now I use it for basically anything I used to use Matlab for.  It’s syntax is very similar to Matlab and it’s very fast. I think it is quickly gaining a large following and may be as comprehensive as Python some day.

3. Python often seems more like a way of life than a software tool. I would probably be using Python if it were not for Julia and the fact that Julia is faster. Python has packages for everything. There is SciPy and NumPy for scientific computing, Pandas for statistics, Matplotlib for making graphs, and many more that I don’t yet know about.  I must confess that I still don’t know my way around Python but my fellows all use it.

4. R. For statistics, look no further than R, which is what academic statisticians use. It’s big in Big Data.  So big that I heard that Microsoft is planning to write a wrapper for it. I also heard that billionaire mathematician James Simons’s hedge fund Renaissance Technologies uses it.  For Bayesian inference there is now Stan, which implements Hamilton Monte Carlo.  We tried using it for one of our projects and had trouble getting it to work but it’s improving very fast.

5. AMS-Latex. The great computer scientist Donald Knuth wrote the typesetting language TeX in 1978 and he changed scientific publication forever. If you have ever had to struggle putting equations into MS Word, you’ll realize what a genius Knuth is. Still TeX was somewhat technical and thus LaTeX was invented as a simplified interface for TeX with built-in environments that are commonly used. AMS-Latex is a form of LaTeX that includes commands for any mathematical symbol you’ll ever need. It also has very nice equation and matrix alignment tools.

6. Maxima. Before Mathematica and Maple there was Macsyma. It was a symbolic mathematics system developed over many years at MIT starting in the 60’s. It was written in the programming language Lisp (another great open source tool but I have never used it) and was licensed by MIT to a company called Symbolics that made dedicated Lisp machines that ran Macsyma.  My Thesis advisor at MIT bought one of these machines (I think it cost him something like 20 thousand dollars, which was a lot of money back then) and I used it for my thesis. I really loved Macysma and got quite adept at it. However, as you can imagine the Symbolics business plan really didn’t pan out and Macysma kind of languished after the company failed. However, after many trials and tribulations, Macsyma was reborn as the open source software tool Maxima and it’s great.  I’ve been running wmMaxima and it can do everything that I ever needed Mathematica for with the bonus that I don’t have to find and re-enter my license number every few months.

7. OpenOffice. I find it reprehensible that scientific journals force me to submit my papers in Microsoft Word. But MS Office is a monopoly and all my collaborators use it.  Data always comes to me in Excel and talks are in PowerPoint. For my talks, I use Apple Keynote, which is not open source. However, Apple likes to completely overhaul their software so my old talks are not even compatible with the most recent version. I also dislike the current version. The reason I went to Keynote is because I could embed PDFs of equations made in LaTeXiT (donation ware). However, the new version makes this less convenient. PDFs looked terrible in PowerPoint a decade ago. I have no idea if this has changed or not.  I have flirted with using OpenOffice for many years but it was never quite 100% compatible with MS Office so I could never fully dispense with Word.  However, in my push to open source, I may just write my next talk in OpenOffice.

8. Plink The standard GWAS analysis tool is Plink, originally written by Shaun Purcell.  It’s nice but kind of slow for some computations and was not being actively updated.  It also couldn’t do some of the calculations we wanted.  So in steps my collaborator Chris Chang who took it upon himself to write a software tool that could do all the calculations we needed. His code was so fast and good that we started to ask him to add more and more to it. Eventually, it did almost everything that Plink and gcta (tool for estimating heritability) could do and thus he asked Purcell if he could just call it Plink. It’s currently called Plink 1.9.

9. C/C++  We tend to forget that computer languages like C, Java, Javascript, Ruby, etc. are all open source software tools.

10. Inkscape is a very nice drawing program, an open source Adobe Illustrator if you will.

11. GNU Project. Computer scientist Richard Stallman kind of invented the concept of open software. He started the free software foundation and the GNU Project, which includes GNU/Linux, the editor emacs, gnuplot among many other things.

Probably the software tools you use most that are currently free (but may not be forever) are the browser and email. People forget how much these two ubiquitous things have completely changed our lives.  When was the last time you went to the library or wrote a letter in ink?

NIH Stadtman Investigator

The US National Institutes of Health is divided into an Extramural Program (EP), where scientists  in universities and research labs apply for grants, and an Intramural Program (IP), where investigators such as myself are provided with a budget to do research without having to write grants. Intramural Investigators are reviewed fairly rigorously every four years, which affects budgets for the next four years, but this is less stressful than trying to run a lab on NIH grants. This funding model difference is particularly salient in the face of budget cuts because for the IP a 10% cut is 10% cut whereas for the EP, it means that 10% fewer grants are funded. When a  lab cannot renew a grant, people lose their jobs. This problem is further exacerbated by medical schools loading up with “soft money” positions, where researchers must pay their own salaries from grants. The institutions also extract fairly large indirect costs from these grants, so in essence, the investigators write grants to both pay their salaries and fill university coffers. I often nervously joke that since the IP is about 10% of the NIH budget, an easy way to implement a 10% budget cut is to eliminate the IP.

However, I think there is value in having something like the IP where people have the financial security to take some risks. It is the closet thing we have these days to the old Bell Labs, where the transistor, information theory, C, and Unix were invented.  The IP has produced 18 Nobel Prizes and can be credited with breaking the genetic code (Marshall Nirenberg), the discovery of fluoride to prevent tooth decay, lithium for bipolar disorder, and vaccines against multiple diseases (see here for a list of past accomplishments). What the IP needs to ensure its survival is a more a rigorous and transparent procedure for entry into the IP where the EP participates. An IP position should be treated like a lifetime grant to which anyone at any stage in their career can apply. Not everyone may want to be here. Research groups are generally smaller and there are lots of rules and regulations to deal with, particularly for travel. But if someone just wants to close their door and do high risk high reward research, this is a pretty good place to be and they should get a shot at it.

The Stadtman Tenure-track Investigator program is a partial implementation of this idea. For the past five years, the IP has conducted institute-wide searches to identify young talent in a broad set of fields. I am co-chair of the Computational Biology search this year. We have invited five candidates to come to a “Stadtman Symposium”, which will be held tomorrow at NIH.  Details are here along with all the symposia. Candidates that strike the interest of individual scientific directors of the various institutes will be invited back for a more traditional interview. Most of the hires at NIH over the past five years have been through the Stadtman process. I think this has been a good idea and has brought some truly exceptional people to the IP. What I would do to make it even more transparent is to open up the search to people at all stages in the their career and to have EP people participate in the searches and eventual selection of the investigators.



It takes a team

Here is a letter (reposted with permission) from Michael Gottesman, Deputy Director for Intramural Research of the NIH, telling the story of how the NIH intramural research program was instrumental in helping Eric Betzig win this years Nobel Prize in Chemistry.  I think it once again shows how great breakthroughs rarely occur in isolation.

Dear colleagues,

The NIH intramural program has placed its mark on another Nobel Prize. You likely heard last week that Eric Betzig of HHMI’s Janelia Farm Research Campus will share the 2014 Nobel Prize in Chemistry “for the development of super-resolved fluorescence microscopy.”  Eric’s key experiment came to life right here at the NIH, in the lab of Jennifer Lippincott-Schwartz.

In fact, Eric’s story is quite remarkable and highlights the key strengths of our intramural program: freedom to pursue high-risk research, opportunities to collaborate, and availability of funds to kick-start such a project.

Eric was “homeless” from a scientist’s viewpoint. He was unemployed and working out of a cottage in rural Michigan with no way of turning his theory into reality.  He had a brilliant idea to isolate individual fluorescent molecules by a unique optical feature to overcome the diffraction limit of light microscopes, which is about 0.2 microns. He thought that if green fluorescent proteins (GFPs) could be switched on and off a few molecules at a time, it might be possible using Gaussian fitting to synthesize a series of images based on point localization that, when stacked, provide extraordinary resolution.

Eric chanced to meet Jennifer, who heads the NICHD’s Section on Organelle Biology. She and George Patterson, then a postdoc in Jennifer’s lab and now a PI in NIBIB, had developed a photoactivable version of GFP with these capabilities, which they were already applying to the study of organelles. Jennifer latched on to Eric’s idea immediately; she was among the first to understand its significance and saw that her laboratory had just the tool that Eric needed.

So, in mid-2005, Jennifer offered to host Eric and his friend and colleague, Harald Hess, to collaborate on building a super-resolution microscope based on the use of photoactivatable GFP. The two had constructed key elements of this microscope in Harald’s living room out of their personal funds.

Jennifer located a small space in her lab in Building 32. She and Juan Bonifacino, also in NICHD, then secured some centralized IATAP funds for microscope parts to supplement the resources that Eric and Harald brought to the lab.  Owen Rennert, then the NICHD scientific director, provided matching funds. By October 2005, Eric and Harald became affiliated with HHMI, which also contributed funds to the project.

Eric and Harald quickly got to work with their new NICHD colleagues in their adopted NIH home.  The end result was a fully operational microscope married to GFP technology capable of producing super-resolution images of intact cells for the first time. Called photoactivated localization microscopy (PALM), the new technique provided 10 times the resolution of conventional light microscopy.

Another postdoc in Jennifer’s lab, Rachid Sougrat, now at King Abdullah University of Science and Technology in Saudi Arabia, correlated the PALM images of cell organelles to electron micrographs to validate the new technique, yet another important contribution.

Upon hearing of Eric’s Nobel Prize, Jennifer told me: “We didn’t imagine at the time how quickly the point localization imaging would become such an amazing enabling technology; but it caught on like wildfire, expanding throughout many fields of biology.”

That it did! PALM and all its manifestations are at the heart of extraordinary discoveries.  We think this is a quintessential intramural story. We see the elements of high-risk/high-reward research and the importance of collaboration and the freedom to pursue ideas, as well as NIH scientists with the vision to encourage and support this research.

Read the landmark 2006 Science article by Eric, Harald, and the NICHD team, “Imaging Intracellular Fluorescent Proteins at Nanometer Resolution,” at

The story of the origins of Eric Betzig’s Nobel Prize in Jennifer Lippincott-Schwartz’s lab is one that needs to be told. I feel proud to work for an organization that can attract such talent and enable such remarkable science to happen.

Kudos to Eric and to Jennifer and her crew.

Michael M. Gottesman

Deputy Director for Intramural Research

Nobel Prize in Physiology or Medicine

The Nobel Prize for Physiology or Medicine was awarded this morning to John O’Keefe and May-Brit Moser and Edward Moser for the discovery of place cells and grid cells, respectively. O’Keefe discovered in 1971 that there were cells in the hippocampus that fired when a rat was in a certain location. He called these place cells and a whole generation of scientists, including the Mosers, have been studying them ever since then. In 2005, the Mosers discovered grid cells in the entorhinal cortex, which feed into the hippocampus. Grid cells fire whenever rats pass through periodically spaced intervals in a given area such as a room, dividing the room into a triangular lattice. Different grid cells have different frequencies, phases and orientations.

For humans, the hippocampus is an area of the brain known to be associated with memory formation. Much of what we know about the hippocampus in humans was learned by studying Henry Molaison, known as H.M. in the scientific literature, who had both of his hippocampi removed as a young man because of severe epileptic fits. H.M. could carry on a conversation but could not remember any of it if he was distracted. He had to be re-introduced to the medical staff that treated and observed him every day. H.M. showed us that memory comes in at least three forms. There is very short term or working memory, necessary to carry a conversation or remember a phone number long enough to dial it. Then there is long term explicit or declarative memory for which the hippocampus is essential. This is the memory of episodic events in your life and random learned facts about the world. People without a hippocampus, as depicted in the film Memento, cannot form explicit memories. Finally, there is implicit long term memory, such as how to ride a bicycle or use a pencil. This type of memory does not seem to require the hippocampus as evidenced by the fact that H.M. could become more skilled at certain games that he was taught to play daily even though he professed to never having played the game each time. The implication of the hippocampus for spatial location for humans is more recent. There was the famous study that showed London cab drivers had an enlarged hippocampus compared to controls and neural imaging has now shown something akin to place fields in humans.

While the three new laureates are all excellent scientists and deserving of the prize, this is still another example of how the Nobel prize singles out individuals at the expense of other important contributors. O’Keefe’s coauthor on the 1971 paper, Jonathan Dovstrosky, was not awarded. I’ve also been told that my former colleague at the University of Pittsburgh, Bill Skaggs, was the one who pointed out to the Mosers that the patterns in their data corresponded to grid cells. Bill was one of the most brilliant scientists I have known but did not secure tenure and is not directly involved in academic research anymore as far as I know. The academic system should find a way to maximize the skills of people like Bill and Douglas Prasher.

Finally, the hype surrounding the prize announcement is that the research could be important for treating Alzheimer’s disease, which is associated with a loss of episodic memory and navigational ability. However, if we use the premise that there must be a neural correlate of anything an animal can do, then place cells must necessarily exist given that rats have the ability to discern spatial location. What we did not know was where these cells are and O’Keefe showed us that it is in the hippocampus but we could have also associated the hippocampus with the memory loss of Alzheimer’s disease from H.M. The existence of grid cells is perhaps less obvious since it is not inherently obvious that we can naturally divide a room into a triangular lattice. It is plausible that grid cells do the computation giving rise to place cells but we still need to understand the computation that gives rise to grid cells. It is not obvious to me that grid cells are easier to compute than place cells.

Tim’s Vermeer

Jan Vermeer has been one of my favourite painters ever since I saw his famous “The Girl with the Pearl Earring” painting that was on display in Toronto in the 1980’s. I’ve been on a quest to see all of his paintings although its been on hiatus for the past ten years. Here is the list of what I’ve seen so far (I have five left). You only need to stand in front of a Vermeer for a few seconds to be mesmerized. I stood in front of “The Music Lesson” in Buckingham Palace for at least an hour. The guard started joking with me because I was so transfixed. This is why I’ve been intrigued by recent suggestions by artist David Hockney and others that some great old masters like Vermeer and van Eyck may have used optical aids like camera obscura. Well, inventor Tim Jenison has taken this theory to another level by attempting to completely recreate Vermeer’s Music Lesson using a set up of mirrors and lenses that he (re)invented. The endeavor is documented in the film Tim’s Vermeer directed by Teller of Penn and Teller fame. Whether you believe the theory or not (I actually do and it doesn’t detract at all for my love of Vermeer), what this film does do so well is to show what dedication, thought, patience, and careful execution can accomplish. I got tired just watching him paint the threads in a Persian rug using his optical tool.

What is the difference between math, science and philsophy?

I’ve been listening to the Philosophy Bites podcast recently. One from a few years ago consisted of answers from philosopher’s to the question posed on the spot and without time for deep reflection: What is Philosophy? Some managed to give precise answers, but many struggled. I think one source of conflict they faced as they answered was that they didn’t know how to separate the question of what philosophers actually do from they should be doing. However, I think that a clear distinction between science, math and philosophy as methodologies can be specified precisely. I also think that this is important because practitioner’s in each subject should be aware of what methodology they are actually using and what is appropriate for whatever problem they are working on.

Here are my definitions: Math explores the consequences of rules or assumptions, science is the empirical study of measurable things, and philosophy examines things that cannot be resolved by mathematics or empiricism. With these definitions, practitioner’s of any discipline may use either math, science, or philosophy to help answer whatever question they may be addressing. Scientists need mathematics to work out the consequences of their assumptions and philosophy to help delineate phenomena. Mathematicians need science and philosophy to provide assumptions or rules to analyze. Philosophers need mathematics to sort out arguments and science to test hypotheses experimentally.

Those skeptical of philosophy may suggest that anything that cannot be addressed by math or science has no practical value. However, with these definitions, even the most hardened mathematician or scientist may be practicing philosophy without even knowing it. Atheists like Richard Dawkins should realize that part of their position is based on philosophy and not science. The only truly logical position to take with respect to God is agnosticism. It may be probable that there is not a God that intervenes directly in our lives and that probability may be high but it is not a provable fact. To be an atheist is to put some cutoff on the posterior probability for the existence of God and that cutoff is based on philosophy not science.

While most scientists and mathematicians are cognizant that moral issues may be pertinent to their work (e.g. animal experimentation), they may be less cognizant of what I believe is an equally important philosophical issue , which is the ontological question. Ontology is a philosophical term for the study of what exists. To many pragmatically minded people, this may sound like an ethereal topic (or worse adjective) that has no place in the hard sciences. However, as I pointed out in an earlier post, we can put labels on at most a countably infinite number of things out of an uncountable number of possibilities and for most purposes, our ontological list of things is finite. We thus have to choose and although some of these choices are guided by how we as human agents interact with the world, others will be arbitrary. Determining ontology will involve aspects of philosophy, science and math.

Mathematicians face the ontological problem daily when they decide on what areas to work in and what theorems to prove. The possibilities in mathematics are infinite so it is almost certain that if we were to rerun history some if not many fields would not be reinvented. While scientists may have fewer degrees of freedom to choose from they are also making choices and these choices tend to be confined by history. The ontological problem shows up anytime we try to define a phenomenon. The classification of cognitive disorders is a pure exercise in ontology. Authors of the DSM IV have attempted to be as empirical and objective as possible but there is still plenty of philosophy in their designations of psychiatric conditions. While most string theorists accept that their discipline is mostly mathematical, they should also realize that it is very philosophical. A theory of everything includes the ontology by definition.

Subjects traditionally within the realm of philosophy also have mathematical and scientific aspects. Our morals and values have certainly been shaped by evolution and biological constraints. We should completely rethink our legal philosophy based on what we now know about neuroscience (e.g. see here). The same goes for any discussion of consciousness, the mind-body problem, and free will. To me the real problem with free will isn’t whether or not it exists but rather who or what exactly is exercising that free will and this can be looked at empirically.

So next time when you sit down to solve a problem, think about whether it is one of mathematics, science or philosophy.

What counts as science?

Ever since the financial crisis of 2008 there has been some discussion about whether or not economics is a science. Some, like Russ Roberts of Econtalk, channelling Friedrich Hayek, do not believe that economics is a science. They think it’s more like history where we come up with descriptive narratives that cannot be proven. I think that one thing that could clarify this debate is to separate the goal of a field from its practice. A field could be a science although its practice is not scientific.

To me what defines a science is whether or not it strives to ask questions that have unambiguous answers. In that sense, most of economics is a science. We may never know what caused the financial crisis of 2008 but that is still a scientific question. Now, it is quite plausible that the crisis of 2008 had no particular cause just like there is no particular cause for a winter storm. It could have been just the result of a collection of random events but knowing that would be extremely useful. In this sense, parts of history can also be considered to be a science. I do agree that the practice of economics and history are not always scientific and can never be as scientific as a field like physics because controlled experiments usually cannot be performed. We will likely never find the answer for what caused World War I but there certainly was a set of conditions and events that led to it.

There are parts of economics that are clearly not science such as what constitutes a fair system. Likewise in history, questions regarding who was the best president or military mind are certainly  not science. Like art and ethics these questions depend on value systems. I would also stress that a big part of science is figuring out what questions can be asked. If it is true that recessions are random like winter storms then the question of when the next crisis will hit does not have an answer. There may be a short time window for some predictability but no chance of a long range forecast. However, we could possibly find some necessary conditions for recessions just like cold weather is necessary for a snow storm.

The cost of the shutdown and sequester

People may be wondering how the US government shutdown is affecting the NIH. I can’t speak for the rest of the institutes but I was instructed to not come to work and to not use my NIH email account or NIH resources. Two new fellows, who were supposed to begin on Oct 1, now have to wait and they will not be compensated for the missed time even if Congress does decides to give back pay to the furloughed employees. I really was hoping for them to start in August or September but that was pushed back because of the Sequester (have people forgotten about that?), which cut our budgets severely. In fact, because of the Sequester, I wasn’t able to hire one fellow because the salary requirements for their seniority exceeded my budget. We were just starting to get some really interesting psychophysics results on ambiguous stimuli but that had to be put on hold because we couldn’t immediately replace fellow Phyllis Thangaraj, who was running the experiments and left this summer to start her MD/PhD degree at Columbia. Now it will be delayed even further. I have several papers in the revision process that have also been delayed by the shutdown. All travel has been cancelled and I heard that people at conferences were ordered to return immediately, including those who were on planes on Oct 1. My quadrennial external review this week has now been postponed. All the flights for the committee and ad hoc members have to be cancelled and we now have to find another date where 20 or more people can agree on. All NIH seminars and the yearly NIH research festival has been cancelled. I was supposed to review an external NIH research proposal this week and that has been postponed indefinitely along with all other submitted proposals awaiting review. Academic labs, students and postdocs depending on their NIH grants this fiscal year will be without funding until the government is reopened. Personally, I will probably come out of this reasonably intact. However, I do worry how this will affect young people, who are the future.