# More on why most results are wrong

Science writer Jonah Lehrer has a nice article in the New Yorker on the “Decline Effect”, where the significance of many published results, mostly but not exclusively in clinical and psychological fields, tends to decline with time or disappear entirely.   The article cites the work of John Ioannidis, which I summarized here, on why most published results are false.  The article posits several explanations that basically boil down to selection bias and multiple comparisons.

I believe this problem stems from the fact that the reward structures in science are biased towards positive results and the more surprising and hence unlikely the result, the greater the impact and attention.   Generally, a result is deemed statistically significant if the probability of that result arising by random chance (technically if the null hypothesis were true) is less than 5%.  By this fact alone, 5% of published results are basically noise.  However, the actual amount is much higher because of selection bias and multiple comparisons.

In order to estimate significance, a sample set must be defined and it is quite simple for conscious and unconscious selection bias to arise when deciding on what data is to be included. For example, in clinical trials, some subjects or data points will be excluded for various reasons and this could bias the results towards a positive result.   Multiple comparison may even be harder to avoid.  For every published result of an investigator, there are countless “failed” results. For example, suppose you want to show pesticides cause cancer and you test different pesticides until one shows an effect.  Most likely you will assess significance only for that particular pesticide, not for all the others that didn’t show an effect. In some sense, to be truly fair, one should include all experiments that one has ever conducted when assessing significance with the odd implication that the criterion for significance will become more stringent as you age.

This is a tricky thing.  Suppose you do a study and measure ten things but you blind yourself to the results.  You then pick one of the items and test for significance.  In this case, have you done ten measurements or one?  The criterion for significance will be much more stringent if  it is the former. However, there are probably a hundred other things you could have measured but didn’t.  Should significance be tested against a null hypothesis that includes all potential events including those you didn’t measure?

I think the only way this problem will be solved is with an overhaul of the way science is practiced.  First of all, negative results must be taken as seriously as positive ones.  In some sense, the results of all experiments need to be published or at least made public in some database.  Second, the concept of statistical significance needs to be abolished.   There cannot be some artificial dividing line between significance and nonsignificance.  Adopting a Bayesian approach will help.  People will just report the probability that the result is true given some prior and likelihood.  In fact, P values used for assessing significance could be easily converted to Bayesian probabilities.   However, I doubt very much these proposals will be adopted any time soon.

# The power of small

Here’s a nice story for Christmas in the New York Times. African villagers can now get electrical power using cheap solar cell units. It can make a significant difference for their lives.

# Arsenic and old life

The big science news last week was the announcement and publication in Science that a strain of bacteria that lives on arsenic instead of phosphorous was discovered.  Arsenic, which appears below phosphorous in the periodic table, is toxic to most life forms mostly because it is chemically similar to phosphorous.  It had thus been postulated that there could be life forms that utilize arsenic instead of phosphorous.  In fact, astrophysicist Paul Davies had long suggested that a proof of principle of the possibility of alien life could be obtained by finding an alternative form of life  on earth.  The new bacterium comes from Mono Lake in California, which is very rich in arsenic.  The authors put some samples from the lake into a medium rich in arsenic but devoid of phosphorous to see what would grow and found a strain that grew robustly.  They then found that arsenic was actually incorporated into the proteins and DNA within the cells.  In a post from five years ago,  I  speculated that we might find some new organism living  on toxic waste some day although this cell is probably of ancient origin.  However, there has been strong criticisms of the paper since the announcement.  For example see here. Hence, the jury may still be out on arsenic loving microbes.

# Talk at NYU

I was in New York yesterday and gave a talk at NYU in a joint Center for Neural Science and Courant Institute seminar. My slides are here. The talk is an updated version of the talk I gave before and summarized here.  The new parts include recent work on applying the model to Autism (see here) and some new work on resolving why mutual inhibition models of binocular rivalry do not reproduce Levelt’s fourth proposition, which states that as the contrast is decreased to both eyes, the dominance time of the percepts increases.  I will summarize the results of that work in detail when we finish the paper.

# The population conundrum

The world’s population is nearing 7 billion and will perhaps hit 9 billion by 2050.  In a previous post,  I estimated that the earth could feed up to 15 billion based on the amount of arable land and current farm yields.  Ever since Thomas Malthus, people have predicted that we would eventually reach saturation resulting in massive famine and global unrest.  However, technology  keeps coming along to make farming more and more efficient, pushing off the Malthusian crisis into the future.  The green revolution led by Norman Borlaug, saved over a billion people from starvation in the mid-twentieth century.  In fact, food production is so efficient now that it has led to an obesity epidemic in the developed world (e.g. see here).

The question then is what is a sustainable population.  Currently, food production requires a lot of fossil fuels, which comes with its own set of issues.  Scientific American recently published an article arguing that phosphorous, one of the three components of fertilizer, with nitrogen and potassium, may run out by the end of this century.  Obviously, our current means of food production is not sustainable indefinitely.  I think our situation is like the person walking across a railroad bridge with a train bearing down on her. She can either run towards the train or away from it to get off the bridge.  Depending on the speed of the train and how fast she can run, there is a critical point on the bridge where running in one of the directions is optimal.  For us, going backwards away from the train is to try to reduce population growth and try to find a sustainable level.  Running towards the train is to rely on technological progress to  increase food production.  Given that good ideas seem to grow linearly with the population and possibly slower (e.g. see here), going towards the train actually means we should keep growing as fast as we can and hope that another Norman Borlaug comes along.  Where we are on that bridge is anybody’s guess.

# The cost of commuting

It is about 45 miles (70 km) from Baltimore to the NIH campus in Bethesda, MD.  If I were to travel the entire distance using public transit it would cost over 20 dollars for a return trip (one way bus fare in Baltimore is $1.60, commuter rail (Marc train) fare is$7.00, and Metro fare in DC is $3.65 ($3.85 during peak hours)).  That amounts to over $100 per week and$5000 per year.  If I bought a  monthly rail pass, then I could cut the cost down by 75% or so.  Now if instead I were to drive everyday,  ninety miles per day is equivalent to 22,500 miles per year.  A car that could travel 30 miles per gallon of gasoline would use 750 gallons a year.  At the current price of $3 per gallon, this would be$2250 per year.  If I drive my car for ten years and it cost twenty thousand dollars then that is an additional $2000 per year. Insurance, fees, maintenance and repairs probably costs another$2000 per year so driving would cost about $6000 per year. If I drove a cheaper and more efficient car then I could bring this cost down to$5000 per year.  Thus, driving is economically competitive with public transit.  Add in the fact that I would own a car anyway even if I didn’t use it to commute to work and driving is the less expensive choice.

How is this possible?  Well one cost that I didn’t account for is parking.  The NIH happens to have a large campus where parking is nominally free.  Although if I chose not to drive, I could receive a public transit subsidy of  up to $110 per month or$1320 per year.  If the NIH were located in downtown Washington DC, parking could cost over $400 per month or$5000 per year.  So the real reason driving is competitive with public transit is because parking is subsidized.  If I  worked in an urban center  where parking is expensive then driving would be much more expensive than public transit.  Driving is further subsidized because roads and highways are funded by tax dollars while the cost of maintaining transit stations and tracks are only partially funded by taxes.  If transportation infrastructure were publically funded or if subsidies for roads and parking did not exist then public transit would be the prohibitive cost effective option.

# Bayesian parameter estimation

This is the third post on Bayesian inference.  The other two are here and here. This probably should be the first one to read if you are completely unfamiliar with the topic.  Suppose you are trying to model some system and you have a model that you want to match some data.  The model could be a function, a set of differential equations, or anything with parameters that can be adjusted.  To make this concrete, consider this classic differential equation model of the response of glucose to insulin in the blood:

$\dot G = -S_I X G - S_G(G-G_b)$

$\dot X = c_X[I(t) - X -I_b]$

where $G$ is glucose concentration, $I$ is insulin, $X$ is the action of insulin in some remote compartment, and there are five free parameters $S_I, S_G, G_b, c_X, I_b$.  If you include the initial values of $G$ and $X$ there are seven free parameters.  The data consist of measurements of glucose and insulin at discrete time points $\{t_1,t_2,\dots\}$.  The goal is to find a set of free parameters so that the model fits the data at those time points.

# High energy physics

I was asked a while ago what I thought of the Large Hadron Collider at CERN.  Although I’ve been critical of high energy physics in the past (see for example here), I strongly support the LHC and think it is a worthwhile endeavor.  My reason is because I think it will be important for future technology.  By this I don’t just mean spin offs, like the World Wide Web, which was invented at CERN by Tim Berners Lee.  What I mean is that knowledge gained at the high energy scale could be useful for saving the human race one day.

Let me elaborate.  My criticism of high energy or particle physics in the past was mostly because of the claim that it was more “fundamental” than other areas of science like condensed matter physics or psychology.  Following noble laureate Philip Anderson’s famous article “More is Different” (Science 177:393-396, 1971), what is fundamental to me is a matter of perspective.  For example, the fact that I can’t find a parking spot at the mall a week before Christmas is not because of particle physics but because of the pigeonhole principle, (i.e. if you have more things than boxes, then if you try to put the things into the boxes at least one box must contain more than one thing).  This is as fundamental to me as any high energy theory.  The fact that you can predict an election using polling data from a small sample of the electorate is because of  the central limit theorem, (i.e. the sum of a bunch of random events tends to obey a normal distribution), and is also independent of what particles that comprise the electorate.  Ironically, the main raison d’etre of the LHC is to look for the Higgs boson, which is thought to give  masses to some subatomic particles.  The Higgs mechanism is based on the idea of spontaneous symmetry breaking, which came from none other than Phil Anderson who was studying properties of magnets.

So how could high energy physics be pertinent to our existence some day?  Well, some day in the very distant future the sun will expand into a red giant and swallow the earth.  If humans, or whatever our descendants will be called, are to survive they are going to need to move.  This will take space faring technology that could rely on some yet unknown principle of high energy physics that could be discovered by the LHC.  And in the very, very distant future the universe will end either in a big crunch or by expanding so much that matter won’t be able to persist.  If and when that time comes and life forms still exist, then to survive they’ll have to figure out how to “tunnel” into a new universe or new existence.  This will take real science fiction-like stuff that will likely depend on knowledge of high energy physics.  So although high energy physics does not hold a monopoly on fundamental concepts, it may still be absolutely necessary for life saving future technology.

# Biomass

Since the rise of human civilization,  life forms larger than 10 centimeters to a metre have been systematically culled or eliminated from the ecosystem.  Almost all land megafauna that used to roam wildly a few thousand or even hundred years ago are either extinct or reside in small numbers in protected parks and reserves. Macroscopic sized sea creatures that were reasonably plentiful just two or three decades ago may all disappear shortly.  In that mean time the population of  humans and domesticated plants and animals have exploded.

So, has there been a net gain or loss of total biomass?   I think the conventional wisdom would be that we have replaced large tracts of forest with pavement, lawns and farmland, which would seem like a huge net loss of biomass.  However, we have added extra nutrients (i.e. fertilizer) and carbon (i.e. fossil fuels) into the system. The energy flux from the sun has also not changed significantly in the last millennium.  Hence, the capacity to support life has probably not changed or maybe has even increased. Removing, all of the large wild animals may also create more opportunities for small animals.  Perhaps there are more small and microscopic creatures then there would have been had humans not existed.  I have no idea what the answer is.

# Paulos in the Times

Mathematician John Allen Paulos, author of Innumeracy and other popular books on math, has a beautifully written column in the New York Times.  He articulates a dichotomy, which most people probably have never thought of,  between stories and statistics.  Here is a small excerpt from the article:

Despite the naturalness of these notions, however, there is a tension between stories and statistics, and one under-appreciated contrast between them is simply the mindset with which we approach them. In listening to stories we tend to suspend disbelief in order to be entertained, whereas in evaluating statistics we generally have an opposite inclination to suspend belief in order not to be beguiled. A drily named distinction from formal statistics is relevant: we’re said to commit a Type I error when we observe something that is not really there and a Type II error when we fail to observe something that is there. There is no way to always avoid both types, and we have different error thresholds in different endeavors, but the type of error people feel more comfortable may be telling. It gives some indication of their intellectual personality type, on which side of the two cultures (or maybe two coutures) divide they’re most comfortable.

People who love to be entertained and beguiled or who particularly wish to avoid making a Type II error might be more apt to prefer stories to statistics. Those who don’t particularly like being entertained or beguiled or who fear the prospect of making a Type I error might be more apt to prefer statistics to stories. The distinction is not unrelated to that between those (61.389% of us) who view numbers in a story as providing rhetorical decoration and those who view them as providing clarifying information.

I highly recommend reading the whole article.

# The Genographic Project

National Geographic is conducting a research project (The Genographic Project) to analyze historical genome patterns  by sampling DNA from people all over the world. The main aim is to sample from various indigenous peoples from around the world but the public can participate as well. The website for the project is here.  There is a charge for a participation kit, which is used to defray the costs of the study.  You can have your mitochondrial DNA, which follows your maternal lineage, or if you are a male, the Y-chromosome, which follows your paternal line analyzed.  I  recently did the test for my Y-chromosome and I am a member of Haplogroup O with the M175 marker.  My earliest male ancestor emerged roughly 50,000 years ago in Africa and is the common ancestor of every non-African male alive today.  The man with the M175 marker emerged about 35,000 years ago during the ice age somewhere in Central or East Asia.  There were probably something like 100,000 Homo sapiens alive at that time.

# On tunnels and civilizations

One of the recurring themes in my posts is that seemingly irrational or self-defeating behavior has an underlying logic.  There are always trade offs, so if one behavior or tendency allows us to succeed in some aspect then it may impede in some other.  Paul Krugman laments in today’s New York Times the demise of a second rail tunnel between New Jersey and New York.  He’s been voicing his frustrations about our inability to make the correct choices to move forward for the past few years.  Below is a post I wrote in 2005 that gives my argument for why civilizations have finite life spans.  I think it is still relevant today.

Scientific Clearing House Sep 22, 2005:

Every time I feel kind of optimistic about the future, I think back to the Roman Empire and realize that it could all end pretty quickly. It may be no accident that civilizations tend to have finite lives and our brains may be responsible. Jared Diamond (in his book Collapse) posits a framework for a society’s demise but he basically believes it is some combination of bad decision making and management that leads to failure. I’m proposing that it may actually be embedded in how our brains work and how it reacts to success. What allows us to build great civilizations may ultimately be responsible for our undoing.

As has been written in countless columns and blogs, manufacturing, software development, clerical work and so forth is being or will soon be outsourced to an offshore location where labour costs are so much lower. Many have argued that the US can retain world dominance by remaining a source of innovation and ideas. However, Thomas Friedman and others have been screaming lately that the US is losing it’s lead in technology and science and American students are falling behind the rest of the world in technical subjects.

The reason is not just that we’ve become lazy or stupid. The Flynn effect shows that average IQ’s have actually been rising every generation and in the recent book Everything Bad is Good For You: How Today’s Popular Culture Is Actually Making Us Smarter, Steven Johnson argues that video games and popular culture are actually making us smarter. So why is it that we are becoming less intellectual even though we are getting smarter?

I think it is related to the fact that it takes effort to concentrate on something. This effort is not because we’re using more energy. Although it may seem that thinking hard burns more calories, there is in fact little evidence for this. So if there is no metabolic cost then why is it so difficult to think? The reason may be that the brain is a novelty machine that constantly seeks new stimuli. Advertising and marketing people know that they need to change a scene every 10 or 15 seconds in a commercial or people’s attention will be lost. Our brains are designed to wander and seek new stimuli. This constant novelty seeking probably helps in the early stages of a civilization where things need to be built and everyone sees open opportunities for growth.

As a civilization matures, it takes longer and longer for the citizens to acquire and digest the accumulated knowledge required just to keep it running much less advance it. Years of training is necessary before anyone can make a contribution. Given our current comfortable circumstances, there is little incentive to undertake such an ordeal when there are so many other distractions to occupy us. In the past, scholastic learning might have been the most cognitively stimulating thing one could engage in. Now, our lives are filled with leisure activities that are much more interesting and entertaining than what we learn in school. For every high school kid with his nose stuck in an analysis textbook, there are hundreds or thousands of other kids who are playing video games, surfing the web, reading a Harry Potter novel or solving a Sudoku puzzle.

Is there a way out? I’m pessimistic. While it is true that those on the cutting edge are doing very interesting and stimulating things, the journey to get there is so long and arduous that fewer and fewer are likely to take it. No matter how appealing you may make calculus or organic chemistry, they just will never be able to compete with the endless variety of distractions in modern society. There will still be an educated elite but there won’t be enough of them to keep the engine going.

The decline of the US could be very rapid. Even now, much of science and technology is being driven by foreigners. However, as the balance of power starts to shift overseas and the US remains xenophobic, that spigot could be shut off quickly. The incentive to come here will diminish and people may return to their native countries as things decline here accelerating the process.

It may be that the only hope for humanity is to maintain uneven economic development. If the entire world became comfortable simultaneously, it might completely collapse all at once. However, if the decline of the US is accompanied by the rise of China and India then at least some order in the world could be maintained. After a century or so, the US could rise again in a perpetual cycle of localized growth and decay.

# Scotch tape and flying frogs

This year’s Nobel prize in physics went to Andre Geim and Konstantin Novoselov for making single layer graphite or graphene using scotch tape.  However, Geim is also famous for having won an Ignoble prize for demonstrating diamagnetic levitation using a frog.  You can see a video of a flying frog and tomato hereSir Michael Berry of Berry’s phase fame and Geim wrote a paper demonstrating that diamagnetic but not paramagnetic objects can be levitated stably in a solenoidal magnetic field.  This was somewhat surprising because there is a theorem (Earnshaw’s theorem) that says you cannot suspend an object with fixed charges and magnets in any combination of static, magnetic, and gravitational fields.  The reason diamagnetic levitation works is because Earnshaw’s theorem does not apply to induced magnetism.

# Path Integral Methods for SDEs

I’ve just uploaded a review paper to arXiv on the use of path integral and field theory methods for solving stochastic differential equations.   The paper can be obtained here.  Most books on field theory and path integrals are geared towards applications in particle physics or statistical mechanics.  This paper shows how you can adapt these methods to solving everyday problems in applied mathematics and theoretical biology.  The nice thing about it is that they form an organized way to do perturbative expansions and explicitly compute quantities like moments.  The paper was originally written for a special issue of the journal Methods that fell through.  Our goal is to collate the papers intended for that issue into a book, which will include an expanded version of this paper.

# Tononi in the Times

The New York Times had a fun article on neuroscientist  Giulio Tononi last week.  Tononi is one of the most creative researchers in cognitive science right now.   Many of my views on consciousness, which I partly summarized here,  have been strongly influenced by his ideas.

Here is an excerpt from the article:

New York Times: Consciousness, Dr. Tononi says, is nothing more than integrated information. Information theorists measure the amount of information in a computer file or a cellphone call in bits, and Dr. Tononi argues that we could, in theory, measure consciousness in bits as well. When we are wide awake, our consciousness contains more bits than when we are asleep.

For the past decade, Dr. Tononi and his colleagues have been expanding traditional information theory in order to analyze integrated information. It is possible, they have shown, to calculate how much integrated information there is in a network. Dr. Tononi has dubbed this quantity phi, and he has studied it in simple networks made up of just a few interconnected parts. How the parts of a network are wired together has a big effect on phi. If a network is made up of isolated parts, phi is low, because the parts cannot share information.

But simply linking all the parts in every possible way does not raise phi much. “It’s either all on, or all off,” Dr. Tononi said. In effect, the network becomes one giant photodiode.

Networks gain the highest phi possible if their parts are organized into separate clusters, which are then joined. “What you need are specialists who talk to each other, so they can behave as a whole,” Dr. Tononi said. He does not think it is a coincidence that the brain’s organization obeys this phi-raising principle.

Dr. Tononi argues that his Integrated Information Theory sidesteps a lot of the problems that previous models of consciousness have faced. It neatly explains, for example, why epileptic seizures cause unconsciousness. A seizure forces many neurons to turn on and off together. Their synchrony reduces the number of possible states the brain can be in, lowering its phi.

Tononi is an NIH Pioneer Award winner this year and his talk this coming Thursday at 9:00 EDT will be webcast.  The whole slate of Pioneer Award winners are all quite impressive.

# Rethinking clinical trials

Today’s New York Times has a poignant article about the cold side of randomized clinical trials.  It describes the case of two cousins with melanoma and a promising new drug to treat it.  One cousin was given the drug and is still living while the other was assigned to the control arm of the trial and is now dead.  The new treatment seems to work better but the drug company and trial investigators want to complete the trial to prove that it actually extends life and that implies that the control arm patients need to die before the treatment arm patients.

Ever since my work on modeling sepsis a decade ago, I have felt that we need to come up with a paradigm for testing the efficacy of treatments.  Aside from the ethical concern of depriving a patient of a treatment just to get better statistics, I felt that we would hit a combinatorial limit where it would just be physically impossible to test a new generation of treatments.  Currently, a drug is tested in three phases before it is approved for use.  Phase I is a small trial that tests the safety of the drug in humans.  Phase II then tests for the efficacy of the drug in a larger group.  If the drug passes these two phases then it goes to Phase III, which is a randomized clinical trial with many patients and at multiple centers.   It takes a long time and a lot of money to make it through all of these stages.

# The push hypothesis for obesity

My blog post on the summary of my SIAM talk on obesity was picked up by Reddit.com.  There is also a story by mathematics writer Barry Cipra in SIAM news (not yet available online).  I thought I would explicitly clarify the “push” hypothesis here and reiterate that this is my opinion and not NIH policy.  What we had done previously was to derive a model of human metabolism that gives a prediction of how much you would weigh given how much you eat.  The model is fully dynamic and can capture how much you gain or lose weight depending on changes in diet or physical activity.  The parameters in the model have been calibrated with physiological measurements and validated in several independent studies of people undergoing weight change due to diet changes.

We then applied this model to the US population.  We used data from the National Health and Nutrition Examination Survey, which has kept track of the body weights of a representative sample of the US population for the past several decades and food availability data from the USDA.  Since the 1970’s, the average US body weight has increased linearly.  The US food availability per person has also increased linearly.  However, when we used the food availability data in the model, it predicted that the weight gain would grow linearly at a faster rate.  The USDA has used surveys and other investigative techniques to try to account for how much food is wasted.  If we calibrate the wastage to 1970 then we predict that the difference between the amount consumed and the amount available progressively increased from 1970 to 2005.  We interpreted this gap to be a progressive increase of food waste.  An alternative hypothesis would be that everyone burned more energy than the model predicted.

This also makes a prediction for the cause of the obesity epidemic although we didn’t make this the main point of the paper.  In order to gain weight, you have to eat more calories than you burn.  There are three possibilities for how this could happen: 1)  We could decrease energy expenditure by reducing physical activity and thus increase weight even if we ate the same amount of food as before,  2) There could be a pull effect where we became hungrier and start to eat more food, and 3)  There could be a push effect where we eat more food than we would have previously because of increased availability.  Now the data rules out hypothesis 1) since we assumed that physical activity stayed constant and still showed an increasing gap between energy intake and energy expenditure.  If anything, we may be exercising more than expected.  Hypothesis 2) would predict that the gap between intake and expenditure should fall and waste should decrease as we utilize more of the available food.  This then leaves us with hypothesis 3) where we are being supplied more food than we need to maintain our body weight and while we are eating some of this excess food, we are wasting more and more of it as well.

The final question, which is outside my realm of expertise, is why food supply increased. The simple answer is that food policy changed dramatically in the 1970’s. Earl Butz was appointed to be the US Secretary of Agriculture in 1971.  At that time food prices were quite high so he decided to change farm policy and vastly increase the production of corn and soybeans.  As a result, the supply of food increased dramatically and the price of food began to drop.   The story of Butz and the consequences of his policy shift is documented in the film King Corn.

# Talk at Pitt

I visited the University of Pittsburgh today to give a colloquium.  I was supposed to have come in February but my plane was cancelled because of a snow storm.  This was not the really big snow storm that closed Washington, DC and Baltimore for a week but a smaller one that hit New England and not the DC area.  My flight was on Southwest and I presume that they have such a tightly correlated flight system, where planes circulate around the country in a “just in time” fashion, that a disturbance in one part of the country affects the rest of the country.  So while other airlines just had cancellations in New England, Southwest flights were cancelled for the day all across the US.  It seems that there is a trade off between business efficiency and robustness.  I drove this time. My talk was on the finite size effects in the Kuramoto model, which I’ve given several times already.  However, I have revised the slides on pedagogical grounds and they can be found  here.