Technology and inference

In my previous post, I gave an example of how fake news could lead to a scenario of no update of posterior probabilities. However, this situation could occur just from the knowledge of technology. When I was a child, fantasy and science fiction movies always had a campy feel because the special effects were unrealistic looking. When Godzilla came out of Tokyo Harbour it looked like little models in a bathtub. The Creature from the Black Lagoon looked like a man in a rubber suit. I think the first science fiction movie that looked astonishing real was Stanley Kubrick’s 1968 masterpiece 2001: A Space Odyssey, which adhered to physics like no others before and only a handful since. The simulation of weightlessness in space was marvelous and to me the ultimate attention to detail was the scene in the rotating space station where a mild curvature in the floor could be perceived. The next groundbreaking moment was the 1993 film Jurassic Park, which truly brought dinosaurs to life. The first scene of a giant sauropod eating from a tree top was astonishing. The distinction between fantasy and reality was forever gone.

The effect of this essentially perfect rendering of anything into a realistic image is that we now have a plausible reason to reject any evidence. Photographic evidence can be completely discounted because the technology exists to create completely fabricated versions. This is equally true of audio tapes and anything your read on the Internet. In Bayesian terms, we now have an internal model or likelihood function that any data could be false. The more cynical you are the closer this constant is to one. Once the likelihood becomes insensitive to data then we are in the same situation as before. Technology alone, in the absence of fake news, could lead to a world where no one ever changes their mind. The irony could be that this will force people to evaluate truth the way they did before such technology existed, which is that you believe people (or machines) that you trust through building relationships over long periods of time.

Fake news and beliefs

Much has been written of the role of fake news in the US presidential election. While we will never know how much it actually contributed to the outcome, as I will show below, it could certainly affect people’s beliefs. Psychology experiments have found that humans often follow Bayesian inference – the probability we assign to an event or action is updated according to Bayes rule. For example, suppose $P(T)$ is the probability we assign to whether climate change is real; $P(F) = 1-P(T)$ is our probability that climate change is false. In the Bayesian interpretation of probability, this would represent our level of belief in climate change. Given new data $D$ (e.g. news), we will update our beliefs according to

$P(T|D) = \frac{P(D|T) P(T)}{P(D)}$

What this means is that our posterior probability or belief that climate change is true given the new data, $P(T|D)$, is equal to the probability that the new data came from our internal model of a world with climate change (i.e. our likelihood), $P(D|T),$ multiplied by our prior probability that climate change is real, $P(T),$ divided by the probability of obtaining such data in all possible worlds, $P(D)$. According to the rules of probability, the latter is given by $P(D) = P(D|T)P(T) + P(D|F)P(F)$, which is the sum of the probability the data came from a world with climate change and that from one without.

This update rule can reveal what will happen in the presence of new data including fake news. The first thing to notice is that if $P(T)$ is zero, then there is no update. In this binary case, this means that if we believe that climate change is absolutely false or true then no data will change our mind. In the case of multiple outcomes, any outcome with zero prior (has no support) will never change. So if we have very specific priors, fake news is not having an impact because no news is having an impact. If we have nonzero priors for both true and false then if the data is more likely from our true model then our posterior for true will increase and vice versa. Our posteriors will tend towards the direction of the data and thus fake news could have a real impact.

For example, suppose we have an internal model where we expect the mean annual temperature to be 10 degrees Celsius with a standard deviation of 3 degrees if there is no climate change and a mean of 13 degrees with climate change. Thus if the reported data is mostly centered around 13 degrees then our belief of climate change will increase and if it is mostly centered around 10 degrees then it will decrease. However, if we get data that is spread uniformly over a wide range then both models could be equally likely and we would get no update. Mathematically, this is expressed as – if $P(D|T)=P(D|F)$ then $P(D) = P(D|T)(P(T)+P(F))= P(D|T)$. From the Bayesian update rule, the posterior will be identical to the prior. In a world of lots of misleading data, there is no update. Thus, obfuscation and sowing confusion is a very good strategy for preventing updates of priors. You don’t need to refute data, just provide fake examples and bury the data in a sea of noise.

Code platform update

It’s a week away from 2015, and I have transitioned completely away from Matlab. Julia is winning the platform attention battle. It is very easy to code in and it is very fast. I just haven’t gotten around to learning python much less pyDSTool (sorry Rob). I kind of find the syntax of Python (with all the periods between words) annoying. Wally Xie and I have also been trying to implement some of our MCMC runs in Stan but we have had trouble making it work.  Our model requires integrating ODEs and the ODE solutions from Stan (using our own solver) do not match our Julia code or the gold standard XPP. Maybe we are missing something obvious in the Stan syntax but our code is really very simple. Thus, we are going back to doing our Bayesian posterior estimates in Julia. However, I do plan to revisit Stan if they (or we) can write a debugger for it.

MCMC for linear models

I’ve been asked in a comment to give a sample of pseudo code for an MCMC algorithm to fit a linear model $ax + b$ to some data. See here for the original post on MCMC. With a linear model, you can write down the answer in closed form (see here), so it is a good model to test your algorithm and code.  Here it is in pseudo-Julia code:

#  initial guess for parameters a and b a=0b=0
# construct chi squared, where D is the data vector and x is the vector of the # independent quantity
chi = norm(D - (a*x +b))^2;
for n = 1 : total;# Make random guesses for new normally distributed a and b with mean old a and b # and standard deviation asig and bsig
at = a + asig * randn()bt = b + bsig * randn()

chit = norm(D - (at*x + bt))^2;
# Take ratio of likelihoods, sigma is the data uncertaintyratio=exp((-chit + chi)/(2*sigma^2));
# Compare the ratio to a uniform random number between 0 and 1, # keep new parameters if ratio exceeds random numberif rand() < ratio
a = at;b = bt;
chi = chit;
endend# Keep running until convergence

Big Data backlash

I predicted that there would be an eventual push back on Big Data and it seems that it has begun. Gary Marcus and Ernest Davis of NYU had an op-ed in the Times yesterday outlining nine issues with Big Data. I think one way to encapsulate many of the critiques is that you will never be able to do true prior free data modeling. The number of combinations in a data set grows as the factorial of the number of elements, which grows faster than an exponential. Hence, Moore’s law can never catch up. At some point, someone will need to exercise some judgement in which case Big Data is not really different from the ordinary data that we deal with all the time.

The myth of the single explanation

I think one of the things that tends to lead us astray when we try to understand complex phenomena like evolution, disease, or the economy, is that we have this idea that they must have a single explanation. For example, recently two papers have been published in high profile journals trying to explain mammal monogamy. Although monogamy is quite common in birds it only occurs in 5% of mammals. Here is Carl Zimmer’s summary.  The study in Science, which surveyed 2545 mammal species, argued that monogamy arises when females are solitary and sparse. Males must then commit to one since dates are so hard to find. The study in PNAS examined 230 primate species, for which monogamy occurs at the higher rate of 27%, and used Bayesian inference to argue that monogamy arises to prevent male infanticide. It’s better to help out at home rather than go around killing other men’s babies. Although both of these arguments are plausible, there need not be a single universal explanation. Each species could have its own set of circumstances that led to monogamy involving these two explanations and others. However, while we should not be biased towards a single explanation, we shouldn’t also throw up our hands like Hayek and argue that no complex phenomenon can be understood. Some phenomena will have simpler explanations than others but since the Kolmogorov complexity is undecidable there is no algorithm that can tell you which is which. We will just have to struggle with each problem as it comes.

Talk at GRC

I’m currently in Mt. Snow, Vermont to give a talk at the Gordon Research Conference on Computer Aided Drug Design. Yes, I know nothing about drug design. I am here because the organizer, Anthony Nicholls, asked me to give a pedagogical talk on Bayesian Inference. My slides are here. I only arrived yesterday but the few talks I’ve seen have been quite interesting. One interesting aspect of this conference is that many of the participants are from industry. The evening sessions are meant to be of more general interest. Last night were two talks about how to make science more reproducible. As I’ve posted before, many published results are simply wrong. The very enterprising Elizabeth Iorns has started something called the Reproducibility Initiative. I am not completely clear about how it works but it is part of another entity she started called Science Exchange, which helps to facilitate collaborations with a fee-for-service model. The Reproducibility Initiative piggy backs on Science Exchange by providing a service (for a fee) to validate any particular result. Papers that pass approval get a stamp of approval. It is expected that pharma would be interested in using this service so they can inexpensively check if possible drug targets actually hold up. Many drugs fail at phase three of clinical trials because they’ve been shown to be ineffective and this may be due to the target being wrong to start with.

On a final note, I flew to Albany and drove here. Unlike in the past when I would have printed out a map, I simply assumed that I could use Google Maps on my smart phone to get here. However, Google Maps doesn’t really know where Mt. Snow is. It tried to take me up a dirt road to the back of the ski resort. Also, just after I turned up the road, the phone signal disappeared so I was blind and had no paper backup. I was suspicious that this was the right way to go so I turned back to the main highway in hopes of finding a signal or a gas station to ask for directions. A few miles down Route 9, I finally did get a signal and also found a sign that led me the way. Google Maps still tried to take me the wrong way. I should have followed what I always tell my daughter – Always have a backup plan.