Probability of gun death

The tragedy in Oregon has reignited the gun debate. Gun control advocates argue that fewer guns mean fewer deaths while gun supporters argue that if citizens were armed then shooters could be stopped through vigilante action. These arguments can be quantified in a simple model of the probability of gun death, p_d:

p_d = p_gp_u(1-p_gp_v) + p_gp_a

where p_g is the probability of having a gun, p_u is the probability of being a criminal or  mentally unstable enough to become a shooter, p_v is the probability of effective vigilante action, and p_a is the probability of accidental death or suicide.  The probability of being killed by a gun is given by the probability of someone having a gun times the probability that they are unstable enough to use it. This is reduced by the probability of a potential victim having a gun times the probability of acting effectively to stop the shooter. Finally, there is also a probability of dying through an accident.

The first derivative of p_d with respect to p_g is p_u - 2 p_u p_g p_v + p_a and the second derivative is negative. Thus, the minimum of p_d cannot be in the interior 0 < p_g < 1 and must be at the boundary. Given that p_d = 0 when p_g=0 and p_d = p_u(1-p_v) + p_a when p_g = 1, the absolute minimum is found when no one has a gun. Even if vigilante action was 100% effective, there would still be gun deaths due to accidents. Now, some would argue that zero guns is not possible so we can examine if it is better to have fewer guns or more guns. p_d is maximal at p_g = (p_u + p_a)/(2p_u p_v). Thus, unless p_v is greater than one half then even in the absence of accidents there is no situation where increasing the number of guns makes us safer. The bottom line is that if we want to reduce gun deaths we should either reduce the number of guns or make sure everyone is armed and has military training.




Information content of the brain revisited

My post – The gigabit machine, was reposted on the web aggregator site recently.  Aside from increasing traffic to my blog by tenfold for a few days, the comments on reddit made me realize that I wasn’t completely clear in my post.  The original post was about a naive calculation of the information content in the brain and how it dwarfed the information content of the genome.  Here, I use the term information in the information theoretical sense, which is about how many bits must be specified to define a system.  So a single light switch that turns on and off has one bit of information while ten light switches have 10 bits.  If we suppose that the brain has about 10^{11} neurons, with about 10^4 connections each, then there are 10^{15} total connections.  If we make the very gross assumption that each connection can be either “on” or “off”, then we arrive at 10^{15} bits.  This would be a lower bound on the amount of information required to specify the brain and it is already a really huge number.  The genome has 3 billion bases and each base can be one of four types or two bits, so this gives a total of 6 billion bits.  Hence, the information contained in the genome is just rounding noise compared to the potential information contained in the brain.  I then argued that education and training was insufficient to make up this shortfall and that most of the brain must be specified by uncontrolled events.

The criticism I received in the comments on reddit was that this doesn’t imply that the genome did not specify the brain. An example that was brought up was the Mandelbrot set where highly complex patterns can arise from a very simple dynamical system.  I thought this was a bad example because it takes a countably infinite amount of information to specify the Mandelbrot set but I understood the point which is that a dynamical system could easily generate complexity that appears to have higher information content.  I even used such an argument to dispel the notion that the brain must be simpler than the universe in this post.  However, the key point is that the high information content is only apparent; the actual information content of a given state is no larger than that contained in the original dynamical system and initial conditions.   What this would mean for the brain is that the genome alone could in principle set all the connections in the brain but these connections are not independent.  There would be correlations or other high order statistical relationships between them.  Another way to say this is that while in principle there are 2^{10^{15}} possible brains, the genome can only specify 2^{6\times10^{9}} of them, which is still a large number.  Hence, I believe that the conclusions of my original post still hold – the connections in the brain are either set mostly by random events or they are highly correlated (statistically related).

Evolution of overconfidence

A new paper on the evolution of overconfidence (arXiv:0909.4043v2) will appear shortly in Nature. (Hat tip to J.L. Delatre). It is well known in psychology that people generally overvalue themselves and it has always been a puzzle as to why.  This paper argues that under certain plausible conditions, it may have been evolutionarily advantageous to be overconfident.  One of the authors is James Fowler who has garnered recent fame for claiming with Nicholas Christakis that medically noninfectious phenomena such as obesity and divorce are socially contagious.  I have always been skeptical of these social network results and it seems like  there has been some recent push back.  Statistician and blogger Andrew Gelman has a summary of the critiques here.  The problem with these papers  fall in line with the same problems of many other clinical papers that I have posted on before (e.g. see here and here).  The evolution of overconfidence paper does not rely on statistics but on a simple evolutionary model.

The model  considers competition between two parties for some scarce resource.  Each party possess some heritable attribute and the one with the higher value of that attribute will win a contest and obtain the resource.   The model allows for three outcomes in any interaction: 1) winning a competition and obtaining the resource with value W-C (where C is the cost of competing), 2) claiming the resource without a fight with value W, and 3) losing a competition with a value -C.    The parties assess their own and their opponents attributes before deciding to compete.  If both parties had perfect information, participating in a contest would be unnecessary.  Both parties would realize who would win and the stronger of the two would claim the prize. However,  because of the error and biases in assessing attributes, resources will be contested. Overconfidence is represented as a positive bias in assessing oneself.  The authors chose a model that was simple enough to explicitly evaluate the outcomes of all possible situations and show that when the reward for winning is sufficiently large compared to the cost, then overconfidence is evolutionarily stable.

Here I will present a simpler toy model of why the result is plausible. Let P be the probability that a given party will win a competition on average and let Q be the probability that they will engage in a competition. Hence, Q is a measure of overconfidence.  Using these values, we can then compute the expectation value of an interaction:

E = Q^2P (W-C) + Q(1-Q) W - Q^2(1-P) C

(i.e. the probability of a competition and winning is Q^2P, the probability of  winning and not having to fight is Q(1-Q), the probability of  losing a competition is Q^2(1-P), and it doesn’t cost anything to not compete.)  The derivative of E with respect to Q is

E' = 2 QP(W-C) + (1-2Q)W-2Q(1-P)C=2Q[(1-P)W-C]+W

Hence, we see that if (1-P)W > C, i.e. the reward of winning sufficiently exceeds the cost of competing, then the expectation value is guaranteed to increase with increasing confidence. Of course this simple demonstration doesn’t prove that overconfidence is a stable strategy but it does affirm Woody Allen’s observation that “95% of life is just showing up.”

Productivity and ability

What makes some people more productive then others?  Is it innate ability, better training, hard work?  Although the meaning of productivity is subjective,  there are quantifiable differences between researchers in measures of productivity such as the  h-index.    Here I will argue that a small difference in ability or efficiency can lead to great differences in output.

Let’s consider a simple and admittedly flawed model of productivity.  Suppose we consider productivity to be the number of tasks you can complete and let P represent the probability that you can accomplish a  task (i.e. efficiency).  A task could be anything from completing an integral, to writing a program, to sticking an electrode into a cell, or to finishing a paper.  The probability of completing N independent tasks is T=P^N.  Conversely, the number of steps that can be completed with probability T is N = \log T/\log P.  Now let P = 1-\epsilon, where \epsilon is the failure probability.  Hence, for high efficiency (i.e. low failure rate),  we can expand the logarithm for small \epsilon and obtain N \propto \epsilon^{-1}.  The number of tasks you can complete for a given probability  is inversely proportional to your failure rate.

The rate of change in productivity with respect to efficiency increases even faster with

\frac{dN}{d P}\propto \epsilon^{-2}

Hence, small differences in efficiency can lead to large differences in the number of tasks that can be completed and the gain is more dramatic if you have higher efficiency.  For example, if you go from being 90\% efficient (i.e. \epsilon = .1) to 95\% efficient (i.e. \epsilon = .05) then you will double the number of tasks you can complete. Going from 98\% to 99\% is also a doubling in productivity.  The model clearly disregards the fact that tasks are often correlated and have different probabilities for success.  I know  some people who have great trouble in revising and resubmitting papers to get published and thus they end up having low measured productivity even though they have accomplished a lot.   However, it seems to indicate that it is always worth improving your efficiency even by a small amount.

Some numbers for the BP leak

The Deepwater Horizon well is situated 1500 m below the surface of the Gulf of Mexico.  The hydrostatic pressure is approximately given by  the simple formula of P_a+ g\rho h where P_a = 100 \ kPa is the pressure of the atmosphere, \rho = 1 \ g/ml = 1000 \ kg/m^3   is the density of water, and g = 10 \ m/s^2 is the gravitational acceleration.  Putting the numbers together gives 1.5\times 10^7 \ kg/m s^2, which is 15000 \ kPa or about 150 times atmospheric pressure.  Hence, the oil and natural gas must be under tremendous pressure to be able to leak out of the well at all.  It’s no wonder the Top Kill operation, where mud was pumped in at high pressure, did not work.

Currently, it is estimated that the leak rate is somewhere between 10,000 and 100,000 barrels of oil per day.  A barrel of oil is 159 litres or 0.159 cubic metres.  So basically 1600 to 16000 cubic metres of oil is leaking each day.  This amounts to a cube with sides of about 11 metres for the lower value and 25 metres for the upper one, which is about the length of a basketball court.  However, assuming that the oil forms a layer on the surface of the ocean that is 0.001 mm thick, this then corresponds to a slick with an area between 1,600 to 16,000 square kilometres.  Given that the leak has been going on for almost two months and the Gulf of Mexico is 160,000 square kilometres, this implies that the slick is either very thick, oil has started to wash up on shore, or a lot of the oil is still under the surface.

Cost of health care

The New York Times  had a nice summary of what is known as Baumol’s cost disease for an explanation of why health care costs will always rise faster than inflation.  The explanation is quite elegant in my opinion and can also explain why costs for education and arts will also keep increasing at a rapid rate.  The example Baumol (and his colleague Bowen) use is that it takes the same number of people to play a Mozart string quartet as it did in the 18th century and yet musicians are paid so much more now.  Hence, the cost of music has increased with no corresponding increase in productivity.  Generally, wages should only increase because of a net gain in productivity.  Hence, a manufacturing plant  today has far fewer people than a century ago but they get paid more and produce more.  However, a violinist today is exactly the same efficient as she was a century ago.  Baumol argued that it was competition with other sectors of the economy that allowed the wages of artists to go up.  If you didn’t give musicians a living wage then there would be no musicians.

Applied to the health care industry, the implication is that medicine is just as labour intensive  and no more productive as it was before yet the salaries keep going up.   I think this is not quite correct and it is the complement or corollary of cost disease, which I’ll call productivity disease, that is the culprit for health care cost increases.  Health care is substantially more productive and efficient than before but this increase in productivity does not decrease cost but increases it.  For example, consider the victims of  a car crash.  Fifty years ago, they would probably just die and there would be no health care costs.  Now, they are evacuated by emergency personnel who resuscitate them on the way to hospital where they are given blood transfusions, undergo surgery, etc.  If they survive, they may then require months or years of physical therapy, expensive medication and so forth.  The increase in productivity leads to more health care and an increase in cost.  Hence, the better the health care industry gets at keeping you alive, the more expensive it becomes.

I feel that the panic over the rapid increase in health care costs is misplaced.  Why shouldn’t a civilized society be spending 90% of its GDP on health care?  After all, what is more important,  being healthy or having a flat screen TV?  I do think that eventually, cost disease or productivity disease, will saturate.  Perhaps we are at the steepest part of the cost curve right now because our health care system is good at keeping people alive but does not make them well enough so that they don’t need extended and expensive care.  If technology increased to the point that illness and injury could be treated instantly then costs would surely level off or even decrease.  For example, a minor cut a century ago could lead to a severe infection that could require hospitalization,  expensive treatment and result in death.  Now, you can treat a cut at home with some antibiotic ointment and a bandage.   We can certainly try to restrain some abuses and overuse of the health cares system by changing the incentive structure but perhaps we should also accept that a continuous increase in health care costs is inevitable and even desirable.

Connecting the dots

As I posted previously, I am highly skeptical that any foolproof system can be developed to screen for potential threats.  My previous argument was that in order to have a zero false negative rate for identifying terrorists, it would be impossible to not also have a relatively significant false positive rate.  In other words, the only way to guarantee that a terrorist doesn’t board a plane is to not let anyone board.  A way to visualize this graphically is with what is called a receiver operating characteristic or ROC curve, which is a plot of the true positive rate versus the false positive rate for a binary classification test as some parameter, usually a discrimination threshold, is changed.  Ideally, one would like  a curve that jumps to a true positive rate of 1 for zero false positive rate.  The area under the ROC curve (AROC) is the usual measure for how good a discriminator is.  So a perfect discriminator has AROC = 1.  In  my experience with biological systems,  it is pretty difficult to make a test with an AROC of greater than 90%.    Additionally, ROC curves are usually somewhat smooth so that they only reach true positve rate = 1  at false positive rate = 1.

Practicalities aside, is there any mathematical reason why a perfect or near perfect discriminator couldn’t be designed?  This to me is the more interesting question.  My guess is that deciding if a person is a terrorist is an NP hard question, which is why it is so insidious.   For any NP problem, it is simple to verify the answer but hard to find one.   Connecting all the dots to show that someone is a terrorist is a straightforward matter if you already know that they are a terrorist.  This  is also true of proving the Riemann Hypothesis or solving the 3D Ising model.  The  solution is obvious if you know the answer. If terrorist finding is NP hard, then that means for a large enough population and I think 5 billion qualifies, then no method nor achievable amount of computational power is sufficient to do the job perfectly.