The relevant Covid-19 fatality rate

Much has been written in the past few days about whether the case fatality rate (CFR) for Covid-19 is actually much lower than the original estimate of about 3 to 4%. Globally, the CFR is highly variable ranging  from half a  percent in Germany to nearly 10% in Italy. The difference could be due to underlying differences in the populations or to the extent of testing. South Korea, which has done very wide scale testing, has a CFR of around 1.5%. However, whether the CFR is high or low is not the important parameter.  The number we must determine is the population fatality rate because even if most of the people who become infected with SARS-CoV-2 have mild or even no symptoms so the CFR is low, if most people are susceptible and the entire world population gets the virus then even a tenth of a percent of 7 billion is still a very large number.

What we don’t know yet is how much of the population is susceptible. Data from the cruise ship Diamond Princess showed that about 20% of the passengers and crew became infected but there were still some social distancing measures in place after the first case was detected so this does not necessarily imply that 80% of the world population is innately immune. A recent paper from Oxford argues that about half of the UK population may already have been infected and is no longer susceptible. However, I redid their analysis and find that widespread infection although possible is not very likely (details to follow — famous last words) but this can and should be verified by testing for anti-bodies in the population. The bottom line is that we need to test, test and test both for the virus and for anti-bodies before we will know how bad this will be.

How many Covid-19 cases are too many ?

The US death rate is approximately 900 per 100,000 people. Thus, for a medium sized city of a million there are on average 25 deaths per day. Not all of these deaths will be  preceded by hospital care of course but that gives an idea for the scale of the case load of the health care system. The doubling time for the number of cases of Covid-19 is about 5 days. At this moment, the US has over 25 thousand cases with 193 cases in Maryland, where I live, and over 11 thousand in New York. If the growth rate is unabated then in 5 days there will be almost 400 cases in Maryland and over 50 thousand in the US. The case-fatality rate for Covid-19 is still not fully known but let’s suppose it is 1% and let’s say 5% of those infected need hospital care. This means that 5 days from now there will be an extra 20 patients in Maryland and 2500 patients in the US. New York will be have an extra thousand patients. Many of these patients will need ventilators and most hospitals only have a few. It is easy to see that it will not take too long until every ventilator in the state and US will be in use. Also, with the shortage of protective gear,  some of the hospital staff will contract the virus and add to the problem. As conditions in hospitals deteriorate, the virus will spread to non-covid-19 patients. This is where northern Italy is now and the US is about 10 days behind them. This is the scenario that has been presented to the policy makers and they have chosen to take what may seem like extreme social distancing measures now. We may not be able to stop this virus but if we can slow the doubling time, which is related to how many people are infected by a person with the virus, then we can give the health care system a chance to catch up.

Probability of gun death

The tragedy in Oregon has reignited the gun debate. Gun control advocates argue that fewer guns mean fewer deaths while gun supporters argue that if citizens were armed then shooters could be stopped through vigilante action. These arguments can be quantified in a simple model of the probability of gun death, p_d:

p_d = p_gp_u(1-p_gp_v) + p_gp_a

where p_g is the probability of having a gun, p_u is the probability of being a criminal or  mentally unstable enough to become a shooter, p_v is the probability of effective vigilante action, and p_a is the probability of accidental death or suicide.  The probability of being killed by a gun is given by the probability of someone having a gun times the probability that they are unstable enough to use it. This is reduced by the probability of a potential victim having a gun times the probability of acting effectively to stop the shooter. Finally, there is also a probability of dying through an accident.

The first derivative of p_d with respect to p_g is p_u - 2 p_u p_g p_v + p_a and the second derivative is negative. Thus, the minimum of p_d cannot be in the interior 0 < p_g < 1 and must be at the boundary. Given that p_d = 0 when p_g=0 and p_d = p_u(1-p_v) + p_a when p_g = 1, the absolute minimum is found when no one has a gun. Even if vigilante action was 100% effective, there would still be gun deaths due to accidents. Now, some would argue that zero guns is not possible so we can examine if it is better to have fewer guns or more guns. p_d is maximal at p_g = (p_u + p_a)/(2p_u p_v). Thus, unless p_v is greater than one half then even in the absence of accidents there is no situation where increasing the number of guns makes us safer. The bottom line is that if we want to reduce gun deaths we should either reduce the number of guns or make sure everyone is armed and has military training.

 

 

 

Information content of the brain revisited

My post – The gigabit machine, was reposted on the web aggregator site reddit.com recently.  Aside from increasing traffic to my blog by tenfold for a few days, the comments on reddit made me realize that I wasn’t completely clear in my post.  The original post was about a naive calculation of the information content in the brain and how it dwarfed the information content of the genome.  Here, I use the term information in the information theoretical sense, which is about how many bits must be specified to define a system.  So a single light switch that turns on and off has one bit of information while ten light switches have 10 bits.  If we suppose that the brain has about 10^{11} neurons, with about 10^4 connections each, then there are 10^{15} total connections.  If we make the very gross assumption that each connection can be either “on” or “off”, then we arrive at 10^{15} bits.  This would be a lower bound on the amount of information required to specify the brain and it is already a really huge number.  The genome has 3 billion bases and each base can be one of four types or two bits, so this gives a total of 6 billion bits.  Hence, the information contained in the genome is just rounding noise compared to the potential information contained in the brain.  I then argued that education and training was insufficient to make up this shortfall and that most of the brain must be specified by uncontrolled events.

The criticism I received in the comments on reddit was that this doesn’t imply that the genome did not specify the brain. An example that was brought up was the Mandelbrot set where highly complex patterns can arise from a very simple dynamical system.  I thought this was a bad example because it takes a countably infinite amount of information to specify the Mandelbrot set but I understood the point which is that a dynamical system could easily generate complexity that appears to have higher information content.  I even used such an argument to dispel the notion that the brain must be simpler than the universe in this post.  However, the key point is that the high information content is only apparent; the actual information content of a given state is no larger than that contained in the original dynamical system and initial conditions.   What this would mean for the brain is that the genome alone could in principle set all the connections in the brain but these connections are not independent.  There would be correlations or other high order statistical relationships between them.  Another way to say this is that while in principle there are 2^{10^{15}} possible brains, the genome can only specify 2^{6\times10^{9}} of them, which is still a large number.  Hence, I believe that the conclusions of my original post still hold – the connections in the brain are either set mostly by random events or they are highly correlated (statistically related).

Evolution of overconfidence

A new paper on the evolution of overconfidence (arXiv:0909.4043v2) will appear shortly in Nature. (Hat tip to J.L. Delatre). It is well known in psychology that people generally overvalue themselves and it has always been a puzzle as to why.  This paper argues that under certain plausible conditions, it may have been evolutionarily advantageous to be overconfident.  One of the authors is James Fowler who has garnered recent fame for claiming with Nicholas Christakis that medically noninfectious phenomena such as obesity and divorce are socially contagious.  I have always been skeptical of these social network results and it seems like  there has been some recent push back.  Statistician and blogger Andrew Gelman has a summary of the critiques here.  The problem with these papers  fall in line with the same problems of many other clinical papers that I have posted on before (e.g. see here and here).  The evolution of overconfidence paper does not rely on statistics but on a simple evolutionary model.

The model  considers competition between two parties for some scarce resource.  Each party possess some heritable attribute and the one with the higher value of that attribute will win a contest and obtain the resource.   The model allows for three outcomes in any interaction: 1) winning a competition and obtaining the resource with value W-C (where C is the cost of competing), 2) claiming the resource without a fight with value W, and 3) losing a competition with a value -C.    The parties assess their own and their opponents attributes before deciding to compete.  If both parties had perfect information, participating in a contest would be unnecessary.  Both parties would realize who would win and the stronger of the two would claim the prize. However,  because of the error and biases in assessing attributes, resources will be contested. Overconfidence is represented as a positive bias in assessing oneself.  The authors chose a model that was simple enough to explicitly evaluate the outcomes of all possible situations and show that when the reward for winning is sufficiently large compared to the cost, then overconfidence is evolutionarily stable.

Here I will present a simpler toy model of why the result is plausible. Let P be the probability that a given party will win a competition on average and let Q be the probability that they will engage in a competition. Hence, Q is a measure of overconfidence.  Using these values, we can then compute the expectation value of an interaction:

E = Q^2P (W-C) + Q(1-Q) W - Q^2(1-P) C

(i.e. the probability of a competition and winning is Q^2P, the probability of  winning and not having to fight is Q(1-Q), the probability of  losing a competition is Q^2(1-P), and it doesn’t cost anything to not compete.)  The derivative of E with respect to Q is

E' = 2 QP(W-C) + (1-2Q)W-2Q(1-P)C=2Q[(1-P)W-C]+W

Hence, we see that if (1-P)W > C, i.e. the reward of winning sufficiently exceeds the cost of competing, then the expectation value is guaranteed to increase with increasing confidence. Of course this simple demonstration doesn’t prove that overconfidence is a stable strategy but it does affirm Woody Allen’s observation that “95% of life is just showing up.”

Productivity and ability

What makes some people more productive then others?  Is it innate ability, better training, hard work?  Although the meaning of productivity is subjective,  there are quantifiable differences between researchers in measures of productivity such as the  h-index.    Here I will argue that a small difference in ability or efficiency can lead to great differences in output.

Let’s consider a simple and admittedly flawed model of productivity.  Suppose we consider productivity to be the number of tasks you can complete and let P represent the probability that you can accomplish a  task (i.e. efficiency).  A task could be anything from completing an integral, to writing a program, to sticking an electrode into a cell, or to finishing a paper.  The probability of completing N independent tasks is T=P^N.  Conversely, the number of steps that can be completed with probability T is N = \log T/\log P.  Now let P = 1-\epsilon, where \epsilon is the failure probability.  Hence, for high efficiency (i.e. low failure rate),  we can expand the logarithm for small \epsilon and obtain N \propto \epsilon^{-1}.  The number of tasks you can complete for a given probability  is inversely proportional to your failure rate.

The rate of change in productivity with respect to efficiency increases even faster with

\frac{dN}{d P}\propto \epsilon^{-2}

Hence, small differences in efficiency can lead to large differences in the number of tasks that can be completed and the gain is more dramatic if you have higher efficiency.  For example, if you go from being 90\% efficient (i.e. \epsilon = .1) to 95\% efficient (i.e. \epsilon = .05) then you will double the number of tasks you can complete. Going from 98\% to 99\% is also a doubling in productivity.  The model clearly disregards the fact that tasks are often correlated and have different probabilities for success.  I know  some people who have great trouble in revising and resubmitting papers to get published and thus they end up having low measured productivity even though they have accomplished a lot.   However, it seems to indicate that it is always worth improving your efficiency even by a small amount.

Some numbers for the BP leak

The Deepwater Horizon well is situated 1500 m below the surface of the Gulf of Mexico.  The hydrostatic pressure is approximately given by  the simple formula of P_a+ g\rho h where P_a = 100 \ kPa is the pressure of the atmosphere, \rho = 1 \ g/ml = 1000 \ kg/m^3   is the density of water, and g = 10 \ m/s^2 is the gravitational acceleration.  Putting the numbers together gives 1.5\times 10^7 \ kg/m s^2, which is 15000 \ kPa or about 150 times atmospheric pressure.  Hence, the oil and natural gas must be under tremendous pressure to be able to leak out of the well at all.  It’s no wonder the Top Kill operation, where mud was pumped in at high pressure, did not work.

Currently, it is estimated that the leak rate is somewhere between 10,000 and 100,000 barrels of oil per day.  A barrel of oil is 159 litres or 0.159 cubic metres.  So basically 1600 to 16000 cubic metres of oil is leaking each day.  This amounts to a cube with sides of about 11 metres for the lower value and 25 metres for the upper one, which is about the length of a basketball court.  However, assuming that the oil forms a layer on the surface of the ocean that is 0.001 mm thick, this then corresponds to a slick with an area between 1,600 to 16,000 square kilometres.  Given that the leak has been going on for almost two months and the Gulf of Mexico is 160,000 square kilometres, this implies that the slick is either very thick, oil has started to wash up on shore, or a lot of the oil is still under the surface.

Cost of health care

The New York Times  had a nice summary of what is known as Baumol’s cost disease for an explanation of why health care costs will always rise faster than inflation.  The explanation is quite elegant in my opinion and can also explain why costs for education and arts will also keep increasing at a rapid rate.  The example Baumol (and his colleague Bowen) use is that it takes the same number of people to play a Mozart string quartet as it did in the 18th century and yet musicians are paid so much more now.  Hence, the cost of music has increased with no corresponding increase in productivity.  Generally, wages should only increase because of a net gain in productivity.  Hence, a manufacturing plant  today has far fewer people than a century ago but they get paid more and produce more.  However, a violinist today is exactly the same efficient as she was a century ago.  Baumol argued that it was competition with other sectors of the economy that allowed the wages of artists to go up.  If you didn’t give musicians a living wage then there would be no musicians.

Applied to the health care industry, the implication is that medicine is just as labour intensive  and no more productive as it was before yet the salaries keep going up.   I think this is not quite correct and it is the complement or corollary of cost disease, which I’ll call productivity disease, that is the culprit for health care cost increases.  Health care is substantially more productive and efficient than before but this increase in productivity does not decrease cost but increases it.  For example, consider the victims of  a car crash.  Fifty years ago, they would probably just die and there would be no health care costs.  Now, they are evacuated by emergency personnel who resuscitate them on the way to hospital where they are given blood transfusions, undergo surgery, etc.  If they survive, they may then require months or years of physical therapy, expensive medication and so forth.  The increase in productivity leads to more health care and an increase in cost.  Hence, the better the health care industry gets at keeping you alive, the more expensive it becomes.

I feel that the panic over the rapid increase in health care costs is misplaced.  Why shouldn’t a civilized society be spending 90% of its GDP on health care?  After all, what is more important,  being healthy or having a flat screen TV?  I do think that eventually, cost disease or productivity disease, will saturate.  Perhaps we are at the steepest part of the cost curve right now because our health care system is good at keeping people alive but does not make them well enough so that they don’t need extended and expensive care.  If technology increased to the point that illness and injury could be treated instantly then costs would surely level off or even decrease.  For example, a minor cut a century ago could lead to a severe infection that could require hospitalization,  expensive treatment and result in death.  Now, you can treat a cut at home with some antibiotic ointment and a bandage.   We can certainly try to restrain some abuses and overuse of the health cares system by changing the incentive structure but perhaps we should also accept that a continuous increase in health care costs is inevitable and even desirable.

Connecting the dots

As I posted previously, I am highly skeptical that any foolproof system can be developed to screen for potential threats.  My previous argument was that in order to have a zero false negative rate for identifying terrorists, it would be impossible to not also have a relatively significant false positive rate.  In other words, the only way to guarantee that a terrorist doesn’t board a plane is to not let anyone board.  A way to visualize this graphically is with what is called a receiver operating characteristic or ROC curve, which is a plot of the true positive rate versus the false positive rate for a binary classification test as some parameter, usually a discrimination threshold, is changed.  Ideally, one would like  a curve that jumps to a true positive rate of 1 for zero false positive rate.  The area under the ROC curve (AROC) is the usual measure for how good a discriminator is.  So a perfect discriminator has AROC = 1.  In  my experience with biological systems,  it is pretty difficult to make a test with an AROC of greater than 90%.    Additionally, ROC curves are usually somewhat smooth so that they only reach true positve rate = 1  at false positive rate = 1.

Practicalities aside, is there any mathematical reason why a perfect or near perfect discriminator couldn’t be designed?  This to me is the more interesting question.  My guess is that deciding if a person is a terrorist is an NP hard question, which is why it is so insidious.   For any NP problem, it is simple to verify the answer but hard to find one.   Connecting all the dots to show that someone is a terrorist is a straightforward matter if you already know that they are a terrorist.  This  is also true of proving the Riemann Hypothesis or solving the 3D Ising model.  The  solution is obvious if you know the answer. If terrorist finding is NP hard, then that means for a large enough population and I think 5 billion qualifies, then no method nor achievable amount of computational power is sufficient to do the job perfectly.

Was the stimulus too slow

Paul Krugman seems to think so.  As I posted five months ago, if we use the analogy of persistent activity in the brain for the economy then the total amount of stimulus would only be one variable that is important in knocking the economy out of a recession and into a new equilibrium point.  Another variable is how fast the money is spent.  Thus far, only 30% of the stimulus has been disbursed and Krugman thinks that the impact of it has already been maximized.  The rest of the money will then be dissipated without a stimulatory effect.

Continue reading

Screening for terrorists

The recent tragedy at Fort Hood has people griping about missed signals that could have been used to prevent the attack.  However, I will argue that is likely to be impossible to ever have a system that can screen out all terrorists without also flagging a lot of innocent people.  The calculation is a simple exercise in probability theory that is often given in first year statistic classes.

Suppose we have a system in place that gives a yes Y or no response of whether or not a person is a terrorist T.  Let P(T) be the probablity that a given person is a terrorist,  P(T|Y) be the probability that a person  is a terrorist given that the test said yes.  Thus P(~T|Y)=1-P(T|Y) is the probability that one is not a terrorist even though the test said so.  Using Bayes theorem we have that

P(~T|Y)=P(Y|~T) P(~T)/P(Y)  (*)

where P(Y)=P(Y|T)P(T) + P(Y|~T)P(~T) is the probability of getting a yes result.   Now, the probability of being a terrorist is very low.   Out of the 300 million or so people in the US a small number are probably potential terrorists.  The US military has over a million people on active service.   Hence, the probability of not being a terrorist is very high.

From (*),  we see that in order to have a low probability of flagging an innocent person we need to have  P(Y|~T)P(~T)<< P(Y|T)P(T), or P(Y|~T)<< P(Y|T) P(T)/P(~T).  Since  P(T) is very small, P(T)/P(~T)~ P(T),   so if the true positive probability P(Y|T) was near one (i.e. a test that catches all terrorists), we need the false positive probability P(Y|~T) to be much smaller than the probability of having a terrorist, which means we need a test that gives false positives at a rate of less than 1 in a million.  The problem is that the true positive and false positive probabilities will be correlated.  The more sensitive the test the more likely it is to get a false positive.  So if you set your threshold to be very low so P(Y|T) is very high (i.e. make sure you never miss a terrorist), you’ll most certainly have P(Y|~T) to also be high.  I doubt you’ll ever have a test where P(Y|T) is near one while P(Y|~T) is less than one in a million.   So basically, if we want to catch all the terrorists, we’ll also have to flag a lot of innocent people.

Energy efficiency and boiling water

I’ve noticed that my last few posts have been veering towards the metaphysical so I thought today I would talk about some kitchen science, literally. The question is what is the most efficient way to boil water.  Should one turn the heat on the stove to the maximum or is there some mid-level that should be used?  I didn’t know what the answer was so I tried to calculate it.  The answer turned out to be more subtle than I anticipated.

Continue reading

Being wrong may be rational

This past Sunday, economist Paul Krugman was lamenting in a book review of Justin Fox’s book “Myth of the Rational Market” (which he liked very much) that despite this current financial crisis and previous crises, like the failure of the hedge fund Long Term Capital Management, people still believe in efficient markets as strongly as ever. The efficient market hypothesis is the basis of most of modern finance and assumes that the price of a security is always correct and that you can never beat the market.  So artificial bubbles should never occur.  Krugman wonders what it will take to ever change people’s minds.

I want to show here that there might be no amount of evidence that will ever change their minds and they can still be perfectly rational in the Bayesian sense.  The argument can also apply to all other controversial topics.  I think it is generally believed in intellectual circles that the reason there is so much disagreement on these issues is that the other side is either stupid, deluded or irrational.   I want to point out that believing in something completely wrong even in the face of overwhelming evidence may arise in perfectly rational beings.  That is not to say that faulty reasoning does not exist and can be dangerous.  It just explains why two perfectly reasonable and intelligent people can disagree so alarmingly.

Continue reading

The mass of humanity

Ever since Malthus, there has been a concern about overpopulation.  I thought it would be an interesting excercise to see how much space the human population actually takes up.  For example, how many oil tankers would it take to carry around the volume of humanity if converted to liquid.  Let’s say there are 6 billion people on the planet and the average mass per person is 100 kg (this is an overestimate).  Hence, the upper bound on the mass of humanity is 10 ^{12} kg, or a billion metric tons.  Given that we are mostly water, we can assume that this is about 10^{12} litres.  Taking the cube root gives 10^4\times .1 metres or a kilometre.  Thus, if we liquefied the mass of all humans, it would fit in a cube whose sides are a kilometre long.   The largest oil tankers can carry about five hundred thousand metric tons, so two thousand oil tankers could cart around all of humanity.  To put that into perspective, according to Wikipedia, the current fleet of oil tankers moves around 2 billion metric tons a year, so half the world’s fleet could carry around the world’s population.

Now, how much area would we take up if we were to stand side by side.  Let’s say 6 people can fit into a square metre of space, then we would all be able to fit into a billion square metres or 1000 square kilometres, about the size of  Hong Kong (according to Wolfram Alpha), or we could all fit 4 to a square metre  onto the island of Oahu in Hawaii.  If we each wanted about 100 square metres of space, then we would take up about a million square kilometres or about twice the area of France.  Wolfram Alpha also tells me that there is about 1.5\times 10^7 square kilometres of arable land in the world.  If we assume that a square kilometre can feed 1000 people (10 people per hectare), then that puts the capacity of the earth at 15 billion people.

Why so slow?

John Tierney of the New York times shows a figure from Ray Kurzweil of a log-log plot of the time between changes in history, such as the appearance of life multicellular organisms to new technologies like televisions and computers. His graph shows power law scaling with an exponent of negative one, which I obtained by eyeballing the curve. In other words, if dT is the time between the appearance of the next great change then it scales as 1/T where T is the time. I haven’t read Kurzweil’s book so maybe I’m misinterpreting the graph. The fact that there is scaling over such a long time is interesting but I want to discuss a different point. Let’s take the latter part of the curve regarding technological innovation. Kurzweil’s argument is that the pace of change is accelerating so we’ll soon be enraptured in the Singularity (see previous post). However, the rate of appearance of new ideas seems to be only increasing linearly with T. So the number of new ideas are accumulating as T^2, which is far from exponential. Additionally, the population is increasing exponentially (at least in the last few hundred years). Hence the number of ideas per person is obeying t^2 Exp(-t). I’m not sure where we are on the curve but after an initial increase, the number of ideas per person actually decreases exponentially. I was proposing in the last post that the number of good ideas was scaling with the population but according to Kurzweil I was being super optimistic. Did I make a mistake somewhere?