Archive for the ‘Back of the envelope’ Category

Information content of the brain revisited

March 14, 2012

My post – The gigabit machine, was reposted on the web aggregator site reddit.com recently.  Aside from increasing traffic to my blog by tenfold for a few days, the comments on reddit made me realize that I wasn’t completely clear in my post.  The original post was about a naive calculation of the information content in the brain and how it dwarfed the information content of the genome.  Here, I use the term information in the information theoretical sense, which is about how many bits must be specified to define a system.  So a single light switch that turns on and off has one bit of information while ten light switches have 10 bits.  If we suppose that the brain has about 10^{11} neurons, with about 10^4 connections each, then there are 10^{15} total connections.  If we make the very gross assumption that each connection can be either “on” or “off”, then we arrive at 10^{15} bits.  This would be a lower bound on the amount of information required to specify the brain and it is already a really huge number.  The genome has 3 billion bases and each base can be one of four types or two bits, so this gives a total of 6 billion bits.  Hence, the information contained in the genome is just rounding noise compared to the potential information contained in the brain.  I then argued that education and training was insufficient to make up this shortfall and that most of the brain must be specified by uncontrolled events.

The criticism I received in the comments on reddit was that this doesn’t imply that the genome did not specify the brain. An example that was brought up was the Mandelbrot set where highly complex patterns can arise from a very simple dynamical system.  I thought this was a bad example because it takes a countably infinite amount of information to specify the Mandelbrot set but I understood the point which is that a dynamical system could easily generate complexity that appears to have higher information content.  I even used such an argument to dispel the notion that the brain must be simpler than the universe in this post.  However, the key point is that the high information content is only apparent; the actual information content of a given state is no larger than that contained in the original dynamical system and initial conditions.   What this would mean for the brain is that the genome alone could in principle set all the connections in the brain but these connections are not independent.  There would be correlations or other high order statistical relationships between them.  Another way to say this is that while in principle there are 2^{10^{15}} possible brains, the genome can only specify 2^{6\times10^{9}} of them, which is still a large number.  Hence, I believe that the conclusions of my original post still hold – the connections in the brain are either set mostly by random events or they are highly correlated (statistically related).

Evolution of overconfidence

July 31, 2011

A new paper on the evolution of overconfidence (arXiv:0909.4043v2) will appear shortly in Nature. (Hat tip to J.L. Delatre). It is well known in psychology that people generally overvalue themselves and it has always been a puzzle as to why.  This paper argues that under certain plausible conditions, it may have been evolutionarily advantageous to be overconfident.  One of the authors is James Fowler who has garnered recent fame for claiming with Nicholas Christakis that medically noninfectious phenomena such as obesity and divorce are socially contagious.  I have always been skeptical of these social network results and it seems like  there has been some recent push back.  Statistician and blogger Andrew Gelman has a summary of the critiques here.  The problem with these papers  fall in line with the same problems of many other clinical papers that I have posted on before (e.g. see here and here).  The evolution of overconfidence paper does not rely on statistics but on a simple evolutionary model.

The model  considers competition between two parties for some scarce resource.  Each party possess some heritable attribute and the one with the higher value of that attribute will win a contest and obtain the resource.   The model allows for three outcomes in any interaction: 1) winning a competition and obtaining the resource with value W-C (where C is the cost of competing), 2) claiming the resource without a fight with value W, and 3) losing a competition with a value -C.    The parties assess their own and their opponents attributes before deciding to compete.  If both parties had perfect information, participating in a contest would be unnecessary.  Both parties would realize who would win and the stronger of the two would claim the prize. However,  because of the error and biases in assessing attributes, resources will be contested. Overconfidence is represented as a positive bias in assessing oneself.  The authors chose a model that was simple enough to explicitly evaluate the outcomes of all possible situations and show that when the reward for winning is sufficiently large compared to the cost, then overconfidence is evolutionarily stable.

Here I will present a simpler toy model of why the result is plausible. Let P be the probability that a given party will win a competition on average and let Q be the probability that they will engage in a competition. Hence, Q is a measure of overconfidence.  Using these values, we can then compute the expectation value of an interaction:

E = Q^2P (W-C) + Q(1-Q) W - Q^2(1-P) C

(i.e. the probability of a competition and winning is Q^2P, the probability of  winning and not having to fight is Q(1-Q), the probability of  losing a competition is Q^2(1-P), and it doesn’t cost anything to not compete.)  The derivative of E with respect to Q is

E' = 2 QP(W-C) + (1-2Q)W-2Q(1-P)C=2Q[(1-P)W-C]+W

Hence, we see that if (1-P)W > C, i.e. the reward of winning sufficiently exceeds the cost of competing, then the expectation value is guaranteed to increase with increasing confidence. Of course this simple demonstration doesn’t prove that overconfidence is a stable strategy but it does affirm Woody Allen’s observation that “95% of life is just showing up.”

Productivity and ability

March 11, 2011

What makes some people more productive then others?  Is it innate ability, better training, hard work?  Although the meaning of productivity is subjective,  there are quantifiable differences between researchers in measures of productivity such as the  h-index.    Here I will argue that a small difference in ability or efficiency can lead to great differences in output.

Let’s consider a simple and admittedly flawed model of productivity.  Suppose we consider productivity to be the number of tasks you can complete and let P represent the probability that you can accomplish a  task (i.e. efficiency).  A task could be anything from completing an integral, to writing a program, to sticking an electrode into a cell, or to finishing a paper.  The probability of completing N independent tasks is T=P^N.  Conversely, the number of steps that can be completed with probability T is N = \log T/\log P.  Now let P = 1-\epsilon, where \epsilon is the failure probability.  Hence, for high efficiency (i.e. low failure rate),  we can expand the logarithm for small \epsilon and obtain N \propto \epsilon^{-1}.  The number of tasks you can complete for a given probability  is inversely proportional to your failure rate.

The rate of change in productivity with respect to efficiency increases even faster with

\frac{dN}{d P}\propto \epsilon^{-2}

Hence, small differences in efficiency can lead to large differences in the number of tasks that can be completed and the gain is more dramatic if you have higher efficiency.  For example, if you go from being 90\% efficient (i.e. \epsilon = .1) to 95\% efficient (i.e. \epsilon = .05) then you will double the number of tasks you can complete. Going from 98\% to 99\% is also a doubling in productivity.  The model clearly disregards the fact that tasks are often correlated and have different probabilities for success.  I know  some people who have great trouble in revising and resubmitting papers to get published and thus they end up having low measured productivity even though they have accomplished a lot.   However, it seems to indicate that it is always worth improving your efficiency even by a small amount.

Some numbers for the BP leak

June 3, 2010

The Deepwater Horizon well is situated 1500 m below the surface of the Gulf of Mexico.  The hydrostatic pressure is approximately given by  the simple formula of P_a+ g\rho h where P_a = 100 \ kPa is the pressure of the atmosphere, \rho = 1 \ g/ml = 1000 \ kg/m^3   is the density of water, and g = 10 \ m/s^2 is the gravitational acceleration.  Putting the numbers together gives 1.5\times 10^7 \ kg/m s^2, which is 15000 \ kPa or about 150 times atmospheric pressure.  Hence, the oil and natural gas must be under tremendous pressure to be able to leak out of the well at all.  It’s no wonder the Top Kill operation, where mud was pumped in at high pressure, did not work.

Currently, it is estimated that the leak rate is somewhere between 10,000 and 100,000 barrels of oil per day.  A barrel of oil is 159 litres or 0.159 cubic metres.  So basically 1600 to 16000 cubic metres of oil is leaking each day.  This amounts to a cube with sides of about 11 metres for the lower value and 25 metres for the upper one, which is about the length of a basketball court.  However, assuming that the oil forms a layer on the surface of the ocean that is 0.001 mm thick, this then corresponds to a slick with an area between 1,600 to 16,000 square kilometres.  Given that the leak has been going on for almost two months and the Gulf of Mexico is 160,000 square kilometres, this implies that the slick is either very thick, oil has started to wash up on shore, or a lot of the oil is still under the surface.

Cost of health care

January 21, 2010

The New York Times  had a nice summary of what is known as Baumol’s cost disease for an explanation of why health care costs will always rise faster than inflation.  The explanation is quite elegant in my opinion and can also explain why costs for education and arts will also keep increasing at a rapid rate.  The example Baumol (and his colleague Bowen) use is that it takes the same number of people to play a Mozart string quartet as it did in the 18th century and yet musicians are paid so much more now.  Hence, the cost of music has increased with no corresponding increase in productivity.  Generally, wages should only increase because of a net gain in productivity.  Hence, a manufacturing plant  today has far fewer people than a century ago but they get paid more and produce more.  However, a violinist today is exactly the same efficient as she was a century ago.  Baumol argued that it was competition with other sectors of the economy that allowed the wages of artists to go up.  If you didn’t give musicians a living wage then there would be no musicians.

Applied to the health care industry, the implication is that medicine is just as labour intensive  and no more productive as it was before yet the salaries keep going up.   I think this is not quite correct and it is the complement or corollary of cost disease, which I’ll call productivity disease, that is the culprit for health care cost increases.  Health care is substantially more productive and efficient than before but this increase in productivity does not decrease cost but increases it.  For example, consider the victims of  a car crash.  Fifty years ago, they would probably just die and there would be no health care costs.  Now, they are evacuated by emergency personnel who resuscitate them on the way to hospital where they are given blood transfusions, undergo surgery, etc.  If they survive, they may then require months or years of physical therapy, expensive medication and so forth.  The increase in productivity leads to more health care and an increase in cost.  Hence, the better the health care industry gets at keeping you alive, the more expensive it becomes.

I feel that the panic over the rapid increase in health care costs is misplaced.  Why shouldn’t a civilized society be spending 90% of its GDP on health care?  After all, what is more important,  being healthy or having a flat screen TV?  I do think that eventually, cost disease or productivity disease, will saturate.  Perhaps we are at the steepest part of the cost curve right now because our health care system is good at keeping people alive but does not make them well enough so that they don’t need extended and expensive care.  If technology increased to the point that illness and injury could be treated instantly then costs would surely level off or even decrease.  For example, a minor cut a century ago could lead to a severe infection that could require hospitalization,  expensive treatment and result in death.  Now, you can treat a cut at home with some antibiotic ointment and a bandage.   We can certainly try to restrain some abuses and overuse of the health cares system by changing the incentive structure but perhaps we should also accept that a continuous increase in health care costs is inevitable and even desirable.

Connecting the dots

January 8, 2010

As I posted previously, I am highly skeptical that any foolproof system can be developed to screen for potential threats.  My previous argument was that in order to have a zero false negative rate for identifying terrorists, it would be impossible to not also have a relatively significant false positive rate.  In other words, the only way to guarantee that a terrorist doesn’t board a plane is to not let anyone board.  A way to visualize this graphically is with what is called a receiver operating characteristic or ROC curve, which is a plot of the true positive rate versus the false positive rate for a binary classification test as some parameter, usually a discrimination threshold, is changed.  Ideally, one would like  a curve that jumps to a true positive rate of 1 for zero false positive rate.  The area under the ROC curve (AROC) is the usual measure for how good a discriminator is.  So a perfect discriminator has AROC = 1.  In  my experience with biological systems,  it is pretty difficult to make a test with an AROC of greater than 90%.    Additionally, ROC curves are usually somewhat smooth so that they only reach true positve rate = 1  at false positive rate = 1.

Practicalities aside, is there any mathematical reason why a perfect or near perfect discriminator couldn’t be designed?  This to me is the more interesting question.  My guess is that deciding if a person is a terrorist is an NP hard question, which is why it is so insidious.   For any NP problem, it is simple to verify the answer but hard to find one.   Connecting all the dots to show that someone is a terrorist is a straightforward matter if you already know that they are a terrorist.  This  is also true of proving the Riemann Hypothesis or solving the 3D Ising model.  The  solution is obvious if you know the answer. If terrorist finding is NP hard, then that means for a large enough population and I think 5 billion qualifies, then no method nor achievable amount of computational power is sufficient to do the job perfectly.

Was the stimulus too slow

December 6, 2009

Paul Krugman seems to think so.  As I posted five months ago, if we use the analogy of persistent activity in the brain for the economy then the total amount of stimulus would only be one variable that is important in knocking the economy out of a recession and into a new equilibrium point.  Another variable is how fast the money is spent.  Thus far, only 30% of the stimulus has been disbursed and Krugman thinks that the impact of it has already been maximized.  The rest of the money will then be dissipated without a stimulatory effect.

(more…)

Screening for terrorists

November 13, 2009

The recent tragedy at Fort Hood has people griping about missed signals that could have been used to prevent the attack.  However, I will argue that is likely to be impossible to ever have a system that can screen out all terrorists without also flagging a lot of innocent people.  The calculation is a simple exercise in probability theory that is often given in first year statistic classes.

Suppose we have a system in place that gives a yes Y or no response of whether or not a person is a terrorist T.  Let P(T) be the probablity that a given person is a terrorist,  P(T|Y) be the probability that a person  is a terrorist given that the test said yes.  Thus P(~T|Y)=1-P(T|Y) is the probability that one is not a terrorist even though the test said so.  Using Bayes theorem we have that

P(~T|Y)=P(Y|~T) P(~T)/P(Y)  (*)

where P(Y)=P(Y|T)P(T) + P(Y|~T)P(~T) is the probability of getting a yes result.   Now, the probability of being a terrorist is very low.   Out of the 300 million or so people in the US a small number are probably potential terrorists.  The US military has over a million people on active service.   Hence, the probability of not being a terrorist is very high.

From (*),  we see that in order to have a low probability of flagging an innocent person we need to have  P(Y|~T)P(~T)<< P(Y|T)P(T), or P(Y|~T)<< P(Y|T) P(T)/P(~T).  Since  P(T) is very small, P(T)/P(~T)~ P(T),   so if the true positive probability P(Y|T) was near one (i.e. a test that catches all terrorists), we need the false positive probability P(Y|~T) to be much smaller than the probability of having a terrorist, which means we need a test that gives false positives at a rate of less than 1 in a million.  The problem is that the true positive and false positive probabilities will be correlated.  The more sensitive the test the more likely it is to get a false positive.  So if you set your threshold to be very low so P(Y|T) is very high (i.e. make sure you never miss a terrorist), you’ll most certainly have P(Y|~T) to also be high.  I doubt you’ll ever have a test where P(Y|T) is near one while P(Y|~T) is less than one in a million.   So basically, if we want to catch all the terrorists, we’ll also have to flag a lot of innocent people.

Energy efficiency and boiling water

September 11, 2009

I’ve noticed that my last few posts have been veering towards the metaphysical so I thought today I would talk about some kitchen science, literally. The question is what is the most efficient way to boil water.  Should one turn the heat on the stove to the maximum or is there some mid-level that should be used?  I didn’t know what the answer was so I tried to calculate it.  The answer turned out to be more subtle than I anticipated.

(more…)

Being wrong may be rational

August 14, 2009

This past Sunday, economist Paul Krugman was lamenting in a book review of Justin Fox’s book “Myth of the Rational Market” (which he liked very much) that despite this current financial crisis and previous crises, like the failure of the hedge fund Long Term Capital Management, people still believe in efficient markets as strongly as ever. The efficient market hypothesis is the basis of most of modern finance and assumes that the price of a security is always correct and that you can never beat the market.  So artificial bubbles should never occur.  Krugman wonders what it will take to ever change people’s minds.

I want to show here that there might be no amount of evidence that will ever change their minds and they can still be perfectly rational in the Bayesian sense.  The argument can also apply to all other controversial topics.  I think it is generally believed in intellectual circles that the reason there is so much disagreement on these issues is that the other side is either stupid, deluded or irrational.   I want to point out that believing in something completely wrong even in the face of overwhelming evidence may arise in perfectly rational beings.  That is not to say that faulty reasoning does not exist and can be dangerous.  It just explains why two perfectly reasonable and intelligent people can disagree so alarmingly.

(more…)

The mass of humanity

June 26, 2009

Ever since Malthus, there has been a concern about overpopulation.  I thought it would be an interesting excercise to see how much space the human population actually takes up.  For example, how many oil tankers would it take to carry around the volume of humanity if converted to liquid.  Let’s say there are 6 billion people on the planet and the average mass per person is 100 kg (this is an overestimate).  Hence, the upper bound on the mass of humanity is 10 ^{12} kg, or a billion metric tons.  Given that we are mostly water, we can assume that this is about 10^{12} litres.  Taking the cube root gives 10^4\times .1 metres or a kilometre.  Thus, if we liquefied the mass of all humans, it would fit in a cube whose sides are a kilometre long.   The largest oil tankers can carry about five hundred thousand metric tons, so two thousand oil tankers could cart around all of humanity.  To put that into perspective, according to Wikipedia, the current fleet of oil tankers moves around 2 billion metric tons a year, so half the world’s fleet could carry around the world’s population.

Now, how much area would we take up if we were to stand side by side.  Let’s say 6 people can fit into a square metre of space, then we would all be able to fit into a billion square metres or 1000 square kilometres, about the size of  Hong Kong (according to Wolfram Alpha), or we could all fit 4 to a square metre  onto the island of Oahu in Hawaii.  If we each wanted about 100 square metres of space, then we would take up about a million square kilometres or about twice the area of France.  Wolfram Alpha also tells me that there is about 1.5\times 10^7 square kilometres of arable land in the world.  If we assume that a square kilometre can feed 1000 people (10 people per hectare), then that puts the capacity of the earth at 15 billion people.

Why so slow?

June 10, 2008

John Tierney of the New York times shows a figure from Ray Kurzweil of a log-log plot of the time between changes in history, such as the appearance of life multicellular organisms to new technologies like televisions and computers. His graph shows power law scaling with an exponent of negative one, which I obtained by eyeballing the curve. In other words, if dT is the time between the appearance of the next great change then it scales as 1/T where T is the time. I haven’t read Kurzweil’s book so maybe I’m misinterpreting the graph. The fact that there is scaling over such a long time is interesting but I want to discuss a different point. Let’s take the latter part of the curve regarding technological innovation. Kurzweil’s argument is that the pace of change is accelerating so we’ll soon be enraptured in the Singularity (see previous post). However, the rate of appearance of new ideas seems to be only increasing linearly with T. So the number of new ideas are accumulating as T^2, which is far from exponential. Additionally, the population is increasing exponentially (at least in the last few hundred years). Hence the number of ideas per person is obeying t^2 Exp(-t). I’m not sure where we are on the curve but after an initial increase, the number of ideas per person actually decreases exponentially. I was proposing in the last post that the number of good ideas was scaling with the population but according to Kurzweil I was being super optimistic. Did I make a mistake somewhere?


Follow

Get every new post delivered to your Inbox.

Join 81 other followers