My post – The gigabit machine, was reposted on the web aggregator site reddit.com recently. Aside from increasing traffic to my blog by tenfold for a few days, the comments on reddit made me realize that I wasn’t completely clear in my post. The original post was about a naive calculation of the information content in the brain and how it dwarfed the information content of the genome. Here, I use the term information in the information theoretical sense, which is about how many bits must be specified to define a system. So a single light switch that turns on and off has one bit of information while ten light switches have 10 bits. If we suppose that the brain has about neurons, with about
connections each, then there are
total connections. If we make the very gross assumption that each connection can be either “on” or “off”, then we arrive at
bits. This would be a lower bound on the amount of information required to specify the brain and it is already a really huge number. The genome has 3 billion bases and each base can be one of four types or two bits, so this gives a total of 6 billion bits. Hence, the information contained in the genome is just rounding noise compared to the potential information contained in the brain. I then argued that education and training was insufficient to make up this shortfall and that most of the brain must be specified by uncontrolled events.
The criticism I received in the comments on reddit was that this doesn’t imply that the genome did not specify the brain. An example that was brought up was the Mandelbrot set where highly complex patterns can arise from a very simple dynamical system. I thought this was a bad example because it takes a countably infinite amount of information to specify the Mandelbrot set but I understood the point which is that a dynamical system could easily generate complexity that appears to have higher information content. I even used such an argument to dispel the notion that the brain must be simpler than the universe in this post. However, the key point is that the high information content is only apparent; the actual information content of a given state is no larger than that contained in the original dynamical system and initial conditions. What this would mean for the brain is that the genome alone could in principle set all the connections in the brain but these connections are not independent. There would be correlations or other high order statistical relationships between them. Another way to say this is that while in principle there are possible brains, the genome can only specify
of them, which is still a large number. Hence, I believe that the conclusions of my original post still hold – the connections in the brain are either set mostly by random events or they are highly correlated (statistically related).