Confusion about consciousness

I have read two essays in the past month on the brain and consciousness and I think both point to examples of why consciousness per se and the “problem of consciousness” are both so confusing and hard to understand. The first article is by philosopher Galen Strawson in The Stone series of the New York Times. Strawson takes issue with the supposed conventional wisdom that consciousness is extremely mysterious and cannot be easily reconciled with materialism. He argues that the problem isn’t about consciousness, which is certainly real, but rather matter, for which we have no “true” understanding. We know what consciousness is since that is all we experience but physics can only explain how matter behaves. We have no grasp whatsoever of the essence of matter. Hence, it is not clear that consciousness is at odds with matter since we don’t understand matter.

I think Strawson’s argument is mostly sound but he misses on the crucial open question of consciousness. It is true that we don’t have an understanding of the true essence of matter and we probably never will but that is not why consciousness is mysterious. The problem is that we do now know whether the rules that govern matter, be they classical mechanics, quantum mechanics, statistical mechanics, or general relativity, could give rise to a subjective conscious experience. Our understanding of the world is good enough for us to build bridges, cars, computers and launch a spacecraft 4 billion kilometers to Pluto, take photos, and send them back. We can predict the weather with great accuracy for up to a week. We can treat infectious diseases and repair the heart. We can breed super chickens and grow copious amounts of corn. However, we have no idea how these rules can explain consciousness and more importantly we do not know whether these rules are sufficient to understand consciousness or whether we need a different set of rules or reality or whatever. One of the biggest lessons of the twentieth century is that knowing the rules does not mean you can predict the outcome of the rules. Not even taking into the computability and decidability results of Turing and Gödel, it is still not clear how to go from the microscopic dynamics of molecules to the Navier-Stokes equation for macroscopic fluid flow and how to get from Navier-Stokes to the turbulent flow of a river. Likewise, it is hard to understand how the liver works, much less the brain, starting from molecules or even cells. Thus, it is possible that consciousness is an emergent phenomenon of the rules that we already know, like wetness or a hurricane. We simply do not know and are not even close to knowing. This is the hard problem of consciousness.

The second article is by psychologist Robert Epstein in the online magazine Aeon. In this article, Epstein rails against the use of computers and information processing as a metaphor for how the brain works. He argues that this type of restricted thinking is why we can’t seem to make any progress understanding the brain or consciousness. Unfortunately, Epstein seems to completely misunderstand what computers are and what information processing means.

Firstly, a computation does not necessarily imply a symbolic processing machine like a von Neumann computer with a central processor, memory, inputs and outputs. A computation in the Turing sense is simply about finding or constructing a desired function from one countable set to another. Now, the brain certainly performs computations; any time we identify an object in an image or have a conversation, the brain is performing a computation. You can couch it in whatever language you like but it is a computation. Additionally, the whole point of a universal computer is that it can perform any computation. Computations are not tied to implementations. I can always simulate whatever (computable) system you want on a computer. Neural networks and deep learning are not symbolic computations per se but they can be implemented on a von Neumann computer. We may not know what the brain is doing but it certainly involves computation of some sort. Any thing that can sense the environment and react is making a computation. Bacteria can compute. Molecules compute. However, that is not to say that everything a brain does can be encapsulated by Turing universal computation. For example, Penrose believes that the brain is not computable although as I argued in a previous post, his argument is not very convincing. It is possible that consciousness is beyond the realm of computation and thus would entail very different physics. However, we have yet to find an example of a real physical phenomenon that is not computable.

Secondly, the brain processes information by definition. Information in both the Shannon and Fisher senses is a measure of uncertainty reduction. For example, in order to meet someone for coffee you need at least two pieces of information, where and when. Before you received that information your uncertainty was huge since there were so many possible places and times the meeting could take place. After receiving the information your uncertainty was eliminated. Just knowing it will be on Thursday is already a big decrease in uncertainty and an increase in information. Much of the brain’s job at least for cognition is about uncertainly reduction. When you are searching for your friend in the crowded cafe, you are eliminating possibilities and reducing uncertainty. The big mistake that Epstein makes is conflating an example with the phenomenon. Your brain does not need to function like your smartphone to perform computations or information processing. Computation and information theory are two of the most important mathematical tools we have for analyzing cognition.

New paper in PLoS Comp Bio

Shashaank Vattikuti , Phyllis Thangaraj, Hua W. Xie, Stephen J. Gotts, Alex Martin, Carson C. Chow. Canonical Cortical Circuit Model Explains Rivalry, Intermittent Rivalry, and Rivalry Memory. PLoS Computational Biology (2016).


It has been shown that the same canonical cortical circuit model with mutual inhibition and a fatigue process can explain perceptual rivalry and other neurophysiological responses to a range of static stimuli. However, it has been proposed that this model cannot explain responses to dynamic inputs such as found in intermittent rivalry and rivalry memory, where maintenance of a percept when the stimulus is absent is required. This challenges the universality of the basic canonical cortical circuit. Here, we show that by including an overlooked realistic small nonspecific background neural activity, the same basic model can reproduce intermittent rivalry and rivalry memory without compromising static rivalry and other cortical phenomena. The background activity induces a mutual-inhibition mechanism for short-term memory, which is robust to noise and where fine-tuning of recurrent excitation or inclusion of sub-threshold currents or synaptic facilitation is unnecessary. We prove existence conditions for the mechanism and show that it can explain experimental results from the quartet apparent motion illusion, which is a prototypical intermittent rivalry stimulus.

Author Summary

When the brain is presented with an ambiguous stimulus like the Necker cube or what is known as the quartet illusion, the perception will alternate or rival between the possible interpretations. There are neurons in the brain whose activity is correlated with the perception and not the stimulus. Hence, perceptual rivalry provides a unique probe of cortical function and could possibly serve as a diagnostic tool for cognitive disorders such as autism. A mathematical model based on the known biology of the brain has been developed to account for perceptual rivalry when the stimulus is static. The basic model also accounts for other neural responses to stimuli that do not elicit rivalry. However, these models cannot explain illusions where the stimulus is intermittently switched on and off and the same perception returns after an off period because there is no built-in mechanism to hold the memory. Here, we show that the inclusion of experimentally observed low-level background neural activity is sufficient to explain rivalry for static inputs, and rivalry for intermittent inputs. We validate the model with new experiments.


This paper is the latest of a continuing series of papers outlining how a canonical cortical circuit of excitatory and inhibitory cells can explain psychophysical and electrophysiological data of perceptual and cortical dynamics under a wide range of stimuli and conditions. I’ve summarized some of the work before (e.g. see here). In this particular paper, we show how the same circuit previously shown to explain winner-take-all behavior, normalization, and oscillations at various time scales, can also possess memory in the absence of input. Previous work has shown that if you have a circuit with effective mutual inhibition between two pools representing different percepts and include some type of fatigue process such as synaptic depression or spike frequency adaptation, then the circuit exhibits various dynamics depending on the parameters and input conditions. If the inhibition strength is relatively low and the two pools receive equal inputs then the model will have a symmetric fixed point where both pools are equally active. As the inhibition strength (or input strength) increases, then there can be a bifurcation to oscillations between the two pools with a frequency that is dependent on the strengths of inhibition, recurrent excitation, input, and the time constant of the fatigue process. A further increase in inhibition leads to a bifurcation to a winner-take-all (WTA) state where one of the pools dominates the others. However, the same circuit would be expected to not possess “rivalry memory”, where the same percept returns after the stimulus is completely removed for a duration that is long compared to the average oscillation period (dominance time). The reason is that during rivalry, the dominant pool is weakened while the suppressed pool is strengthened by the fatigue process. Thus when the stimulus is removed and returned, the suppressed pool would be expected to win the competition and become dominant. This reasoning had led people, including myself, to believe that rivalry memory could not be explained by this same model.

However, one thing Shashaank observed and that I hadn’t really noticed before was that the winner-take-all state can persist for arbitrarily low input strength. We prove a little theorem in the paper showing that if the gain function (or FI curve) is concave (i.e. does not bend up), then the winner-take-all will persist for arbitrarily low input if the inhibition is strong enough. Most importantly, the input does not need to be tuned and could be provided by the natural background activity known to exist in the brain. Even zero mean noise is sufficient to maintain the WTA state. This low-activity WTA state can then serve as a memory since whatever was active during a state with strong input can remain active when the input is turned off and the neurons just receive low level background activity. It is thus a purely mutual inhibition maintained memory. We dubbed this “topological memory” because it is like a kink in the carpet that never disappears and persists over a wide range of parameter values and input strengths. Although, we only consider rivalry memory in this paper, the mechanism could also apply in other contexts such as working memory. In this paper, we also focus on a specific rivalry illusion called the quartet illusion, which makes the model slightly more complicated but we show how it naturally reduces to a two pool model. We are currently finishing a paper quantifying precisely how excitatory and inhibitory strengths affect rivalry and other cortical phenomena so watch this space. We also have submitted an abstract to neuroscience demonstrating how you can get WTA and rivalry in a balanced-state network.


Update: link to paper is fixed.

Commentary on the Blue Brain Project

Definitely read Christof Koch and Michael Buice’s commentary on the Blue Brain Project paper in Cell. They nicely summarize all the important points of the paper and propose a Turing Test for models. The performance of a model can be assessed by how long it would take an experimenter to figure out if the data from proposed neurophysiological experiments was coming from a model or the real thing. I think that this is a nice idea but there is one big difference between the Turing Test for artificial intelligence and brain simulations and that is that everyone has an innate sense of what it means to be human but no one knows what a real brain should be doing. In that sense, it is not really a Turing Test per se but rather the replication of experiments in a more systematic way than is done now. You do an experiment on a real brain then repeat it on the model and see if they get comparable results.

Big blue brain

Appearing in this week’s edition of Cell is a paper summarizing the current status of Henry Markram’s Blue Brain Project. You can download the paper for free until Oct 22 here. The paper reports on a morphological and electrophysiological statistically accurate reconstruction of a rat somatosensory cortex. I think it is a pretty impressive piece of work. They first did a survey of cortex (14 thousand recorded and labeled neurons) to get probability distributions for various types of neurons and their connectivities. The neurons are classified according to their morphology (55 m-types), electrophysiology (11 e-types), and synaptic dynamics (6 s-types). The neurons are connected according to an algorithm outlined in a companion paper in Frontiers in Computational Neuroscience that reproduces the measured connectivity distribution. They then created a massive computer simulation of the reconstructed circuit and show that it has interesting dynamics and can reproduce some experimentally observed behaviour.

Although much of the computational neuroscience community has not really rallied behind Markram’s mission, I’m actually more sanguine about it now. Whether the next project to do the same for the human brain is worth a billion dollars, especially if this is a zero sum game, is another question. However, it is definitely a worthwhile pursuit to systematically catalogue and assess what we know now. Just like how IBM’s Watson did not really invent any new algorithms per se, it clearly changed how we perceive machine learning by showing what can be done if enough resources are put into it. One particularly nice thing the project has done is to provide a complete set of calibrated models for all types of cortical neurons. I will certainly be going to their data base to get the equations for spiking neurons in all of my future models. I think one criticism they will face is that their model basically produced what they put in but to me that is a feature not a bug. A true complete description of the brain would be a joint probability distribution for everything in the brain. This is impossible to compute in the near future no matter what scale you choose to coarse grain over. No one really believes that we need all this information and thus the place to start is to assume that the distribution completely factorizes into a product of independent distributions. We should at least see if this is sufficient and this work is a step in that direction.

However, the one glaring omission in the current rendition of this project is an attempt to incorporate genetic and developmental information. A major constraint in how much information is needed to characterize the brain is how much is contained in the genome. How much of what determines a neuron type and its location is genetically coded, determined by external inputs, or is just random? When you see great diversity in something there are two possible answers: 1) the details matter a lot or 2) details do not matter at all. I would want to know the answer to this question first before I tried to reproduce the brain.

Guest Post: On Proving Too Much in Scientific Data Analysis

I asked Rick Gerkin to write a summary of his recent eLife paper commenting on a much hyped Science paper on how many odours we can discriminate.

On Proving Too Much in Scientific Data Analysis

by Richard C. Gerkin
First off, thank you to Carson for inviting me to write about this topic.

Last year, Science published a paper by a group at Rockefeller University claiming that humans can discriminate at least a trillion smells. This was remarkable and exciting because, as the authors noted, there are far fewer than a trillion mutually discriminable colors or pure tones, and yet olfaction has been commonly believed to be much duller than vision or audition, at least in humans. Could it in fact be much sharper than the other senses?

After the paper came out in Science, two rebuttals were published in eLife. The first was by Markus Meister, an olfaction and vision researcher and computational neuroscientist at Cal Tech. My colleague Jason Castro and I had a separate rebuttal. The original authors have also posted a re-rebuttal of our two papers (mostly of Meister’s paper), which has not yet been peer reviewed. Here I’ll discuss the source of the original claim, and the logical underpinnings of the counterclaims that Meister, Castro, and I have made.

How did the original authors support their claim in the Science paper? Proving this claim by brute force would have been impractical, so the authors selected a representative set of 128 odorous molecules and then tested a few hundred random 30-component mixtures of those molecules. Since many mixture stimuli can be constructed in this way but only a small fraction can be practically tested, they tried to extrapolate their experimental results to the larger space of possible mixtures. They relied on a statistical transformation of the data, followed by a theorem from the mathematics of error-correcting codes, to estimate — from the data they collected — a lower bound on the actual number of discriminable olfactory stimuli.

The two rebuttals in eLife are mostly distinct from one another but have a common thread: both effectively identify the Science paper’s analysis framework with the logical fallacy of `proving too much‘, which can be thought of as a form of reductio ad absurdum. An argument `proves too much’ when it (or an argument of parallel construction) can prove things that are known to be false. For example, the 11th century theologian St. Anselm’s ontological argument [ed note: see previous post] for the existence of god states (in abbreviated form): “God is the greatest possible being. A being that exists is greater than one that doesn’t. If God does not exist, we can conceive of an even greater being, that is one that does exist. Therefore God exists”. But this proves too much because the same argument can be used to prove the existence of the greatest island, the greatest donut, etc., by making arguments of parallel construction about those hypothetical items, e.g. “The Lost Island is the greatest possible island…” as shown by Anselm’s contemporary Gaunilo of Marmoutiers. One could investigate further to identify more specific errors in logic in Anselm’s argument, but this can be tricky and time-consuming. Philosophers have spent centuries doing just this, with varying levels of success. But simply showing that the argument proves too much is sufficient to at least call the conclusion into question. This makes `proves too much’ a rhetorically powerful approach. In the context of a scientific rebuttal, leading with a demonstration that this fallacy has occurred piques enough reader interest to justify a dissection of more specific technical errors. Both eLife rebuttals use this approach, first showing that the analysis framework proves too much, and then exploring the source(s) of the error in greater detail.

How does one show that a particular detailed mathematical analysis `proves too much’ about experimental data? Let me reduce the analysis in the Science paper to the essentials, and abstract away all the other mathematical details. The most basic claim in that paper is based upon what I will call `the analysis framework’:

d = g(data)

z* = f(d)

z > z*

The authors did three basic things. First, they extracted a critical parameter d from their data set using a statistical procedure I’ll call g. d represents an average threshold for discriminability, corresponding to the number of components by which two mixtures must differ to be barely discriminable. Second, they fed this derived value, d, into a function, f that produces a number of odorous mixtures z*. Finally, they argued that the number z* so obtained necessarily underestimates the `true’ number of discriminable smells, owing to the particular form of f. Each step and proposition can be investigated:

1) How does the quantity d behave as the data or form of g varies? That is, is g the `right thing’ to do to the data?
2) What implicit assumptions does f make about the sense of smell — are these assumptions reasonable?
3) Is the stated inequality — which says that any number z* derived using f will always underestimate the true value z — really valid?

What are the rebuttals about? Meister’s paper rejects the equation 2 on the grounds that f is unjustified for the current problem. Castro and I are also critical of f, but focus more on equations 1 and 3, criticizing the robustness of g and demonstrating that the inequality z > z* should be reversed (the last of which I will not discuss further here). So together we called everything about the analysis framework into question. However, all parties are enthusiastic about the data itself, as well as its importance, so great care should be taken to distinguish the quality of the data from the validity of the interpretation.

In Meister’s paper, he shows that the analysis framework proves too much by using simulations of simple models, using either synthetic data or the actual data from the original paper. These simulations show that the original analysis framework can generate all sorts of values for z* which are known to be false by construction. For example, he shows that a synthetic organism constructed to have 3 odor percepts necessarily produces data which, when the analysis framework is applied, yield values of z* >> 3. Since we know by construction that the correct answer is 3, the analysis framework must be flawed. This kind of demonstration of `proving too much’ is also known by the more familiar term `positive control’: a control where a specific non-null outcome can be expected in advance if everything is working correctly. When instead of the correct outcome the analysis framework produces an incredible outcome reminiscent of the one reported in the Science paper, then that framework proves too much.

Meister then explores the reason the equations are flawed, and identifies the flaw in f. Imagine making a map of all odors, wherein similar-smelling odors are near each other on the map, and dissimilar-smelling odors are far apart. Let the distance between odors on the map be highly predictive of their perceptual similarity. How many dimensions must this map have to be accurate? We know the answer for a map of color vision: 3. Using only hue (H), saturation (S), and lightness (L) any perceptible color can be constructed, and any two nearby colors in an HSL map are perceptually similar, while any two distant colors in an such a map are perceptually dissimilar. The hue and saturation subspace of that map is familiar as the `color wheel‘, and has been understood for more than a century. In that map, hue is the angular dimension, saturation is the radial dimension, and lightness (if it were shown) would be perpendicular to the other two.

Meister argues that f must be based upon a corresponding perceptual map. Since no such reliable map exists for olfaction, Meister argues, we cannot even begin to construct an f for the smell problem; in fact, the f actually used in the Science paper assumes a map with 128 dimensions, corresponding to the dimensionality of the stimulus not the (unknown) dimensionality of the perceptual space. By using such a high dimensional version of f, a very high large value of z is guaranteed, but unwarranted.

In my paper with Castro, we show that the original paper proves too much in a different way. We show that very similar datasets (differing only in the number of subjects, the number of experiments, or the number of molecules) or very similar analytical choices (differing only in the statistical significance criterion or discriminability thresholds used) produce vastly different estimates for z*, differing over tens of orders of magnitude from the reported value. Even trivial differences produce absurd results such as `all possible odors can be discriminated’ or `at most 1 odor can be discriminated’. The differences were trivial in the sense that equally reasonable experimental designs and analyses could and have proceeded according to these differences. But the resulting conclusions are obviously false, and therefore the analysis framework has proved too much. This kind of demonstration of `proving too much’ differs from that in Meister’s paper. Whereas he showed that the analysis framework produces specific values that are known to be incorrect, we showed that it can produce any value at all under equally reasonable assumptions. For many of those assumptions, we don’t know if the values it produces is correct or not; after all, there may be 10^4 or 10^8 or 10^{12} discriminable odors — we don’t know. But if all values are equally justified, the framework proves too much.

We then showed the technical source of the error, which is a very steep dependence of d on incidental features of the study design, mediated by g, which is then amplified exponentially by a steep nonlinearity in f. I’ll illustrate with a much more well-known example from gene expression studies. When identifying genes that are thought to be differentially expressed in some disease or phenotype of interest, there is always a statistical significance threshold, e.g. p<0.01, p<0.001, etc. used for selection. After correcting for multiple comparisons, some number of genes pass the threshold and are identified as candidates for involvement in the phenotype. With a liberal threshold, e.g. p<0.05, many candidates will be identified (e.g. 50). With a more moderate threshold, e.g. p<0.005, fewer candidates will be identified (e.g. 10). With a more strict threshold, e.g. p<0.001, still fewer candidates will be identified (e.g. 2). This sensitivity is well known in gene expression studies. We showed that the function g in the original paper has a similar sensitivity.

Now suppose some researcher went a step further and said, “If there are N candidates genes involved in inflammation, and each has two expression levels, then there are 2^N inflammation phenotypes”. Then the estimate for the number of inflammation phenotypes might be:
2^2 = 4 at p<0.001,

2^{10} = 1024 at p<0.005, and

2^{50} = 1.1*10^{15} at p<0.05.

Any particular claim about the number of inflammation phenotypes from this approach would be arbitrary, incredibly sensitive to the significance threshold, and not worth considering seriously. One could obtain nearly any number of inflammation phenotypes one wanted, just by setting the significance threshold accordingly (and all of those thresholds, in different contexts, are considered reasonable in experimental science).

But this is essentially what the function f does in the original paper. By analogy, g is the thresholding step, d is the number of candidate genes, and z* is the number of inflammation phenotypes. And while all of the possible values for z* in the Science paper are arbitrary, a wide range of them would have been unimpressively small, another wide range would have been comically large, and only the `goldilocks zone’ produced the impressive but just plausible value reported in the paper. This is something that I think can and does happen to all scientists. If your first set of analysis decisions gives you a really exciting result, you may forget to check whether other reasonable sets of decisions would give you similar results, or whether instead they would give you any and every result under the sun. This robustness check can prevent you from proving too much — which really means proving nothing at all.

2015-08-14: typos fixed