Talk at NJIT

I was at the FACM ’09 conference held at the New Jersey Institute of Technology the past two days.  I gave a talk on “Effective theories for neural networks”.  The slides are here.  This was an unsatisfying talk on two accounts.  The first was that I didn’t internalize how soon this talk came after the Snowbird conference and so I didn’t have enough time to properly prepare.   I thus ended up giving a talk that provided enough information to be confusing and hopefully thought provoking but not enough to be understood.   The second problem was that there is a flaw in what I presented.

I’ll give a brief backdrop to the talk for those unfamiliar with neuroscience.  The brain is composed of interconnected neurons and as a proxy for understanding the brain, computational neuroscientists try to understand what a collection of coupled neurons will do.   The state of a neuron is characterized by the voltage across its membrane and the state of its membrane ion channels.  When a neuron is given enough input,  there can be a  massive change of voltage and flow of ions called an action potential.  One of the ions that flows into the cell is calcium, which can trigger the release of neurotransmitter to influence other neurons.  Thus, neuroscientists are highly focused on how and when action potentials or spikes occur.

We can thus model a neural network at many levels.  At the bottom level, there is what I will call a microscopic description where we write down equations for the dynamics of the voltage and ion channels for each neuron.  These neuron models are sometimes called conductance-based neurons and the Hodgkin-Huxley neuron is the first and most famous of them.  They usually consist of two to four differential equations and can easily be a lot more.  On the other hand, if one is more interested in just the spiking rate,  then there is a reduced description for that.  In fact, much of the early progress in mathematically understanding neural networks used rate equations, examples being Wison and Cowan, Grossberg, Hopfield and Amari.  The question that I have always had was what is the precise connection between a microscopic description and a spike rate or activity description.  If I start with a network of conductance-based neurons can I derive the appropriate activity based description?

People have heuristically done this several times before (e.g. Gerstner, Laing and Chow, Shriki et al.).  The argument generally goes like this.  If a neuron gets a constant amount of input then it will fire at some rate.  The rate or frequency as a function of input is called the F-I curve or  gain function.  Now, if the arrival time of spikes from other neurons is asynchronous or very weakly correlated, then if there are a lot of arriving spikes (it is said that each pyramidal neuron in the brain gets about ten thousand inputs), we can consider that the input to be a slowly varying function of time and be represented in terms of the input neuron rates.  We can then derive a set of self-consistent equations for the rates.  This works well in situations where the network is  firing asynchronously and when inputs are mostly stationary.  However,  there are certainly instances in the brain when this is not the case.  Another problem is that it is thought that learning and memory is due to some form of correlation based updating of synapses between neurons.  However, if we only model the rates of the inputs then we can’t possibly probe how changes in correlations will affect learning.  Thus it would be nice to be able to derive generalized activity equations that include the effects of correlations.  This system could take various forms but one possibility is an equation for a rate where the gain function depends on the correlation of the inputs together with a companion equation for correlations that changes depending on the rates.

To address this issue, Michael Buice and I have been pursuing what I call a density functional approach, which is represented in terms of a path integral.  The path integral was first introduced by Feynman to describe quantum mechanics and quantum field theory and then adapted by condensed matter theorists to analyze systems in statistical mechanics.  We’re now adopting it to study neural networks.  Michael and Jack Cowan first showed that one could generalize the Wilson-Cowan rate equation by hypothesizing it to be the reduction of a microscopic Markov model.  They then showed how the probability functional for the rates could be derived.  This functional contains all of the statistics in the Markov model and is consistent with the Wilson-Cowan equation.   The probability density functional for the activity is written in exponential form and the exponent is called the action.  Michael, Jack and I then showed how you could derive a hierarchy of moment equations,where the first equation is the Wilson-Cowan equation, from this action.

Michael and I also applied the density functional idea to studying finite size effects in the Kuramoto model.  Eric Hildebrand and I had derived a moment hierarchy to describe fluctuations and correlations but the equations were unwieldy to solve.  The density functional approach made it much easier to do the calculations.  The idea here was that you represent the dynamics of a population of microscopic neurons (oscillator phases in the Kuramoto case) in terms of the dynamics of a probability density.  The density measures the number of neurons with a given phase at a given time.  Now for a finite number of neurons, this density is not smooth. It will have spikes located at the individual neurons.  What people had done in the past to smooth out  the density is to consider the the infinite neuron limit (e.g. Strogatz and Mirollo), which is also known as mean field theory.  Our strategy instead was to smooth out the density by averaging over different populations of the network that had different frequencies and initial conditions.  The advantage of averaging this way is that you can preserve the correlations and fluctuations.  We used this method to compute the fluctuations in the order parameter due to finite size and also showed that the incoherent state is stable for a finite number of oscillators.  Strogatz and Mirollo had shown that this state was marginally stable in the mean field limit and it had been an open question of whether or not finite size effects stabilized the incoherent state.

Thus, starting from a microscopic description of deterministic oscillators, we can derive a probability density functional (i.e an action) that captures all the dynamics including fluctuations and correlations and starting from some action of neural activity (i.e. spike rate) we can derive a hierarchy of activity equations.  The obvious thing to do is to find a way to connect the two  and my talk was about how to do this.  However, there was a flaw in what I proposed. The activity is given by the probability current at threshold.  This in turn is given by the time derivative of the oscillator phase times the density at threshold.  Thus, we need to compute the action for the density at threshold.  The simplest thing to do is to simply isolate it and integrate out all the dynamics below threshold.  In my talk, I argued that if you assumed that the probability current was independent of the below threshold dynamics then you end up with just having to solve a single partial differential equation to obtain the equation.  However, if you carry out the calculation in full, which is what I did on the train ride home, you’ll find that this assumption throws away all interesting dynamics and you end up with a trivial activity action.  I’m still pretty sure that there is a way to derive the activity action from the density action but it will take more work.  Michael and I have some ideas to bridge this gap and I hope to post on the correct way to do this soon.

References:

Buice and Chow, PRE, 031118, 2007.

Buice and Cowan, PRE, 051919, 2007.

Buice, Cowan and Chow,  arXiv:0902.3925, 2009

Gerstner, Neural Comp, 43, 2000.

Hildebrand, Buice and Chow, PRL, 054101, 2007.

Laing and Chow, Neural Comp, 1473, 2001.

Shriky, Hansel, and Sompolinsky, Neural Comp, 1809, 2003.

Strogatz and Mirollo, J Stat Phys, 613, 1991.

2 thoughts on “Talk at NJIT

Leave a comment