Talk at Jackfest

July 16, 2014

I’m currently in Banff, Alberta for a Festschrift for Jack Cowan (webpage here). Jack is one of the founders of theoretical neuroscience and has infused many important ideas into the field. The Wilson-Cowan equations that he and Hugh Wilson developed in the early seventies form a foundation for both modeling neural systems and machine learning. My talk will summarize my work on deriving “generalized Wilson-Cowan equations” that include both neural activity and correlations. The slides can be found here. References and a summary of the work can be found here. All videos of the talks can be found here.


Addendum: 17:44. Some typos in the talk were fixed.

Addendum: 18:25. I just realized I said something silly in my talk.  The Legendre transform is an involution because the transform of the transform is the inverse. I said something completely inane instead.

Zero Matlab 2015

July 7, 2014

So I have bitten the bullet and am committed to phasing out Matlab completely by 2015.  I have Julia installed and can sort of run code in it although I have no idea how to change directories.  cd() doesn’t seem to work for me.  I have also tried to install matplotlib on my MacBook Pro running OS 10.8 using a SciPy superpack but it does not seem to work yet.  When I try to plot something, nothing happens but I have some bouncing rocket ships in my dock.  Feel free to let me know what I’m doing wrong.


Addendum: I installed iPython using Pip and I can plot out of iPython

The need for speed

July 1, 2014

Thanks for all the comments about the attributes of Python and Julia. It seems to me that the most prudent choice is to learn Python and Julia. However, what I would really like to know is just how fast these languages really are and here is the test. What I want to do is to fit networks of coupled ODEs (and  PDEs) to data using MCMC (see here). This means I need a language that loops fast. An example in pseudo-Matlab code would be

for n = 1:N

for i = 1:T

y(i+1) = M\y(i)


Compare to data and set new parameters


where h is a parameter and M is some matrix (say 1000 dimensional), which is sometimes a Toeplitz matrix but not always. Hence, in each time step I need to invert a matrix, which can depend on time so I can’t always precompute, and do a matrix multiplication. Then in each parameter setting step I need to sum an objective function like the mean square error over all the data points. The code to do this in C or Fortran can be pretty complicated because you have to keep track of all the indices and call linear algebra libraries. I thus want something that has the simple syntax of Matlab but is as fast as C. Python seems to be too slow for our needs but maybe we haven’t optimized the code. Julia seems like the perfect fit but let me know if I am just deluded.

Julia vs Python

June 29, 2014

I was about to start my trek up Python mountain until Bard Ermentrout tipped me to the Julia language and I saw this speed table from here (lower is faster):

Fortran Julia Python R Matlab Octave Mathe-matica JavaScript Go
gcc 4.8.1 0.2 2.7.3 3.0.2 R2012a 3.6.4 8.0 V8 go1
fib 0.26 0.91 30.37 411.36 1992.00 3211.81 64.46 2.18 1.03
parse_int 5.03 1.60 13.95 59.40 1463.16 7109.85 29.54 2.43 4.79
quicksort 1.11 1.14 31.98 524.29 101.84 1132.04 35.74 3.51 1.25
mandel 0.86 0.85 14.19 106.97 64.58 316.95 6.07 3.49 2.36
pi_sum 0.80 1.00 16.33 15.42 1.29 237.41 1.32 0.84 1.41
rand_mat_stat 0.64 1.66 13.52 10.84 6.61 14.98 4.52 3.28 8.12
rand_mat_mul 0.96 1.01 3.41 3.98 1.10 3.41 1.16 14.60 8.51

Julia is a dynamic high level language like MATLAB and Python that is open source and developed at MIT. The syntax looks fairly simple and it is about as fast as C (Fortran looks like it still is the Ferrari of scientific computing). Matlab is fast for vector and matrix operations but deadly slow for loops. I had no idea that Mathematica was so fast. Although Julia is still relatively new and not nearly as expansive as Python, should I drop Python for Julia?

The MATLAB handcuff

June 26, 2014

The first computer language I learned was BASIC back in the stone age, which led directly to Fortran. These are procedural languages that allow the infamous GOTO statement, now shunned by the computer literati. Programming with the GOTO gives you an appreciation for why the Halting problem is undecidable.  Much of what I did in those days was to track down infinite loops. I was introduced to structured programming in university, where I learned Pascal. I didn’t really know what structured programming meant except that I no longer could use GOTO and there were data structures like records. I was forced to use APL at a summer job. I have little recollection of the language except that it was extremely terse and symbolic. It was fun to try to construct the shortest program possible to do the task. The ultimate program was the so-called “APL one liner”. APL gave me first hand experience of the noncomputability of Kolmogorov complexity. In graduate school I went back to Fortran, which was the default language to do scientific computing at that time. I also used the computer algebra system called Macsyma, which was much better than Mathematica. I used it to do Taylor expansions and perturbation theory. I was introduced to C and C++ in my first postdoc. That was an eye-opening experience as I never really understood how a computer worked until I programmed in C. Pointer arithmetic was a revelation. I now had such control and power. C++ was the opposite of C for me. Object oriented programming takes you very far away from the workings of a computer. I basically programmed exclusively in C for a decade – just C and XPP, which was a real game changer. I had no need for anything else until I got to NIH. It was only then that I finally sat down and programmed in MATLAB. I had resisted up to that point and still feel like it is cheating but I now almost do all of my programming in MATLAB, with a smattering of R and XPP of course. I’m also biased against MATLAB because it gave a wrong answer in a previous version. At first, I programmed in MATLAB as I would in C or Fortran but when it came down to writing the codes to estimate heritability directly from GWAS (see here), the matrix manipulating capabilities of MATLAB really became useful. I also learned that statistics is basically applied linear algebra. Now, when I code I think instinctively in matrix terms and it is very hard for me to go back to programming in C. (Although I did learn Objective C recently to write an iPhone App to predict body weight. But that was mostly point-and-click and programming by trial and error. The App does work though (download it here). I did that because I wanted to get a sense of what real programmers actually do.) My goal is to switch from MATLAB to Python and not rely on proprietary software. I encourage my fellows to use Python instead of MATLAB because it will be a cinch to learn MATLAB later if they already know Python. The really big barrier for me for all languages is to learn the ancillary stuff like what do you actually type to run programs, how does Python know where programs are, how do you read in data, how do you plot graphs, etc? In MATLAB, I just click on an icon and everything is there. I keep saying that I will uncuff myself from MATLAB one day and maybe this is the year that I actually do.

New Papers

June 16, 2014
Two new papers are now in print:
The first is on applying compressed sensing to genomics is now published in Gigascience. The summary of the paper is here and the link is here.
The second is on steroid-regulated gene induction and can be obtained here.
Biochemistry. 2014 Mar 25;53(11):1753-67. doi: 10.1021/bi5000178. Epub 2014 Mar 11.

A kinase-independent activity of Cdk9 modulates glucocorticoid receptor-mediated gene induction.


A gene induction competition assay has recently uncovered new inhibitory activities of two transcriptional cofactors, NELF-A and NELF-B, in glucocorticoid-regulated transactivation. NELF-A and -B are also components of the NELF complex, which participates in RNA polymerase II pausing shortly after the initiation of gene transcription. We therefore asked if cofactors (Cdk9 and ELL) best known to affect paused polymerase could reverse the effects of NELF-A and -B. Unexpectedly, Cdk9 and ELL augmented, rather than prevented, the effects of NELF-A and -B. Furthermore, Cdk9 actions are not blocked either by Ckd9 inhibitors (DRB or flavopiridol) or by two Cdk9 mutants defective in kinase activity. The mode and site of action of NELF-A and -B mutants with an altered NELF domain are similarly affected by wild-type and kinase-dead Cdk9. We conclude that Cdk9 is a new modulator of GR action, that Ckd9 and ELL have novel activities in GR-regulated gene expression, that NELF-A and -B can act separately from the NELF complex, and that Cdk9 possesses activities that are independent of Cdk9 kinase activity. Finally, the competition assay has succeeded in ordering the site of action of several cofactors of GR transactivation. Extension of this methodology should be helpful in determining the site and mode of action of numerous additional cofactors and in reducing unwanted side effects.

PMID: 24559102 [PubMed - indexed for MEDLINE]
PMCID: PMC3985961 [Available on 2015/2/21]

Slides for talk on gene expression

June 10, 2014

Here are the slides for my talk today at the NIH Systems Biology Forum on gene expression.  Background for the talk can be found here.

Marc Andreesen on EconTalk

June 3, 2014

If you have any interest in technology and the internet then you should definitely listen to this EconTalk podcast with Marc Andreesen, who wrote the first web browser Mosaic that led to the explosive growth of the internet. He has plenty of insightful things to say.  I remember first seeing Mosaic in 1994 as a postdoc in Boulder, Colorado. There I was, doing research that involved programming in C and C++. I was not really happy with what I was doing. I was having a hard time finding the next job. I was one of the first to play around with HTML, and it never occurred to me once that I could pack my bags, move to Silicon Valley, and try to get involved in the burgeoning tech revolution. It just makes me wonder what other obvious things I’m missing right now.

Addendum, 2014-6-5:  Actually, it may have been 1993 that I first saw Mosaic.

Integrated Information Theory

June 2, 2014

Neuroscientist Giulio Tononi has proposed that consciousness is integrated information and can be measured by a quantity called \phi, which is a measure of the amount of information that involves the entire system as a whole. I have never really found this theory to be entirely compelling. While I think that consciousness probably does require some amount of integrated information, I am skeptical that it is the only relevant measure. See here and here for some of my previous thoughts on the topic. One of the reasons that Tononi has proposed a single measure is because it is a way to sidestep what is known as “the hard problem of consciousness”. Instead of trying to explain how a collection of neurons would be endowed with a sense of self-awareness, he posits that consciousness is a property of information and the more \phi one has, the more conscious you become. So in this theory, rocks are not conscious but thermostats are minimally conscious.

Theoretical computer scientist Scott Aaronson has now weighed in on the topic (see here and here). In his inimitable style, Aaronson shows essentially that a large grid of XOR gates could have arbitrarily large \phi and hence be even more conscious than you or me.  He finds this to be highly implausible. Tononi then produced a 14 page response where he essentially doubles down on IIT and claims that indeed a planar array of XOR gates is conscious and we should not be surprised it is so. Aaronson also proposes that we try to solve the “pretty hard problem of consciousness”, which is to come up with a theory or means for deciding when something has consciousness. To me, the fact that we can’t come up with an empirical way to tell whether something is conscious is the best argument for dualism we have. It may even be plausible that the PHPC is undecidable in that solving it would entail the solution of the halting problem. I agree with philosopher David Chalmers (see here) that there are only two possible consistent theories of consciousness. The first is that it is an emergent property of the brain but it has no “causal influence” on events. In other words, consciousness is an epiphenomenon that just allows “us” to be an audience for the dynamical evolution of the universe. The second is that we live in a dualistic world of mind and matter. It is definitely worth reading the posts and the comments, where Chalmers chimes in.

Did microbes cause the Great Dying?

May 24, 2014

In one of my very first posts almost a decade ago, I wrote about the end-Permian extinction 250 million years ago, which was the greatest mass extinction thus far. In that post I covered research that had ruled out an asteroid impact and found evidence of global warming, possibly due to volcanos, as a cause. Now, a recent paper in PNAS proposes that a horizontal gene transfer event from bacteria to archaea may have been the main cause for the increase of methane and CO2. This paper is one of the best papers I have read in a long time, combining geological field work, mathematical modeling, biochemistry, metabolism, and evolutionary phylogenetic analysis to make a compelling argument for their hypothesis.

Their case hinges on several pieces of evidence. The first comes from well-dated carbon isotopic records from China.  The data shows a steep plunge in the isotopic ratio (i.e ratio between the less abundant but heavier carbon 13 and the lighter more abundant carbon 12) in the inorganic carbonate reservoir with a moderate increase in the organic reservoir. In the earth’s carbon cycle, the organic reservoir comes from the conversion of atmospheric CO2 into carbohydrates via photosynthesis, which prefers carbon 12 to carbon 13. Organic carbon is returned to inorganic form through oxidation by animals eating photosynthetic organisms or by the burning of stored carbon like trees or coal. A steep drop in the isotopic ratio means that there was an extra surge of carbon 12 into the inorganic reservoir. Using a mathematical model, the authors show that in order to explain the steep drop, the inorganic reservoir must have grown superexponentially (faster than exponential). This requires some runaway positive feedback loop that is difficult to explain by geological processes such as volcanic activity, but is something that life is really good at.

The increased methane would have been oxidized to CO2 by other microbes, which would have lowered the oxygen concentration. This would allow for more efficient fermentation and thus more acetate fuel for the archaea to make more methane. The authors showed in another simple mathematical model how this positive feedback loop could lead to superexponential growth. Methane and CO2 are both greenhouse gases and their increase would have caused significant global warming. Anaerobic methane oxidation could also lead to the release of poisonous hydrogen sulfide.

They then considered what microbe could have been responsible. They realized that during the late Permian, a lot of organic material was being deposited in the sediment. The organic reservoir (i.e. fossil fuels, methane hydrates, soil organic matter, peat, etc) was much larger back then than today, as if someone or something used it up at some point. One of the end products of fermentation of this matter would be acetate and that is something archaea like to eat and convert to methane. There are two types of archaea that can do this and one is much more efficient than the other at high acetate concentrations. This increased efficiency was also shown recently to have arisen by a horizontal gene transfer event from a bacterium. A phylogenetic analysis of all known archaea showed that the progenitor of the efficient methanogenic one likely arose 250 million years ago.

The final piece of evidence is that the archaea need nickel to make methane. The authors then looked at the nickel concentrations in their Chinese geological samples and found a sharp increase in nickel immediately before the steep drop in the isotopic ratio. They postulate that the source of the nickel was the massive Siberian volcano eruptions at that time (and previously proposed as the cause of the increased methane and CO2).

This scenario required the unlikely coincidence of several events –  lots of excess organic fuel, low oxygen (and sulfate), increased nickel, and a horizontal gene transfer event. If any of these were missing, the Great Dying may not have taken place. However, given that there have been only 5 mass extinctions, although we may be currently inducing the 6th, low probability events may be required for such calamitous events. This paper should also give us some pause about introducing genetically modified organisms into the environment. While most will probably be harmless, you never know when one will be the match that lights the fire.



What is the difference between math, science and philsophy?

May 16, 2014

I’ve been listening to the Philosophy Bites podcast recently. One from a few years ago consisted of answers from philosopher’s to the question posed on the spot and without time for deep reflection: What is Philosophy? Some managed to give precise answers, but many struggled. I think one source of conflict they faced as they answered was that they didn’t know how to separate the question of what philosophers actually do from they should be doing. However, I think that a clear distinction between science, math and philosophy as methodologies can be specified precisely. I also think that this is important because practitioner’s in each subject should be aware of what methodology they are actually using and what is appropriate for whatever problem they are working on.

Here are my definitions: Math explores the consequences of rules or assumptions, science is the empirical study of measurable things, and philosophy examines things that cannot be resolved by mathematics or empiricism. With these definitions, practitioner’s of any discipline may use either math, science, or philosophy to help answer whatever question they may be addressing. Scientists need mathematics to work out the consequences of their assumptions and philosophy to help delineate phenomena. Mathematicians need science and philosophy to provide assumptions or rules to analyze. Philosophers need mathematics to sort out arguments and science to test hypotheses experimentally.

Those skeptical of philosophy may suggest that anything that cannot be addressed by math or science has no practical value. However, with these definitions, even the most hardened mathematician or scientist may be practicing philosophy without even knowing it. Atheists like Richard Dawkins should realize that part of their position is based on philosophy and not science. The only truly logical position to take with respect to God is agnosticism. It may be probable that there is not a God that intervenes directly in our lives and that probability may be high but it is not a provable fact. To be an atheist is to put some cutoff on the posterior probability for the existence of God and that cutoff is based on philosophy not science.

While most scientists and mathematicians are cognizant that moral issues may be pertinent to their work (e.g. animal experimentation), they may be less cognizant of what I believe is an equally important philosophical issue , which is the ontological question. Ontology is a philosophical term for the study of what exists. To many pragmatically minded people, this may sound like an ethereal topic (or worse adjective) that has no place in the hard sciences. However, as I pointed out in an earlier post, we can put labels on at most a countably infinite number of things out of an uncountable number of possibilities and for most purposes, our ontological list of things is finite. We thus have to choose and although some of these choices are guided by how we as human agents interact with the world, others will be arbitrary. Determining ontology will involve aspects of philosophy, science and math.

Mathematicians face the ontological problem daily when they decide on what areas to work in and what theorems to prove. The possibilities in mathematics are infinite so it is almost certain that if we were to rerun history some if not many fields would not be reinvented. While scientists may have fewer degrees of freedom to choose from they are also making choices and these choices tend to be confined by history. The ontological problem shows up anytime we try to define a phenomenon. The classification of cognitive disorders is a pure exercise in ontology. Authors of the DSM IV have attempted to be as empirical and objective as possible but there is still plenty of philosophy in their designations of psychiatric conditions. While most string theorists accept that their discipline is mostly mathematical, they should also realize that it is very philosophical. A theory of everything includes the ontology by definition.

Subjects traditionally within the realm of philosophy also have mathematical and scientific aspects. Our morals and values have certainly been shaped by evolution and biological constraints. We should completely rethink our legal philosophy based on what we now know about neuroscience (e.g. see here). The same goes for any discussion of consciousness, the mind-body problem, and free will. To me the real problem with free will isn’t whether or not it exists but rather who or what exactly is exercising that free will and this can be looked at empirically.

So next time when you sit down to solve a problem, think about whether it is one of mathematics, science or philosophy.

The blinking-dot paradox of consciousness

May 6, 2014

Suppose you could measure the activity of every neuron in the brain of an awake and behaving person, including all sensory and motor neurons. You could then represent the firing pattern of these neurons on a screen with a hundred billion pixels (or as many as needed). Each pixel would be identified with a neuron and the activity of the brain would be represented by blinking dots of light. The question then is whether or not the array of blinking dots is conscious (provided the original person was conscious). If you believe that everything about consciousness is represented by neuronal spikes, then you would be forced to answer yes. On the other hand, you must then acknowledge that a television screen simply outputting entries from a table is also conscious.

There are several layers to this possible paradox. The first is whether or not all the information required to fully decode the brain and emulate consciousness is in the spiking patterns of the neurons in the brain. It could be that you need the information contained in all the physical processes in the brain such as the movement of  ions and water molecules, conformational changes of ion channels, receptor trafficking, blood flow, glial cells, and so forth. The question is then what resolution is required. If there is some short distance cut-off so you could discretize the events then you could always construct a bigger screen with trillions of trillions of pixels and be faced with the same question. But suppose that there is no cut-off so you need an uncountable amount of information. Then consciousness would not be a computable phenomenon and there is no hope in ever understanding it. Also, at a small enough scale (Planck length) you would be forced to include quantum gravity effects as well, in which case Roger Penrose may have been on to something after all.

The second issue is whether or not there is a difference between a neural computation and reading from a table. Presumably, the spiking events in the brain are due to the extremely complex dynamics of synaptically coupled neurons in the presence of environmental inputs. Is there something intrinsically different between a numerical simulation of a brain model from reading the entries of a list? Would one exhibit consciousness while the other not? To make matters even more confusing, suppose you have a computer running a simulation of a brain. The firing of the neurons are now encoded by the states of various electronic components like transistors. Does this means that the circuits in the computer become conscious when the simulation is running? What if the computer were simultaneously running other programs, like a web browser, or even another brain simulation?  In a computer, the execution of a program is not tied to specific electronic components.  Transistors just change states as instructions arrive so when a computer is running multiple programs, the transistors simulating the brain are not conserved.  How then do they stay coherent to form a conscious perception?  In a normal computer operation, the results are fed to an output, which is then interpreted by us.  In a simulation of the brain, there is no output, there is just the simulation. Questions like these make me question my once unwavering faith in the monistic (i.e. not dualistic) theory of the brain.

New paper on genomics

April 22, 2014

James Lee and I have a new paper out: Lee and Chow, Conditions for the validity of SNP-based heritability estimation, Human Genetics, 2014. As I summarized earlier (e.g. see here and here), heritability is a measure of the proportion of the variance of some trait (like height or cholesterol levels) due to genetic factors. The classical way to estimate heritability is to regress standardized (mean zero, standard deviation one) phenotypes of close relatives against each other. In 2010, Jian Yang, Peter Visscher and colleagues developed a way to estimate heritability directly from the data obtained in Genome Wide Association Studies (GWAS), sometimes called GREML.  Shashaank Vattikuti and I quickly adopted this method and computed the heritability of metabolic syndrome traits as well as the genetic correlations between the traits (link here). Unfortunately, our methods section has a lot of typos but the corrected Methods with the Matlab code can be found here. However, I was puzzled by the derivation of the method provided by the Yang et al. paper.  This paper is our resolution.  The technical details are below the fold.


Read the rest of this entry »

Saving US biomedical research

April 15, 2014

Bruce Alberts, Marc Kirschner, Shirley Tilghman, and Harold Varmus have an opinion piece in PNAS (link here) summarizing their concerns for the future of US biomedical research and suggesting some fixes. Their major premise is that medical research is predicated on an ever continuing expansion and we’re headed for a crisis if we don’t change immediately. As an NIH intramural investigator, I am shielded from the intense grant writing requirements of those on the outside. However, I am well aware of the difficulties in obtaining grant support and more than cognizant of the fact that a simple way to resolve the recent 8% cut in NIH funding is to eliminate the NIH intramural program. I have also noticed that medical schools keep expanding and hiring faculty on “soft money”, which requires them to raise their own salaries through grants. Soft money faculty essentially run independent businesses who rent lab space from institutions. The problem is that the market is a monopsony, where the sole buyer is the NIH. In order to keep their businesses running, they need lots of low paid labour, in the form of grad students and postdocs, many of whom have no hope of ever becoming independent investigators. One of the proposed solutions is to increase the salary of post docs and increase the numbers of permanent staff scientist positions. The premise is that by increasing unit costs, a labour equilibrium can be achieved. There is much more in the article and anyone involved in science should read it.

Big Data backlash

April 7, 2014

I predicted that there would be an eventual push back on Big Data and it seems that it has begun. Gary Marcus and Ernest Davis of NYU had an op-ed in the Times yesterday outlining nine issues with Big Data. I think one way to encapsulate many of the critiques is that you will never be able to do true prior free data modeling. The number of combinations in a data set grows as the factorial of the number of elements, which grows faster than an exponential. Hence, Moore’s law can never catch up. At some point, someone will need to exercise some judgement in which case Big Data is not really different from the ordinary data that we deal with all the time.

The ultimate pathogen vector

March 31, 2014

If civilization succumbs to a deadly pandemic, we will all know what the vector was. Every physician, nurse, dentist, hygienist, and health care worker is bound to check their smartphone sometime during the day before, during, or after seeing a patient and they are not sterilizing it afterwards.  The fully hands free smartphone could be the most important invention of the 21st century.

Optimizing food delivery

March 25, 2014

This Econtalk podcast with Frito-Lay executive Brendan O’Donohoe from 2011 gives a great account of how optimized the production and marketing system for potato chips and other salty snacks has become. The industry has a lot of very smart people trying to figure out how to ensure that you maximize food consumption from how to peel potatoes to how to stack store shelves with bags of chips. This increased efficiency is our hypothesis (e.g. see here) for the obesity epidemic. However, unlike before where I attributed the increase in food production to changes in agricultural policy, I now believe it is mostly due to the vastly increased efficiency of food production. This podcast shows the extent of the optimization after the produce leaves the farm but the efficiency improvements on the farm are just as dramatic. For example, farmers now use GPS to optimally line up their crops.

Analytic continuation continued

March 9, 2014

As I promised in my previous post, here is a derivation of the analytic continuation of the Riemann zeta function to negative integer values. There are several ways of doing this but a particularly simple way is given by Graham Everest, Christian Rottger, and Tom Ward at this link. It starts with the observation that you can write

\int_1^\infty x^{-s} dx = \frac{1}{s-1}

if the real part of s>0. You can then break the integral into pieces with

\frac{1}{s-1}=\int_1^\infty x^{-s} dx =\sum_{n=1}^\infty\int_n^{n+1} x^{-s} dx

=\sum_{n=1}^\infty \int_0^1(n+x)^{-s} dx=\sum_{n=1}^\infty\int_0^1 \frac{1}{n^s}\left(1+\frac{x}{n}\right)^{-s} dx      (1)

For x\in [0,1], you can expand the integrand in a binomial expansion

\left(1+\frac{x}{n}\right)^{-s} = 1 +\frac{sx}{n}+sO\left(\frac{1}{n^2}\right)   (2)

Now substitute (2) into (1) to obtain

\frac{1}{s-1}=\zeta(s) -\frac{s}{2}\zeta(s+1) - sR(s)  (3)


\zeta(s) =\frac{1}{s-1}+\frac{s}{2}\zeta(s+1) +sR(s)   (3′)

where the remainder R is an analytic function when Re s > -1 because the resulting series is absolutely convergent. Since the zeta function is analytic for Re s >1, the right hand side is a new definition of \zeta that is analytic for s >0 aside from a simple pole at s=1. Now multiply (3) by s-1 and take the limit as s\rightarrow 1 to obtain

\lim_{s\rightarrow 1} (s-1)\zeta(s)=1

which implies that

\lim_{s\rightarrow 0} s\zeta(s+1)=1     (4)

Taking the limit of s going to zero from the right of (3′) gives


Hence, the analytic continuation of the zeta function to zero is -1/2.

The analytic domain of \zeta can be pushed further into the left hand plane by extending the binomial expansion in (2) to

\left(1+\frac{x}{n}\right)^{-s} = \sum_{r=0}^{k+1} \left(\begin{array}{c} -s\\r\end{array}\right)\left(\frac{x}{n}\right)^r + (s+k)O\left(\frac{1}{n^{k+2}}\right)

 Inserting into (1) yields

\frac{1}{s-1}=\zeta(s)+\sum_{r=1}^{k+1} \left(\begin{array}{c} -s\\r\end{array}\right)\frac{1}{r+1}\zeta(r+s) + (s+k)R_{k+1}(s)

where R_{k+1}(s) is analytic for Re s>-(k+1).  Now let s\rightarrow -k^+ and extract out the last term of the sum with (4) to obtain

\frac{1}{-k-1}=\zeta(-k)+\sum_{r=1}^{k} \left(\begin{array}{c} k\\r\end{array}\right)\frac{1}{r+1}\zeta(r-k) - \frac{1}{(k+1)(k+2)}    (5)

Rearranging (5) gives

\zeta(-k)=-\sum_{r=1}^{k} \left(\begin{array}{c} k\\r\end{array}\right)\frac{1}{r+1}\zeta(r-k) -\frac{1}{k+2}     (6)

where I have used

\left( \begin{array}{c} -s\\r\end{array}\right) = (-1)^r \left(\begin{array}{c} s+r -1\\r\end{array}\right)

The righthand side of (6) is now defined for Re s > -k.  Rewrite (6) as

\zeta(-k)=-\sum_{r=1}^{k} \frac{k!}{r!(k-r)!} \frac{\zeta(r-k)(k-r+1)}{(r+1)(k-r+1)}-\frac{1}{k+2}

=-\sum_{r=1}^{k} \left(\begin{array}{c} k+2\\ k-r+1\end{array}\right) \frac{\zeta(r-k)(k-r+1)}{(k+1)(k+2)}-\frac{1}{k+2}

=-\sum_{r=1}^{k-1} \left(\begin{array}{c} k+2\\ k-r+1\end{array}\right) \frac{\zeta(r-k)(k-r+1)}{(k+1)(k+2)}-\frac{1}{k+2} - \frac{\zeta(0)}{k+1}

Collecting terms, substituting for \zeta(0) and multiplying by (k+1)(k+2)  gives

(k+1)(k+2)\zeta(-k)=-\sum_{r=1}^{k-1} \left(\begin{array}{c} k+2\\ k-r+1\end{array}\right) \zeta(r-k)(k-r+1) - \frac{k}{2}

Reindexing gives

(k+1)(k+2)\zeta(-k)=-\sum_{r'=2}^{k} \left(\begin{array}{c} k+2\\ r'\end{array}\right) \zeta(-r'+1)r'-\frac{k}{2}

Now, note that the Bernoulli numbers satisfy the condition \sum_{r=0}^{N-1} B_r = 0.  Hence,  let \zeta(-r'+1)=-\frac{B_r'}{r'}

and obtain

(k+1)(k+2)\zeta(-k)=\sum_{r'=0}^{k+1} \left(\begin{array}{c} k+2\\ r'\end{array}\right) B_{r'}-B_0-(k+2)B_1-(k+2)B_{k+1}-\frac{k}{2}

which using B_0=1 and B_1=-1/2 gives the self-consistent condition


which is the analytic continuation of the zeta function for integers k\ge 1.

Analytic continuation

February 21, 2014

I have received some skepticism that there are possibly other ways of assigning the sum of the natural numbers to a number other than -1/12 so I will try to be more precise. I thought it would be also useful to derive the analytic continuation of the zeta function, which I will do in a future post.  I will first give a simpler example to motivate the notion of analytic continuation. Consider the geometric series 1+s+s^2+s^3+\dots. If |s| < 1 then we know that this series is equal to

\frac{1}{1-s}                (1)

Now, while the geometric series is only convergent and thus analytic inside the unit circle, (1) is defined everywhere in the complex plane except at s=1. So even though the sum doesn’t really exist outside of the domain of convergence, we can assign a number to it based on (1). For example, if we set s=2 we can make the assignment of 1 + 2 + 4 + 8 + \dots = -1. So again, the sum of the powers of two doesn’t really equal -1, only (1) is defined at s=2. It’s just that the geometric series and (1) are the same function inside the domain of convergence. Now, it is true that the analytic continuation of a function is unique. However, although the value of -1 for s=-1 is the only value for the analytic continuation of the geometric series, that doesn’t mean that the sum of the powers of 2 needs to be uniquely assigned to negative one because the sum of the powers of 2 is not an analytic function. So if you could find some other series that is a function of some parameter z that is analytic in some domain of convergence and happens to look like the sum of the powers of two for some z value, and you can analytically continue the series to that value, then you would have another assignment.

Now consider my example from the previous post. Consider the series

\sum_{n=1}^\infty \frac{n-1}{n^{s+1}}  (2)

This series is absolutely convergent for s>1.  Also note that if I set s=-1, I get

\sum_{n=1}^\infty (n-1) = 0 +\sum_{n'=1}^\infty n' = 1 + 2 + 3 + \dots

which is the sum of then natural numbers. Now, I can write (2) as

\sum_{n=1}^\infty\left( \frac{1}{n^s}-\frac{1}{n^{s+1}}\right)

and when the real part of s is greater than 1,  I can further write this as

\sum_{n=1}^\infty\frac{1}{n^s}-\sum_{n=1}^\infty\frac{1}{n^{s+1}}=\zeta(s)-\zeta(s+1)  (3)

All of these operations are perfectly fine as long as I’m in the domain of absolute convergence.  Now, as I will show in the next post, the analytic continuation of the zeta function to the negative integers is given by

\zeta (-k) = -\frac{B_{k+1}}{k+1}

where B_k are the Bernoulli numbers, which is given by the Taylor expansion of

\frac{x}{e^x-1} = \sum B_n \frac{x^n}{n!}   (4)

The first few Bernoulli numbers are B_0=1, B_1=-1/2, B_2 = 1/6. Thus using this in (4) gives \zeta(-1)=-1/12. A similar proof will give \zeta(0)=-1/2.  Using this in (3) then gives the desired result that the sum of the natural numbers is (also) 5/12.

Now this is not to say that all assignments have the same physical value. I don’t know the details of how -1/12 is used in bosonic string theory but it is likely that the zeta function is crucial to the calculation.

Nonuniqueness of -1/12

February 11, 2014

I’ve been asked to give an example of how the sum of the natural numbers could lead to another value in the comments to my previous post so I thought it may be of general interest to more people. Consider again S=1+2+3+4\dots to be the sum of the natural numbers.  The video in the previous slide gives a simple proof by combining divergent sums. In essence, the manipulation is doing renormalization by subtracting away infinities and the left over of this renormalization is -1/12. There is another video that gives the proof through analytic continuation of the Riemann zeta function

\zeta(s)=\sum_{n=1}^\infty \frac{1}{n^s}

The zeta function is only strictly convergent when the real part of s is greater than 1. However, you can use analytic continuation to extract values of the zeta function to values where the sum is divergent. What this means is that the zeta function is no longer the “same sum” per se, but a version of the sum taken to a domain where it was not originally defined but smoothly (analytically) connected to the sum. Hence, the sum of the natural numbers is given by \zeta(-1) and \zeta(0)=\sum_{n=1}^\infty 1, (infinite sum over ones). By analytic continuation, we obtain the values \zeta(-1)=-1/12 and \zeta(0)=-1/2.

Now notice that if I subtract the sum over ones from the sum over the natural numbers I still get the sum over the natural numbers, e.g.

1+2+3+4\dots - (1+1+1+1\dots)=0+1+2+3+4\dots.

Now, let me define a new function \xi(s)=\zeta(s)-\zeta(s+1) so \xi(-1) is the sum over the natural numbers and by analytic continuation \xi(-1)=-1/12+1/2=5/12 and thus the sum over the natural numbers is now 5/12. Again, if you try to do arithmetic with infinity, you can get almost anything. A fun exercise is to create some other examples.


Get every new post delivered to your Inbox.

Join 115 other followers