# How to be a scientific dilettante

I have worked in a lot of disparate fields from plasma physics, to nonlinear dynamics, to posture control, to neuroscience, to inflammation, to obesity, to gene transcription, and population genetics. I have had a lot of fun doing this but I definitely do not recommend it as a career path for a young person. I may have gotten away with it but I was really lucky and it certainly carried a cost. First of all, by working in so many fields, I definitely lack the deep knowledge that specialists have. Thus, I don’t always see the big picture and am often a step behind others. Secondly, while I am pretty well cited, my citations are diluted over multiple fields. Thus, while my total H index (number of papers where number of citations exceeds rank) is pretty decent, my H index in each given field is relatively small. I thus do not have much impact in any given area. To be taken seriously as a scientist, one must be a world expert in something. The system is highly nonlinear; being pretty good in a lot of things is much worse than being really good in one thing. There is a threshold for relevance and if you don’t cross it then it is like you don’t exist.

However, if you do want to work in a lot of fields, the worse thing to do is to say, “Hey I really find field X to be interesting so I’m just going to read some books and papers on it and try to do something.” I have reviewed quite a few papers, where some mathematician or physicist has read some popular science book or newspaper article on the topic and then tried to publish a paper on a problem mentioned in the book. I then have to tell them to read up on four decades of previous work first, and then resubmit. The way I have managed to meander through multiple fields is that someone will either contact me directly about some specific question or mention something to me either in a casual setting or at a conference. I could not possibly have made any progress at all if I didn’t have great collaborators who really knew the field and the literature. Still, people constantly ask me if I still work in neuroscience, to which I can only respond “Just because you don’t cite me doesn’t mean I don’t publish!”

# The Drake equation and the Cambrian explosion

This summer billionaire Yuri Milner announced that he would spend upwards of 100 million dollars to search for extraterrestrial intelligent life (here is the New York Times article). This quest to see if we have company started about fifty years ago when Frank Drake pointed a radio telescope at some stars. To help estimate the number of possible civilizations, $N$, Drake wrote down his celebrated equation,

$N = R_*f_p n_e f_l f_i f_c L$

where $R_*$ is the rate of star formation, $f_p$ is the fraction of stars with planets, $n_e$ is the average number of planets per star that could support life, $f_l$ fraction of planets that develop life, $f_i$ fraction of those planets that develop intelligent life, $f_c$ fraction of civilizations that emit signals, and $L$ is the length of time civilizations emit signals.

The past few years have demonstrated that planets in the galaxy are likely to be plentiful and although the technology to locate earth-like planets does not yet exist, my guess is that they will also be plentiful. So does that mean that it is just a matter of time before we find ET? I’m going to come on record here and say no. My guess is that life is rare and intelligent life may be so rare that there could only be one civilization at a time in any given galaxy.

While we are now filling in the numbers for the left side of Drake’s equation, we have absolutely no idea about the right side of the equation. However, I have good reason to believe that it is astronomically small and that reason is statistical independence. Although Drake characterized the probability of intelligent life into the probability of life forming times the probability it goes on to develop extra-planetary communication capability, there are actually a lot of factors in between. One striking example is the probability of the formation of multi-cellular life. In earth’s history, for the better part of three and a half billion years we had mostly single cellular life and maybe a smattering of multicellular experiments. Then suddenly about half a billion years ago, we had the Cambrian Explosion where multicellular animal life from which we are descended suddenly came onto the scene. This implies that forming multicellular life is extremely difficult and it is easy to envision an earth where it never formed at all.

We can continue. If it weren’t for an asteroid impact, the dinosaurs may never have gone extinct and mammals may not have developed. Even more recently, there seem to have been many species of advanced primates yet only one invented radios. Agriculture only developed ten thousand years ago, which meant that modern humans took about a hundred thousand years to discover it and only in one place. I think it is equally plausible that humans could have gone extinct like all of our other australopithecus and homo cousins. Life in the sea has existed much longer than life on land and there is no technologically advanced sea creature although I do think octopuses, dolphins and whales are intelligent.

We have around 100 billion stars in the galaxy and let’s just say that each has a habitable planet. Well, if the probability of each stage of life is one in a billion and if we need say three stages to attain technology then the probability of finding ET is one in $10^{16}$. I would say that this is an optimistic estimate. Probabilities get small really quickly when you multiply them together. The probability of single cellular life will be much higher. It is possible that there could be hundred planets in our galaxy that have life but the chance that one of those is within a hundred light years will again be very low. However, I do think it is a worthwhile exercise to look for extracellular life, especially for oxygen or other life emitting gases in the atmosphere of exoplanets. It could tell us a lot about biology on earth.

2015-10-1: I corrected a factor of 10 error in some of the numbers.

# The world of Gary Taubes

Science writer Gary Taubes has a recent New York Times commentary criticizing Kevin Hall’s recent paper on the differential metabolic effects of low fat vs low carbohydrate diets. See here for my recent post on the experiment. Taubes is probably best known for his views on nutrition and as an advocate for low carb diets although he has two earlier books on the sociology of physics. The main premise running through his four books is that science is susceptible to capture by the vanity, ambition, arrogance, and plain stupidity of scientists. He is pro-science but anti-scientist.

His first book on nutrition – Good Calories, Bad Calories, was about how the medical establishment and in particular nutritionists have provided wrong and potentially dangerous advice on diets for decades. He takes direct aim at Ancel Keys as one of the main culprits for pushing the reduction of dietary fat to prevent heart disease. The book is a great read and clearly demonstrates Taubes’s sharp mind and gifts as a story teller. In the course of researching the book, Taubes also discovered the biological mechanisms of insulin and this is what has mostly shaped his thinking about carbohydrates and obesity. He spells it out in more detail in his subsequent book – Why We Get Fat. I think that these two books are a perfect demonstration of why having a little knowledge and a high IQ can be a dangerous thing.

Most people know of insulin as the hormone that goes awry in diabetes. When we fast, our insulin levels are low and our body, except for our brain, burns fat. If we then ingest carbohydrates, our insulin levels rise, which induces our body to utilize glucose (the main source of fuel in carbs) in favour of insulin. Exercise will also cause a switch in fuel choice from fat to glucose. What is less well known is that insulin also suppresses the release of fat from fat cells (adipocytes), which is something I have modeled (see here). This seems to have been a revelation to Taubes – Clearly, if you eat lots of carbs, you will have lots of insulin, which will sequester fat in fat cells. Ergo, eating carbs makes you fat! Nutritionists were so focused on their poorly designed studies that they missed the blatantly obvious. This is just another example of how arrogant scientists get things wrong.

Taubes then proposed a simple experiment – take two groups of people and put one group on a high carb diet and the other on a low carb diet with the same caloric content, and see who loses weight. Well, Kevin Hall anticipated this request with basically the same experiment although for a different purpose. What Kevin noticed in his model was that if you cut carbs and keep everything else the same, insulin goes down and the body responds by burning much more fat. However, if you cut fat, there is nothing in the model that told the body that the fat was missing. Insulin didn’t change and thus the body just burned the same amount of carbs as before. He found this puzzling. Surely there must be a fat detector that we don’t know about so he went about to test it. I remember he and his fellows labouring diligently for what seemed like years writing the protocol and getting the necessary approval and resources to do the experiment. The result was exactly as the model predicted. We really don’t have a fat sensor. However, the subjects lost more fat on the low fat diet then they did on the low carb diet.  This is not exactly the experiment Taubes wanted to do, which was to change the macronutrient composition but keep the calories the same. He then hypothesized that those on the low carb diet would lose weight and those on the low fat, high carb diet would gain weight. Kevin and a consortium of top obesity researchers has since done that experiment and the results will come out shortly.

Now is this surprising? Well not really, for while Taubes is absolutely correct in that insulin suppresses fat utilization the net outcome of insulin reduction is a quantitative and not a qualitative question. You cannot deduce the outcome with formal logic. The reason is that insulin cannot be elevated all the time. Even a continuous grazer must sleep at some point where upon insulin falls. You then must consider the net effect of high and low insulin over a day or longer to assess the outcome. This can only be determined empirically and this is what Taubes fails to see or accept. He also commits a logical fallacy –  Just because a scientist is stupid doesn’t mean he is wrong.

Taubes’s recent commentary criticizes Kevin’s experiment by saying that it 1) is a diet that is impossible to follow and 2) it ignores appetite. The response to the first point is that the experiment was meant to test a metabolic hypothesis and was not meant to test the effect of a diet. My response to his second point is to stare agape. When Taubes visited NIH a few years ago after his Good Calories, Bad Calories book came out I offered the hypothesis that low carb diets could suppress appetite and this could be why they may be effective in reducing weight. However, he had no interest in this idea and Kevin has told me that he has repeatedly shown no interest in it. (I don’t need to give details on how people have been interested in appetite for decades since it is well done in this post.) I came to the conclusion that appetite control was the primary driver of the obesity epidemic shortly after arriving at NIH. In fact my first BSC presentation was on this topic. The recommendation by the committee was that I should do something else and that NIH was a bad fit for me. However, I am still here and I still believe appetite control is the key.

# Hopfield on the difference between physics and biology

Here is a short essay by theoretical physicist John Hopfield of the Hopfield net and kinetic proofreading fame among many other things (hat tip to Steve Hsu). I think much of the hostility of biologists towards physicists and mathematicians that Hopfield talks about have dissipated over the past 40 years, especially amongst the younger set. In fact these days, a good share of Cell, Science, and Nature papers have some computational or mathematical component. However, the trend is towards brute force big data type analysis rather than the simple elegant conceptual advances that Hopfield was famous for. In the essay, Hopfield gives several anecdotes and summarizes them with pithy words of advice. The one that everyone should really heed and one I try to always follow is “Do your best to make falsifiable predictions. They are the distinction between physics and ‘Just So Stories.’”

# Jobs at the Allen Institute

The Allen Institute for Brain Science is currently recruiting. The positions are listed on this link but in particular see below:

 SCIENTIST I – MODELING, ANALYSIS AND THEORY The Modeling, Analysis and Theory team at the Allen Institute is seeking a candidate with strong mathematical and computational skills who will work closely with both the team as well as experimentalists in order to both maximize the potential of datasets as well as realize that potential via analysis and theory. The successful candidate will be expected to develop analysis for populations of neurons as well as establish theoretical results on cortical computation, object recognition, and related areas in order to aid the Institute in understanding the most complex piece of matter in the universe. Contact: Michael Buice, Scientist II michaelbu@ alleninstitute.org

# Why science is hard to believe

Here is an excerpt from a well written opinion piece by Washington Post columnist Joel Achenbach:

Washington Post: We live in an age when all manner of scientific knowledge — from the safety of fluoride and vaccines to the reality of climate change — faces organized and often furious opposition. Empowered by their own sources of information and their own interpretations of research, doubters have declared war on the consensus of experts. There are so many of these controversies these days, you’d think a diabolical agency had put something in the water to make people argumentative.

Science doubt has become a pop-culture meme. In the recent movie “Interstellar,” set in a futuristic, downtrodden America where NASA has been forced into hiding, school textbooks say the Apollo moon landings were faked.

I recommend reading the whole piece.

# Open source software for math and science

Here is a list of open source software that you may find useful.  Some, I use almost every day, some I have not yet used, and some may be so ubiquitous that you have even forgotten that it is software.

1. XPP/XPPAUT. Bard Ermentrout wrote XPP in the 1980’s as a dynamical systems tool for himself. It’s now the de facto tool for the Snowbird community.  I still find it to be the easiest and fastest way to simulate and visualize differential equations.  It includes the equally excellent bifurcation continuation software tool AUTO originally written by Eusebius Doedel with contributions from a who’s who list of mathematicians.  XPP is also available as an iPad and iPhone App.

2. Julia. I only learned about Julia this spring and now I use it for basically anything I used to use Matlab for.  It’s syntax is very similar to Matlab and it’s very fast. I think it is quickly gaining a large following and may be as comprehensive as Python some day.

3. Python often seems more like a way of life than a software tool. I would probably be using Python if it were not for Julia and the fact that Julia is faster. Python has packages for everything. There is SciPy and NumPy for scientific computing, Pandas for statistics, Matplotlib for making graphs, and many more that I don’t yet know about.  I must confess that I still don’t know my way around Python but my fellows all use it.

4. R. For statistics, look no further than R, which is what academic statisticians use. It’s big in Big Data.  So big that I heard that Microsoft is planning to write a wrapper for it. I also heard that billionaire mathematician James Simons’s hedge fund Renaissance Technologies uses it.  For Bayesian inference there is now Stan, which implements Hamilton Monte Carlo.  We tried using it for one of our projects and had trouble getting it to work but it’s improving very fast.

5. AMS-Latex. The great computer scientist Donald Knuth wrote the typesetting language TeX in 1978 and he changed scientific publication forever. If you have ever had to struggle putting equations into MS Word, you’ll realize what a genius Knuth is. Still TeX was somewhat technical and thus LaTeX was invented as a simplified interface for TeX with built-in environments that are commonly used. AMS-Latex is a form of LaTeX that includes commands for any mathematical symbol you’ll ever need. It also has very nice equation and matrix alignment tools.

6. Maxima. Before Mathematica and Maple there was Macsyma. It was a symbolic mathematics system developed over many years at MIT starting in the 60’s. It was written in the programming language Lisp (another great open source tool but I have never used it) and was licensed by MIT to a company called Symbolics that made dedicated Lisp machines that ran Macsyma.  My Thesis advisor at MIT bought one of these machines (I think it cost him something like 20 thousand dollars, which was a lot of money back then) and I used it for my thesis. I really loved Macysma and got quite adept at it. However, as you can imagine the Symbolics business plan really didn’t pan out and Macysma kind of languished after the company failed. However, after many trials and tribulations, Macsyma was reborn as the open source software tool Maxima and it’s great.  I’ve been running wmMaxima and it can do everything that I ever needed Mathematica for with the bonus that I don’t have to find and re-enter my license number every few months.

7. OpenOffice. I find it reprehensible that scientific journals force me to submit my papers in Microsoft Word. But MS Office is a monopoly and all my collaborators use it.  Data always comes to me in Excel and talks are in PowerPoint. For my talks, I use Apple Keynote, which is not open source. However, Apple likes to completely overhaul their software so my old talks are not even compatible with the most recent version. I also dislike the current version. The reason I went to Keynote is because I could embed PDFs of equations made in LaTeXiT (donation ware). However, the new version makes this less convenient. PDFs looked terrible in PowerPoint a decade ago. I have no idea if this has changed or not.  I have flirted with using OpenOffice for many years but it was never quite 100% compatible with MS Office so I could never fully dispense with Word.  However, in my push to open source, I may just write my next talk in OpenOffice.

8. Plink The standard GWAS analysis tool is Plink, originally written by Shaun Purcell.  It’s nice but kind of slow for some computations and was not being actively updated.  It also couldn’t do some of the calculations we wanted.  So in steps my collaborator Chris Chang who took it upon himself to write a software tool that could do all the calculations we needed. His code was so fast and good that we started to ask him to add more and more to it. Eventually, it did almost everything that Plink and gcta (tool for estimating heritability) could do and thus he asked Purcell if he could just call it Plink. It’s currently called Plink 1.9.

9. C/C++  We tend to forget that computer languages like C, Java, Javascript, Ruby, etc. are all open source software tools.

10. Inkscape is a very nice drawing program, an open source Adobe Illustrator if you will.

11. GNU Project. Computer scientist Richard Stallman kind of invented the concept of open software. He started the free software foundation and the GNU Project, which includes GNU/Linux, the editor emacs, gnuplot among many other things.

Probably the software tools you use most that are currently free (but may not be forever) are the browser and email. People forget how much these two ubiquitous things have completely changed our lives.  When was the last time you went to the library or wrote a letter in ink?