## New paper on steroid-regulated gene expression

February 27, 2015

Research Resource: Modulators of glucocorticoid receptor activity identified by a new high-throughput screening assay

John A. Blackford, Jr., Kyle R. Brimacombe, Edward J. Dougherty , Madhumita Pradhan, Min Shen, Zhuyin Li, Douglas S. Auld, Carson C. Chow, Christopher P. Austin, and S. Stoney Simons, Jr.

Abstract: Glucocorticoid steroids affect almost every tissue-type and thus are widely used to treat a variety of human pathologies. However, the severity of numerous side-effects limits the frequency and duration of glucocorticoid treatments. Of the numerous approaches to control off-target responses to glucocorticoids, small molecules and pharmaceuticals offer several advantages. Here we describe a new, extended high throughput screen in intact cells to identify small molecule modulators of dexamethasone-induced glucocorticoid receptor (GR) transcriptional activity. The novelty of this assay is that it monitors changes in both GR maximal activity (Amax) and EC50, or the position of the dexamethasone dose-response curve. Upon screening 1280 chemicals, ten with the greatest change in the absolute value of Amax or EC50 were selected for further examination. Qualitatively identical behaviors for 60 –90% of the chemicals were observed in a completely different system, suggesting that other systems will be similarly affected by these chemicals. Additional analysis of the ten chemicals in a recently described competition assay determined their kinetically-defined mechanism and site of action. Some chemicals had similar mechanisms of action despite divergent effects on the level of GR-induced product. These combined assays offer a straightforward method of identifying numerous new pharmaceuticals that can alter GR transactivation in ways that could be clinically useful.

## Paper on new version of Plink

February 27, 2015

The paper describing the updated version of the genome analysis software tool Plink has just been published.

Second-generation PLINK: rising to the challenge of larger and richer datasets
Christopher C Chang, Carson C Chow, Laurent CAM Tellier, Shashaank Vattikuti, Shaun M Purcell, and James J Lee

GigaScience 2015, 4:7  doi:10.1186/s13742-015-0047-8

Abstract
Background
PLINK 1 is a widely used open-source C/C++ toolset for genome-wide association studies (GWAS) and research in population genetics. However, the steady accumulation of data from imputation and whole-genome sequencing studies has exposed a strong need for faster and scalable implementations of key functions, such as logistic regression, linkage disequilibrium estimation, and genomic distance evaluation. In addition, GWAS and population-genetic data now frequently contain genotype likelihoods, phase information, and/or multiallelic variants, none of which can be represented by PLINK 1’s primary data format.

Findings
To address these issues, we are developing a second-generation codebase for PLINK. The first major release from this codebase, PLINK 1.9, introduces extensive use of bit-level parallelism, View MathML-time/constant-space Hardy-Weinberg equilibrium and Fisher’s exact tests, and many other algorithmic improvements. In combination, these changes accelerate most operations by 1-4 orders of magnitude, and allow the program to handle datasets too large to fit in RAM. We have also developed an extension to the data format which adds low-overhead support for genotype likelihoods, phase, multiallelic variants, and reference vs. alternate alleles, which is the basis of our planned second release (PLINK 2.0).

Conclusions
The second-generation versions of PLINK will offer dramatic improvements in performance and compatibility. For the first time, users without access to high-end computing resources can perform several essential analyses of the feature-rich and very large genetic datasets coming into use.

Keywords: GWAS; Population genetics; Whole-genome sequencing; High-density SNP genotyping; Computational statistics

This project started out with us trying to do some genomic analysis that involved computing various distance metrics on sequence space. Programming virtuoso Chris Chang stepped in and decided to write some code to speed up the computations. His program, originally called wdist, was so good and fast that we kept asking him to put in more capabilities. Eventually,  he had basically replicated the suite of functions that Plink performed so he contacted Shaun Purcell, the author of Plink, if he could just call his code Plink too and Shaun agreed. We then ran a series of tests on various machines to check the speed-ups compared to the original Plink and gcta. If you do any GWAS analysis at all, I highly recommend you check out Plink 1.9.

## Selection of the week

February 27, 2015

English composer Ralph Vaughn Williams’s Fantasia on Greensleeves played by  Ibis, who are based near here in Arlington Virginia.

## Talk at UMBC

February 20, 2015

I’m giving a talk at University of Maryland, Baltimore County today at noon. The talk will be similar to the one I gave at SMB this past summer.  Slides can be found here.

## Selection of the week

February 20, 2015

Italian composer Luigi Boccherini’s Guitar Quintet No. 1 played by the Boccherini Ensemble.

## Why science is hard to believe

February 14, 2015

Here is an excerpt from a well written opinion piece by Washington Post columnist Joel Achenbach:

Washington Post: We live in an age when all manner of scientific knowledge — from the safety of fluoride and vaccines to the reality of climate change — faces organized and often furious opposition. Empowered by their own sources of information and their own interpretations of research, doubters have declared war on the consensus of experts. There are so many of these controversies these days, you’d think a diabolical agency had put something in the water to make people argumentative.

Science doubt has become a pop-culture meme. In the recent movie “Interstellar,” set in a futuristic, downtrodden America where NASA has been forced into hiding, school textbooks say the Apollo moon landings were faked.

I recommend reading the whole piece.

## Selection of the week

February 13, 2015

Yo Yo Ma and Itzhak Perlman play Antonin Dvorak’s Humoresque in G-flat minor with Seiji Ozawa and the Boston Symphony Orchestra.

## Plants: from roots to riches

February 11, 2015

I highly recommend this podcast series from BBC on the history and science of plants, narrated by Kathy Willis, director of science at Kew Gardens. I’ve been listening to it through podcasts of The Science Show on ABC radio.

## The tragic life of Walter Pitts

February 9, 2015

Everyone in computational neuroscience knows about the McCulloch-Pitts neuron model, which forms the foundation for neural network theory. However, I never knew anything about Warren McCulloch or Walter Pitts until I read this very interesting article in Nautilus. I had no idea that Pitts was a completely self-taught genius that impressed the likes of Bertrand Russell, Norbert Wiener and John von Neumann but was also a self-destructive alcoholic. One thing the article nicely conveys was the camaraderie and joie de vivre that intellectuals experienced in the past. Somehow this spirit seems missing now.

## Open source software for math and science

February 8, 2015

Here is a list of open source software that you may find useful.  Some, I use almost every day, some I have not yet used, and some may be so ubiquitous that you have even forgotten that it is software.

1. XPP/XPPAUT. Bard Ermentrout wrote XPP in the 1980’s as a dynamical systems tool for himself. It’s now the de facto tool for the Snowbird community.  I still find it to be the easiest and fastest way to simulate and visualize differential equations.  It includes the equally excellent bifurcation continuation software tool AUTO originally written by Eusebius Doedel with contributions from a who’s who list of mathematicians.  XPP is also available as an iPad and iPhone App.

2. Julia. I only learned about Julia this spring and now I use it for basically anything I used to use Matlab for.  It’s syntax is very similar to Matlab and it’s very fast. I think it is quickly gaining a large following and may be as comprehensive as Python some day.

3. Python often seems more like a way of life than a software tool. I would probably be using Python if it were not for Julia and the fact that Julia is faster. Python has packages for everything. There is SciPy and NumPy for scientific computing, Pandas for statistics, Matplotlib for making graphs, and many more that I don’t yet know about.  I must confess that I still don’t know my way around Python but my fellows all use it.

4. R. For statistics, look no further than R, which is what academic statisticians use. It’s big in Big Data.  So big that I heard that Microsoft is planning to write a wrapper for it. I also heard that billionaire mathematician James Simons’s hedge fund Renaissance Technologies uses it.  For Bayesian inference there is now Stan, which implements Hamilton Monte Carlo.  We tried using it for one of our projects and had trouble getting it to work but it’s improving very fast.

5. AMS-Latex. The great computer scientist Donald Knuth wrote the typesetting language TeX in 1978 and he changed scientific publication forever. If you have ever had to struggle putting equations into MS Word, you’ll realize what a genius Knuth is. Still TeX was somewhat technical and thus LaTeX was invented as a simplified interface for TeX with built-in environments that are commonly used. AMS-Latex is a form of LaTeX that includes commands for any mathematical symbol you’ll ever need. It also has very nice equation and matrix alignment tools.

6. Maxima. Before Mathematica and Maple there was Macsyma. It was a symbolic mathematics system developed over many years at MIT starting in the 60’s. It was written in the programming language Lisp (another great open source tool but I have never used it) and was licensed by MIT to a company called Symbolics that made dedicated Lisp machines that ran Macsyma.  My Thesis advisor at MIT bought one of these machines (I think it cost him something like 20 thousand dollars, which was a lot of money back then) and I used it for my thesis. I really loved Macysma and got quite adept at it. However, as you can imagine the Symbolics business plan really didn’t pan out and Macysma kind of languished after the company failed. However, after many trials and tribulations, Macsyma was reborn as the open source software tool Maxima and it’s great.  I’ve been running wmMaxima and it can do everything that I ever needed Mathematica for with the bonus that I don’t have to find and re-enter my license number every few months.

7. OpenOffice. I find it reprehensible that scientific journals force me to submit my papers in Microsoft Word. But MS Office is a monopoly and all my collaborators use it.  Data always comes to me in Excel and talks are in PowerPoint. For my talks, I use Apple Keynote, which is not open source. However, Apple likes to completely overhaul their software so my old talks are not even compatible with the most recent version. I also dislike the current version. The reason I went to Keynote is because I could embed PDFs of equations made in LaTeXiT (donation ware). However, the new version makes this less convenient. PDFs looked terrible in PowerPoint a decade ago. I have no idea if this has changed or not.  I have flirted with using OpenOffice for many years but it was never quite 100% compatible with MS Office so I could never fully dispense with Word.  However, in my push to open source, I may just write my next talk in OpenOffice.

8. Plink The standard GWAS analysis tool is Plink, originally written by Shaun Purcell.  It’s nice but kind of slow for some computations and was not being actively updated.  It also couldn’t do some of the calculations we wanted.  So in steps my collaborator Chris Chang who took it upon himself to write a software tool that could do all the calculations we needed. His code was so fast and good that we started to ask him to add more and more to it. Eventually, it did almost everything that Plink and gcta (tool for estimating heritability) could do and thus he asked Purcell if he could just call it Plink. It’s currently called Plink 1.9.

9. C/C++  We tend to forget that computer languages like C, Java, Javascript, Ruby, etc. are all open source software tools.

10. Inkscape is a very nice drawing program, an open source Adobe Illustrator if you will.

11. GNU Project. Computer scientist Richard Stallman kind of invented the concept of open software. He started the free software foundation and the GNU Project, which includes GNU/Linux, the editor emacs, gnuplot among many other things.

Probably the software tools you use most that are currently free (but may not be forever) are the browser and email. People forget how much these two ubiquitous things have completely changed our lives.  When was the last time you went to the library or wrote a letter in ink?

## Selection of the week

February 6, 2015

Yundi Li playing Frederic Chopin’s famous Fantaisie-Impromptu Op 66.

## Selection of the week

January 30, 2015

Here is “Siegfried’s Death and Funeral March” from Richard Wagner’s opera Gotterdammerung of the Ring Cycle played by the London Philharmonic conducted by Klaus Tennstedt.  This piece was used to great effect by director John Boorman in the movie Excalibur.

## Selection of the week

January 23, 2015

The twentieth century’s greatest pianist Vladimir Horowitz (arguments?) plays Domenico Scarlatti’s Keyboard Sonata in B minor, K. 87.  Baroque composers Scarlatti, George Frideric Handel, and JS Bach were all born in 1685.

## The demise of the American cappuccino

January 19, 2015

When I was a post doc at BU in the nineties, I used to go to a cafe on Commonwealth Ave just down the street from my office on Cummington Street. I don’t remember the name of the place but I do remember getting a cappuccino that looked something like this:Now, I usually get something that looks like this:   Instead of a light delicate layer of milk with a touch of foam floating on rich espresso, I get a lump of dry foam sitting on super acidic burnt quasi-espresso. How did this unfortunate circumstance occur? I’m not sure but I think it was because of Starbucks. Scaling up massively means you get what the average customer wants, or Starbucks thinks they want. This then sets a standard and other cafes have to follow suit because of consumer expectations. Also, making a real cappuccino takes training and a lot of practice and there is no way Starbucks could train enough baristas. Now, I’m not an anti-Starbucks person by any means. I think it is nice that there is always a fairly nice space with free wifi on every corner but I do miss getting a real cappuccino. I believe there is a real business opportunity out there for cafes to start offering better espresso drinks.

## Selection of the week

January 16, 2015

Here is a piece by turn of the last century British composer Samuel Coleridge-Taylor, named after the poet who wrote The Rime of the Ancient Mariner. Coleridge-Taylor met with some racism because he was of mixed African descent but had achieved some renown before dying at the young age of 37.

## Sebastian Seung and the Connectome

January 11, 2015

The New York Times Magazine has a nice profile on theoretical neuroscientist Sebastian Seung this week. I’ve known Sebastian since we were graduate students in Boston in the 1980’s. We were both physicists then and both ended up in biology though through completely different paths. The article focuses on his quest to map all the connections in the brain, which he terms the connectome. Near the end of the article, neuroscientist Eve Marder of Brandeis comments on the endeavor with the pithy remark that “If we want to understand the brain, the connectome is absolutely necessary and completely insufficient.”  To which the article ends with

Seung agrees but has never seen that as an argument for abandoning the enterprise. Science progresses when its practitioners find answers — this is the way of glory — but also when they make something that future generations rely on, even if they take it for granted. That, for Seung, would be more than good enough. “Necessary,” he said, “is still a pretty strong word, right?”

Personally, I am not sure if the connectome is necessary or sufficient although I do believe it is a worthy task. However, my hesitation is not because of what was proposed in the article, which is that we exist in a fluid world and the connectome is static. Rather, like Sebastian, I do believe that memories are stored in the connectome and I do believe that “your” connectome does capture much of the essence of “you”. Many years ago, the CPU on my computer died. Our IT person swapped out the CPU and when I turned my computer back on, it was like nothing had happened. This made me realize that everything about the computer that was important to me was stored on the hard drive. The CPU didn’t matter even though every thing a computer did relied on the CPU. I think the connectome is like the hard drive and trying to figure out how the brain works from it is like trying to reverse engineer the CPU from the hard drive. You can certainly get clues from it such as information is stored in binary form but I’m not sure if it is necessary or sufficient to figure out how a computer works by recreating an entire hard drive. Likewise, someday we may use the connectome to recover lost memories or treat some diseases but we may not need it to understand how a brain works.

## Implicit bias

January 7, 2015

The most dangerous form of bias is when you are unaware of it. Most people are not overtly racist but many have implicit biases that can affect their decisions.  In this week’s New York Times, Claudia Dreifus has a conversation with Stanford psychologist Jennifer Eberhardt, who has been studying implicit biases in people experimentally.  Among her many eye opening studies, she has found that convicted criminals whose faces people deem more “black” are more likely to be executed than those that are not. Chris Mooney has a longer article on the same topic in Mother Jones.  I highly recommend reading both articles.

## Journal Club

January 7, 2015

Here is the paper I’ll be covering in the Laboratory of Biological Modeling, NIDDK, Journal Club tomorrow

Morphological and population genomic evidence that human faces have evolved to signal individual identity

Abstract: Facial recognition plays a key role in human interactions, and there has been great interest in understanding the evolution of human abilities for individual recognition and tracking social relationships. Individual recognition requires sufficient cognitive abilities and phenotypic diversity within a population for discrimination to be possible. Despite the importance of facial recognition in humans, the evolution of facial identity has received little attention. Here we demonstrate that faces evolved to signal individual identity under negative frequency-dependent selection. Faces show elevated phenotypic variation and lower between-trait correlations compared with other traits. Regions surrounding face-associated single nucleotide polymorphisms show elevated diversity consistent with frequency-dependent selection. Genetic variation maintained by identity signalling tends to be shared across populations and, for some loci, predates the origin of Homo sapiens. Studies of human social evolution tend to emphasize cognitive adaptations, but we show that social evolution has shaped patterns of human phenotypic and genetic diversity as well.

## Selection of the week

January 2, 2015

Generally, any European music written before the age of Vivaldi and the Baroque Era is called Early Music.  It is often performed with instruments that are no longer in use in traditional symphony orchestras such as the viola da gamba, the lute, and the recorder.  Here is a nice piece performed by the Opus 5 Early Music Ensemble.

## The liquidity trap

December 28, 2014

The monetary base (i.e. amount of cash and demand deposits) has risen dramatically since the financial crisis and ensuing recession.

Immediately following the plunge in the economy in 2008, credit markets seized and no one could secure loans. The immediate response of the US Federal Reserve was to lower the interest rate it gives to large banks. Between January and December of 2008, the Fed discount rate dropped from around 4% to zero but the economy kept on tanking. The next move was to use unconventional monetary policy. The Fed implemented several programs of quantitative easing where they bought bonds of all sorts. When they do so, they create money out of thin air and trade it for bonds. This increases the money supply and is how the Fed “prints money.”

In the quantity theory of money, increasing the money supply should do nothing more than increase prices and people have been screaming about looming inflation for the past five years. However, inflation has remained remarkably low. The famous bond trader Bill Gross of Pimco essentially lost his job by betting on inflation and losing a lot of money. Keynesian theory predicts that increasing the money supply can cause a short-term surge in production because it takes time for prices to adjust (sticky prices) but not when interest rates are zero (at the zero lower bound). This is called a liquidity trap and there will be neither economic stimulus nor inflation. The reason is spelled out in the IS-LM model, invented by John Hicks to quantify Keynes’s theory. The Kahn Academy actually has a nice set of tutorials too. The idea is quite simple once you penetrate the economics jargon.

The IS-LM model looks at the relationship between interest rate r and the general price level/economic productivity (Y). It’s a very high level macroeconomic model of the entire economy. Even Hicks himself considered it to be just a toy model but it can give some very useful insights. Much of the second half of the twentieth century has been devoted to providing a microeconomic basis of macroeconomics in terms of interacting agents (microfoundations) to either support Keynesian models like IS-LM (New Keynesian models) or refute it (Real Business Cycle models). In may ways this tension between effective high level models and more detailed microscopic models mirrors that in biology (although it is much less contentious in biology). My take is that what model is useful depends on what question you are asking. When it comes to macroeconomics, simple effective models make sense to me.

The IS-LM model is analogous to the supply-demand model of microeconomics where the price and supply level of a product is set by the competing interests of consumers and producers. Supply increases with increasing price while demand decreases and the equilibrium is given by the intersection of these two curves. Instead of supply and demand curves, in the IS-LM model we have an Investment-Savings curve and a Liquidity-Preference-Money-supply curve. The IS curve specifies Y as an increasing function of interest rate. The rationale  that when interest rates are low, there will be more borrowing, spending, and investment and hence more goods and services will be made and sold, which increases Y.  In the LM curve, the interest rate is an increasing function of Y because as economic activity increases there will be a greater demand for money and this will allow banks to charge more for money (i.e. raise interest rates). The model shows how government or central bank intervention can increase Y. Increased government spending will shift the IS curve to the right and thus increase Y and the interest rate. It is also argued that as Y increases, employment will also increase. Here is the figure from Wikipedia:

Likewise, increasing the money supply amounts to shifting the LM curve to the right and this also increases Y and lowers interest rates. Increasing the money supply thus increases price levels as expected.

A liquidity trap occurs if instead of the above picture, the GDP is so low that we have a situation that looks like this (from Wikipedia):

Interest rates cannot go lower than zero because otherwise people will simply just hold money instead of putting it in banks. In this case, government spending can increase GDP but increasing the money supply will do nothing. The LM curve is horizontal at the intersection with the IS curve, so sliding it rightward will do nothing to Y. This explains why the monetary base can increase fivefold and not lead to inflation or economic improvement. However, there is a way to achieve negative interest rates and that is to spur inflation. Thus, in the Keynesian framework, the only way to get out of a liquidity trap is to increase government spending or induce inflation.

The IS-LM model is criticized for many things, one being that it doesn’t take into account of dynamics. In economics, dynamics are termed inter-temporal effects, which is what New Keynesian models incorporate (e.g. this paper by Paul Krugman on the liquidity trap). I think that economics would be much easier to understand if it were framed in terms of ODEs and dynamical systems language. The IS-LM model could then be written as

$\frac{dr}{dt} = [Y - F]_+ - r$

$\frac{dY}{dt} = c - r - d Y$

From here, we see that the IS-LM curves are just nullclines and obviously monetary expansion will do nothing when $Y-F <0$, which is the condition for the liquidity trap. The course of economics may have been very different if only Poincaré had dabble in it a century ago.

2104-12-29: Fixed some typos