Paper on compressed sensing and genomics

October 14, 2013

New paper on the arXiv. The next step after the completion of the Human Genome Project, was the search for genes associated with diseases such as autism or diabetes. However, after spending hundreds of millions of dollars, we find that there are very few common variants of genes with large effects. This doesn’t mean that there aren’t genes with large effect. The growth hormone gene definitely has a large effect on height. It just means that variations of genes that are common among people have small effects on the phenotype. Given the results of Fisher, Wright, Haldane and colleagues, this was probably expected as the most likely scenario and recent results measuring narrow-sense heritability directly from genetic markers (e.g. see this) confirms this view.

Current GWAS microarrays consider about a million or two markers and this is increasing rapidly. Narrow-sense heritability refers to the additive or linear genetic variance, which means the phenotype is given by the linear model y= Z\beta + \eta, where y is the phenotype vector, Z is the genotype matrix, \beta are all the genetic effects we want to recover, and \eta are all the nonadditive components including environmental effects. This is a classic linear regression problem. The problem comes when the number of coefficients \beta far exceeds the number of people in your sample, which is the case for genomics. Compressed sensing is a field of high dimensional statistics that addresses this specific problem. People such as David Donoho, Emmanuel Candes and Terence Tao have proven under fairly general conditions that if the number of nonzero coefficients are sparse compared to the number samples, then the effects can be completely recovered using L1 penalized optimization algorithms such as the lasso or approximate message passing. In this paper, we show that these ideas can be applied to genomics.

Here is Steve Hsu’s summary of the paper

Application of compressed sensing to genome wide association studies and genomic selection

Shashaank VattikutiJames J. LeeStephen D. H. HsuCarson C. Chow
(Submitted on 8 Oct 2013)

We show that the signal-processing paradigm known as compressed sensing (CS) is applicable to genome-wide association studies (GWAS) and genomic selection (GS). The aim of GWAS is to isolate trait-associated loci, whereas GS attempts to predict the phenotypic values of new individuals on the basis of training data. CS addresses a problem common to both endeavors, namely that the number of genotyped markers often greatly exceeds the sample size. We show using CS methods and theory that all loci of nonzero effect can be identified (selected) using an efficient algorithm, provided that they are sufficiently few in number (sparse) relative to sample size. For heritability h2 = 1, there is a sharp phase transition to complete selection as the sample size is increased. For heritability values less than one, complete selection can still occur although the transition is smoothed. The transition boundary is only weakly dependent on the total number of genotyped markers. The crossing of a transition boundary provides an objective means to determine when true effects are being recovered. For h2 = 0.5, we find that a sample size that is thirty times the number of nonzero loci is sufficient for good recovery.

Comments: Main paper (27 pages, 4 figures) and Supplement (5 figures) combined
Subjects: Genomics (q-bio.GN); Applications (stat.AP)
Cite as: arXiv:1310.2264 [q-bio.GN]
(or arXiv:1310.2264v1 [q-bio.GN] for this version)

Happiness and divisive inhibition

October 9, 2013

The Wait But Why blog has an amusing post on why Generation Y yuppies (GYPSYS) are unhappy, which I found through the blog of Michigan economist  Miles Kimball. In short, it is because their expectations exceed reality and they are entitled. What caught my eye was that they defined happiness as “Reality-Expectations”. The key point being that this is a subtractive expression. My college friend Peter Lee, now Professor and Director of the University Manchester X-Ray imaging facility, used to define happiness as “desires fulfilled beyond expectations”. I always interpreted this as a divisive quantity, meaning “Reality/Expectations”.

Now, the definition does have implications if we actually try to use it as a model for how happiness would change with some quantity like money. For example, consider the model where reality and expectations are both proportional to money. Then happiness = a*money – b*money. As long as b is less than a, then money always buys happiness, but if a is less than b then more money brings more unhappiness. However, if we consider the divisive model of happiness then happiness = a*money/ b*money = a/b and happiness doesn’t depend on money at all.

However, the main reason I bring this up is because it is analogous to the two possible ways to model inhibition (or adaptation) in neuroscience. The neurons in the brain generally interact with each other through two types of synapses – excitatory and inhibitory. Excitatory synapses generally depolarize a neuron and make its potential get closer to threshold whereas inhibitory neurons hyperpolarize the neuron and make it farther from threshold (although there are ways this can be violated). For neurons receiving stationary asynchronous inputs, we can consider the firing rate to be some function of the excitatory E and inhibitory I inputs. In subtractive inhibition, the firing rate would have the abstract form f(E-I) whereas for divisive inhibition it would have the form f(E)/(I+C), where f is some thresholded gain function (i.e. zero below threshold, positive above threshold) and C is a constant to prevent the firing rate from reaching infinity. There are some critical differences between subtractive and divisive inhibition. Divisive inhibition works by reducing the gain of the neuron, i.e. it makes the slope of the gain function shallower while subtractive inhibition makes the threshold effectively higher. These properties have great computational significance, which I will get into in a future post.

The cost of the shutdown and sequester

October 7, 2013

People may be wondering how the US government shutdown is affecting the NIH. I can’t speak for the rest of the institutes but I was instructed to not come to work and to not use my NIH email account or NIH resources. Two new fellows, who were supposed to begin on Oct 1, now have to wait and they will not be compensated for the missed time even if Congress does decides to give back pay to the furloughed employees. I really was hoping for them to start in August or September but that was pushed back because of the Sequester (have people forgotten about that?), which cut our budgets severely. In fact, because of the Sequester, I wasn’t able to hire one fellow because the salary requirements for their seniority exceeded my budget. We were just starting to get some really interesting psychophysics results on ambiguous stimuli but that had to be put on hold because we couldn’t immediately replace fellow Phyllis Thangaraj, who was running the experiments and left this summer to start her MD/PhD degree at Columbia. Now it will be delayed even further. I have several papers in the revision process that have also been delayed by the shutdown. All travel has been cancelled and I heard that people at conferences were ordered to return immediately, including those who were on planes on Oct 1. My quadrennial external review this week has now been postponed. All the flights for the committee and ad hoc members have to be cancelled and we now have to find another date where 20 or more people can agree on. All NIH seminars and the yearly NIH research festival has been cancelled. I was supposed to review an external NIH research proposal this week and that has been postponed indefinitely along with all other submitted proposals awaiting review. Academic labs, students and postdocs depending on their NIH grants this fiscal year will be without funding until the government is reopened. Personally, I will probably come out of this reasonably intact. However, I do worry how this will affect young people, who are the future.

Heritability and additive genetic variance

October 4, 2013

Most people have an intuitive notion of heritability being the genetic component of why close relatives tend to resemble each other more than strangers. More technically, heritability is the fraction of the variance of a trait within a population that is due to genetic factors. This is the pedagogical post on heritability that I promised in a previous post on estimating heritability from genome wide association studies (GWAS).

One of the most important facts about uncertainty and something that everyone should know but often doesn’t is that when you add two imprecise quantities together, while the average of the sum is the sum of the averages of the individual quantities, the total error (i.e. standard deviation) is not the sum of the standard deviations but the square root of the sum of the square of the standard deviations or variances. In other words, when you add two uncorrelated noisy variables, the variance of the sum is the sum of the variances. Hence, the error grows as the square root of the number of quantities you add and not linearly as it had been assumed for centuries. There is a great article in the American Scientist from 2007 called The Most Dangerous Equation giving a history of some calamities that resulted from not knowing about how variances sum. The variance of a trait can thus be expressed as the sum of the genetic variance and environmental variance, where environment just means everything that is not correlated to genetics. The heritability is the ratio of the genetic variance to the trait variance.

Read the rest of this entry »

Government shutdown

October 1, 2013

As of today, I am officially furloughed without pay since the NIH is officially closed and nonessential employees like myself are barred from working without pay by the Antideficiency Act of 1884. However, given that blogging is not considered an official duty, I can continue to post to Scientific Clearing House. Those who are not up on American politics may be wondering why the US government has shutdown. The reason is that the US fiscal year begins on Oct 1 and according to the the US Constitution, only Congress can appropriate funds for the functioning of government and they did not pass a budget for the new fiscal year by midnight of September 30. Actually, Congress has not passed a budget on time in recent years but has passed Continuing Resolutions that to keep the government going. So why have they not passed a budget or a CR this year? Well, currently the US government is divided with the Democratic party controlling the Senate and Presidency and the Republican party controlling the House of Representatives. All three entities must agree for a law to pass. Three years ago, the Democrats controlled the Congress, which includes both the House and Senate, and passed the Affordable Care Act, also known as Obamacare, which the President signed into law.   The Republicans took control of the House in 2011 and have been trying to repeal the ACA ever since but have been stopped by the Senate. This year they decided to try a new tactic, which was to pass a budget that withholds funding for the ACA. The Senate did not agree, passed a budget with the ACA and sent it back to the House, which then took out funding for the ACA again with some modifications and sent it back. This went on back and forth without converging to an agreement and thus we are closed today.

David Wright, 1963-2013

September 26, 2013

I guess this is just a week for sadness.  My high school friend, David Wright, died this morning at the age of 50.  He collapsed at his desk, probably of a heart attack. Dave was a natural athlete and story teller. He was always there for you if you needed someone.  He will be missed.

Epic comeback

September 25, 2013

Well, I spoke too soon in my earlier post on the America’s Cup.  Oracle Team USA has since won 7 races in a row and now it is 8-8 in the best of 17 match (although they have already had 18 races). The final race to determine the winner is today. Check out the action here.  In the past, America’s Cup races had usually been best of 3 or best of 5 matches. In this new format, the races are much shorter, taking less than an hour rather than several, and they try to get in two a day if the weather permits. In the beginning New Zealand had the faster boat. They had already been racing for over a month in the challenger series and were just better than Oracle. However, the long format and some weather delays has given Oracle a chance to get up to speed and now they are definitely the faster boat. Yesterday, they flew by New Zealand on the upwind leg. The only chance New Zealand has to win today is if Oracle makes a mistake.

Richard Azuma, 1930 – 2013

September 20, 2013

I was saddened to learn that Richard “Dick” Azuma, who was a professor in the University of Toronto Physics department from 1961 to 1994 and emeritus after that, passed yesterday. He was a nuclear physicist par excellence and chair of the department when I was there as an undergraduate in the early 80′s. I was in the Engineering Science (physics option) program, which was an enriched engineering program at UofT. I took a class in nuclear physics with Professor Azuma during my third year. He brought great energy and intuition to the topic. He was one of the few professors I would talk to outside of class and one day I asked if he had any open summer jobs. He went out of his way to secure a position for me at the nuclear physics laboratory TRIUMF in Vancouver in 1984. That was the best summer of my life. The lab was full of students from all over Canada and I remain good friends with many of them today. I worked on a meson scattering experiment and although I wasn’t of much use to the experiment I did get to see first hand what happens in a lab. I wrote a 4th year thesis on some of the results from that experiment. I last saw Dick in 2010 when I went to Toronto to give a physics colloquium. He was still very energetic and as engaged in physics as ever. We will all miss him greatly.

America’s Cup 2013

September 19, 2013

Today may be the last race for the America’s Cup yacht series between the US and New Zealand.   Here are the highlights from the last race.

It is a best of 17 series and New Zealand has 8 wins so today may be the last chance to watch these hundred million dollar multihull yachts fly around San Francisco harbour at close to 50 miles per hour.  All the races are posted on You Tube.

Coase and the Nature of the Firm

September 12, 2013

Economist and Noble Laureate Ronald Coase died earlier this month just three months short of his 103rd birthday. Coase is mostly famous for two papers: “The Nature of the Firm” (1937) and “The Problem of Social Cost” (1960). He came up with many of the ideas for the first paper when he was just 21. Coase asked the simple question of why companies exist. According to Adam Smith it should actually be more cost-effective for a person to contract out work rather than hire people. Coases’s answer was that there are always transaction costs or frictions that make a firm more cost-effective. In other words, the market (i.e. price mechanism) is not always the most efficient way to organize production. The size of a firm is determined by the point when the extra (marginal) cost of organizing an extra employee balances the transaction costs of obtaining her services on the free market. Hence, the great irony of modern capitalism is that its main pillar, the firm, is a paragon of central planning. Firms in essence are totalitarian regimes where the citizens are free to leave.

Conservative and libertarian leaning individuals generally prize  private companies and free markets over governments. They argue that many of the functions of government, such as schools and healthcare, would be more efficient if privatized. The question then is why are private firms more efficient than government? When we hand over functions formerly performed by a democratically elected government, we are in essence making society less democratic. One could argue that firms are more efficient because they are subject to competition. That is why we want to break up monopolies. However, that should be true of government too. If we don’t like the government we have then we can always elect another one. We can even change the constitution to our liking. In principle, no one is under more competition than our elected officials. It is the job of the citizenry to ensure that they are doing their job.

TB, streptomycin, and who gets credit

September 4, 2013

The Science Show did a feature story recently about the discovery of streptomycin,  the first antibiotic to treat tuberculosis, which had killed 2 billion people in the 18th and 19th centuries. Streptomycin was discovered by graduate student Albert Schatz in 1943, who worked in the lab of Professor Selman Waksman at Rutgers. Waksman was the sole winner of the 1952 Nobel Prize for this work. The story is narrated by the author of the book Experiment Eleven, who paints Waksman as the villain and Schatz as the victim. Evidently, Waksman convinced Schatz to sign away his patent rights to Rutgers but secretly negotiated a deal to obtain 20% of the royalties. When Schatz discovered this, he sued Waksman and obtained a settlement. However, this turned the scientific community against him and he forced him out of microbiology into science education. To me, this is just more evidence that prizes and patents are incentives for malfeasance.

New paper on population genetics

August 28, 2013

James Lee and I just published a paper entitled “The causal meaning of Fisher’s average effect” in the journal Genetics Research. The paper can be obtained here. This paper is the brainchild of James and I just helped him out with some of the proofs.  James’s take on the paper can be read here. The paper resolves a puzzle about the incommensurability of Ronald Fisher’s two definitions of the average effect noted by population geneticist D.S. Falconer three decades ago.

Fisher was well known for both brilliance and obscurity and people have long puzzled over the meaning of some of his work.  The concept of the average effect is extremely important for population genetics but it is not very well understood. The field of population genetics was invented in the early twentieth century by luminaries such as Fisher, Sewall Wright, and JBS Haldane to reconcile Darwin’s theory of evolution with Mendelian genetics. This is a very rich field that has been somewhat forgotten. People in mathematical,  systems, computational, and quantitative biology really should be fully acquainted with the field.

For those who are unacquainted with genetics, here is a quick primer to understand the paper. Certain traits, like eye colour or the ability to roll your tongue, are affected by your genes. Prior to the discovery of the structure of DNA, it was not clear what genes were, except that they were the primary discrete unit of genetic inheritance. These days it usually refers to some region on the genome. Mendel’s great insight was that genes come in pairs, which we now know to correspond to the two copies of each of the 23 chromosomes we have.  A variant of a particular gene is called an allele.  Traits can depend on genes (or more accurately genetic loci) linearly or nonlinearly. Consider a quantitative trait that depends on a single genetic locus that has two alleles, which we will call a and A. This means that a person will have one of three possible genotypes: 1) homozygous in A (i.e. have two A alleles), 2) heterozygous (have one of each), or 3) homozygous in a (i.e. have no A alleles). If the locus is linear then if you plot the measure of the trait (e.g. height) against the number of A alleles, you will get a straight line. For example, suppose allele A contributes a tenth of a centimetre to height. Then people with one A allele will be on average one tenth of a centimetre taller than those with no A alleles and those with two A alleles will be two tenths taller. The familiar notion of dominance is a nonlinear effect. So for example, the ability to roll your tongue is controlled by a single gene. There is a dominant rolling allele and a recessive nonrolling allele. If you have at least one rolling allele, you can roll your tongue.

The average effect of a gene substitution is the average change in a trait if one allele is substituted for another. A crucial part of population genetics is that you always need to consider averages. This is because genes are rarely completely deterministic. They can be influenced by the environment or other genes. Thus, in order to define the effect of the gene, you need to average over these other influences. This then leads to a somewhat ambiguous definition of average effect and Fisher actually came up with two. The first, and as James would argue the primary definition, is a causal one in that we want to measure the average effect of a gene if you experimentally substituted one allele for another prior to development and influence by the environment. A second correlation definition would simply be to plot the trait against the number of alleles as in the example above. The slope would then be the average effect. This second definition looks at the correlation between the gene and the trait but as the old saying goes “correlation does not imply causation”. For example, the genetic loci may not have any effect on the trait but happens to be strongly correlated with a true causal locus (in the population you happen to be examining). Distinguishing between genes that are merely associated with a trait from ones that are actually causal remains an open problem in genome wide association studies.

Our paper goes over some of the history and philosophy of the tension between these two definitions. We wrote the paper because these two definitions do not always agree and we show under what conditions they do agree. The main reason they don’t agree is that averages will depend on the background over which you average. For a biallelic gene, there are 2 alleles but 3 genotypes. The distribution of alleles in a population is governed by two parameters. It’s not enough to specify the frequency of one allele. You also need to know the correlation between alleles. The regression definition matches the causal definition if a particular function representing this correlation is held fixed while the experimental allele substitutions under the causal definition are carried out. We also considered the multi-allele and multi-loci case in as much generality as we could.

The problem with democracy

August 16, 2013

Winston Churchill once said that “Democracy is the worst form of government, except for all those other forms that have been tried from time to time.” The current effectiveness of the US government does make one wonder if that is even true. The principle behind democracy is essentially utilitarian – a majority or at least a plurality decides on the course of the state. However, implicit in this assumption is that the utility function for individuals match their participation function.

For example, consider environmental regulation. The utility function for the amount of allowable emissions of some harmful pollutant like mercury for most people will be downward sloping – most people would increase their utility the less the pollutant is emitted. However, for a small minority of polluters it will be upward sloping with a much steeper slope. Let’s say that the sum of the utility gained for the bulk of the population for strong regulation is greater than that gained by the few polluters for weak regulation. If the democratic voice one has in affecting policy is proportional to the summed utility then the smaller gain for the many will outweigh the larger gain to the few. Unfortunately, this is not usually case. More often, the translation of utility to legislation and regulation is not proportional but passes through a very nonlinear participation function with a sharp threshold. The bulk of the population is below the threshold so they provide little or no voice on the issue. The minority utility is above the threshold and provides a very loud voice which dominates the result. Our laws are thus systematically biased to protecting the interests of special interest groups.

The way out of this trap is to either align everyone’s utility functions or to linearize the participation functions. We could try to use regulation to dampen the effectiveness of minority participation functions or use public information campaigns to change utility functions or increase the participation functions of the silent majority. Variations of these methods have been tried with varying degrees of success. Then there is always the old install a benevolent dictator who respects the views of the majority. That one really doesn’t have a good track record though.

Beware the vampire squid

August 10, 2013

Before you take that job programming at an investment bank or hedge fund you may want to read Felix Salmon’s post and Michael Lewis’s article on the case of Sergey Aleynikov. He was a top programmer at Goldman Sachs, who was then prosecuted and convicted of stealing proprietary computer code. The conviction was eventually overturned but he has now been charged again for the same crime under a different law. According to Lewis, the code was mostly modified open source stuff that Aleynikov emailed to himself for future reference of what he had done and had little value outside of Goldman. Salmon thinks that Goldman aggressively pursued this case because in order for the directors of the programming division to justify their bonuses, they need to make it look like the code, which they don’t understand, is important. If Goldman Sachs had a public relations problem before, the Lewis article will really put it over the top. This case certainly makes me think that we should change the criminal code and leave cases of intellectual theft by employees to the civil courts and not force the taxpayer to pick up the tab. Also, what is the point for putting a harmless nonviolent programmer in jail for 8 years. We could at least have him serve his sentence doing something useful like writing code to improve city traffic flow. Finally, the open software foundation may have a case against Goldman and other firms who use open source code and then violate the open source license agreement. I’m sure it wouldn’t be too hard to find a backer with deep pockets to pursue the case.

New paper on childhood growth and obesity

August 1, 2013

Kevin D Hall, Nancy F Butte, Boyd A Swinburn, Carson C Chow. Dynamics of childhood growth and obesity: development and validation of a quantitative mathematical model. Lancet Diabetes and Endocrinology 2013 .

You can read the press release here.

In order to curb childhood obesity, we need a good measure of how much food kids should eat. Although people like Claire Wang have proposed quantitative models in the past that are plausible, Kevin Hall and I have insisted that this is a hard problem because we don’t fully understand childhood growth. Unlike adults, who are more or less in steady state, growing children are a moving target. After a few fits and starts we finally came up with a satisfactory model that modifies our two compartment adult body composition model to incorporate growth. That previous model partitioned excess energy intake into fat and lean compartments according to the Forbes rule, which basically says that the ratio of added fat to lean is proportional to how much fat you have so the more fat you have the more excess Calories go to fat. The odd consequence of that model is that the steady state body weight is not unique but falls on a one dimensional curve. Thus there is a whole continuum of possible body weights for a fixed diet and lifestyle. I actually don’t believe this and have a modification to fix it but that is a future story.

What puzzled me about childhood growth was how do we know how much more to eat as we grow? After some thought, I realized that what we could do is to eat enough to maintain the fraction of body fat at some level, using leptin as a signal perhaps, and then tap off the energy stored in fat when we needed to grow. So just like we know how much gasoline (petrol) to add by simply filling the tank when it’s empty, we simply eat to keep our fat reserves at some level. In terms of the model, this is a symmetry breaking term that transfers energy from the fat compartment to the lean compartment. In my original model, I made this term a constant and had food intake increase to maintain the fat to lean ratio and showed using singular perturbation theory that his would yield growth that was qualitatively similar to the real thing. This then sat languishing until Kevin had the brilliant idea to make the growth term time dependent and fit it to actual data that Nancy Butte and Boyd Swinburn had taken. We could then fit the model to normal weight and obese kids to quantify how much more obese kids eat, which is more than previously believed. Another nice thing is that when the child stops growing the model is automatically the adult model!

The myth of the single explanation

July 30, 2013

I think one of the things that tends to lead us astray when we try to understand complex phenomena like evolution, disease, or the economy, is that we have this idea that they must have a single explanation. For example, recently two papers have been published in high profile journals trying to explain mammal monogamy. Although monogamy is quite common in birds it only occurs in 5% of mammals. Here is Carl Zimmer’s summary.  The study in Science, which surveyed 2545 mammal species, argued that monogamy arises when females are solitary and sparse. Males must then commit to one since dates are so hard to find. The study in PNAS examined 230 primate species, for which monogamy occurs at the higher rate of 27%, and used Bayesian inference to argue that monogamy arises to prevent male infanticide. It’s better to help out at home rather than go around killing other men’s babies. Although both of these arguments are plausible, there need not be a single universal explanation. Each species could have its own set of circumstances that led to monogamy involving these two explanations and others. However, while we should not be biased towards a single explanation, we shouldn’t also throw up our hands like Hayek and argue that no complex phenomenon can be understood. Some phenomena will have simpler explanations than others but since the Kolmogorov complexity is undecidable there is no algorithm that can tell you which is which. We will just have to struggle with each problem as it comes.

Talk at GRC

July 24, 2013

I’m currently in Mt. Snow, Vermont to give a talk at the Gordon Research Conference on Computer Aided Drug Design. Yes, I know nothing about drug design. I am here because the organizer, Anthony Nicholls, asked me to give a pedagogical talk on Bayesian Inference. My slides are here. I only arrived yesterday but the few talks I’ve seen have been quite interesting. One interesting aspect of this conference is that many of the participants are from industry. The evening sessions are meant to be of more general interest. Last night were two talks about how to make science more reproducible. As I’ve posted before, many published results are simply wrong. The very enterprising Elizabeth Iorns has started something called the Reproducibility Initiative. I am not completely clear about how it works but it is part of another entity she started called Science Exchange, which helps to facilitate collaborations with a fee-for-service model. The Reproducibility Initiative piggy backs on Science Exchange by providing a service (for a fee) to validate any particular result. Papers that pass approval get a stamp of approval. It is expected that pharma would be interested in using this service so they can inexpensively check if possible drug targets actually hold up. Many drugs fail at phase three of clinical trials because they’ve been shown to be ineffective and this may be due to the target being wrong to start with.

On a final note, I flew to Albany and drove here. Unlike in the past when I would have printed out a map, I simply assumed that I could use Google Maps on my smart phone to get here. However, Google Maps doesn’t really know where Mt. Snow is. It tried to take me up a dirt road to the back of the ski resort. Also, just after I turned up the road, the phone signal disappeared so I was blind and had no paper backup. I was suspicious that this was the right way to go so I turned back to the main highway in hopes of finding a signal or a gas station to ask for directions. A few miles down Route 9, I finally did get a signal and also found a sign that led me the way. Google Maps still tried to take me the wrong way. I should have followed what I always tell my daughter – Always have a backup plan.

New paper in Nature Reviews Genetics

July 22, 2013

A Coulon, CC Chow, RH Singer, DR Larson Eukaryotic transcriptional dynamics: from single molecules to cell populations. Nat Gen Reviews (2013).

Abstract | Transcriptional regulation is achieved through combinatorial interactions between regulatory elements in the human genome and a vast range of factors that modulate the recruitment and activity of RNA polymerase. Experimental approaches for studying transcription in vivo now extend from single-molecule techniques to genome-wide measurements. Parallel to these developments is the need for testable quantitative and predictive models for understanding gene regulation. These conceptual models must also provide insight into the dynamics of transcription and the variability that is observed at the single-cell level. In this Review, we discuss recent results on transcriptional regulation and also the models those results engender. We show how a non-equilibrium description informs our view of transcription by explicitly considering time- and energy-dependence at the molecular level.

New paper on measuring gastric acid output

July 16, 2013

This paper started many years ago when Steve Wank, of the Digestive Diseases Branch of NIDDK, had this idea to use this new wireless PH detecting SmartPill that you could swallow to determine how much acid your stomach was producing.  There really was no noninvasive way to monitor how well medications would work for certain reflux diseases.  What he wanted was a model of gastric acid secretion output based on the dynamics of PH when a buffer was added to design a protocol for the experiment.  I came up with a simple mass-action model of acid buffering and made some graphs for him.  We then tested the model out in a beaker.  He thought the model worked better than I did but it was somewhat useful to him in designing the experiment.

Weinstein et al.  A new method for determining gastric acid output using a wireless pH-sensing capsule.  Aliment Pharmacol Ther 37: 1198 (2013)


BACKGROUND:Gastro-oesophageal reflux disease (GERD) and gastric acid hypersecretion respond well to suppression of gastric acid secretion. However, clinical management and research in diseases of acid secretion have been hindered by the lack of a non-invasive, accurate and reproducible tool to measure gastric acid output (GAO). Thus, symptoms or, in refractory cases, invasive testing may guide acid suppression therapy.

AIM:To present and validate a novel, non-invasive method of GAO analysis in healthy subjects using a wireless pH sensor, SmartPill (SP) (SmartPill Corporation, Buffalo, NY, USA).

METHODS:Twenty healthy subjects underwent conventional GAO studies with a nasogastric tube. Variables impacting liquid meal-stimulated GAO analysis were assessed by modelling and in vitro verification. Buffering capacity of Ensure Plus was empirically determined. SP GAO was calculated using the rate of acidification of the Ensure Plus meal. Gastric emptying scintigraphy and GAO studies with radiolabelled Ensure Plus and SP assessed emptying time, acidification rate and mixing. Twelve subjects had a second SP GAO study to assess reproducibility.

RESULTS:Meal-stimulated SP GAO analysis was dependent on acid secretion rate and meal-buffering capacity, but not on gastric emptying time. On repeated studies, SP GAO strongly correlated with conventional basal acid output (BAO) (r = 0.51, P = 0.02), maximal acid output (MAO) (r = 0.72, P = 0.0004) and peak acid output (PAO) (r = 0.60, P = 0.006). The SP sampled the stomach well during meal acidification.

CONCLUSIONS:SP GAO analysis is a non-invasive, accurate and reproducible method for the quantitative measurement of GAO in healthy subjects. SP GAO analysis could facilitate research and clinical management of GERD and other disorders of gastric acid secretion.

Houghton opines on the unfairness of prizes

July 12, 2013

I recently wrote about Michael Houghton declining the prestigious Gairdner prize because it left out two critical contributors to the discovery of the Hepatitis C virus. Houghton has now written an opinion piece in Nature Medicine arguing that prizes relax the restriction to three awardees, an arbitrary number I’ve never understood. After all, one could argue that Freeman Dyson had a reasonable claim on the Nobel Prize awarded to Feynman, Schwinger, and Tomonaga for QED.  I’ve quoted the entire piece below.

Nature Medicine: Earlier this year, I was greatly honored with the offer of a 2013 Canada Gairdner International Award for my contributions to the discovery of the hepatitis C virus (HCV). I was selected along with Harvey Alter, chief of clinical studies in the Department of Transfusion Medicine at the US National Institutes of Health’s Clinical Center in Bethesda, Maryland, and Daniel Bradley, a consultant at the US Centers for Disease Control and Prevention in Atlanta, both of whom had a vital role in the research that eventually led to the identification and characterization of the virus.

My colleagues accepted their awards. However, I declined my C$100,000 ($98,000) prize because it excluded two other key contributors who worked with me closely to successfully isolate the viral genome for the first time. I felt that given their crucial inputs, it would be wrong of me to keep accepting major prizes just ‘on their behalf’, a situation that has developed because major award foundations and committees around the world insist that prizes be limited to no more than three recipients per topic.

HCV was identified in 1989 in my laboratory at the Chiron Corporation, a California biotechnology firm since purchased by the Swiss drug company Novartis. The discovery was the result of seven years of research in which I worked closely, both intellectually and experimentally, with Qui-Lim Choo, a member of my own laboratory, and George Kuo, who had his own laboratory next door to mine at Chiron. We finally identified the virus using a technically risky DNA-expression screening technique through which we isolated a single small nucleic acid clone from among many millions of such clones from different recombinant libraries. This was achieved without the aid of the still-evolving PCR technology to amplify the miniscule amounts of viral nucleic acid present in blood. We ultimately proved that this clone derived from a positive-stranded viral RNA genome intimately associated with hepatitis, but one not linked to either the hepatitis A or B viruses12. The finding represented the first time any virus had been identified without either prior visualization of the virus itself, characterization of its antigens or viral propagation in cell culture.

The high-titer infectious chimpanzee plasma used for our molecular analyses at Chiron was provided in 1985 by Bradley, an expert in chimpanzee transmission of HCV and in the virus’s basic properties and cellular responses, with whom I had an active collaboration since 1982. The proposed aim of the collaboration was for my laboratory to apply contemporary molecular cloning methodologies to a problem that had proven intractable since the mid-1970s, when Alter and his colleagues first demonstrated the existence of non-A, non-B hepatitis (NANBH), as it was then known. Alter’s team went on to define the high incidence and medical importance of NANBH, including the virus’s propensity to cause liver fibrosis, cirrhosis and cancer. They also identified high-titer infectious human plasma in 1980 and were instrumental in promoting the adoption of surrogate tests for NANBH by blood banks to reduce the incidence of post-transfusion infection.

With regrets to the Gairdner Foundation—a generous and altruistic organization—I felt compelled to decline the International Gairdner Award without the addition of Kuo and Choo to the trio of scientists offered the award. In 1992, all five of us received the Karl Landsteiner Memorial Award from the American Association of Blood Banks. But subsequent accolades given in honor of HCV’s discovery have omitted key members of the group: only Bradley and I received the 1993 Robert Koch Prize, and only Alter and I won the 2000 Albert Lasker Award for Clinical Medical Research—in both cases, despite my repeated requests that the other scientists involved in the discovery be recognized. With the exclusion once more of Kuo and Choo from this year’s Gairdner Award, I decided that I should not continue to accept major awards without them. In doing so, I became the first person since the Gairdner’s inception in 1959 to turn down the prize.

I hope that my decision helps bring attention to a fundamental problem with many scientific prizes today. Although some awards, such as the Landsteiner, are inclusionary and emphasize outstanding team accomplishments, the majority of the world’s prestigious scientific awards—including the Gairdner, Lasker and Shaw prizes, which all seem to be modeled on the Nobel Prize and indeed are sometimes known as the ‘baby Nobels’—are usually restricted to at most three individuals per discovery. Unsurprisingly, this limitation often leads to controversy, when one or more worthy recipients are omitted from the winners list.

Perhaps what may help this situation is for awards committees to solicit, and then be responsive to, input from potential recipients themselves prior to making their final decisions. Some of the recipients are best placed to know the full and often intricate history of the discovery and collaborative efforts, and such input should help committees better understand the size of the contributing team from which they can then choose recipients according to each award’s particular policy.

With this information in hand, award organizers should be willing to award more than three researchers. As knowledge and technology grows exponentially around the world and with an increasing need for multidisciplinary collaborations to address complex questions and problems, there is a case to be made for award committees adjusting to this changing paradigm. Moreover, it is inherently unfair to exclude individuals who played a key part in the discovery. Why should they and their families suffer such great disappointment after contributing such crucial input? Some award restructuring could also be inspirational to young scientists, encouraging them to be highly interactive and collaborative in the knowledge that when a novel, long-shot idea or approach actually translates to scientific success, all key parties will be acknowledged appropriately.

In this vein, I am happy to note that the inaugural Queen Elizabeth Prize for Engineering, a new £1 million ($1.6 million) prize from the UK government, was awarded at a formal ceremony last month to five individuals who helped create the internet and the World Wide Web, even though the original guidelines stipulated a maximum of three recipients. If the Queen of England—the very emblem of tradition—can cast protocol aside, clearly other institutions can too. I hope more awards committees will follow Her Majesty’s lead.


Get every new post delivered to your Inbox.

Join 111 other followers