New paper on gene repression

CC Chow, KK Finn, GB Storchan, X Lu, X Sheng, SS Simons Jr., Kinetically-Defined Component Actions in Gene Repression. PLoS Comp Bio. 11:e1004122, (2015)


Gene repression by transcription factors, and glucocorticoid receptors (GR) in particular, is a critical, but poorly understood, physiological response. Among the many unresolved questions is the difference between GR regulated induction and repression, and whether transcription cofactor action is the same in both. Because activity classifications based on changes in gene product level are mechanistically uninformative, we present a theory for gene repression in which the mechanisms of factor action are defined kinetically and are consistent for both gene repression and induction. The theory is generally applicable and amenable to predictions if the dose-response curve for gene repression is non-cooperative with a unit Hill coefficient, which is observed for GR-regulated repression of AP1LUC reporter induction by phorbol myristate acetate. The theory predicts the mechanism of GR and cofactors, and where they act with respect to each other, based on how each cofactor alters the plots of various kinetic parameters vs. cofactor. We show that the kinetically-defined mechanism of action of each of four factors (reporter gene, p160 coactivator TIF2, and two pharmaceuticals [NU6027 and phenanthroline]) is the same in GR-regulated repression and induction. What differs is the position of GR action. This insight should simplify clinical efforts to differentially modulate factor actions in gene induction vs. gene repression.

Author Summary

While the initial steps in steroid-regulated gene induction and repression are known to be identical, the same cannot be said of cofactors that modulate steroid-regulated gene activity. We describe the conditions under which a theoretical model for gene repression reveals the kinetically-defined mechanism and relative position of cofactor action. This theory has been validated by experimental results with glucocorticoid receptors. The mode and position of action of four factors is qualitatively identical in gene repression to that previously found in gene induction. What changes is the position of GR action. Therefore, we predict that the same kinetically-defined mechanism usually will be utilized by cofactors in both induction and repression pathways. This insight and simplification should facilitate clinical efforts to maximize desired outcomes in gene induction or repression.

I am so happy that this paper is finally published.  It was a two-year ordeal from the time I had the idea of what to do until it finally came out. This is the second leg of the three-legged stool for a theory of steroid-regulated gene expression. The first was developing the theory for gene induction (e.g. see here) that started over ten years ago when Stoney and I first talked about trying to understand his data and really took off when Karen Ong turned her summer internship into a two-year baccalaureate fellowship. She’s now finishing up the PhD part of her MD-PhD at the Courant Institute at NYU.

In the first leg, we showed that if the dose-response curve for steroid-regulated gene induction (i.e. gene product as a function of ligand concentration), had the form  a x/ (c+x), (which has been variously called noncooperative, Michaelis-Menten function, Hill function with Hill coefficient equal to 1, hyperbolic, first order Hill dose response curve, to give a few), then the dose-response could be written down in closed form.  The theory considers gene induction to be a sequence of complex forming reactions Y_{i-1} + X_{i} \leftrightarrow Y_i for i = 1, 2, ..., n, and the dose-response is given by [Y_n] as a function of [Y_0], which in general is a very high order polynomial which is not Michaelis-Menten. However,  when some biophysically plausible conditions on the parameters are met, the polynomial can be represented by the group of lower triangular matrices and can be solved exactly.  We can then use the formulae to make predictions for the mechanisms of various transcription factors.

However, steroids also repress genes and interestingly enough the repression curve is also noncooperative and is given by the linear fractional function a + bx/(1 + c x). The question then was how does this work. I was puzzled for a while on how to solve this but then thought that if we believe that the transcription machinery after initiation is mostly conserved then the induction theory we had previously derived should still be in place. What is different is that in repression instead of steroids initiating the cascade, there was some other agonist and steroid repressed this. In our induction theory, we included the effects of activators and inhibitors from enzyme kinetics, which we called accelerators and decelerators to avoid confusing with previously used terms. Because of the group property of the reactions, basically any function you are interested in has linear-fractional form. I thus postulated that steroids, after binding to a nuclear steroid receptor, acts like a decelerator. I then had to work out all the possible cases for where the decelerator could act and the large number of them made the calculations rather tedious. As a result, I made lots of mistakes initially and the theory just wouldn’t fit the data. I finally had a breakthrough in the fall of 2013 when I was in Taiwan for a workshop and everything started to come together. It then took another six months to work out the details and write the paper, which was then followed by several back and forth’s with the referees, a major rewriting and a final acceptance a few months ago. In the process of working on this paper, I discovered a lot of properties about the induction system that I didn’t realize. I still didn’t believe it was finished until I saw it posted on the PLoS Comp Bio website this week.

I’m currently putting on the finishing touches for revisions on the third leg of the stool now. We have even reunited the band and convinced Karen to take some time away from her thesis to help finish it. This paper is about how partial agonists or antagonists like tamoxifen work, which could have implications for drug development and avoiding side effects. Steroids are not the only ligand that can activate a steroid-regulated gene. The steroid cream that you use for rashes consists of a highly potent steroid agonist. There are also molecules that block or impede the action of steroids by binding to steroid receptors and these are called partial agonists, antagonists or antisteroids. However, steroid receptors are widely expressed and that is why when you take them they can have severe side effects. Hence, it would be nice to be able to control where they act and by how much. This third leg paper is the theory behind how to do this.

New paper on path integrals

Carson C. Chow and Michael A. Buice. Path Integral Methods for Stochastic Differential Equations. The Journal of Mathematical Neuroscience,  5:8 2015.

Abstract: Stochastic differential equations (SDEs) have multiple applications in mathematical neuroscience and are notoriously difficult. Here, we give a self-contained pedagogical review of perturbative field theoretic and path integral methods to calculate moments of the probability density function of SDEs. The methods can be extended to high dimensional systems such as networks of coupled neurons and even deterministic systems with quenched disorder.

This paper is a modified version of our arXiv paper of the same title.  We added an example of the stochastically forced FitzHugh-Nagumo equation and fixed the typos.

New paper on steroid-regulated gene expression

Recent paper in Molecular Endocrinology 7:1194-206. doi: 10.1210/me.2014-1069:

Research Resource: Modulators of glucocorticoid receptor activity identified by a new high-throughput screening assay

John A. Blackford, Jr., Kyle R. Brimacombe, Edward J. Dougherty , Madhumita Pradhan, Min Shen, Zhuyin Li, Douglas S. Auld, Carson C. Chow, Christopher P. Austin, and S. Stoney Simons, Jr.

Abstract: Glucocorticoid steroids affect almost every tissue-type and thus are widely used to treat a variety of human pathologies. However, the severity of numerous side-effects limits the frequency and duration of glucocorticoid treatments. Of the numerous approaches to control off-target responses to glucocorticoids, small molecules and pharmaceuticals offer several advantages. Here we describe a new, extended high throughput screen in intact cells to identify small molecule modulators of dexamethasone-induced glucocorticoid receptor (GR) transcriptional activity. The novelty of this assay is that it monitors changes in both GR maximal activity (Amax) and EC50, or the position of the dexamethasone dose-response curve. Upon screening 1280 chemicals, ten with the greatest change in the absolute value of Amax or EC50 were selected for further examination. Qualitatively identical behaviors for 60 –90% of the chemicals were observed in a completely different system, suggesting that other systems will be similarly affected by these chemicals. Additional analysis of the ten chemicals in a recently described competition assay determined their kinetically-defined mechanism and site of action. Some chemicals had similar mechanisms of action despite divergent effects on the level of GR-induced product. These combined assays offer a straightforward method of identifying numerous new pharmaceuticals that can alter GR transactivation in ways that could be clinically useful.

Paper on new version of Plink

The paper describing the updated version of the genome analysis software tool Plink has just been published.

Second-generation PLINK: rising to the challenge of larger and richer datasets
Christopher C Chang, Carson C Chow, Laurent CAM Tellier, Shashaank Vattikuti, Shaun M Purcell, and James J Lee

GigaScience 2015, 4:7  doi:10.1186/s13742-015-0047-8

PLINK 1 is a widely used open-source C/C++ toolset for genome-wide association studies (GWAS) and research in population genetics. However, the steady accumulation of data from imputation and whole-genome sequencing studies has exposed a strong need for faster and scalable implementations of key functions, such as logistic regression, linkage disequilibrium estimation, and genomic distance evaluation. In addition, GWAS and population-genetic data now frequently contain genotype likelihoods, phase information, and/or multiallelic variants, none of which can be represented by PLINK 1’s primary data format.

To address these issues, we are developing a second-generation codebase for PLINK. The first major release from this codebase, PLINK 1.9, introduces extensive use of bit-level parallelism, View MathML-time/constant-space Hardy-Weinberg equilibrium and Fisher’s exact tests, and many other algorithmic improvements. In combination, these changes accelerate most operations by 1-4 orders of magnitude, and allow the program to handle datasets too large to fit in RAM. We have also developed an extension to the data format which adds low-overhead support for genotype likelihoods, phase, multiallelic variants, and reference vs. alternate alleles, which is the basis of our planned second release (PLINK 2.0).

The second-generation versions of PLINK will offer dramatic improvements in performance and compatibility. For the first time, users without access to high-end computing resources can perform several essential analyses of the feature-rich and very large genetic datasets coming into use.

Keywords: GWAS; Population genetics; Whole-genome sequencing; High-density SNP genotyping; Computational statistics


This project started out with us trying to do some genomic analysis that involved computing various distance metrics on sequence space. Programming virtuoso Chris Chang stepped in and decided to write some code to speed up the computations. His program, originally called wdist, was so good and fast that we kept asking him to put in more capabilities. Eventually,  he had basically replicated the suite of functions that Plink performed so he contacted Shaun Purcell, the author of Plink, if he could just call his code Plink too and Shaun agreed. We then ran a series of tests on various machines to check the speed-ups compared to the original Plink and gcta. If you do any GWAS analysis at all, I highly recommend you check out Plink 1.9.

New paper in eLife

Kinetic competition during the transcription cycle results in stochastic RNA processing

Matthew L FergusonValeria de TurrisMurali PalangatCarson C ChowDaniel R Larson


Synthesis of mRNA in eukaryotes involves the coordinated action of many enzymatic processes, including initiation, elongation, splicing, and cleavage. Kinetic competition between these processes has been proposed to determine RNA fate, yet such coupling has never been observed in vivo on single transcripts. In this study, we use dual-color single-molecule RNA imaging in living human cells to construct a complete kinetic profile of transcription and splicing of the β-globin gene. We find that kinetic competition results in multiple competing pathways for pre-mRNA splicing. Splicing of the terminal intron occurs stochastically both before and after transcript release, indicating there is not a strict quality control checkpoint. The majority of pre-mRNAs are spliced after release, while diffusing away from the site of transcription. A single missense point mutation (S34F) in the essential splicing factor U2AF1 which occurs in human cancers perturbs this kinetic balance and defers splicing to occur entirely post-release.


New Papers

Two new papers are now in print:
The first is on applying compressed sensing to genomics is now published in Gigascience. The summary of the paper is here and the link is here.
The second is on steroid-regulated gene induction and can be obtained here.
Biochemistry. 2014 Mar 25;53(11):1753-67. doi: 10.1021/bi5000178. Epub 2014 Mar 11.

A kinase-independent activity of Cdk9 modulates glucocorticoid receptor-mediated gene induction.


A gene induction competition assay has recently uncovered new inhibitory activities of two transcriptional cofactors, NELF-A and NELF-B, in glucocorticoid-regulated transactivation. NELF-A and -B are also components of the NELF complex, which participates in RNA polymerase II pausing shortly after the initiation of gene transcription. We therefore asked if cofactors (Cdk9 and ELL) best known to affect paused polymerase could reverse the effects of NELF-A and -B. Unexpectedly, Cdk9 and ELL augmented, rather than prevented, the effects of NELF-A and -B. Furthermore, Cdk9 actions are not blocked either by Ckd9 inhibitors (DRB or flavopiridol) or by two Cdk9 mutants defective in kinase activity. The mode and site of action of NELF-A and -B mutants with an altered NELF domain are similarly affected by wild-type and kinase-dead Cdk9. We conclude that Cdk9 is a new modulator of GR action, that Ckd9 and ELL have novel activities in GR-regulated gene expression, that NELF-A and -B can act separately from the NELF complex, and that Cdk9 possesses activities that are independent of Cdk9 kinase activity. Finally, the competition assay has succeeded in ordering the site of action of several cofactors of GR transactivation. Extension of this methodology should be helpful in determining the site and mode of action of numerous additional cofactors and in reducing unwanted side effects.

PMID: 24559102 [PubMed – indexed for MEDLINE]
PMCID: PMC3985961 [Available on 2015/2/21]

New paper on genomics

James Lee and I have a new paper out: Lee and Chow, Conditions for the validity of SNP-based heritability estimation, Human Genetics, 2014. As I summarized earlier (e.g. see here and here), heritability is a measure of the proportion of the variance of some trait (like height or cholesterol levels) due to genetic factors. The classical way to estimate heritability is to regress standardized (mean zero, standard deviation one) phenotypes of close relatives against each other. In 2010, Jian Yang, Peter Visscher and colleagues developed a way to estimate heritability directly from the data obtained in Genome Wide Association Studies (GWAS), sometimes called GREML.  Shashaank Vattikuti and I quickly adopted this method and computed the heritability of metabolic syndrome traits as well as the genetic correlations between the traits (link here). Unfortunately, our methods section has a lot of typos but the corrected Methods with the Matlab code can be found here. However, I was puzzled by the derivation of the method provided by the Yang et al. paper.  This paper is our resolution.  The technical details are below the fold.


Continue reading