New paper on heritability from GWAS

Heritability and genetic correlations explained by common SNPS in metabolic syndrome traits

PLoS Genet 8(3): e1002637. doi:10.1371/journal.pgen.1002637

Shashaank Vattikuti, Juen Guo, and Carson C. Chow

Abstract: We used a bivariate (multivariate) linear mixed-effects model to estimate the narrow-sense heritability (h2) and heritability explained by the common SNPs (hg2) for several metabolic syndrome (MetS) traits and the genetic correlation between pairs of traits for the Atherosclerosis Risk in Communities (ARIC) genome-wide association study (GWAS) population. MetS traits included body-mass index (BMI), waist-to-hip ratio (WHR), systolic blood pressure (SBP), fasting glucose (GLU), fasting insulin (INS), fasting trigylcerides (TG), and fasting high-density lipoprotein (HDL). We found the percentage of h2 accounted for by common SNPs to be 58% of h2 for height, 41% for BMI, 46% for WHR, 30% for GLU, 39% for INS, 34% for TG, 25% for HDL, and 80% for SBP. We confirmed prior reports for height and BMI using the ARIC population and independently in the Framingham Heart Study (FHS) population. We demonstrated that the multivariate model supported large genetic correlations between BMI and WHR and between TG and HDL. We also showed that the genetic correlations between the MetS traits are directly proportional to the phenotypic correlations.

Author Summary: The narrow-sense heritability of a trait such as body-mass index is a measure of the variability of the trait between people that is accounted for by their additive genetic differences. Knowledge of these genetic differences provides insight into biological mechanisms and hence treatments for diseases. Genome-wide association studies (GWAS) survey a large set of genetic markers common to the population. They have identified several single markers that are associated with traits and diseases. However, these markers do not seem to account for all of the known narrow-sense heritability. Here we used a recently developed model to quantify the genetic information contained in GWAS for single traits and shared between traits. We specifically investigated metabolic syndrome traits that are associated with type 2 diabetes and heart disease, and we found that for the majority of these traits much of the previously unaccounted for heritability is contained within common markers surveyed in GWAS. We also computed the genetic correlation between traits, which is a measure of the genetic components shared by traits. We found that the genetic correlation between these traits could be predicted from their phenotypic correlation.

I am very happy that this paper is finally out.  It has been a three year long ordeal.  I’ll write about the story and background for this paper later.

Is a mandate a tax?

The health care law is currently under debate in the US Supreme Court.  At question is whether it is constitutional to impose an individual mandate to buy health insurance nationally.  I think the mandate is a perfect example of how the phrasing of logically equivalent statements can make all the difference.  In other words  “A is equal to B” and “A is not not equal to B” may both be logically equivalent but can have profound differences in the law and politics.  In the current law, everyone is obliged to have health insurance.  If they choose not to have insurance they are forced to pay a penalty.  The fact that one is compelled to engage in a commercial activity seems to be offensive to a majority of the US population.  However, congress has the right to impose a tax.  Hence, the mandate law could have been rephrased as “Everyone is assessed a health care tax and if you have health insurance, then you get a tax credit”.  This is logically equivalent to the mandate law but should have no constitutional problem since we currently pay taxes for social security and medicare.  I have never quite understood why the government doesn’t use this argument to justify the law so if anyone could explain this to me I would greatly appreciate it. It probably has more to do with the political anathema of taxation then anything else.

One last thing – there is a rationale for why a mandate is necessary, which I don’t think everyone fully appreciates.  The health care law aims to prevent people from being denied health care coverage because of pre-existing conditions.  If you make this a law then you have a free rider problem where people can simply sign up for insurance when they get sick.  This would then make insurance prohibitively expensive for everyone else.  The only way to prevent the free rider problem is to make everyone have insurance even when they are healthy and thus the mandate.  So, if you don’t want your insurance company to deny you a liver transplant because you forgot to report that you had  a broken collar-bone when you were six, then you are stuck with the mandate.  The other options are a single payer system like Canada or a fully nationalized health system like the UK but they were political nonstarters.

Calories revisited

Kevin Hall gave a talk today on some of his recent results that forces me to revise what I wrote yesterday.  I won’t divulge his exact results since it’s not published yet but he has experimental evidence that a calorie is not a calorie.  What I said in my previous post was that if you are in steady state then it doesn’t matter what diet you are eating.  However, that must be qualified in light of the recent data.  To be in steady state, you must be in both energy and macronutrient balance. This means that you need to burn exactly all the food you eat, both in the caloric content and composition.  What your body burns depends on the food you eat as well as your current body composition.  So let’s say you decide to cut the carbs from your diet but eat the same amount of calories everyday.  Your body must now burn all the extra fat you are eating to stay in steady state.   If it can adjust immediately then you will stay in steady state.  However, if it cannot burn all the extra fat you are eating then that excess fat will be stored and the body will burn extra glycogen or protein instead.  Your body composition will then change until you are back in steady state.  If your body over adjusts and you burn too much fat, then you will lose fat and gain lean tissue.  In hindsight, the assumption that you can immediately adjust to whatever fuel you are taking in seems pretty far-fetched and according to Kevin’s data, it is.

Is a calorie a calorie?

I read a lot of stuff in the popular press about obesity and weight gain and some of it can make me cringe.  However, here is a piece by Mark Bittman, the New York Times food columnist, that I can agree with.  One of the vexing things about this field is that a little bit of knowledge is a very dangerous thing.  When I talk to people about my research, I get various responses.  For those that have never thought about the problem, when I tell them that the obesity epidemic is caused by too much food they, will simply say  “You needed to do five years of research to figure that out?”  However, for those that have thought about it some, I can get more skepticism:  “What about insulin?  What about lack of physical activity?”  What about perturbed microflora?”  “What about fibre?”  “What about high fructose corn syrup?”   The answer is that all those things may matter but they’re second order.  The data show that the increase in food supply more than explains the increase in weight holding everything else constant.  That means if you include these other effects, it would have been even worse!

Now to the question of a “Is a calorie a calorie?”  This causes a lot of confusion because some people believe that low carb diets will lead to weight loss independent of the reduction in calories.  This might be true but the reasoning is often incorrect.  The wrong answer is that carbs cause insulin to be secreted and insulin suppresses the release of fat (i.e. lipolysis).  Your body stores fat in fat cells or adipocytes.  When insulin is low, adipocytes release fat into the blood stream for you to use as fuel.  When insulin is high, lipolysis is suppressed so the body can burn the carbs or more specifically glucose.  The faulty reasoning then goes that if you eat lots of carbs, you’ll suppress lipolysis so that your fat stays trapped in your adipocytes.  It’s only if you eat low carbs will the fat be free to escape.  On the surface this sounds fairly reasonable.  Where it goes wrong is that if you are in steady state then at the end of the day, it didn’t matter what happened to your insulin levels because your body is the same as it started.

Let me be more specific.  Let’s consider a person in steady state, but eating two different diets.  In the first, she eats carbs all day.  Insulin will be high, lipolysis will be suppressed and she will burn all the carbs she ate.  At the end of the day if she ate 2000 Calories of carbs, she’ll have burned all 2000 of it (ignoring for now the small inefficiencies that won’t change the argument).  Now, let’s say she eats very low carbs for a day.  In this case, insulin stays low all day so she burns fat all day.   However, you can’t ignore the fact that she also ate 2000 Calories of food.  If it was all fat, then what she did was to burn 2000 Calories of fat that was some combination of the fat she ate and the fat that was released from the fat cells.  The leftover fat was then repackaged into the fat cells and at the end of the day she stays the same.  (If she ate no carbs at all, then she would have depleted her small stores of glycogen and then her brain would have started using ketones as fuel but that is another story).

This doesn’t mean that low carb diets won’t work.   However, if  low carbs do cause you to lose weight, then the only reason they work was because you either a) burn more energy or b) eat less.  It has been argued that the drop in insulin after you eat carbs makes you hungry and can cause you to eat too much.  This might be true but there is no solid evidence for it as far as I know.  It is also plausible that eating low carbs could cause you to use more energy.  For one, if you ate lots of protein, it could take the body more energy to process it although there is no clear evidence yet.  If you eat lots of fat, there will be excess futile cycling of fat between the blood stream and fat cells and this could cause excess energy to be used but again there is no evidence yet.  Kevin Hall is actually doing some precise experiments to address these questions and will publish some results soon.

Finally, let’s get back to “Is a calorie a calorie?”.   The answer, as Bittman writes, is that it is probably not since the body does process foods differently but we don’t yet know how much it will matter.  The question of how much energy we burn on different diets could be answered shortly with ongoing experiments.  However, the question of how different diets affect appetite is a question we won’t have an answer to in the near future.


The Mandelbrot set

The Mandelbrot set is often held up as an example of how amazing complexity can be generated from a simple dynamical system.  In comments to my previous posts on the information content or Kolmogorov complexity of the brain, it was brought up as an example of how the brain could be very complex yet still be fully specified by the genome.  While, I agree with this premise, the Mandelbrot set is not the best example to show this.   Now, the Mandelbrot set is a beautiful example of how you can generate incredibly complex fractal landscapes using a simple algorithm.  However, it takes an uncountably infinite amount of information to specify it.

Let’s be more precise.  Consider the iterative map z \rightarrow z^2 +C.  Pick any complex number for C and iterate the map starting at z=0.  The ensuing iterates or orbit will either go to infinity or stay bounded.  The Mandelbrot set is the set of all points that you use for C that stay bounded.  In essence, it consists of all complex numbers such that the series C, C^2 +C, (C^2+C)^2 +C stays bounded.  You can immediately rule out some numbers.  You know that zero will always stay bounded and you also know that any number with absolute magnitude greater than 2 will also go to infinity.  In fact, to compute the Mandelbrot set, you just have to see if any iterate exceeds 2 because after that you know it is gone.  The question then is what happens in between and it turns out that the boundary of the Mandelbrot set is this beautiful fractal shape that looks like sea horses within sea horses and so forth.

The question then is how much information do you need to construct the Mandelbrot set.  The answer as proved by Blum, Cucker, Shub, and Smale (see their book Complexity and Real Computation), is that the Mandelbrot set is undecidable.  There is no algorithm to obtain the boundaries of the Mandelbrot set.  In other words, you would need an uncountable amount of information to specify it.  The beautiful pictures we see, as shown above, are only approximations to the set.

However, I am not hostile to the idea that simple things can generate complexity.  One could say that my career is based on this idea.  It is what chaos theory is all about.  I use the argument all the time.  I’m just saying that the Mandelbrot set is not a great example.  Perhaps, a better example is to say let’s consider the logistic map on real numbers x \rightarrow rx(x-1).  If r is between zero and one, then all orbits will eventually go to zero but as you increase r, the nature of the orbits will change and eventually you’ll reach a periodic doubling cascade to chaos. If you choose an r that is slightly bigger than 3.57 then you’ll get chaos.  This implies that small changes to the initial conditions will give you completely different results and also that if you just plot the iterates coming out of the map, they will seem to have no apparent pattern.  If you were to naively estimate the complexity or information content of the orbit, you could be led to believe that it has high information content even though the Kolmogorov complexity is actually quite small and is given by the logistic map and the initial condition.  However, this may also not be the greatest example because there are ways to deduce that the orbit came from a low dimensional chaotic system rather than a high dimensional system.

Information content of the brain revisited

My post – The gigabit machine, was reposted on the web aggregator site recently.  Aside from increasing traffic to my blog by tenfold for a few days, the comments on reddit made me realize that I wasn’t completely clear in my post.  The original post was about a naive calculation of the information content in the brain and how it dwarfed the information content of the genome.  Here, I use the term information in the information theoretical sense, which is about how many bits must be specified to define a system.  So a single light switch that turns on and off has one bit of information while ten light switches have 10 bits.  If we suppose that the brain has about 10^{11} neurons, with about 10^4 connections each, then there are 10^{15} total connections.  If we make the very gross assumption that each connection can be either “on” or “off”, then we arrive at 10^{15} bits.  This would be a lower bound on the amount of information required to specify the brain and it is already a really huge number.  The genome has 3 billion bases and each base can be one of four types or two bits, so this gives a total of 6 billion bits.  Hence, the information contained in the genome is just rounding noise compared to the potential information contained in the brain.  I then argued that education and training was insufficient to make up this shortfall and that most of the brain must be specified by uncontrolled events.

The criticism I received in the comments on reddit was that this doesn’t imply that the genome did not specify the brain. An example that was brought up was the Mandelbrot set where highly complex patterns can arise from a very simple dynamical system.  I thought this was a bad example because it takes a countably infinite amount of information to specify the Mandelbrot set but I understood the point which is that a dynamical system could easily generate complexity that appears to have higher information content.  I even used such an argument to dispel the notion that the brain must be simpler than the universe in this post.  However, the key point is that the high information content is only apparent; the actual information content of a given state is no larger than that contained in the original dynamical system and initial conditions.   What this would mean for the brain is that the genome alone could in principle set all the connections in the brain but these connections are not independent.  There would be correlations or other high order statistical relationships between them.  Another way to say this is that while in principle there are 2^{10^{15}} possible brains, the genome can only specify 2^{6\times10^{9}} of them, which is still a large number.  Hence, I believe that the conclusions of my original post still hold – the connections in the brain are either set mostly by random events or they are highly correlated (statistically related).

Income, wealth, and being rich

An article in Bloomberg recently told of how some Wall Street bankers were struggling to make ends meet because of the drop in bonuses. There was the obvious backlash and schadenfreude.  Although I agree that if you’re making upwards of 350 thousand per year you shouldn’t get much sympathy, most people are still confused about the difference between wealth and income.  Your are rich if you have lots of wealth.  Having a high income will help you attain wealth but if you spend more than your income then you will be struggling.  To be truly rich, you should be able to sustain your current lifestyle on just your wealth (e.g. savings and investments) and not require any additional income.  The easiest way to become “rich” is to reduce your expenses.

This can be made concrete with the simple model of wealth change:

\frac{dW}{dt}= I +r W - E

where W is wealth, I is annual income, r is the annual return on investments, and E is annual expenditure.  In the absence of income (I = 0), you can maintain your lifestyle and wealth, if rW\ge E.   You are rich if the annual return on your wealth exceeds your annual expenditures.  If you’re annual expenditure is 100 thousand dollars, and your annual rate of return is 5%, then you’ll need 2 million dollars to be rich.  However, this is a conservative estimate because you’ll still have $2 million when you die.

Suppose, you just want to maintain your lifestyle while you’re alive and have nothing left when you die. Then, we need to solve the differential equation, yielding

W(t) = \frac{(E-I)}{r}(1-e^{rt}) + W(0)e^{rt}



for the current wealth you need if you want to have W(t) dollars in t years.  If we suppose no income and zero wealth at time t, then we get

W(0) = E(1-e^{-rt})/r,

for the necessary current wealth, which gives the intuitive result that you need t times your annual expenditure if the growth rate is very slow.  The above formula also works if we include the rate of inflation (and taxes) if we let  r be the real annual rate of return that includes inflation and taxes.  So putting in some numbers, if we suppose that you want to have zero wealth in 50 years time and you have a real rate of return of 3%, then you’ll need about 26 times your annual expenditure to be rich.  If you get 5% real return then you’ll need 18 times the expenditure and if you can get the historical 7% from the stock market then you’ll only need 14 times the expenditure.  If you can get your expenditure down to $50K per year then you’ll only need $700,000 to be self-sufficient.  However, if you wanted to be really rich such that you could live basically anywhere, travel wherever you want, own a summer home, send your kids to private school, then let’s say you’ll need to spend about a million dollars a year.  If you can get 7% real returns, then you’ll need $14 million dollars to be rich.

Mar 8: Corrected some errors