When it comes to measuring cellular age, telomere length is often the first method that comes to mind. However an increasing body of work suggests that telling time using an epigenetic clock may be even more accurate – specifically through the detection of DNA methylation. Previous studies have shown how age-related changes in DNA methylation vary with tissue type. In a Genome Biology study published today, Steve Horvath from the University of California, Los Angeles, USA reveals his novel ‘age calculator’ that is able to accurately predict DNA methylation age across multiple tissue types. In his evaluation of this age predictor, Horvath takes advantage of large volumes of publicly available DNA methylation data sets. He explains more about how this study came about, as well as discussing the potential impact this could have on diagnosing and characterizing disease, specifically in the context of cancer where Horvath noticed a particularly intriguing result.  


What was the motivation behind this study, and how did your previous research lead up to it?

In order to study what causes aging and what can be done against it, we need a way to measure age. In other words, there is a need for accurate aging clocks. I had previously spent a couple of years trying to develop an aging signature based on transcriptomic data but these studies went nowhere.

A few years ago, a colleague handed me a DNA methylation data set from a twin study of homosexuality. While we did not find an epigenetic signature of homosexuality, I quickly realized that age had a profound effect on DNA methylation levels in these data. The effect was so strong that it was straightforward to construct an age predictor based on DNA methylation levels from saliva (results that were published in PLoS One, 2011;6(6):e14821).

Last year, Roel Ophoff and I published an article in Genome Biology that showed that many age related changes in brain tissue can also be found in blood tissue. These results, which echo those from many other groups, provide a first glimpse of the epigenetic clock. But there is a big difference between having a set of age related CpGs and having an accurate age predictor that works across most tissues and cell types.

It took me over four years to construct a predictor of age that would work for most tissues and cell types, and that would work for data sets measured on two different versions of the Illumina platform. Assembling the large data set was as time consuming as finding suitable statistical methods for combining different data sets and for calibrating age.


Your study builds on previous findings that DNA methylation in human cells changes with age, to show this holds true across a vast number of datasets from healthy tissue. How surprised were you by the strength of this correlation?

As a biostatistician, I am completely surprised by the strength of the correlation. If I had not done the study myself, I would probably not believe it. DNA methylation age has an almost perfect correlation of 0.96 with chronological age. This means that the epigenetic clock is almost twice as accurate as other clocks that are not based on DNA methylation levels. More precisely, it is about 80 percent more accurate than the telomere length based clock and it is 70 percent more accurate than the cyclin-dependent kinase inhibitor 2A based clock (also known as p16Ink4a), when accuracy is measured by correlation.


Using your findings in healthy tissue, you develop an ‘age calculator‘ tool for people to predict the age of a subject based on DNA methylation data. What plans do you have to develop this tool further and how do you envisage it being used?

One of the supplements contains R software code but an even more user friendly web based tool can be found here.

Currently, the software can be used to predict age (that is measure DNA methylation age), and to predict gender. Future versions will also allow the user to predict the source of the DNA, for example tissue type. Although the current version already implements a tissue predictor, it still needs to be evaluated and optimized.


Your study shows that for some diseases, especially some cancers, DNA methylation age is accelerated so that it is older than the subject’s chronological age. Do you have any thoughts on why this might be?

While nobody wants to be old, I don’t think that accelerated age is necessarily a bad thing. In particular, age acceleration may protect us from cancer. I find it striking that cancers with high age acceleration tend to have fewer somatic mutations. These and related findings suggest to me that DNA methylation age measures the work done by an epigenetic maintenance system as mentioned in the article.


Do you think age acceleration is merely symptomatic of disease or do you think it might represent a therapeutic target?

I think this is clearly the most pressing question. The epigenetic clock is the new elephant in the room of aging research. It will probably stimulate a lively scientific discussion and carefully designed follow up studies. It would be wonderful news for anybody who wants to live a long and healthy life if DNA methylation age turns out to be closely related to a process that causes aging. If so, it would become a valuable surrogate marker for evaluating rejuvenating interventions.


We understand that you are continuing the study of DNA methylation age and disease. What directions are you taking this research in?

Many of the most exciting questions will require teams of researchers. We should try to develop a similar epigenetic clock for mice. We need to determine whether age acceleration in easily accessible human tissues (for example blood, saliva, skin) can be used to diagnose or prognosticate age related diseases. Large cohort studies such as the Women’s Health Initiative or the Framingham heart study will be useful to address these questions.

DNA methylation age is also a very promising biomarker for characterizing different cancer sub types. My preliminary data already suggest that age acceleration of cancer tissue could inform treatment modalities. For example, patients whose breast cancer tissue looks very old are likely to have an estrogen receptor or progesterone receptor positive cancer.

Of course, we need to evaluate the epigenetic clock in many other organs and cell types. I am
particularly interested in evaluating it in skeletal and cardiac muscle tissue. It will be interesting to figure out why heart tissue appears to be younger while female breast tissue appears to be older than other tissues.


As a biostatistician, you rely on the availability of data generated by others for your research, in a mutually beneficial arrangement. How much difference has the move toward open data, made to the work of biostatisticians such as yourself?

The open access paradigm shift has really made all the difference in my research and that of many of my colleagues. It also greatly improved the quality of scientific papers. Several data sets, generated by different labs, are usually needed to validate a particular finding or to illustrate that a novel statistical method is useful. I am tremendously grateful to the many researchers who went through the trouble of posting their data in open access repositories such as Gene Expression Omnibus (GEO) and ArrayExpress. From my own experience of depositing genomics data, I can assure you that the staff at these repositories are very helpful. While journal editors and reviewers play an important role in enforcing open access, I have found that most authors are very generous and forthcoming. Many of them went through a considerable trouble responding to my email requests and questions. I also want to briefly mention the crucial contribution of The Cancer Genome Atlas which is a freely accessible data repository.


Questions from Naomi Attar (@naomiattar), Senior Editor for Genome Biology.


More about the researcher(s)

  • Steve Horvath, professor of human genetics and biostatistics, University of California, Los Angeles, USA.

    Steve Horvath

    Steve Horvath is professor of human genetics and biostatistics at the University of California, Los Angeles, USA. He obtained his PhD in mathematics from the University of North Carolina, USA, and subsequently gained a Doctorate of Science in Biostatistics from the Harvard School of Public Health, USA. His current research interests lie at the intersection… Read more »


Highly AccessedOpen Access

Related posts