Robust validation of methylation levels association at CPT1A locus with lipid plasma levels.

There is increasing enthusiasm regarding the use of bio-banked whole blood DNA as a model to discover methylation marks associated with biological phenotypes and generate novel mechanistic hypotheses (1–3). DNA methylation has a critical role in cell functions and is cell-type specific. Such cell specificity makes DNA methylation particularly challenging for epidemiological epigenetic investigations because disease relevant cell types might not be accessible due to practical issues such as availability, ethics, and cost associated with more complex specimen collection. 
 
Recent work suggests that agnostic methylation-wide association scan (MWAS) in peripheral blood can reflect phenotype-associated methylation marks in other tissues and cell types, with effects detected in established effector cells much stronger than effects detected in blood (4). These observations suggest that marks detected in blood are associated with functions in effector cells. The Illumina HumanMethylation450 (HM450K) array is a robust assay to measure DNA methylation across the genome (4–7). For any high-throughput technologies, and in particular for a novel assay such as the HM450K, rigorous quality control procedures are warranted and robustness of findings must be validated through independent replication to avoid reporting spurious associations. 
 
In the current issue of the Journal, Frazier-Wood et al. (8) reported the novel findings of significant negative correlations between methylation levels at two CpG sites in the CPT1A locus and plasma levels of VLDL and LDL. Methylation levels were assessed in CD4+ T-cells isolated from peripheral blood DNA using the HM450K array. Given that no independent study samples were available for replication, to circumvent this challenge, the authors adopted an internal validation method by splitting the whole sample into “discovery and replication subsamples”. This strategy provides arguments in favor of the discovered associations but does not provide evidence of robustness against spurious findings due to sampling or confounding biases or any other undetected biases present in the study sample. A robust and thorough validation strategy implies the use of independent study samples and variation in the study designs (9, 10). The validation phase is of particular importance in MWAS, as this technique is particularly subjected to confounders (3). Thus, we undertook to test for associations the two CTP1A CpG sites found associated with lipid-related traits by Frazier-Wood et al. using two independent study samples with considerable variations in their respective study design and with the design of the Frazier-Wood study. 
 
The studies had differences in sampling scheme, DNA methylation specimen, and array preprocessing approaches. The notable differences in the design and sample characteristics between the three studies are shown in Table 1. Most notable is the method for lipid measurement, nuclear magnetic resonance spectroscopy in Frazier-Wood et al. and spectrophotometry in our studies. In addition to sampling variation, and of particular interest for MWAS studies, Frazier-Wood et al. assessed DNA methylation in isolated CD4+ T-cells, while we assessed methylation in peripheral whole blood, which includes CD4+ T-cell (<30%) and several other leukocyte subtypes. Finally, different normalization procedures were used: we applied the SWAN methodology (4, 11) to globally normalize β values from the Infinium I and II probes, while separate normalization by probe type was applied by Frazier-Wood et al. 
 
 
 
TABLE 1. 
 
Main design and sample characteristics of the three MWAS studies on lipids 
 
 
 
Despite the nontrivial differences between these studies, we observed strong statistical evidence for a negative association between the two CTP1A CpG sites (cg00574958 and cg17058475) identified by Frazier-Wood et al. and plasma levels of both LDL and triglycerides (TG) in two independent studies, the MARTHA (4, 12) and F5L-pedigree studies (13). In our two samples totaling 526 individuals, increased DNA methylation levels at CPTA1 CpG sites were associated with both decreased LDL and TG (Table 2). A 1% increase in cg00574958 DNA methylation levels was associated with a 0.057 ± 0.011 decrease in log TG levels (P = 5.71 10−8). Corresponding values for a 1% increase in cg17058475 levels were 0.030 ± 0.008 (P = 9.83 10−5). 
 
 
 
TABLE 2. 
 
Association of cg00574958 and cg17058475 CPT1A CpG variability with plasma TG and LDL levels in the MARTHA and F5L-pedigrees 
 
 
 
Of note, cg00574958 and cg17058475 were highly correlated (ρspearman = 0.67 in both studies, P < 10−16); adjusting for cg00574958 in the model abolished the effect observed for cg17058475 on log TG levels. Finally, after adjustment for key covariates (age, sex, BMI, cell type composition, batch, and chip effects), cg00574958 explained ∼4% of log TG plasma levels, both in MARTHA and F5L-pedigrees. Negative association was also observed between plasma LDL levels and cg17058475 (P = 1.7 10−2) but not with cg00574958 (P = 0.11). No association was observed with HDL-cholesterol levels (P = 0.96 for cg00574958 and P = 0.75 for cg17058475), nor with total cholesterol levels (P = 0.16 for cg00574958 and P = 0.53 for cg17058475). 
 
The CPT1A protein is essential for fatty acid oxidation (a multistep process that metabolizes fats and converts them into energy) and is expressed in the liver and glandular tissues (14). This pivotal role in fatty acid metabolism makes CPT1A DNA methylation marks relevant to many metabolic disorders (from lipids to glucose homeostasis). The lipid-related DNA methylation probes in this study (cg00574958 and cg17058475) are designated as falling in a single “CpG shore”, and are flanked by two CpG islands. Human ENCODE HM450K studies performed on over 40 cell lines suggest these two probes show more variable methylation levels than the two CpG islands that flank them. The uncoupled methylation levels at these probes versus the flanking islands suggest that the observed variation is more likely to be regulatory. This region also shows evidence of open chromatin through DNase I hypersensitivity assays (15) and gene regulatory potential through chromatin immunoprecipitation sequencing of the epigenetic modification H3K27ac (16). More work is needed to understand the functional impact of DNA methylation on CPT1A gene regulation. 
 
Three important conclusions emerge from this validation study. First, despite limitations in the Frazier-Wood et al. replication approach, the published results are robust to variation in sample, study design, normalization procedures, and even DNA blood specimen type. Second, inter-individual variation in lipid-related traits appears to be under the influence of DNA methylation regulation at the CPT1A locus. This epidemiological evidence now requires technical validation and functional work to confirm that these methylation marks are causes rather than consequences of lipid levels variation. Given that DNA methylation marks are potentially reversible, evidence for their role in the regulation of such a key enzyme is of great interest as it could lead to new therapeutic approaches (e.g., drug and/or diet supplementation) to modulate CPT1A expression. Finally, and of major importance for MWAS studies, peripheral whole blood DNA methylation marks were detected in an enzyme gene expressed in the liver and glandular tissues, suggesting that such marks could serve as surrogates for methylation at more closely-related effector cells, such as hepatocytes. The latter adds to the recent paper by Dick et al. (4) also supporting the value of peripheral whole blood DNA methylation marks as biomarkers of methylation in other tissues.

There is increasing enthusiasm regarding the use of biobanked whole blood DNA as a model to discover methylation marks associated with biological phenotypes and generate novel mechanistic hypotheses ( 1 -3 ). DNA methylation has a critical role in cell functions and is cell-type specifi c. Such cell specifi city makes DNA methylation particularly challenging for epidemiological epigenetic investigations because disease relevant cell types might not be accessible due to practical issues such as availability, ethics, and cost associated with more complex specimen collection.
Recent work suggests that agnostic methylation-wide association scan (MWAS) in peripheral blood can refl ect phenotype-associated methylation marks in other tissues and cell types, with effects detected in established effector cells much stronger than effects detected in blood ( 4 ). These observations suggest that marks detected in blood are associated with functions in effector cells. The Illumina HumanMethylation450 (HM450K) array is a robust assay to measure DNA methylation across the genome ( 4 -7 ). For any high-throughput technologies, and in particular for a novel assay such as the HM450K, rigorous quality control procedures are warranted and robustness of fi ndings must be validated through independent replication to avoid reporting spurious associations.
In the current issue of the Journal , Frazier-Wood et al. ( 8 ) reported the novel fi ndings of signifi cant negative correlations between methylation levels at two CpG sites in the CPT1A locus and plasma levels of VLDL and LDL. Methylation levels were assessed in CD4+ T-cells isolated from peripheral blood DNA using the HM450K array. Given that no independent study samples were available for replication, to circumvent this challenge, the authors adopted an internal validation method by splitting the whole sample into " discovery and replication subsamples " . This strategy provides arguments in favor of the discovered associations but does not provide evidence of robustness against spurious findings due to sampling or confounding biases or any other undetected biases present in the study sample. A robust and thorough validation strategy implies the use of independent study samples and variation in the study designs ( 9 , 10 ). The validation phase is of particular importance in MWAS, as this technique is particularly subjected to confounders ( 3 ). Thus, we undertook to test for associations the two CTP1A CpG sites found associated with lipid-related traits by Frazier-Wood et al. using two independent study samples with considerable variations in their respective study design and with the design of the Frazier-Wood study.
The studies had differences in sampling scheme, DNA methylation specimen, and array preprocessing approaches. The notable differences in the design and sample characteristics between the three studies are shown in Table 1  to sampling variation, and of particular interest for MWAS studies, Frazier-Wood et al. assessed DNA methylation in isolated CD4+ T-cells, while we assessed methylation in peripheral whole blood, which includes CD4+ T-cell (<30%) and several other leukocyte subtypes. Finally, different normalization procedures were used: we applied the SWAN methodology ( 4 , 11 ) to globally normalize β values from the Infi nium I and II probes, while separate normalization by probe type was applied by Frazier-Wood et al.
Despite the nontrivial differences between these studies, we observed strong statistical evidence for a negative association between the two CTP1A CpG sites (cg00574958 and cg17058475) identifi ed by Frazier-Wood et al. and plasma levels of both LDL and triglycerides (TG) in two independent studies, the MARTHA ( 4 , 12 ) and F5L-pedigree studies ( 13 ). In our two samples totaling 526 individuals, increased DNA methylation levels at CPTA1 CpG sites were associated with both decreased LDL and TG ( Table 2 ). A 1% increase in cg00574958 DNA methylation levels was associated with a 0.057 ± 0.011 decrease in log TG levels ( P = 5.71 10 − 8 ). Corresponding values for a 1% increase in cg17058475 levels were 0.030 ± 0.008 ( P = 9.83 10 − 5 ).
Of note, cg00574958 and cg17058475 were highly correlated ( ρ spearman = 0.67 in both studies, P < 10 − 16 ); adjusting for cg00574958 in the model abolished the effect observed for cg17058475 on log TG levels. Finally, after adjustment for key covariates (age, sex, BMI, cell type composition, batch, and chip effects), cg00574958 explained ~ 4% of log TG plasma levels, both in MARTHA and F5L-pedigrees. Negative association was also observed between plasma LDL levels and cg17058475 ( P = 1.7 10 − 2 ) but not with cg00574958 ( P = 0.11). No association was observed with HDL-cholesterol levels ( P = 0.96 for cg00574958 and P = 0.75 for cg17058475), nor with total cholesterol levels ( P = 0.16 for cg00574958 and P = 0.53 for cg17058475).
The CPT1A protein is essential for fatty acid oxidation (a multistep process that metabolizes fats and converts them into energy) and is expressed in the liver and glandular tissues ( 14 ). This pivotal role in fatty acid metabolism makes CPT1A DNA methylation marks relevant to many metabolic disorders (from lipids to glucose homeostasis). The lipid-related DNA methylation probes in this study (cg00574958 and cg17058475) are designated as falling in a single " CpG shore " , and are fl anked by two CpG islands. Human ENCODE HM450K studies performed on over 40 cell lines suggest these two probes show more variable methylation levels than the two CpG islands that fl ank them. The uncoupled methylation levels at these probes versus the fl anking islands suggest that the observed variation  Association was tested using a linear regression model (mixed linear model in F5L-Pedigrees) where log(TG) (LDL, resp.) was the outcome and the CpG site the predictor variable. Analyses were adjusted for age, sex, cell type, batch and chip effects. Reported coeffi cients (standard error) represent the increase in outcome value associated with a 1% increase in CpG site variability. In MARTHA, TG and LDL phenotypes were measured in 327 and 180 individuals, respectively. In the F5L-pedigrees study, lipid phenotypes were measured in 199 individuals.
a Results of the MARTHA and F5L-pedigrees studies were combined into a random effect meta-analysis based on the inverse-variance weighting method.
is more likely to be regulatory. This region also shows evidence of open chromatin through DNase I hypersensitivity assays ( 15 ) and gene regulatory potential through chromatin immunoprecipitation sequencing of the epigenetic modifi cation H3K27ac ( 16 ). More work is needed to understand the functional impact of DNA methylation on CPT1A gene regulation. Three important conclusions emerge from this validation study. First, despite limitations in the Frazier-Wood et al. replication approach, the published results are robust to variation in sample, study design, normalization procedures, and even DNA blood specimen type. Second, inter-individual variation in lipid-related traits appears to be under the infl uence of DNA methylation regulation at the CPT1A locus. This epidemiological evidence now requires technical validation and functional work to confi rm that these methylation marks are causes rather than consequences of lipid levels variation. Given that DNA methylation marks are potentially reversible, evidence for their role in the regulation of such a key enzyme is of great interest as it could lead to new therapeutic approaches (e.g., drug and/or diet supplementation) to modulate CPT1A expression. Finally, and of major importance for MWAS studies, peripheral whole blood DNA methylation marks were detected in an enzyme gene expressed in the liver and glandular tissues, suggesting that such marks could serve as surrogates for methylation at more closely-related effector cells, such as hepatocytes. The latter adds to the recent paper by Dick et al. ( 4 ) also supporting the value of peripheral whole blood DNA methylation marks as biomarkers of methylation in other tissues.
We thank Dr. Michael D. Wilson for his judicious comments on the manuscript and for the many fruitful discussions about epigenetic regulation.