Methylation at CPT1A locus is associated with lipoprotein subfraction profiles.

Lipoprotein subfractions help discriminate cardiometabolic disease risk. Genetic loci validated as associating with lipoprotein measures do not account for a large proportion of the individual variation in lipoprotein measures. We hypothesized that DNA methylation levels across the genome contribute to interindividual variation in lipoprotein measures. Using data from participants of the Genetics of Lipid Lowering Drugs and Diet Network (n = 663 for discovery and n = 331 for replication stages, respectively), we conducted the first systematic screen of the genome to determine associations between methylation status at ∼470,000 cytosine-guanine dinucleotide (CpG) sites in CD4+ T cells and 14 lipoprotein subfraction measures. We modeled associations between methylation at each CpG site and each lipoprotein measure separately using linear mixed models, adjusted for age, sex, study site, cell purity, and family structure. We identified two CpGs, both in the carnitine palmitoyltransferase-1A (CPT1A) gene, which reached significant levels of association with VLDL and LDL subfraction parameters in both discovery and replication phases (P < 1.1 × 10−7 in the discovery phase, P < .004 in the replication phase, and P < 1.1 × 10−12 in the full sample). CPT1A is regulated by PPARα, a ligand for drugs used to reduce CVD. Our associations between methylation in CPT1A and lipoprotein measures highlight the epigenetic role of this gene in metabolic dysfunction.

lipoprotein particle diameters multiplied by the relative mass percentage, based on the amplitude of the methyl NMR signal and given in nanometers. The ranges of diameters for small, medium, and large particle classifi cation within each fraction of VLDL, LDL, and HDL are given in Table 1 . Unlike older methods for determining lipoprotein parameters such as ultracentrifugation, NMR does not require the physical separation of the lipoproteins. Instead, NMR uses subclass distinction in NMR spectral properties. This allows a more accurate quantifi cation of VLDL, LDL, and HDL than spectrum representation through shape-fi tting algorithms and gives information about subfraction distributions that closely agrees with that given by gradient-gel electrophoresis ( 14 ). Data were prepared with values three SDs ± the mean to exclude the possibility that outliers biased parameter estimates.
Epigenetic phenotyping. We isolated CD4 + T cells from frozen buffy coat samples isolated from peripheral blood using positive selection by antigen-specifi c magnetic beads (Invitrogen, Carlsbad, CA). We lysed cells captured on the beads and extracted DNA using DNeasy kits (Qiagen, Venlo, the Netherlands) ( 15 ). We used the Infi nium Human Methylation 450 array (Illumina, San Diego, CA) to quantify genome-wide DNA methylation, as described in further detail in a manuscript from our group ( 15 ). Briefl y, prior to the standard manufacturer protocol steps of amplifi cation, hybridization, and imaging steps, we treated 500 ng of each DNA sample with sodium bisulfi te (Zymo Research, Irvine, CA). We used Illumina GenomeStudio software to estimate ␤ scores, defi ned as the proportion of total signal from the methylation-specifi c probe or color channel, and detection P values, defi ned as the probability that the total intensity for a given probe falls within the background signal intensity. During the quality control stage, we removed any ␤ scores with an associated detection P value >0.01 and samples with more than 1.5% missing data points across ‫ف‬ 470,000 autosomal CpGs. Additionally, we excluded any CpG probes where >10% of samples failed to yield adequate intensity ( 15 ).
We normalized the fi ltered ␤ scores using the ComBat package for R software ( 16 ). We performed the normalization on random subsets of 10,000 CpGs per run, with each array of 12 samples used as a "batch." We separately normalized probes from the Infi nium I and II chemistries and subsequently adjusted the ␤ scores for Infi nium II probes using the equation derived from fi tting a second-order polynomial to the observed methylation values across all pairs of probes located <50 bp apart (withinchemistry correlations >0.99), where one probe was Infi nium I and one was Infi nium II ( 15 ). Finally, we eliminated any CpGs where the probe sequence mapped either to a location that did genome-wide levels of signifi cance, but these explain only a small percentage of individual variation in lipoprotein subfractions ( 10,11 ). As yet, there have been no studies examining associations between lipoprotein subfractions and methylation patterns across the genome. We aimed to conduct the fi rst epigenome-wide methylation study of 14 lipoprotein subfraction measures, to examine whether methylation fractions at cytosine-guanine dinucleotide (CpG) sites across the genome are associated with lipoprotein subfractions. Further, we aimed to combine high-resolution data on both common genetic variants and the methylation status of CpG sites under the hypothesis that any observed DNA methylation associations may be in part mediated by genotype.

Study population
The original study population consisted of 1,328 men and women from 148 families consisting of a mix of familial relationships including the following: parent-offspring (N = 614), sibling (N = 667), grandparent-grandchild (N = 89), avuncular (N = 617), half-sibling (N = 22), grand avuncular (N = 69), half-avuncular (N = 23), fi rst cousin (N = 268), half-grand avuncular (N = 12), fi rst cousin once removed (N = 81), half-fi rst cousin (N = 11), half-fi rst cousin once removed (N = 4), and second cousin (N = 1). All participants were of European descent recruited in Minneapolis, Minnesota, and Salt Lake City, Utah. The primary aim of the Genetics of Lipid Lowering Drugs and Diet Network (GOLDN) study was to characterize the role of genetic and dietary factors on an individual's response to both a high-fat meal challenge and fenofi brate intervention. The details of GOLDN have been published elsewhere ( 12 ). Briefl y, the study protocol consisted of an initial screening visit (visit 0), during which participants were asked to discontinue the use of lipid-lowering drugs and over-the-counter medication that could affect lipid levels. Approximately 4 to 8 weeks later, baseline blood chemistries were measured (visit 1). A day later (during visit 2), participants' fasting (8 h fast) blood samples were collected. The fi nal sample consisted only of those willing to undergo the high-fat meal protocol (N = 1,036 individuals) and who had useable NMR and genotype data after exclusions (see below; N = 817). The protocol was approved by the institutional review boards at the University of Minnesota, the University of Utah, Tufts University/ New England Medical Center, and the University of Alabama at Birmingham. Written informed consent was obtained from all participants.

Measures
Biochemical measurements. All plasma samples used for this analysis were collected after a 12 h fast. All samples were analyzed for lipid profi les once all collections were made for each participant in the study. Measurements of VLDL, LDL, and HDL diameter, as well as concentrations of each subfraction (small, medium, and large) were determined by NMR spectroscopy ( 13 ). NMR detects the signal emitted by lipoprotein methyl-group protons when in the fi eld of a magnet charged at 400 MHz. The NMR signal is deconvoluted to obtain estimates of particle numbers and concentrations for each of several lipoprotein fractions. The weighted average particle diameter for each lipoprotein fraction (VLDL, LDL, and HDL) is calculated as the sum of the average Adapted from Jeyarajah et al. ( 35 ).
"replication" subsamples. Using the "surveyselect" command in SAS v9.3, we randomly allocated two-thirds of the GOLDN study participants (n = 663) to the discovery data set and the remaining 331 participants to the replication data set ( 18 ). There were no differences in demographics or measures between the replication and discovery subsamples ( Table 2 ). Post hoc analysis revealed that there was no signifi cant difference in methylation between the discovery and the replication cohorts at locus cg17058475 [ t ( 992 ) = 0.30; P = 0.77], nor at locus cg00574958 [ t ( 992 ) = Ϫ 0.41; P = 0.68]. In the discovery subsample, we modeled associations between methylation scores (outcome/dependent variable) at each CpG site and 14 lipoprotein parameters (independent variable) using linear mixed models, adjusted for age, sex, study site, and the fi rst four principal components generated to estimate T-cell purity as fi xed effects, and pedigree as a random effect using the lmekin function of the kinship package in R ( 19 ). Methylation at each CpG site was normally distributed, and the use of methylation score as the outcome obviated the need to transform the lipoprotein parameters, some of which have nonnormal distributions ( 11 ). Methylation differs with age and associates with BMI; the use of methylation as the outcome allows one to control for these associations. This allowed us to keep the data in raw (unchanged) format without violating the assumptions of our statistical tests. This also allowed for the methylation to be corrected for the covariates such as age and BMI, given the known associations between methylation and these phenotypes. In models where methylation was the outcome, as opposed to the predictor, coeffi cients of genomic control were closer to = 1.0. This indicates that the use of methylation as a predictor resulted in less artifi cial infl ation of P values. The model with methylation as the outcome and lipoprotein trait as a predictor is therefore statistically sound and justifi ed theoretically, but it precludes the ability to interpret betas. We therefore present betas for informational purposes but do not comment on effect sizes. Given that lipoprotein subfraction measures are correlated ( 6 ), and that we conducted replication analyses within our sample, we followed the precedent set by genome-wide association study analyses of lipoprotein parameters and implemented a stringent Bonferroni correction based on the number of CpG sites, not not match the annotation fi le or to more than one locus. We identifi ed such markers by realigning all probes (with unconverted Cs) to the human reference genome ( 15 ).
Genotyping. A total of 906,600 SNPs were genotyped using the Affymetrix Genome-Wide Human 6.0 array. Genotypes were defi ned using the Birdseed calling algorithm ( 17 ). SNPs that were monomorphic (55,530) or had a call rate <96% (82,462) were removed from the analysis. Additionally, SNPs were excluded from the analysis based on the number of families with Mendelian errors as follows: for minor allele frequency (MAF) у 20%, removed if errors were present in more than three families (1,486 SNPs); for 20% > MAF у 10%, removed if errors were present in more than two families (1,338 SNPs); for 10% > MAF у 5%, removed if errors were present in more than one family (1,767 SNPs); and for MAF < 5%, removed if any errors were present (9,592 SNPs). In families with remaining errors, SNPs that exhibited Mendelian error were set to missing (31,595 SNPs). Furthermore, 16 participants with call rates <96% were also removed from any subsequent analyses. Subsequently, 748 SNPs failing the Hardy-Weinberg equilibrium (HWE) test at P value <10 Ϫ 6 were excluded from association analyses. After conducting the previously discussed quality control measures and then excluding SNPs with MAF < 1%, those that deviate from HWE proportions ( P < 10 Ϫ 6 ), those SNPs missing strand information, and those with discrepancies with the mlinfo fi le, imputation was performed using MACH software (Version 1.0.16). After imputation, a hybrid data set was created that included 2,543,887 SNPs, of which 584,029 were initially genotyped in the GOLDN population. Missing individual-level genotyped data were kept as missing in the fi nal genotype data set.

Analysis
Epigenetic analysis. This is the fi rst association study of epigenome-wide methylation and lipoprotein subfractions. Due to the unavailability of a replication sample, we conducted replication within GOLDN by dividing our participants into "discovery" and For all replicated hits, we additionally examined these CpG-phenotype associations in the full GOLDN cohort. To evaluate deviations from the expected test statistic distributions, we constructed quantile-quantile plots (supplementary Fig. I ). Finally, we constructed Manhattan plots to visualize the results (supplementary Fig. II ).
For the CpG sites that reached epigenome-wide signifi cance in the discovery subsample, we fi t identical statistical models to the replication subsample (n = 331). Fourteen methylation-phenotype associations met this threshold (supplementary Table I ), and we again set a stringent correction; the threshold for significance in the replication stage was set at a Bonferroni corrected ␣ of 0.05 ( P < 0.004).
We repeated the analyses in two sets of models. All models controlled for the covariates mentioned previously; the fi rst set additionally controlled for smoking and alcohol, and the second set for fasting insulin. The results did not change, and the ␤ coeffi cients from regressions were not uniformly attenuated, so these results are only presented as supplementary material (supplementary Table II ).
Genetic analysis. Subsequent to the epigenome-wide methylation analysis, we used genome-wide DNA sequence variation data to test for methylation quantitative trait loci (meQTL) for the CpG sites that showed a successfully replicated signal. For these analyses, we ran mixed effects linear models with methylation score as the outcome and ‫ف‬ 2,543,401 genome-wide SNPs as the predictors in separate models. Our statistical model adjusted for age, sex, and T-cell purity principal components as fi xed effects and family as a random effect. In this analysis, we used the combined sample of discovery and replication subsets with complete covariate information (n = 991). We conducted two sets of analysis: First, we looked for cis -acting SNPs, only including SNPs that were located in or in the vicinity (± 20 kb) of the CPT1A gene. To further reduce the multiple testing burden, we also trimmed the list of SNPs based on linkage disequilibrium ( r 2 > 0.3), yielding the total of 10 SNPs. We implemented the Bonferroni correction to adjust for multiple tests, with the adjusted statistical signifi cance level of 0.05/10 = 0.005. Second, we examined the genome for trans -acting SNPs, including all SNPs available in our data, setting the signifi cance at a Bonferonni-corrected 0.05/2,543,401 = 2.0 × 10 Ϫ 8 . For this latter approach, we had the power to detect an SNP association, which explained at least 4.5% of the variance in methylation at either of our two loci.

RESULTS
Sample characteristics for the discovery and replication subsamples are summarized in Table 2 . The two subsamples did not differ with respect to demographic or lipoprotein characteristics ( P < 0.05).
Methylation at six CpGs was associated with lipoprotein parameters in the discovery subsample at genome-wide levels of signifi cance ( P < 1.1 × 10 Ϫ 7 ; supplementary Table I ). Methylation at two of these CpGs, both in the CPT1A gene, showed replicated associations with the same phenotypes in the replication subsample (all P < 5.5 × 10 Ϫ 5 ; Table 3 ; supplementary Fig. III ). Specifi cally, greater methylation of CPT1A was associated with decreased numbers of large, medium, and total VLDL particles, as well as a decrease in small and overall LDL particles, and thus an increase  For sites that reached genome-wide levels of signifi cance ( P < 1.1 × 10 Ϫ 7 ) in the discovery cohort and replicated at a Bonferroni corrected ␣ of 0.05 ( P < 0.004) in the replication cohort.
of CPT1A are associated with a favorable lipid profi le, but these results were not mediated by genetic variation in PPAR ␣ . It is not clear whether increases in methylation cause increases or decreases in expression across the genome. Increased expression of CPT1A should be associated with an increase in the transport of triglycerides and a more favorable lipid profi le. Genetic variation in CPT1A was not associated with lipoprotein diameters in our previous analysis across two populations, nor with any lipoprotein measures in a meta-analysis of genome-wide association study across three independent populations ( 10,11 ). Variation in CPT1A has been associated with lipid traits ( 29 ) [although other studies have failed to replicate this fi nding ( 30 )] and with glycemic control ( 31 ). Thus, we suspected that the association of genetic variation at this locus with cardiometabolic traits may be mediated by altered methylation; however, our meQTL analysis did not support this.
Our analyses revealed a biologically plausible association of methylation of CPT1A with VLDL and LDL parameters but should be viewed in the light of several limitations. First, independent replication is a key outstanding issue. Second, we analyzed genomic methylation from CD4 + T cells, which represents global methylation status. A number of studies on DNA methylation have confi rmed that use of peripheral white cells in studies in IR is valuable and serves as a proof of concept, but these studies have not included lipoprotein subtypes (32)(33)(34). Future research should examine the possibility of tissue-specifi c methylation, especially in the liver, which may produce stronger results. Our analyses were confi ned to a generally healthy population of European descent, unselected for hyperlipidemia ( 11 ). In addition, betas were small, and for each SD increase in methylation, much of the variance in lipoprotein measure was not accounted for . We encourage future studies that explain this additional variance.
Nonetheless, we present an association between methylation status at loci in CPT1A and the total number and the concentration of small LDL particles, LDL diameter, and the total number and concentration of large and medium VLDL particles . Patterns of these particles are associated with IR and atherosclerosis; given previous research suggesting that the expression of CPT1A is involved in insulin sensitivity, if replicated, these results may identify the methylation of CPT1A as a potential target to reduce CVD risk.
in LDL diameter. The epigenome-wide analysis for these phenotypes did not suggest either artifi cial infl ation or defl ation of P values ( = 1.0; supplementary Table III ). Betas were in the range of 0.0078-1.4 × 10 Ϫ 4 over the whole GOLDN population ( Table 3 ), with methylation at the loci having a mean ± SD of 0.10 (0.03) at locus cg00574958 and 0.14 (0.05) at locus cg17058475.
Because no associations were identifi ed between common sequence variants in CPT1A and the methylation status of the cg00574958 or cg17058475 markers for cis -acting SNPs at P < 0.005, or for trans -acting SNPs at P < 2.0 × 10 Ϫ 8 , the meQTL results are not presented here.

DISCUSSION
Lipoprotein profi les are sensitive indicators of cardiometabolic risk, with specifi c patterns of subfractions associated with IR and atherosclerosis, although other nonlipid-based measures confer additional information on cardiometabolic risk. Numerous genetic loci have been associated with lipoprotein parameters ( 10,11 ), but there are no associations between lipoprotein subfraction measures and global gene methylation. Here, we present the fi rst study to systematically screen the genome for areas where methylation fraction is associated with a lipoprotein subfraction parameter. We identifi ed an association between methylation at two CpGs in the CPT1A gene with VLDL and LDL phenotypes. These associations were not attenuated when models additionally controlled for BMI and fasting insulin. Although BMI and fasting insulin correlate with lipoprotein profi les, our analyses suggest that this correlation does not contribute to association between lipoproteins and CPT1A methylation. As methylation in CPT1A was not associated with genetic variation in the CPT1A gene, we conclude that local genetic variation does not account for the methylation associations reported.
Increased methylation at two CpGs located in the 5 ′ untranslated region regulatory region of CPT1A (cg00574958 and cg170584958) was associated with a decrease in large and medium VLDL particles, and thus a decrease in the overall number of VLDL particles. The association of methylation status at one of these CpGs (cg00574958) with a decrease in the small subfraction of LDL only (and not the medium or large subfraction) led to an association with an increase in average LDL diameter. Smaller, denser LDLs are considered atherogenic, and increases in small, dense LDLs are the lipoprotein change most consistently associated with IR and atherosclerosis (4)(5)(6)(7)(20)(21)(22). The role of VLDL in cardiometabolic disease has been the subject of less research, but this fi nding highlights the role of CPT1A in determining an individual's lipoprotein subfraction profi le.
CPT1A is expressed in the liver and attaches carnitine to long-chain fatty acids so they can be shuttled into mitochondria for ␤ -oxidation (23)(24)(25)(26). Activation of PPAR ␣ via therapeutic agents such as fi brates, which lower CVD risk, increases CPT1A expression as do long-chain fatty acids ( 27,28 ). Our results suggest that increases in methylation