Associations of genetic variants for adult lipid levels with lipid levels in children. The Generation R Study[S]

Lipid concentrations are heritable traits. Recently, the number of known genetic loci associated with lipid levels in adults increased from 95 to 157. The effects of these 157 loci have not been tested in children. Considering that lipid levels track from childhood to adulthood, we studied to determine whether these variants already affected lipid concentrations in a large group of 2,645 children with a median age of 6.0 years (95% range 5.7–7.3 years) from the population-based Generation R Study. Twenty-eight SNPs associated with TGs, 39 SNPs associated with total cholesterol (TC), 28 SNPs associated with LDL cholesterol (LDL-C), and 56 SNPs associated with HDL cholesterol (HDL-C) were analyzed individually and combined into genetic risk scores (GRSs). All risk scores were associated with their specific outcomes. The differences in mean absolute lipid and lipoprotein values between the 10% of children with the highest lipid or lipoprotein GRS versus the 10% with the lowest score were 0.28, 0.25, 0.32, and 0.30 mmol/l for TGs, TC, LDL-C, and HDL-C, respectively. In conclusion, we show for the first time that GRSs based on 157 SNPs associated with adult lipid concentrations are associated with lipid levels in children. The genetic background of these phenotypes at least partly overlaps between children and adults.

org/mpg/snap/ldsearch.php). The other five SNPs were excluded, because no perfect proxy was available. Thus, for the statistical analysis we had information on 28, 39, 28, and 56 SNPs associated with TGs, TC, LDL-C, and HDL-C, respectively (supplemental Tables S1a-d).

Lipid measurements
We measured TG, TC, LDL-C, and HDL-C levels once for each participant from 30 min fasting venous blood samples collected during the 6 year follow-up with enzymatic methods using the Cobas 8000 analyzer (Roche, Almere, The Netherlands). Quality control demonstrated intra-and inter-assay coefficients of variation ranging from 0.77 to 1.39%, and from 0.87 to 2.40%, respectively.

Statistical analysis
All lipid concentrations were normally distributed, except TGs, which were natural logarithm transformed to achieve normality. First, we studied the associations of the individual SNPs with their respective outcomes using multiple linear regression models. Second, as our main analysis, we combined the SNPs into weighted risk scores for each trait. We summed the number of trait-increasing alleles (denoted as risk alleles throughout this report) for all SNPs in the score using the dosage data from the Generation R Study GWAS dataset (9), with each SNP being weighted by its previously reported effect size in adults as follows: where n is the number of SNPs,  is the effect size reported in the adult GWAS (weight), and SNP represents the number of traitincreasing alleles (dosage) from the Generation R GWAS data for that SNP. So for each SNP, the dosage of the trait-increasing allele from the Generation R GWAS data was multiplied by the effect size from the adult GWAS and these values were then summed across all SNPs in the GRS. Next, we rescaled the weighted risk scores to range from zero to the maximum number of traitincreasing alleles, which equals two times the number of SNPs in the score using the following formula: Rescaled GRS = (weighted GRS × maximum number of trait-increasing alleles)/(2 × sum of weights). For the figures, the risk scores were rounded to the nearest integer for clarity of presentation. Third, we created unweighted risk scores by summing the number of trait-increasing alleles for all SNPs in the score using the dosage data. We studied the associations of weighted and unweighted GRSs with all outcomes using multiple linear regression models.
We calculated the percentage of variance in the outcome explained by the GRS or by individual SNPs by subtracting the unadjusted R 2 of a regression model, including only the covariates from the unadjusted R 2 of a regression model including the GRS or the individual SNP and the covariates. All models were adjusted for sex, age, and ethnicity. For the ethnicity adjustment, we used the first four principle components derived from the Generation R Study GWAS data. The main analysis was performed in the full cohort. Additionally, we performed a sensitivity analysis repeating all analyses in the European ancestry population only and adjusting for the first four European-ancestry-specific principal components. The principal component analysis was performed on the samples that passed the quality control procedures for the GWAS. These samples were merged with the Northwestern European (CEU), Sub-Saharan West African (YRI), and Asian (CHB and JPT) panels from HapMap Phase II release 22 build 36 (19,20). Only a common set of 36,845 independent (LD-pruned) autosomal SNPs was used. Pairwise identity-by-state relations were calculated A better understanding of the pathophysiological mechanisms underlying lipid levels in children may form the starting point for future preventative and therapeutic strategies. Therefore, our aim was to study the associations of GRSs based on all 157 SNPs currently known to be associated with serum lipids in adults with the same outcomes in children participating in the Generation R Study, a population-based prospective cohort study from fetal life until young adulthood.

Design and study population
The study was performed in the Generation R Study, which is a population-based prospective cohort study from fetal life until young adulthood in Rotterdam, The Netherlands. In total, 9,778 pregnant women with a due date between April 2002 and January 2006 were included in the study, the vast majority of which were prenatally enrolled (14). The study was approved by the local Medical Ethical Committee. Written informed consent was obtained for all participating children. Large amounts of prenatal and postnatal data were collected from physical examinations, fetal ultrasounds, biological samples, interviews, and questionnaires, and there is ongoing follow-up. The Generation R Study is a multiethnic cohort. The largest subgroup is of Dutch/European origin (57%), with other ethnic groups including Turkish (7.4%), Surinamese (7.3%), and Moroccan (6.4%) (15). Our analyses were restricted to singleton live born children who had genomewide association data and at least one of the lipid measurements available. From a total of 9,901 children born to 9,778 mothers, 9,506 were singleton live births. GWAS data were available for 5,732 of these children, of whom 2,645 had at least one of the blood lipid measurements (Fig. 1).

Genetic data
In the Generation R Study, cord blood samples were collected. In a small minority of the study population, it was not possible to obtain cord blood. In these cases, blood was collected during the follow-up visit at the age of six years (16). DNA was extracted from these blood samples and genome-wide association scans were performed with the Illumina 610 Quad and 660 W platforms. SNPs were excluded if they had a minor allele frequency 0.001, a call rate <0.98, or if they were out of Hardy-Weinberg Equilibrium (P < 1E 6 ) (15). The genotyped dataset was then imputed to the 2.5 million SNPs of the HapMap II (release 22) cosmopolitan panel with MACH (version 1.0.15) software (14,15,17,18).
From a recent large GWAS meta-analysis, we identified 157 SNPs known to be associated with lipid levels in adults: 28 SNPs for TGs, 39 for TC, 30 for LDL-C, and 60 for HDL-C (9). Genotypes for the SNPs of interest for our study were extracted from the existing Generation R imputed GWAS dataset. SNPs with an imputation quality (R-squared) of less than 0.3 were excluded. This was the case for one SNP, rs1801689, associated with LDL-C in adults. The quality of the imputation for the other SNPs ranged from 0.62 to 1.0, with a mean of 0.96. Eight SNPs were not available in our dataset: rs11613352 (for TGs); rs9411489 and rs7640978 (both for LDL-C); and rs4759375, rs838880, rs2652834, rs1047891, and rs1936800 (all for HDL-C). Of these, rs11613352, rs7640978, and rs1936800 were replaced by perfect proxies: rs11614506, rs9824286, and rs1936801, respectively. R 2 was one for all three pairs based on the European ancestry linkage disequilibrium data in HapMap release 22 (https://www.broadinstitute. (http://genenetwork.nl/bloodeqtlbrowser/) to explore their associations with expression of genes within a distance of 250 kb in 5,311 blood samples from seven cohorts (21). Table 1 represents the characteristics of the study population for the full group and for the children of European ancestry separately at a median age of 6.0 years (95% range 5.7-7.3 years).

Associations of SNPs for TG with childhood lipid levels
Two individual SNPs were associated with TGs in children: rs964184 (P = 1.04E 10 ) and rs12678919 (P = 1.30E 5 ), with explained variances of 1.6 and 0.7% of TG levels, respectively (supplemental Table S1a). The weighted GRS, ranging from 15 to 42 with a mean of 28, was associated with childhood TGs and HDL-C. This GRS explained for each pair of individuals (representing the average proportion of alleles shared by those individuals) in PLINK and principal axes of variation (or genomic components, equivalent to principal components) were derived from this identity-by-state matrix by multi-dimensional scaling. Participants were defined as being of non-Northwestern European ancestry when deviating more than four SDs from the CEU panel mean value in any of the first four genomic components. Further details of these analyses are described in Medina-Gomez et al. (15). We applied Bonferroni correction for four phenotypes to adjust for multiple testing, giving a significance cutoff level of 0.0125 for the GRSs, and for the number of SNPs per phenotype for the analyses of individual SNPs. All analyses were performed using the Statistical Package for the Social Sciences (SPSS) version 21.0 for Windows (SPSS IBM, Chicago, IL).

Expression quantitative trait loci analysis
To explore functionality, we studied to determine whether SNPs that were individually associated with childhood lipid levels affect mRNA expression [expression quantitative trait loci (eQTLs)]. The SNPs were introduced into an online blood eQTL browser  Difference,  coefficient from the linear regression models, which represent the change in outcome variable for each additional average outcome-increasing allele in the GRS. 2.8% of the variance in TG levels and 0.4% of the variance in HDL-C levels. For each additional average risk allele, childhood ln-TG increased by 0.020 mmol/l (SE 0.002, P = 6.63E 18 ) and HDL-C decreased by 0.006 mmol/l (SE 0.002, P = 4.75E 4 ). The difference in mean absolute TG levels between the 10% of children with the highest GRS and the 10% with the lowest GRS was 0.28 mmol/l. The TG GRS was not associated with childhood TC and LDL-C levels ( Table 2, Fig. 2A, supplemental Fig. S1a-c). The unweighted GRS was also associated with TGs and HDL-C, with similar effect estimates as the weighted GRS (supplemental Table S2).

Associations of SNPs for TC with childhood lipid levels
None of the individual SNPs were associated with TC levels in children (supplemental Table S1b). The weighted GRS for TC, combining 39 adult SNPs, had a mean of 42 (range 28-54). The GRS was significantly associated with TC and LDL-C levels in children, with explained variances of 1.5 and 1.3%, respectively. For each additional average risk allele, childhood TC levels increased by 0.021 mmol/l (SE 0.003, P = 2.86E 10 ) and LDL-C levels increased by 0.018 mmol/l (SE 0.003, P = 1.80E 9 ). The difference in mean TC levels between the 10% of children with the highest GRS and the 10% with the lowest GRS was 0.25 mmol/l. There was no association of the TC GRS with TGs or with HDL-C ( Table 2, Fig. 2B, supplemental Fig. S2a-c). Results for the unweighted GRS for TC were very similar to those for the weighted GRS (supplemental Table S2).

Associations of SNPs for LDL-C with childhood lipid levels
Two SNPs were individually associated with LDL-C levels in children, rs4420638 (P = 3.47E 10 ) and rs629301 (P = 4.63E 4 ), with explained variances of 1.5 and 0.5%, respectively (supplemental Table S1c). The weighted GRS, based on 28 LDL-C SNPs, ranged from 15 to 39 with a mean of 28. The GRS was significantly associated with childhood TC, LDL-C, and HDL-C levels, but not with TG levels.  Table 2, Fig. 2D, supplemental Fig. S4a-c). The difference in mean HDL-C levels between the 10% of children with the highest GRS and the 10% with the lowest GRS was 0.30 mmol/l. The unweighted HDL-C GRS was associated with all outcomes except LDL-C (supplemental Table S2).

Associations in children of European ancestry
We performed sensitivity analyses in the children of European ancestry only. The results were similar to those in the full group. The LDL-C weighted GRS, based on mmol/l (SE 0.002, P = 0.002). The difference in mean LDL-C levels between the 10% of children with the highest GRS and the 10% with the lowest GRS was 0.32 mmol/l. The GRS explained 1.4, 2.5, and 0.4% of the variance in TC, LDL-C, and HDL-C in children, respectively ( Table 2, Fig. 2C, supplemental Fig. S3a-c). The results for the unweighted LDL-C risk score were similar to those for the weighted risk score, except for the fact that there was no association with HDL-C levels (supplemental Table S2).

Associations of SNPs for HDL-C with childhood lipid levels
Four individual SNPs were associated with childhood HDL-C, rs3764261 (P = 1.07E 39 ), rs1532085 (P = 1.80E

5
), rs4765127 (P = 3.21E 4 ), and rs1883025 (P = 6.00E 6 ), with explained variances of 6.2, 0.7, 0.5, and 0.8%, respectively. The weighted GRS, combining 56 adult HDL-C SNPs and ranging from 46 to 84 with a mean of 64, was associated with  N = 2,629). The x-axis represents the categories of the weighted GRS. On the left y-axis the number of the children per risk score category is shown in the histogram. The right y-axis and the dots represent the mean ln TG level per category. The regression line represents the P value from the analysis of the continuous risk score as presented in Table 2. B: Association of the weighted GRS based on 39 SNPs for TC in adults with TC levels in children (N = 2,637). The x-axis represents the categories of the weighted GRS. On the left y-axis the number of the children per risk score category is shown in the histogram. The right y-axis and the dots represent the mean TC level per category. The regression line represents the P value from the analysis of the continuous risk score as presented in Table 2. C: Association of the weighted GRS based on 28 SNPs for LDL-C levels in adults with LDL-C levels in children (N = 2,639). The x-axis represents the categories of the weighted GRS. On the left y-axis the number of the children per risk score category is shown in the histogram. The right y-axis and the dots represent the mean LDL-C level per category. The regression line represents the P value from the analysis of the continuous risk score as presented in Table 2. D: Association of the weighted GRS based on 56 SNPs for HDL-C levels in adults with HDL-C levels in children (N = 2,640). The x-axis, represents the categories of the weighted GRS. On the left y-axis the number of the children per risk score category is shown in the histogram. The right y-axis and the dots represent the mean HDL-C level per category. The regression line represents the P value from the analysis of the continuous risk score as presented in Table 2. 11.8 to 21.9% in females (12). To be able to better compare our results to those of the previous work in children, we reran our analyses using GRSs, including only the 95 SNPs used in the previous work. This did not substantially change our results (data not shown). The differences in explained variances between the Finish study and our current analysis are remarkable and may be partially explained by differences in study populations. The previous study included participants of Finnish descent only, whereas our study population was likely more heterogeneous, even when restricting our analyses to children of European ancestry only. It may be that the SNPs included in the risk score have larger effects among northern Europeans. In addition, a difference in fasting status of the blood samples may play a role. In our study, lipid concentrations were measured from 30 min fasting blood samples, while in the previous study, blood samples were taken after 12 h fasting (12,25). Mean values of blood lipids in children with a mean age 11 years (range 3-17 years) have been shown to be significantly different between study participants who had fasted for 30 min than those who had fasted for 12 h, but the differences were small (26). Thus, even taking the fasting status into account, we would not expect such large differences. We weighted the SNPs in the GRS based on the effect estimates from the GWAS in adults. Effect sizes of individual SNPs may vary across the life course and adult weights may not appropriately cover the true effect sizes in children (27). There is a small GWAS available in children, but for proper estimation of the appropriate weights in children, there is a need for large meta-analyses of GWASs in children on lipid phenotypes (11). Part of the low percentage of variance explained by the risk scores in our population may be attributable to suboptimal weighting of the SNPs.
We found eight adult SNPs to be individually associated with lipid concentrations in children. Some of these seem to be the main drivers of the associations of the GRSs with the outcomes. Most notably, the total explained variance of 9.7% in HDL-C levels in the full group seems to be driven by rs3764261, which alone has a high explained variance of 5.9% for HDL-C. These nine individually associated SNPs are located in or close to different genes. Two SNPs were individually associated with TG. The rs964184 lies near zinc finger (ZNF)259 (APOA1 reported as the closest gene in the original report) (9). Reduced ZNF259 dosage causes neuron degeneration in mice (28). The other SNP, rs12678919, is located close to LPL. This gene encodes the enzyme, LPL. This enzyme plays a role in lipid metabolism in mice and humans (29,30). Moreover, serum LPL concentrations are associated with the risk of CAD (31). Two SNPs were individually associated with LDL-C, rs4420638 and rs629301. The rs4420638 lies close to APOC1 (APOE reported as a closest gene in the original report) (9) and rs629301 is located in cadherin EGF lag seven-pass G-type receptor 2 (CELSR2) (SORT1 reported as a closest gene in the original report) (9). Increased APOC1 expression has a hypertriglyceridemic effect in mice (32). The CELSR2/ PSRC1/SORT1 region has been previously implicated in lipid-related phenotypes (33). Four of the SNPs associated 28 SNPs, explained 3.2 and 4.5% of the variance in child TC and LDL-C levels, respectively, which is about twice as high as compared with the full group (supplemental Tables S1a-d, S3, S4).

eQTL analysis
A lookup of eQTLs in a database containing information from peripheral blood samples of 5,311 individuals revealed a cis-eQTL [false discovery rate, P < 0.05] for rs12678919 for the LPL gene. Rs3764261 had one cis-eQTL for cholesteryl ester transfer protein (CETP) and rs4765127 had one cis-eQTL for coiled-coil domain containing 92 (CCDC92) (supplemental Table S5).

DISCUSSION
Our study shows that GRSs based on SNPs associated with TG, TC, LDL-C, and HDL-C levels in adults are associated with these and other lipid outcomes in school-age children.

Interpretation of the main findings
Previous studies have shown that lipid levels have a strong heritable component and that these levels track from childhood into adulthood (8,13). A recent GWAS meta-analysis identified 157 SNPs associated with lipid levels in adults, adding 62 SNPs to the 95 already identified in a previous study (9,10). Combining these SNPs into GRSs to test their effects on lipid levels has been described in adults, generally showing associations with lipid levels, but not many studies have been performed in children (12,(22)(23)(24). One previous longitudinal study with multiple lipid measurements from childhood to adulthood showed that trait-specific GRSs, based on the 95 adult SNPs known at that time, were associated with all lipid outcomes in a study of 2,443 participants in a range of age groups between 3 and 45 years (12). In our study the SNPs included in the GRSs were based on a larger GWAS, yielding a total of 157 hits (9).
Our main findings were that the GRSs showed strong associations with their particular traits and also significant associations with most of the other traits. The weighted HDL-C GRS was associated with all lipid traits in our study. The other GRSs were associated with one or two other lipid phenotypes beyond their primary associated trait. Thus, our findings are largely in line with the literature.
The weighted GRSs explained up to 2.8% of the variance in TGs, up to 1.5% in TC, up to 2.5% in LDL-C, and up to 9.7% in HDL-C in the full group. In the sensitivity analysis in children of European ancestry only, the explained variances were similar, with the notable difference that the LDL-C GRS explained twice as much of the variance in TC levels and over 1.5 times as much of the variance in LDL-C levels. This may be because our GRSs are based on SNPs originally identified in adults from European ancestry. The previous work in children, with GRSs based on 95 SNPs, reports explained variances in the youngest group aged 3 to 6 years ranging from 14.8 to 26.7% in males and from The Generation R Study is conducted by the Erasmus Medical Center in close collaboration with the School of Law and Faculty of Social Sciences of the Erasmus University Rotterdam, the Municipal Health Service Rotterdam area, Rotterdam, the Rotterdam Homecare Foundation, Rotterdam, and the Stichting Trombosedienst and Artsenlaboratorium Rijnmond (STAR-MDC), Rotterdam. The authors gratefully acknowledge the contribution of the children and parents, general practitioners, hospitals, midwives, and pharmacies in Rotterdam. The generation and management of GWAS genotype data for the Generation R Study were done at the Genetic Laboratory of the Department of Internal Medicine, Erasmus Medical Center, The Netherlands. The authors would like to thank Karol Estrada, Dr. Tobias A. Knoch, Anis Abuseiris, Luc V. de Zeeuw, and Rob de Graaf for their help in creating GRIMP, BigGRID, MediGRID, and Services@MediGRID/D-Grid, (funded by the German Bundesministerium fuer Forschung und Technology Grants 01 AK 803 A-H and 01 IG 07015 G) for access to their grid computing resources. The authors also thank Mila Jhamai, Manoushka Ganesh, Pascal Arp, Marijn Verkerk, Lizbeth Herrera, and Marjolein Peters for their help in creation, management, and quality control of the GWAS database, and Karol Estrada for his support in the creation and analysis of imputed data.
with HDL-C in adults were also associated individually with HDL-C levels in children. The rs3764261 is located close to CETP. Activity of this protein has previously been reported to play a role in HDL-C levels (34). Also CETP has been suggested to be related with atherosclerosis (35). Genetic variants located in the CETP region have also been associated with myocardial infarction (36). The rs4765127 is in ZNF664, of which the exact function is unknown. This region has also been reported to be associated with adiponectin levels (37). The rs1532085 is close to hepatic lipase (LIPC), which also plays role in lipoprotein metabolism (38). ABCA1, in which rs1883025 is located, encodes a protein with a key role in the cellular lipid removal pathway and in cholesterol efflux (39,40).
Functional analysis showed cis-eQTLs for three of the individually associated SNPs. The rs12678919 was associated with eQTLs in LPL, described above. The rs3764261 was associated with an eQTL in CETP, described above, and rs4765127 with an eQTL in CCDC92, which has also been previously associated with plasma lipoprotein fractions (41).

Methodological considerations
Although GWAS data and lipid measurements were not available for the whole cohort, the main strength of this study was the sample size of the Generation R cohort. Nevertheless, we still had limited power to find associations, especially for individual SNPs and in the analyses limited to individuals of European origin, and negative findings from this study should be interpreted carefully (12).
Of the total sample included in the study, 58.8% had genetic data available. Children with GWAS data had higher birth weights (means 3,447 and 3,295 g for those with and those without GWAS data, respectively, P = 2.28E 27 ) and gestational ages (means 39.9 and 39.3 weeks, for those with and those without GWAS data, respectively, P = 1.94E 61 ) than those without. In addition, children with GWAS data had significantly higher LDL-C levels (P = 0.012) than those without. There were no differences in TG (P = 0.77), TC (P = 0.18), and HDL-C (P = 0.46) levels between the groups. Of all children with genetic data, 49.9% had lipid measurements available. Individuals with lipid measurements had significantly higher birth weights (means 3,413 and 3,361 g for those with and those without lipid measurements, respectively, P = 0.03) and gestational ages (means 39.8 and 39.6 weeks for those with and those without lipid measurements, respectively, P = 5E 05 ) than those without. Although there were some significant differences, the absolute differences were small and it seems unlikely that these differences could have a strong effect on our results.

CONCLUSIONS
GRSs based on SNPs associated with adult lipid measurements are associated with the same outcomes in children. The genetic background of these phenotypes at least partly overlaps between children and adults.