The Metabolic Syndrome in Men study: a resource for studies of metabolic and cardiovascular diseases

The Metabolic Syndrome in Men (METSIM) study is a population-based study including 10,197 Finnish men examined in 2005–2010. The aim of the study is to investigate nongenetic and genetic factors associated with the risk of T2D and CVD, and with cardiovascular risk factors. The protocol includes a detailed phenotyping of the participants, an oral glucose tolerance test, fasting laboratory measurements including proton NMR measurements, mass spectometry metabolomics, adipose tissue biopsies from 1,400 participants, and a stool sample. In our ongoing follow-up study, we have, to date, reexamined 6,496 participants. Extensive genotyping and exome sequencing have been performed for essentially all METSIM participants, and >2,000 METSIM participants have been whole-genome sequenced. We have identified several nongenetic markers associated with the development of diabetes and cardiovascular events, and participated in several genetic association studies to identify gene variants associated with diabetes, hyperglycemia, and cardiovascular risk factors. The generation of a phenotype and genotype resource in the METSIM study allows us to proceed toward a “systems genetics” approach, which includes statistical methods to quantitate and integrate intermediate phenotypes, such as transcript, protein, or metabolite levels, to provide a global view of the molecular architecture of complex traits.

the bottleneck and, consequently, genetic association studies have reported similar associations of common variants with clinical and laboratory traits in Finns as in other European populations, as presented in this review. However, a strong influence of the bottleneck, particularly the proportional enrichment of rare deleterious variants, is also influenced by the unique history of the Finnish population and will not necessarily apply to all populations influenced by a bottleneck (2).
There are benefits to carrying out population-based epidemiological studies among Finns. The participation rate in these studies has been high (60-80%) during recent decades. Additionally, access to electronic nationwide health records (hospital discharge, mortality, cancer, and drug reimbursement registries) in Finland makes it possible to follow-up essentially all participants of any populationbased study with respect to morbidity, mortality, and drug treatment of different diseases. Given this ideal setting to investigate nongenetic and genetic factors influencing the risk of different chronic diseases in Finns, we initiated the Metabolic Syndrome in Men (METSIM) study in 2005.

Cross-sectional study
The METSIM study includes 10,197 men, aged from 45 to 73 years at entry, randomly selected from the population register of the Kuopio town, Eastern Finland, and examined in 2005-2010 (3). The aim of the study is to investigate nongenetic and genetic factors associated with T2D and CVD, and with cardiovascular risk factors in both crosssectional and longitudinal settings. The METSIM study protocol includes, e.g., collection of data on drug treatment, CVD risk factors [smoking, exercise, diet, history of chronic diseases (including CAD, stroke, cardiac failure, medication, history of diabetes or early onset CAD in the family)], a questionnaire on the FINDRISC score (4), and measurement of height, weight, waist, hip, blood pressure, and fat percentage. Figure 1 presents the METSIM cross-sectional and follow-up study designs, including deep phenotyping and extensive DNA- and RNA-based genetic studies. Table 1 lists the diseases and traits investigated in the METSIM study, including nongenetic markers and genetics of the corresponding diseases/traits. Laboratory studies include the following measurements obtained after a minimum of 10 h fasting: lipids, lipoproteins, apolipoproteins, adiponectin, bilirubin, alanine aminotransferase (ALT), bile acids, high sensitivity C-reactive protein, interleukin 1 receptor antagonist (IL-1RA), interleukin 1, HbA1c, mass spectrometry metabolomics (Metabolon, Durham, NC), and proton NMR measurements (lipids and lipoproteins, amino acids, fatty acids of different lengths, and other low molecular weight metabolites). Additionally, an oral glucose tolerance test (OGTT) is performed to evaluate glucose tolerance (samples for glucose, insulin, proinsulin, and free fatty acids measured at 0, 30, and 120 minutes) (5). We also obtained subcutaneous adipose tissue samples from 1,410 participants. We assessed insulin sensitivity and insulin secretion using markers that we validated in a separate sample of 287 nondiabetic Finnish individuals from the region of Kuopio. We validated our insulin secretion markers against measures from the intravenous glucose tolerance test, and insulin sensitivity markers against the gold standard method to evaluate insulin sensitivity, the euglycemic-hyperinsulinemic clamp (6).

Follow-up study
The protocol of 5 year follow-up study is identical to that of the cross-sectional study. To date, we have reexamined 6,496 participants (participation rate 64%) and identified 693 new cases of T2D (a total of 2,106 incident and prevalent cases). The follow-up study will be finished in 2017. Diagnoses of myocardial infarction, stroke, and peripheral vascular disease (amputations) are verified from medical records against internationally accepted criteria for these diseases (7,8). We also collect data from medical records on coronary angiograms, balloon angioplasty, by-pass coronary artery surgery, heart failure, and diabetic microvascular (retinopathy, nephropathy) and macrovascular complications. Participants signed a consent that allows us Phenotyping included several laboratory measurements in fasting, an OGTT, and proton NMR measurements from all participants. DNA-based genotyping includes OmniExpress for common and exome chip for low-frequency and rare variants, exome and genome sequencing, DNA methylation analysis in adipose tissue, and gut microbiome sequencing. Adipose tissue biopsies have been taken from 1,410 participants, RNA sequencing performed for 795 participants and RNA expression determined for 770 participants. The protocol of the follow-up study is identical to the cross-sectional study, and so far 6,496 individuals have participated in the follow-up. Additionally, all participants have registry follow-up allowing information on morbidity, mortality, and drug treatment to be obtained. to use several Finnish registries: hospital discharge registry (includes all diagnoses of hospital admission), drug reimbursement and prescription registry (includes information on drug therapy for major chronic diseases, including diabetes, CAD, hypercholesterolemia, etc.), cancer registry, and mortality registry. Therefore, the coverage of the follow-up data of the participants is nearly 100% with respect to diagnoses of different diseases.

Ethics approval
The Ethics Committee of the University of Eastern Finland and Kuopio University Hospital approved the MET-SIM study, and this study was conducted in accordance with the Declaration of Helsinki. All study participants gave written informed consent.

Limitations of the METSIM study
The METSIM cohort includes only Finnish men. Therefore, the results reported in the METSIM participants need to be replicated in women and in other ethnic groups. The rationale to include only men was that in the age group from 45 to 73 years, the prevalence and incidence of T2D and CAD are higher in Finnish men than in Finnish women.

T2D and hyperglycemia
A major interest in our studies based on the METSIM cohort has been to investigate the predictors for the development of hyperglycemia and the conversion to T2D. T2D is often preceded by a long period of prediabetes, characterized by insulin resistance and elevation of fasting (impaired fasting glucose) or 2 h glucose (impaired glucose tolerance) in an OGTT (11). Conversion to diabetes happens when the pancreas is no longer able to increase insulin secretion in a manner sufficient to compensate for insulin resistance in peripheral insulin-sensitive tissues. Characteristic findings of individuals with prediabetes and T2D are impaired insulin secretion and insulin resistance. Insulin resistance also plays a major role in the development of CVD (11).
We have identified several biomarkers as predictors for T2D and hyperglycemia ( Table 2). We demonstrated that branch-chain amino acids (leucine, isoleucine) and aromatic amino acids (phenylalanine, tyrosine) were associated with increased risk of T2D and hyperglycemia, whereas glutamine was associated with decreased risk of T2D and hyperglycemia (12), confirming previous findings (13). Adjustment for insulin sensitivity abolished the significance of these associations, demonstrating for the first time in a longitudinal setting that insulin resistance is a major mechanism by which amino acids increase the risk of T2D (12). We also demonstrated for the first time that ketone bodies predict the conversion to T2D by impairing insulin secretion (14). Noncholesterol sterols (desmosterol, avenasterol) (15), glycerol, free fatty acids, and monounsaturated and saturated fatty acids predicted incident T2D (16). Palmitoleic acid increased and linoleic acid measured from plasma (17) or erythrocyte membranes (18) decreased the risk of T2D. High proinsulin level (19) and apolipoprotein/lipoprotein ratios (20) also predicted the conversion to T2D.
Summary point. Identification of biomarkers predicting the conversion to T2D is of great importance, especially for the prevention of diabetes. We, and others, have identified several nongenetic biomarkers (amino acids, ketone bodies, noncholesterol sterols, fatty acids) associated with the risk of T2D that improve the prediction for the conversion to diabetes beyond and above the classical risk factors for this disease (obesity, age, family history of diabetes, etc.). These findings need to be replicated in other ethnic groups and in women.

Nonalcoholic steatohepatitis
Non-alcoholic fatty liver disease is rapidly becoming the most common cause of liver disease, characterized by hepatic lipid accumulation contributing to insulin resistance, T2D, and hyperlipidemia. It can lead to nonalcoholic steatohepatitis (NASH) and liver cirrhosis, and ultimately to liver failure (21). Pathophysiological mechanisms leading to NASH have remained unclear, and noninvasive diagnosis of NASH is challenging. Therefore, finding simple laboratory markers for NASH is important. We hypothesized that increased IL-1RA serum levels in subjects with a high ALT level reflect increased inflammation in the liver. Indeed, we found that IL-1RA levels were associated with liver inflammation and ALT, independent of obesity, alcohol consumption, and insulin resistance in the METSIM study. Moreover, we found that liver expression of IL1RN, a gene encoding IL1-RA, correlated with liver steatosis and inflammation. Thus, increased IL1-RA levels in individuals with NASH could be directly linked to liver disease. Our findings suggest that IL-1RA is an important noninvasive inflammatory marker for NASH (22).
Dysregulation of the cholesterol synthesis pathway and accumulation of cholesterol in the liver play an important role in the pathogenesis of NASH. We measured serum and liver levels of three cholesterol precursor sterols (cholestenol, desmosterol, lathosterol) as serum surrogate markers of cholesterol synthesis in 110 obese individuals with detailed liver histology. We found that desmosterol in serum and liver associated with NASH. These results suggest that serum desmosterol is a marker of disturbed cholesterol metabolism in the liver and a potential biomarker for NASH (23).
Summary point. NASH can be diagnosed accurately only by liver biopsy. Therefore, several laboratory measurements have been proposed as surrogate biomarkers, but no consensus about the best biomarker for NASH has been reached. IL-1RA and desmosterol as potential biomarkers for NASH need further validation studies.

T2D
GWASs. The first replicated gene variant associated with T2D was the Pro12Ala amino acid substitution in the PPARG gene (24,25). Interestingly, we demonstrated, using a Pro12Ala knock-in model, that, on chow diet, the Ala12Ala mice were leaner, had increased insulin sensitivity, and longer lifespans. High-fat feeding eliminated the beneficial effects of the Pro12Ala variant on adiposity and insulin sensitivity, demonstrating strong gene-environment interactions for this variant (26). Since 2007, GWASs and sequencing studies have provided new insights into the genetic basis of T2D, and currently about 90 loci for T2D and over 80 loci for glycemic traits have been published (27). The METSIM study has been an important European cohort in the identification and replication of several loci for T2D and for studies aiming to define causal mechanisms at the T2D susceptibility loci (28)(29)(30)(31)(32)(33)(34). However, these studies in aggregate explain only about 10-15% of the heritability of T2D.
Protective gene variants. The METSIM study has been a part of three studies reporting protective gene variants for T2D. A study that combined sequence and genotype data on 150,000 individuals across five ancestry groups showed that two protein-truncating variants of SLC30A8 (p.Arg138* and p.Lys34Serfs*50) encoding zinc ZnT8 transport proteins are associated with T2D protection, suggesting that ZnT8 inhibition could be a therapeutic strategy in T2D prevention (35). Similarly, by whole-genome sequencing of 2,630 Icelanders, a low-frequency (1.5%) variant in intron 1 of CCND2 was found that reduced the risk of T2D by half (36). This finding was replicated in 29,956 individuals, including the participants of the METSIM cohort (37). A recent exome sequencing study reported that a low-frequency missense variant in the gene encoding glucagon-like peptide-1 receptor (GLP1R), the target of GLP1R agonists, was associated with decreased fasting glucose levels and the risk of T2D, consistent with GLP1R agonist therapies (38).
Low frequency variants. The most recent study evaluated how much low-frequency variants (0.005 < minor allele frequency < 0.05) explain of the heritability of T2D. This study included whole-genome sequencing data from 2,657 European individuals with and without diabetes, and exome sequencing data from 12,940 individuals from five ancestry groups, including the METSIM cohort. This study did not support the notion that low-frequency variants explain a substantial portion of T2D heritability (39).
Summary point. GWASs and exome and genome sequencing have identified close to 200 gene variants associated with T2D and hyperglycemia. Larger sample sizes of population studies will increase the number of gene variants even further, but genetic variants do not explain the epidemic of T2D observed today worldwide. Environmental and lifestyle factors and their interaction with genetic variants are the key to understanding the etiology and pathophysiology of T2D. Lifestyle factors also modify gene expression and methylation of DNA, which are likely to play an important role in the risk of T2D and other chronic diseases.

Insulin secretion and insulin resistance
Common variants. Most of the common variants known to be associated with the risk of T2D affect insulin processing and insulin secretion and only a few affect insulin sensitivity (49)(50)(51). The METSIM cohort was important for this collaborative work given the large size of our cohort, as well as validated markers for insulin secretion and insulin sensitivity (52). Three studies, including the METSIM cohort, have searched for variants regulating insulin resistance (53)(54)(55). A common variant, rs11603334, was associated with T2D and proinsulin levels in the METSIM study. The T2D-risk allele increased transcriptional activity and was associated with increased ARAP1 expression in a small sample of human pancreatic islets, suggesting that increased ARAP1 expression may contribute to T2D susceptibility (56), although the T2D-risk allele at this locus is also associated with decreased STARD10 expression in larger studies of human pancreatic islets (57).
Low frequency and rare variants. We were the first to report exome array results for insulin processing and secretion (9). We identified low-frequency coding variants associated with fasting proinsulin concentrations at the SGSM2 and MADD GWAS loci, and three new genes with low-frequency variants associated with fasting proinsulin or insulinogenic index (TBC1D30, KANK1, PAM) in 8,229 nondiabetic participants of the METSIM study. Of these genes, SGSM2, MADD, TBC1D30, and KANK1 regulate or function in G-protein signaling (Fig. 2). Our study demonstrated that, by exome array, it is possible to identify lowfrequency variants that contribute to complex traits.
Summary point. Genetic studies have given important information about the etiology and pathophysiology of T2D. It is now clear that the most important pathophysiological mechanism leading to T2D is impaired insulin secretion, largely attributable to multiple gene variants regulating insulin secretion. By contrast, only a few gene variants regulating insulin sensitivity have been found, suggesting that insulin resistance is more likely to be an acquired trait (obesity, lack of exercise, unhealthy diet) than an inherited trait.

FINDING NEW FUNCTIONS FOR COMMON GENE VARIANTS ASSOCIATED WITH DIFFERENT CLINICAL AND LABORATORY MEASUREMENTS
In recent years, genetic studies have become much larger and technologies developed to identify low frequency and rare gene variants across the whole exome and genome have made it possible to identify causal variants for different diseases and traits. Understanding the biology behind genetic association is, however, the key for the understanding of the pathophysiology of different diseases. T2D is a good example of the problem of imprecise phenotypes because the pathophysiology of this disease involves several tissues, including pancreas, liver, skeletal muscle, fat, and brain. Phenotyping that goes beyond what is typically recorded in medical records or routine laboratory measurement is needed to dissect out mechanisms behind diseases of interest (58). Such "deep phenotyping" is possible in small clinical and family studies. The authors of a recent study performed detailed physiological (euglycemic clamp, insulin secretion) and in vitro characterization (insulin signaling in adipose tissue and skeletal muscle) in a limited number of carriers of the PTEN mutation (59). They were able to show that mutation carriers had decreased risk of T2D owing to enhanced insulin sensitivity (59).
Deep phenotyping is more challenging in large population-based studies for several reasons. First, deep phenotyping is expensive and time consuming. Second, large population-based cohorts have been collected several years ago when the significance of deep phenotyping in studies aiming to identify genes for complex diseases was not deemed practical. Third, deep phenotyping is successful only if study participants are willing to undertake time-consuming protocols (e.g., biopsies from tissues of interest, metabolic studies lasting for hours).
Deep phenotyping offers many advantages for genetic studies, and it could improve, for example, the classification of a disease of interest into the subtypes. In the majority of cases, T2D occurs especially in obese individuals who are typically insulin resistant, but this disease can also occur in normal weight individuals whose primary defect in glucose metabolism is impaired insulin secretion. Large sample sizes are needed to obtain statistically significant association of gene variants with T2D, a heterogenous disease defined by fasting and 2 h glucose concentrations. Deep phenotyping at the tissue, metabolite, and pathway levels related to the pathophysiology of T2D is likely to improve the understanding of the function of gene variants associated with T2D, and has the potential to reduce, to some extent, the sample sizes required to detect association. In the METSIM study, we have applied deep phenotyping, including proton NMR spectroscopy, to measure several metabolites for association studies with T2D risk gene variants. We found that the glucose-increasing alleles of rs780094 in glucokinase regulatory protein gene (GCKR) and rs174550 in fatty acid desaturase 1 (FADS1) were significantly associated with lipoprotein particles (5). The glucose-increasing allele of rs780094 of GCKR was additionally associated with decreased levels of the amino acids, alanine and isoleucine, and elevated levels of glutamine (12) and -hydroxybutyrate (14). Thus, GCKR regulates not only glucose metabolism, but also lipoprotein and ketone body metabolism, and demonstrates that deep phenotyping offers a tool to identify new functions for the genes of interest.

Summary point
Phenotyping has been underdeveloped compared with genotyping in genetic studies. Recent studies applying proton NMR and mass spectrometry-based metabolomics, including the METSIM study, have shown the potential of deep phenotyping to identify new functions and pathways for the gene variants of interest. By contrast, proteomics, another essential tool for the understanding of the functions of cellular systems in human diseases, has not yet been widely applied because of many methodological challenges. New technological developments are urgently needed to advance the proteomics approach so that it may also be applied in genetic studies.

Obesity and height
Obesity is heritable (60) and an important risk factor for T2D and CVD. The METSIM study has been a replication cohort for several GWASs aiming to identify common variants for height (61,62) and BMI (63)(64)(65)(66). The most recent meta-analysis included 339,224 individuals and identified 97 BMI-associated loci accounting for 2.7% of BMI variation. Genome-wide estimates suggested that common variants account for >20% of BMI variation (66).

Adiposity
The METSIM study has participated as a replication cohort for meta-analyses for adiposity and fat distribution (67)(68)(69)(70)(71). Interestingly, we found that the body-fat-decreasing allele near IRS1 was paradoxically associated with insulin resistance, dyslipidemia, and decreased adiponectin levels. We demonstrated, using the data from the METSIM study, that the variant near IRS1 is likely to have its primary effect on body fat percentage and that the association with decreased insulin sensitivity is partly mediated by changes in body fat percentage (69).

Summary point
Obesity and, especially, central obesity are important risk factors for T2D, CAD, hypertension, and certain cancers. Recent studies, including observations from the MET-SIM cohort, have shown that the gene variants associated with BMI are primarily expressed in the central nervous system, whereas gene variants associated with adiposity are primarily expressed in fat tissue (63)(64)(65)(66). These findings from genetic studies have greatly contributed to better understanding of the etiology of obesity and adiposity.

Measurements
In the METSIM study, we have measured lipids (total cholesterol and triglycerides), lipoproteins (HDL, LDL, VLDL), and apolipoproteins (apoA1 and apoB). Additionally, we determined lipoprotein subclasses by proton NMR in native serum samples that provide quantitative molecular data on 14 lipoprotein subclasses and their lipid concentrations and composition (72).

GWASs
The METSIM study has been included in many GWAS meta-analyses identifying gene variants regulating LDL cholesterol, HDL cholesterol, and triglycerides. The first such study including the METSIM cohort reported 11 new variants for these traits, suggesting that the cumulative effect of several common variants contribute to polygenic dyslipidemia (73). Metabochip analysis, including transethnic high-density genotyping, demonstrated the presence of allelic heterogeneity and the identification of population-specific variants (74) and the Global Lipids Genetics Consortium study reported the association of lipid and lipoprotein loci with metabolic and cardiovascular traits, including coronary heart disease and blood pressure (75). The most recent of these studies identified associations of lipid and lipoprotein levels with 62 new loci, bringing the total number of lipid-associated loci to 157 (75). All but one of the loci included protein-coding genes. Of the 62 new loci, 38 included genes whose role in the regulation of blood lipid levels has support from previous literature or databases, but for 24 loci such evidence has not been documented.

Functional studies
It has been challenging to convert GWAS data into mechanistic insights. The METSIM study data has been helpful in generating functional information for some GWAS variants, e.g., a variant rs7575840 in the APOB gene region associated with LDL cholesterol level. Adipose tissue RNA expression analysis indicated that rs7575840 alters the expression of APOB and a regional noncoding RNA (BU630349) (76). Another study examined the functional regulatory effects of 25 noncoding variants at the GALNT2 locus associated with HDL cholesterol in the METSIM cohort. This study demonstrated that at least two noncoding variants play key roles in influencing GALNT2 expression (77). We identified a gene locus for high serum triglycerides in Mexicans on chromosome 18q11.2 harboring a regulatory gene variant (rs17259126), and demonstrated that it disrupts normal hepatocyte nuclear factor 4 binding and decreases the expression of the TMEM241 gene in 795 adipose RNAs from the METSIM cohort. Our findings suggest that decreased transcript levels of TMEM241 contribute to increased triglyceride levels in Mexicans (78).

Summary point
Lipid-associated loci have been shown to associate strongly with CAD, T2D, BMI, and blood pressure (75). Therefore, genetic studies have greatly contributed not only to the understanding of the pathways that modify lipid and lipoprotein levels in humans, but also to the understanding of the links between lipids and lipoproteins and other chronic diseases. These studies may also facilitate the design of new therapies for cardiovascular and metabolic diseases. More mechanistic studies are needed, especially for those 24 new gene variants that have recently been associated with lipids and lipoproteins, having no support from previous literature or databases (75).

Blood pressure
Elevated blood pressure is a heritable trait and an important risk factor for CVD. The METSIM study participated in the first GWAS meta-analysis for elevated blood pressure including a total of 34,433 subjects of European ancestry in which eight common variants were found to be associated with systolic or diastolic blood pressure (79). A second GWAS meta-analysis including 200,000 individuals of European descent identified sixteen new loci (80). A genetic risk score based on 29 genome-wide significant variants was associated with hypertension, stroke, and CAD (80). Only three of the loci associated with blood pressure in Europeans were replicated in African Americans (81). A recent study including 192,000 individuals identified 30 new blood pressure or hypertension-associated genetic regions. This study also identified three rare missense variants in RBM47, COL21A1, and RRAS with larger effects than common variants (82).

Metabolic syndrome
We conducted a GWAS on the metabolic syndrome and its component traits in four Finnish cohorts consisting of 2,637 cases and 7,927 controls, both free of diabetes. A previously known lipid locus, the APOA1/C3/A4/A5 gene cluster region, was associated with the metabolic syndrome in all four study samples. However, we did not find evidence for a common genetic basis for clustering of metabolic syndrome traits (83). We have also identified a variant in SIRT3 that is suggestive of a genetic association with the metabolic syndrome (84).

Adiponectin
Circulating levels of adiponectin, a hormone produced predominantly by adipocytes, are highly heritable. The METSIM study has contributed to a GWAS meta-analysis of 39,883 individuals of European ancestry where eight novel loci were identified for adiponectin level. The genetic risk score, including several common risk variants for adiponectin level, showed an association with T2D, increased triglycerides, obesity markers, fasting insulin, and insulin resistance (85). However, a Mendelian randomization study did not support the conclusion that the association of adiponectin levels with insulin resistance and T2D is causal (86).

Summary point
Several new gene variants have recently been associated with blood pressure. More importantly, these studies have proved the causal link between elevated blood pressure and stroke and CAD, given the fact that the genetic risk score for blood pressure was associated directly with CVD endpoints. Our study has not found evidence for a common genetic basis of the metabolic syndrome (83), and similarly there is no causal link between adiponectin and insulin resistance and T2D.

GENETICS OF CAD
CAD is a major disease causing morbidity and mortality worldwide. Therefore, it is important to understand the genetic basis of CAD. The METSIM study participated in a GWAS meta-analysis of 63,746 CAD cases and 130,681 controls that identified new loci for CAD. Fifteen loci reached genome-wide significance and all 46 putative susceptibility loci explained approximately 10.6% of CAD heritability. The four most significant pathways mapping to these networks were linked to lipid metabolism and inflammation, underscoring the causal role of these pathways in the genetics of CAD (87). Importantly, this study showed that both LDL cholesterol and triglyceride levels, but not HDL cholesterol levels, were causally related to the risk of CAD (87).
The METSIM cohort was involved in another study investigating specifically the possibility that increased triglyceride levels are causally associated with CAD. Using 185 common variants mapped for plasma lipids, the study demonstrated that total triglyceride levels correlated with the magnitude of its effect on CAD risk, taking into account the effects on LDL cholesterol and/or HDL cholesterol levels. This study suggests that triglyceride-rich lipoproteins can causally influence the risk for CAD (88).

Summary point
Accumulating evidence from genetic studies demonstrates that the levels of LDL cholesterol and triglycerides are causally associated with the risk of CAD, whereas this causality is missing for HDL cholesterol level. These results have had important implications for the treatment of dyslipidemia.

Transcriptomics and RNA sequencing
The GWAS approach has been highly successful in the identification of common genetic variation contributing to normal and pathological traits. However, the mechanistic steps between genetic variation and different traits often remain poorly understood (89). Only a limited number of large-scale human population studies are available where extensive genotyping, phenotyping, and transcriptomics have been performed to obtain information on the functionality of the gene variants.
The first large study including samples from peripheral blood (N = 1,002) and subcutaneous fat (N = 673) was published from the Icelandic population. The study showed that more than 50% of all gene expression traits in adipose tissue strongly correlated with clinical traits related to obesity, compared with only about 10% in blood (90). Heritability was a highly significant contributor to variation in gene expression, and there was an approximately 50% overlap of genetic signals between the two tissues (90). Given the fact that adipose tissue is very important for metabolic diseases and CVDs, we collected subcutaneous abdominal adipose tissue biopsies from 1,410 randomly selected MET-SIM participants who did not have chronic diseases or chronic medication. Expression profiling using the Affymetrix U219 microarray expression quantitative trait locus (eQTL) mapping was performed for 770 METSIM individuals with both genotype and expression data available. We provided evidence for >100 loci for which eQTLs are coincident with GWAS loci, suggesting that these genes may be involved in human metabolic traits (91).
We also compared the results with those in 100 diverse commercially available inbred strains of mice, called the Hybrid Mouse Diversity Panel (HMDP), and found consistent associations between the traits and the expression of 25 genes in humans and mice (91). The HDMP is a collection of approximately 100 well-characterized inbred strains of mice that can be used to analyze the genetic and environmental factors underlying complex traits (92,93). There are several benefits of the HMDP. It makes it possible to control environmental factors, guarantees the access for global molecular phenotyping, and makes it possible to integrate separate studies. All published data of the HMDP are available (92).
In another study, we sequenced 600 adipose tissue RNA samples from METSIM participants (Illumina TrueSeq RNA Prep kit and the Illumina Hiseq 2000 platform) and implemented a new method to identify genes whose expression is significantly associated with complex traits in individuals without directly measured expression levels. This method integrates gene expression measurements with summary association statistics from large-scale GWASs to identify genes whose cis-regulated expression is associated with complex traits. We identified 69 new genes significantly associated with obesity-related traits (BMI, lipids, and height). Many of these genes were also associated with relevant phenotypes in the HMDP (10).

MicroRNA
MicroRNAs (miRNAs) are small noncoding RNAs that regulate gene expression and determine how genetic variants affect different phenotypes. Information about genetic factors contributing to miRNA expression is limited; therefore, we examined variation of miRNA expression in adipose tissue in 200 METSIM participants. We reliably quantified 356 miRNA species expressed in human adipose tissue using genome-wide expression arrays and next-generation sequencing. Genetic variation of miRNA expression was substantially less than that of mRNAs. Twenty-four miRNAs were significantly associated with the traits of the metabolic syndrome (94).

Summary point
Adipose tissue gene expression studies have been important to reveal eQTLs that contribute to the understanding of the function of the genes. However, adipose tissue gene expression studies also have limitations. Gene expression in a single tissue may not be relevant for the mechanisms of some of the cardiometabolic traits, and the colocalization of an eQTL with a disease locus may be coincidental. Additionally, adipose tissue includes several cell types affecting gene expression profile, and adipose tissue gene expression is also influenced by gender, diet, and other environmental factors (91).

Genome sequencing
Extensive genotyping and exome sequencing have been performed for essentially all participants of the METSIM study. Genome sequencing has been performed for >2,000 METSIM study participants with and without CVD events, and continues in 2017-2018. (Fig. 3)

Prediction models
The METSIM follow-up study and registry follow-up allow continuous monitoring of the incidence of several diseases, including T2D and CAD, that makes it possible to develop genetic and nongenetic prediction models for T2D, cardiovascular complications, and cardiovascular risk factors. However, observational studies do not prove causality of a given risk factor. Therefore, we will apply Mendelian randomization methods by using genetic variants to estimate the causal contribution of a given risk marker to the risk of a given disease (95). This information is crucial for the planning of prevention of complex diseases, including T2D and CVD.

Omics
Recent major advances in omics technologies (metabolomics, proteomics, etc.) have enabled high-throughput analysis of a variety of molecular processes. We will continue our deep phenotyping by applying these technologies, as well as epigenomics and microbiota studies aiming to associate these phenotypes with common, low frequency, and rare gene variants. Integrative analyses across multiple omics platforms are important as we investigate the etiologies of complex diseases.

Metabolome and proteome
Proton NMR and mass spectrometry allow high throughput analysis of multiple metabolites. In the METSIM study, we have proton NMR data for >10,000 participants and mass spectrometry-based metabolomics results currently for 2,292 participants. These data make it possible to correlate metabolite concentrations with clinical traits and adipose tissue gene expression data, and to identify gene variants that regulate metabolite concentrations. Proteomics allows studies of the entire set of proteins, the proteome, by mass spectrometry and other methods. Recent advances in mass spectrometry and computational analysis of the results have made this technology an attractive method to characterize the proteome (96). As shown in Fig. 4, we have RNA sequencing (N = 795) and adipose tissue methylation data (N = 758) especially from those METSIM study participants who also have mass spectrometry-based metabolomics data (N = 2,292). Additionally, we have so far collected stool samples from 532 METSIM participants for gut microbiota sequencing and about 50% of them also have metabolomics data.

Microbiome
Diet plays an important role in obesity, metabolic syndrome, and other metabolic diseases, especially in T2D. Recently, the gut microbiota has become a focus of interest because it is at the intersection of diet and metabolic health. Both animal models and human studies have accumulated data to show that the gut microbiota mediates the effects of the diet on the host metabolic status (97). In 2017-2018, we aim to collect additional stool samples from about 500 METSIM participants for microbiota studies.

Epigenetics
Methylation of DNA cytosine bases plays an important role in the regulation of gene expression. DNA methylation is variable among human populations, in part heritable, and controlled by genes both in cis and in trans (98). To examine the methylation pattern in the METSIM cohort, we have constructed reduced representation bisulfite sequencing libraries from adipose tissue biopsies. The sequences obtained from bisulfite sequencing libraries are enriched in genes and CpG islands, and cover approximately two million CpGs out of the 30 million CpGs in the human genome. The Illumina HiSeq platform is used for sequencing of the libraries. We aim to sequence about 800 adipose tissue DNA samples in 2017-2018.

CONCLUSIONS
The METSIM study provides a resource for nongenetic and genetic studies of metabolic diseases and CVDs. It is a Fig. 3. A roadmap for the genome and phenome analyses in the METSIM study. The whole-genome sequencing and extensive phenotyping are ongoing. Phenotyping includes metabolomics, proteomics, epigenomics, adipose tissue transciptomics, and microbiome analysis. The interaction between the genome and phenome will be extensively investigated as well as the interaction between the genome and lifestyle/environmental factors and aging. Prediction models will be developed using the Mendelian randomization approach by using genetic variants to estimate the causal contribution of a given risk marker to the risk of a given disease (especially T2D and CAD). randomly selected large population-based sample of men (N = 10,197) from Eastern Finland having unique genetic background and extensive genetic data (OmniExpress chip, exome chip, and exome sequence in nearly all, genome sequence in many). We have performed deep phenotyping of this cohort and are currently expanding the phenotype further to include data on microbiota, RNA sequencing, adipose tissue methylation, and mass spectrometry for metabolites and proteins. Our findings, so far, suggest that we are able to identify new nongenetic biomarkers for T2D, insulin secretion, and insulin sensitivity. Additionally we have been able to find new low-frequency and rare variants in the METSIM study alone and in collaboration with others, and we have reported their association with proinsulin levels and insulin secretion. The METSIM study data have been useful in several meta-analyses to identify new common variants for T2D, hyperglycemia, anthropometric traits, and cardiovascular risk factors.
The METSIM study is a part of the Accelerated Medicines Partnership Type 2 Diabetes (AMP T2D) project supported by the Foundation for the National Institutes of Health (99). This project is designed to create a knowledge portal by building a database of DNA sequence, functional genomic and epigenomic, and clinical data from T2D studies with cardiac and renal complications. A broad range of genetic and phenotype data from the METSIM study will soon be available from the database of genotypes and phenotypes (dbGaP). The generation of a phenotype and genotype resource in the METSIM study allows us and other investigators to proceed toward a "systems genetics" approach (100), which includes statistical methods to quantitate and integrate intermediate phenotypes, such as transcript, protein, or metabolite levels to provide a global view of the molecular architecture of complex traits.