Genetic causes of high and low serum HDL-cholesterol

Abbreviations: ABC, ATP-binding cassette; AI-HDL, HDL contain- ing apoAI; A-I/A-II HDL, HDL containing apoAI and apoAII; ANGPTL4, angiopoietin-related protein 4; apo, apolipoprotein; CAD, coronary artery disease; CETP, cholesterylester transfer protein; CNV, copy number variation; EL, endothelial lipase; eQTL, expression QTL; FADS, fatty acid desaturase; FHA, familial hypoalphalipoproteinemia; FLD, familial lecithin-cholesterol acyltransferase deﬁ ciency; GWA, ge- nome-wide association; HDL-C, HDL cholesterol; HL, hepatic lipase; HNF4A, hepatocyte nuclear factor-4 (cid:2) ; LCAT, lecithin-cholesterol acyl- transferase; LD, linkage disequilibrium; LDLR, LDL receptor; LIPC, hepatic lipase gene; LIPG, endothelial lipase gene; LXR, liver X recep- tor; MAF, minor allele frequency; MODY, maturity onset diabetes of the young; PLTP, phospholipid transfer protein; PON1, paraoxonase 1; RCT, reverse cholesterol transport; SNP, single nucleotide polymorphism; SR-BI, scavenger receptor class B type I; T2DM, type 2 diabetes mellitus; TG, triglyceride; TTC39B, tetratricopeptide repeat domain 39B; VNN1, Vanin1; WWOX, WW domain-containing oxidoreductase. 1 To whom

secreted into plasma, where it is primarily associated with HDL particles ( 8 ). LCAT esterifi es free cholesterol into cholesteryl esters that are hydrophobic and therefore sequestered into the core of the particles. The disc shape is converted into a spherical HDL particle that is predominant in human plasma ( ␣ -2 and ␣ -3 HDL) ( Fig. 1 ) ( 8,9 ). The esterifi cation also establishes a concentration gradient that ensures continuous supply of free cholesterol and prevents reuptake of cholesterol by the cells ( 10 ). LCAT is thus critical for normal HDL metabolism and RCT, as is also evidenced by the fact that LCAT defi ciency in humans ( 17 ) and mice ( 18 ) causes reduced levels of HDL-C.
Spherical HDL are classifi ed into two populations on the basis of the main apolipoproteins, apoAI and apoAII: those containing only apoAI (A-I HDL, mostly found in HDL2 particles that are larger and less dense) and those that contain both apoAI and apoAII (A-I/A-II HDL, mostly HDL3 particles) ( Fig. 1 ) ( 8,9 ), which represent about twothirds of the particles ( 14 ). Although the liver and intestine are critical for the initial lipidation of apoAI via ABCA1 ( Fig. 1 ), spherical HDL also acquires additional lipids from other lipoproteins, tissues, and mechanisms such as passive diffusion, or receptor-mediated pathways, namely ABCG1, ABCG4, and scavenger receptor class B type I (SR-BI) ( 10 ).

REMODELING OF HDL
The HDL particles are continuously being remodeled (i.e., altered in size, shape, surface charge, or composition) by factors such as, LCAT, CETP, PLTP, and lipases ( Fig. 1 ) ( 19 ). CETP is a member of the lipid transfer/ lipopolysaccharide binding protein family that transfers cholesteryl esters from HDL to apoB-containing lipopro- ( 8 ). Most of the HDL particles contain a hydrophobic core of cholesteryl esters and a small amount of TGs surrounded by a monolayer of phospholipids, free (unesterifi ed) cholesterol, and apolipoproteins (apo) ( 8,10 ). The main apolipoprotein components on HDL are apoAI and apoAII. Other minor apolipoproteins are apoAIV, apoCI/CII/ CIII, apoD, apoE, apoJ, apoL, and apoM ( 8,(11)(12)(13). HDL also contains antioxidants and enzymes involved in plasma lipid metabolism such as paraoxonase 1 (PON1), lecithincholesterol acyltransferase (LCAT), cholesterylester transfer protein (CETP), and phospholipid transfer protein (PLTP) ( 14 ). The physiology of HDL is complex and involves several pathways ( Fig. 1 ). Basic study in HDL metabolism has led to the identifi cation of candidate genes that have subsequently been found to harbor germline variants ( Table 1 ). Overall, monogenic conditions of extremely high or low levels of HDL-C have greatly helped delineate the key players and processes involved in HDL metabolism ( 15 ).
ApoAI, the most abundant HDL apolipoprotein (70% of HDL protein), is required for normal HDL biosynthesis, as is evidenced by gene deletion of ApoAI/APOAI, which results in extremely low levels of HDL-C in both mice and human ( 14 ). ApoAI is secreted to plasma by liver and intestine in a lipid-free/lipid poor form (apoAI/ pre-␤ -1 HDL) ( Fig. 1 ) ( 8 ). Nascent HDL rapidly acquires phospholipids and free cholesterol through the effl ux of cholesterol from cell membranes to apoAI by the ATPbinding cassette (ABC) A1, resulting in discoidal HDL [disc shape ( ␣ -4 HDL)] ( 8 ). This critical role of ABCA1 (i.e., the effl ux of cholesterol from cells) was only established after the discovery that absence of ABCA1 causes Tangier disease, which is characterized by extremely low levels of HDL-C ( 16 ). Discoidal HDLs are excellent substrates for LCAT, an enzyme that is expressed in liver and Fig. 1. Schematic overview of GWA loci identifi ed for HDL metabolism thus far. A question mark indicates that the function of the gene in lipid metabolism is currently unknown; a dashed arrow, that the function is uncertain; and a red ellipse, apolipoproteins. The genes that were not implicated in GWA studies are designated in blue. Only genes that were genome-wide signifi cant ( P < 5 × 10 Ϫ 8 ) in at least 5,000 subjects based on the NHGRI GWA catalog data are included. Premature CAD has been reported in some patients, but generally uncommon (17,21,41,42,62,63) Fish-eye disease ( (34,37,(78)(79)(80)(81) strongest correlation with decreased serum HDL-C levels, whereas weight reduction generally increases HDL-C levels ( 27,28 ). For every 1 kg of weight loss, serum HDL-C increases by 0.35 mg/dl ( 27,28 ). Studies suggest an increase in LCAT and LPL activity, as well as in RCT capacity, as plausible mechanisms of action ( 27 ). Of the other determinants, moderate alcohol consumption is also a strong predictor of HDL-C. Alcohol consumption of 30-40 g/day (1-3 drinks/day) or more has been shown to increase HDL-C levels by 12-15% independent of the effect of gender, age, and body mass index ( 26,27 ). Possible mechanisms for this effect on HDL-C are an increase in ABCA1, apoAI, and PON1 levels and a decrease in CETP levels ( 27 ). This infl uence on HDL-C may contribute to the cardioprotective effects of alcohol. However, it should be noted that these secondary causes often themselves have genetic components. Even those that appear 'environmental' such as alcohol consumption and smoking have genetic contributors as well.
Over the past 30 years, numerous researchers have attempted to reveal the genetic basis of low and high HDL-C by using families, cases/controls, and unascertained general population samples. However, the genetic inheritance is complex, and as with many other complex traits, determinants of HDL-C levels can be either monogenic, purely environmental, or in most cases resulting from many genes (i.e., polygenic), environmental factors, and their interactions (i.e., multifactorial).
The most common genetic disorder of HDL-C is familial hypoalphalipoproteinemia (FHA), defi ned as HDL-C levels below the 10th percentile for age and gender (HDL-C levels between 20 and 40 mg/dl) and a family history of low HDL-C levels in at least one fi rst-degree relative ( 20,29 ). FHA is a common fi nding in patients with premature CAD ( 2,30 ). The metabolic etiology in many cases appears to be accelerated catabolism of HDL and its apolipoproteins ( 21 ), and some subjects, but not all, are characterized by small, lipid-poor HDL particles and defective lipid effl ux ( 31 ). FHA was previously considered to be a dominant disorder due to mutations in the ABCA1 gene in some families and of unknown genes in other families ( 32,33 ). However, the genetic causes of FHA remain largely unknown, and in fact segregation analyses suggest that most inheritance patterns for low HDL-C are polygenic even in the case of FHA ( 34 ).
Several monogenic disorders of extremely low HDL-C levels (often below 10 mg/dl; HDL defi ciency) have also been described ( 15,20,35 ). Although these monogenic causes are rare, and altogether they may explain only a small portion ( ‫ف‬ 1%) of low HDL-C cases in the general population ( 36 ), they have demonstrated that extremely low HDL-C levels infl uence multiple organs, and thus the clinical signifi cance of HDL defi ciency extends beyond cardiovascular risk ( Table 1 ).
Familial segregation of high HDL-C levels, familial hyperalphalipoproteinemia, is defi ned as HDL-C levels greater than the 90th percentile for age and gender, no secondary causes of high HDL-C (e.g., drugs, alcoholism, and cirrhosis), and a family history of high HDL-C ( 21 ). teins, VLDL, and LDL in exchange for TGs ( 8 ). The importance of CETP for determining plasma HDL-C levels comes from the discovery of individuals defi cient in CETP who present extremely high levels of HDL-C and exceptionally large HDL particles ( 20,21 ). Once cholesteryl esters are transferred to VLDL and LDL, they are available for uptake by the liver via the LDL receptor (LDLR) and LDLR-related protein ( 8 ). However, cholesteryl esters can return to the liver for excretion in the bile (i.e., RCT) through two additional pathways: through selective uptake by the liver or steroidogenic tissues via SR-BI ( 22 ), and to a lesser degree through an interaction with LDLR if HDL contains apoE ( Fig. 1 ) ( 23 ). PLTP is a member of the same protein family as CETP. It transfers phospholipids between HDL and VLDL, as well as between the different HDL particles ( 10 ). When VLDL levels are high, the cholesteryl ester transfer (by CETP) from HDL exceeds the transfer of TGs from VLDL to HDL, and hence HDL becomes smaller, depleted of core lipids, and TG enriched ( 8 ). The TG-enriched HDL are excellent substrates for hepatic lipase (HL), a member of the TG lipase gene family, which further reduces the HDL size and enhances the dissociation of lipid-free/lipid-poor apoAI from HDL ( 9 ). Accordingly, patients with HL defi ciency present a marked enrichment of TGs in HDL particles and increased levels of HDL-C ( 15,20 ). The dissociation of lipid-free/lipidpoor apoAI from HDL by the actions of CETP, PLTP, and HL is an important aspect in HDL remodeling. Lipid-free/ lipid-poor apoA-I can accept cholesterol and phospholipids through the effl ux of ABCA1 and thus maintain circulating HDL levels and reduce the rate at which apoA-I is cleared from the circulation ( 19 ). Endothelial lipase (EL) is also a member of the TG lipase gene family ( 24 ). However, it has a high phospholipase and very low TG lipase activity ( 14 ). Although it can remodel HDL to a smaller particle, it does not lead to a dissociation of lipid-free/ lipid-poor apoAI ( 8 ). Furthermore, the LPL activity is also associated with HDL-C levels, because phospholipids and apolipoproteins that are shed from the catabolism of TG-rich lipoproteins (i.e., chylomicrons and VLDL) by LPL are acquired by HDL particles ( Fig. 1 ) ( 14 ). The relevance of LPL activity on HDL metabolism was confi rmed when patients with LPL defi ciency were observed to present low HDL-C levels among other lipid abnormalities ( 20 ).

Genetic and lifestyle contribute to variation in HDL-C concentrations
Based on family and twin studies, plasma levels of HDL-C appear to be under a strong inherited basis, with heritability estimates of 40-60% ( 21,25 ). Secondary causes such as gender, age, obesity, smoking, alcohol, diet, physical activity, drugs (e.g., steroids, niacin, statins, and fi brates) or other metabolic disorders (e.g., insulin resistance and liver disease) have also been shown to infl uence HDL-C levels in numerous epidemiological studies ( 26,27 ). Among these, obesity, measured as body mass index, has the clinical symptoms ( 41,42 ). Approximately 25 patients with apoAI defi ciency owing to nonsense mutations, a chromosomal aberration or deletion, have been reported thus far ( 41,(43)(44)(45)(46)(47). Structural mutations in the APOA1 gene such as missense mutations are a frequent cause of FHA (15-35 mg/dl) ( 48,49 ). Although the impact of these mutations is variable ( 21,43 ), they appear to cause the greatest elevation in the risk of CAD when compared with other gene defects ( 44,48 ). The apoAI Milano variant is a notable exception, as it is associated with a reduced vascular risk despite a 67% decrease in HDL-C levels ( 20,21 ). Moreover, a common polymorphism in the promoter of APOA1 and overproduction of apoA1 were shown to be associated with elevated HDL-C levels (50)(51)(52).
ABC transporter. Tangier disease (MIM205400), also known as familial alphalipoprotein defi ciency, is an autosomal recessive disorder due to loss-of-function mutations in the ABCA1 gene ( 32,53 ). It was mapped to the 9q31 locus using a genome-wide linkage approach in 1998 ( Fig. 2 ) ( 54 ). Patients are characterized by a profound decrease in HDL-C (<5 mg/dl) and apoAI levels ( ‫ف‬ 4 mg/dl) ( 32 ), as well as somewhat elevated TG levels (>200 mg/dl) and decreased LDL-C levels (about 50% of normal) ( Table 1 ). The clinical presentation of Tangier disease varies considerably. However, because cholesterol accumulates in many tissues throughout the body (tonsils, liver, spleen, Schwann cells), the major clinical fi ndings are enlarged orange tonsils, hepatomegaly, splenomegaly, and occasionally mild corneal opacifi cation ( Table 1 ) ( 41,42 ). Furthermore, about 50% of the cases present peripheral neuropathy, a life quality-limiting symptom ( 41,55 ), and several present hematological fi ndings, mainly thrombocytopenia ( 42 ). Fibroblasts from patients show a defective cholesterol and phospholipid effl ux to apolipoproteins ( 21,56 ). Thus, There has been no consensus as to whether subjects with markedly high serum HDL-C levels are resistant to atherosclerosis ( 37 ). Some reports of familial hyperalphalipoproteinemia suggest longevity and cardiovascular protection, while others suggest an increase risk of CAD ( 21,34 ). However, because hyperalphalipoproteinemia may be caused by various genetic factors ( 34 ), the atherogeneticity may depend upon the specifi c defect, as well as other modifi ers. For instance, studies have shown that cholesterol effl ux is impaired in some cases ( 38 ), while normal or elevated in others ( 39,40 ), further demonstrating that the level of plasma HDL-C may not necessarily imply the functional aspects of HDL particles. A few single gene defects associated with high levels of HDL-C have been identifi ed in the CETP gene and the hepatic lipase gene (LIPC), as well as described in one report for the APOC3 gene ( 37 ), but most genetic causes of high HDL-C levels remain unknown.

Monogenic disorders of high and low HDL-levels
Monogenic mutations that cause high or low HDL-C levels are summarized in Table 1 .

Single gene defects associated with low HDL-C levels
ApoAI. A complete loss of apoAI (apoAI defi ciency [MIM 604091]) results in a profound decrease in HDL-C (<5 mg/dl) and an increased risk of premature CAD ( Table 1 ) ( 15,20,41 ). ApoAI defi ciency patients can be distinguished from other causes of HDL defi ciency by the complete absence of plasma apoAI (0 mg/dl, undetectable) and normal levels of LDL-C and TGs ( Table 1 ). Patients may also exhibit xanthomas or mild to moderate corneal opacifi cation ( Table 1 ) ( 42 ). Heterozygous carriers have plasma HDL cholesterol and ApoAI levels that are about 50% of normal and usually do not present specifi c Allelic spectra of variants infl uencing HDL-C levels. The frequency ( x axis) and effect size ( y axis) of disease-causing allele are shown. Many genes with common variants of weak effect (yellow circle) and/or rare variants of large effect (blue circle) have been identifi ed by gene mapping using methods such as the optimal method shown in brackets. However, a full spectrum of alleles is expected for common complex traits, such as HDL-C levels. An emerging hypothesis suggests that a signifi cant proportion of the heritability will be attributable to rare and low-frequency variants with modest to intermediate effect (dark-gray circle), which have been mostly missed by gene mapping approaches to date. In each circle, examples of genes with the specifi ed variant are shown. lipoproteins may determine the atherogenic risk in LCAT defi ciency ( 64 ).
LPL. LPL defi ciency, known as Type I hyperlipoproteinemia or familial chylomicronemia (MIM 238600), is an extremely rare autosomal recessive disorder characterized by severe hypertriglyceridemia due to chylomicronemia and VLDL accumulation; by very low levels of LDL-C and HDL-C (< 20 mg/dl) ( Table 1 ); as well as by hepatosplenomegaly, xanthomas, and recurrent episodes of abdominal pain or acute pancreatitis ( 20,21 ). Pancreatitis is the major cause of mortality in these patients. Heterozygotes have variable lipid values, ranging from normolipidemia to elevated plasma TG and decreased HDL-C levels ( 65 ). Many mutations in LPL have been identifi ed, as reviewed previously ( 65,66 ). Plasma LPL activity may regulate HDL cholesterol levels in at least three ways (see "Remodeling of HDL" for more details). First, during hydrolysis of TG-rich lipoproteins, phospholipids and apolipoproteins are shed and acquired by HDL particles ( 20 ). Second, the exchange of cholesterol for TG from HDL by CETP is modulated by the amount of VLDL. Thus, by decreasing plasma TGs, LPL limits the CETP-mediated HDL-C reduction ( 8 ). Third, by altering the lipid composition of HDL, the catabolic rate of HDL apoA-I is altered, and for instance TG-enriched HDL results in the production of smaller lipid-poor apoA-I that are more rapidly cleared from the circulation ( 9 ). Accordingly, LPL activity is usually positively correlated with HDL levels ( 67 ), and the lack of LPL activity in LPL-defi cient individuals could thus impair the generation of lipid-free/lipid-poor precursors and their maturation. More indirectly, the pronounced hypertriglyceridemia observed in these patients results in an enhanced exchange of cholesteryl esters from HDL to VLDL, which thereby also contributes to low HDL-C levels. Furthermore, it may also be positively associated with cholesterol effl ux activity, as suggested by one paper detecting a signifi cant decrease in cholesterol effl ux in an LPL-defi cient patient ( 68 ). However, there does not appear to be an increased risk of CAD in patients with LPL defi ciency, although some controversy remains regarding the relation to CAD ( 65 ). This discrepancy may be explained by both antiatherogenic and proatherogenic effects suggested for LPL ( 66 ).

Single gene defects associated with high HDL-C levels
CEPT. A complete loss of the CETP activity due to mutations in the CETP gene results in markedly elevated HDL-C levels (usually >120 mg/dl) ( Table 1 ) and in moderately elevated HDL-C levels in heterozygotes (70-100 mg/dl) ( 69 ). Most cases of CETP defi ciency (MIM 607322) have been described in Japan, where it explains almost one-half of all hyperalphalipoproteinemia cases ( 70 ). In contrast, screening for CETP mutations in 95 Caucasian hyperalphalipoproteinemia cases led to the identifi cation of only one heterozygous individual ( 71 ). A splice-site mutation in intron 14 (complete defi ciency) and a missense mutation (D442G) in exon 15 (partial defi ciency) are the most frequent mutations in Japan (heterozygote frequencies apoAI is not appropriately lipidated and it is rapidly cleared, resulting in the markedly reduced levels of apoAI and very small amounts of HDL in the plasma with only pre ␤ HDL particles present ( 14 ). Heterozygous carriers have moderately reduced HDL-C levels, 50% of normal cholesterol effl ux, and a decrease in large HDL particles ( 21 ). Various mutations in ABCA1 have been reported (>100) [reviewed in ( 57,58 )], and some of them are also associated with FHA (15-35 mg/dl) in the heterozygous state ( 32,33,57 ). Tangier patients and even obligate heterozygotes are at an increased risk of premature CAD (3-6 and 1.5-fold higher, respectively) ( 59, 60 ). However, some elderly patients without CAD have been described ( 61 ). The low plasma levels of LDL-C in Tangier patients have been postulated as a possible explanation ( 21,32 ), and indeed studies of heterozygotes, who tend to have relatively normal LDL-C levels, showed a signifi cant inverse correlation between ABCA1 activity in fi broblasts and the prevalence and severity of CAD ( 59 ). Furthermore, animal studies show that ABCA1 in macrophages is a cardioprotective factor, although hepatic ABCA1 may override some of these protective effects by increasing plasma levels of atherogenic apoB-containing lipoproteins ( 59 ).
LCAT. Mutations in the LCAT gene in the homozygote state cause either a complete defi ciency, known as familial LCAT defi ciency (FLD) (MIM 245900), or a partial defi ciency, which is known as fi sh-eye disease (MIM 136120) ( 17,62 ). Subsequently it was learned that there is ␤ -LCAT activity that acts on apoB-containing lipoproteins, and ␣ -LCAT activity that acts on HDL ( 63 ). Patients with fi sh-eye disease have defi ciency of only the latter, whereas in FLD, both activities are affected ( 63 ). Many different mutations have been reported, as reviewed previously ( 17,63 ). Both conditions are characterized by reduced HDL-C (<10 mg/dl) and apoA1 levels (<50 mg/dl), elevated TGs, and decreased LDL-C levels ( Table 1 ), as well as by early onset corneal opacifi cations that are more striking than those reported in Tangier disease or in ApoAI defi ciency ( 17,21,42 ). Heterozygous carriers of LCAT mutations are clinically normal, and they frequently (but not always) present with low HDL-C levels ( 42 ).
In LCAT disorders, free cholesterol is greatly increased in the plasma and peripheral tissues, because it cannot be converted to cholesterol esters, which leads to the inability to form mature HDL particles (i.e., large spherical) and thus to rapid clearance of apoAI with only discoidal HDL particles present in plasma (pre-␤ HDL and ␣ -4 HDL) ( 21,41 ). FLD patients also present highly abnormal apoB-containing lipoproteins, hemolytic anemia associated with an increased content of cholesterol in red cells, proteinuria, and progressive renal disease ( 42 ). Renal failure is the major cause of morbidity and mortality in this disorder, and the accumulation of lipids, particularly in glomeruli, has been demonstrated in biopsies of patients ( 17 ). Premature CAD is not a common feature of LCAT defi ciencies ( 21 ), although several patients, mostly with FLD, have been described with premature CAD ( 15 ). Insight from mouse models suggests that the plasma levels of apoB-containing However, only a small number of the fi ndings are considered true positives, because they were replicated in additional studies. The diffi culties in replicating the results can be attributed to various factors, such as genetic and phenotypic heterogeneity, variable expression of the phenotype, and differences in study designs, methods of ascertainment, and analytical strategies ( 84,85 ). However, because we now know that the typical effect sizes in complex traits are small, increasing risk by a factor of 1.1-1.3, inadequate sample sizes with a limited power to detect variants with small effects is probably one of the main reasons for the small number of candidate genes with replicated evidence for HDL-C levels.
Linkage analysis tests for co-segregation of a chromosomal region and a trait of interest using polymorphic markers in families. However, it can only provide an initial localization of susceptibility genes and requires subsequent extensive fi ne-mapping studies. This approach has been successful in localizing the causative genes in rare monogenic disorders, such as Tangier disease ( 86 ), but it has been less benefi cial for common (i.e., complex) polygenic conditions. A large number of genome-wide linkage screens for continuous and dichotomous HDL-C and related traits (such as apoAI and apoAII levels) have been performed in FHA as well as nonascertained family samples from different populations [reviewed by (87)(88)(89)(90)]. However, these studies have been diffi cult to interpret, because there is little overlap within these fi ndings ( 87 ). Chromosomal regions implicated by more than one study ( 88 ) are more plausible, of which the chromosome 16q locus represents the most consistently replicated region (91)(92)(93)(94). Nevertheless, linkage studies suggest that many different loci contribute to HDL-C levels and that the effect of most genes is too small or content dependent (i.e., genetic heterogeneity) to be detected by independent and heterogeneous study samples. Furthermore, because typically the subsequent fi ne-mapping studies have not been able to identify the underlying causal mutation that would alone explain the linkage signal (as opposed to monogenic conditions), these studies also suggest that many private DNA variants may contribute to complex phenotypes. The application of linkage analysis for identifying private (i.e., rare) variants for the polygenic HDL-C trait is discussed in "Family studies in investigation of private variants." Candidate gene studies using association or resequencing are hypothesis-based studies. Genes are selected for study based on either their location in a region of linkage ('regional-candidate genes') or on other evidence of a possible role in the etiology of the disease. Sequencing candidate genes in cases and controls to search for variants enriched or depleted between the two groups is the most comprehensive analysis for identifying causal alleles. However, until recently, these types of studies were expensive and laborious. Association analysis using common single nucleotide polymorphisms (SNPs) are cheaper and simpler than resequencing. In association analysis, the frequencies of alleles or genotypes are compared between the cases and controls or within family-based cases and of 2% and 7%, respectively) ( 72 ), but several other mutations have been reported as well, as reviewed previously ( 37,73 ). In CETP defi ciency, HDL-C levels are elevated as HDL particles are enriched in cholesterol ester (elevation of large HDL2 particles), and the turnover of apoAI and apoAII is signifi cantly reduced ( 17 ). Levels of LDL-C and apoB are normal or slightly decreased (approximately 40%) ( 21 ). The relationship between CETP defects and CAD remains elusive ( 37,73 ). Some studies suggest that CETP-defi cient patients have a reduced CAD risk ( 39,74,75 ), while others suggest that despite the elevation in HDL-C levels, these particles (as well as LDL) are dysfunctional (e.g., have less capacity than normal HDL2 for cholesterol effl ux) and may not be cardioprotective ( 38,76 ). All in all it appears that CETP may be essential for remodeling large HDL particles into smaller ones, which are more antiatherogenic ( 37 ). de Grooth et al. ( 77 ) discuss the complex relation between CETP mutations and the risk of CAD in detail.
HL. Few patients from only fi ve families have been identifi ed with HL-defi cient phenotypes thus far, and the disorder appears to be inherited as an autosomal recessive trait (MIM 151670) ( 78,79 ). Although the lipid profi le of the patients suggests Type III hyperlipoproteinemia (i.e., severe hypertriglyceridemia, and elevated cholesterol and intermediate-density lipoprotein levels), HL defi ciency patients can be distinguished by the absence of postheparin HL activity ( 79 ). Patients also present a 10-fold increase in HDL-TG, modestly elevated HDL-C and apo AI levels, larger HDL particles (elevation of HDL2 particles), and abnormal catabolism of remnant lipoproteins ( Table 1 ) ( 79,80 ). The phenotype in heterozygotes is variable, and they do not appear to have a discrete lipoprotein abnormality ( 79 ). Several HL-defi cient patients had premature CAD, probably due to the elevated levels of atherogenic lipoproteins ( 34,79 ). However, there is a controversy regarding whether HL is pro-or antiatherogenic, comprehensively reviewed by Jansen et al. ( 81 ). It seems that either very low (i.e., HL defi ciency) or very high levels of HL activity (e.g., due to use of anabolic steroids) are associated with atherosclerosis ( 20,81,82 ). Furthermore, the current evidence suggests that defects in HL may lead to an increased susceptibility to atherosclerosis, although another cause of dyslipidemia is typically required for the development of CAD ( 79,83 ).

Genetic studies for polygenic HDL-C trait
Strategies to identify genetic factors for complex polygenic traits generally fall into two categories: candidategene analyses using either association or resequencing approaches, and genome-wide studies, which include genome-wide linkage and genome-wide association (GWA) studies.
Linkage and candidate-gene studies for analyzing polygenic HDL-C trait. Over the past years, genome-wide linkage and candidate gene studies have been somewhat successful at identifying susceptibility genes for HDL-C levels.
(-629C>A), further strengthening the signifi cant effect of variation at the CETP locus on HDL-C levels ( 99 ). Two variants in the promoter region of LIPC that are in strong LD ( r 2 > 0.9) have been consistently associated with HDL-C levels: rs1800588 (C-514T or C-480T) and rs2070895 (G-250A) ( 98 ). A meta-analysis of 24,000 subjects has shown that one copy of the minor allele of rs1800588 increases HDL-C concentration by 1.5 mg/dl and two copies by 3.5 mg/dl ( 100 ). Six SNPs in LPL have shown signifi cant associations with HDL-C concentrations, four of which are in strong LD ( r 2 > 0.8) ( 97 ). A meta-analysis, available for three of these variants including 4,000 to 15,000 subjects, demonstrated that HDL-C levels are increased (1.5 mg/dl) in carriers of the minor allele of rs328 (S474X) and decreased in carriers of the minor allele of rs268 (N291S) and rs1801177 (D9N) (4.5 and 3 mg/dl, respectively) ( 101 ).
The evidence for association with LIPG is not as strong as for the other lipases (LPL and LIPC) ( 15 ). Although a few variants were signifi cantly associated, the sample sizes in these studies were small (<1,000 individuals) (102)(103)(104). Common variants in LCAT have shown inconsistent results in association studies ( 105,106 ). However, LCAT has not been thoroughly investigated yet. On the other hand, common variants in ABCA1 have been investigated in many candidate-gene studies. None of the variants showed a strong effect, although many of these studies were rather large (5,000-20,000) ( 98 ). Similarly, the evidence for association with common variants in APOA1 is not very strong, but APOA1 has not been thoroughly investigated in large candidate-gene studies ( 97 ).
The gene APOC3 is located 2.5 kb apart from APOA1 (within the APOA1/C3/A4/A5 gene cluster on chromosome 11). It is present on both HDL and TG-rich particles and is known to inhibit LPL activity ( 107 ). Although the APOC3 locus has been linked to HDL-C levels ( 108,109 ), association data has been inconsistent and weak ( 20 ). APOA5, also located within the APOA1/C3/A4/A5 gene cluster, is mostly present on TG-rich particles but also on HDL, and it appears to activate LPL function ( 110 ). The association of APOA5 is the strongest among the apolipoproteins, with most investigated alleles decreasing HDL-C levels ( 97 ). APOE is also mainly present on TG-rich particles and on some HDL subspecies as well ( 15 ). An association with HDL-C levels has been observed in some but not in all studies, with most studies showing an increase in HDL-C levels for the E2 allele and a decrease for the E4 allele relative to the common E3 allele ( 98 ).
PON1 is an antioxidative enzyme present on the HDL particle ( 10 ). Two nonsynonymous variants (i.e., amino acid substitutions) have been the main targets of investigation, resulting in some association signals ( 111,112 ). SR-BI has been the focus of numerous investigators, because it was shown to infl uence HDL-C concentrations and the susceptibility to CAD in animal studies ( 14 ). Despite this functional support, the association evidence in human studies has been rather weak ( 15,20 ). Likewise, SNPs in ABCG5/G8 have also been investigated in several association studies ( 113,114 ), as these transporters mediate the controls to avoid the potential problem of population stratifi cation ( 84 ). Association studies have a much greater power than linkage-based studies to detect the effects of common variants that have a minor to modest effect on the disease or trait ( 95 ). An allelic association can result from the actual disease variant or more probably from a linkage disequilibrium (LD) between the variant and disease. Because LD mapping (i.e., association) is based on historical recombination events, only the DNA sequence near the mutation remains in LD (approximately thousands of bases) ( 84 ). Hence, association analysis provides a much better resolution for mapping than linkage that spans tens of millions of bases. However, it requires a large number of markers and a priori knowledge of the underlying LD structure, which before the completion of the International HapMap Project ( 96 ) was diffi cult, if not impossible, to achieve.
Association studies. Numerous candidate-gene association studies for HDL-C levels have been performed over the years ( 20,97 ). Selecting genetic variants in or near the genes is an important consideration in this type of study. In early studies, typically few variants, often in coding regions, were genotyped, and thus the effects of many polymorphisms in the gene and regulatory regions were missed. The completion of the HapMap Project in 2005 provided a comprehensive map of the common [minor allele frequency (MAF) > 5%) human variation and LD structure in different populations ( 96 ), which allows a careful selection of regional nonredundant SNPs (i.e., not in LD) that capture most of the common variation (tagSNPs). Furthermore, the genome-wide scale of the HapMap Project has made it possible to extend association studies to encompass fl anking regulatory regions (i.e., promoters, enhancers, and conserved elements) as well as entire regions of linkage (i.e., fi ne-mapping studies).
Numerous genes suggested to play a role in the HDL metabolism (i.e., apolipoproteins, enzymes, and lipid transfer proteins; cellular receptors and transporters; and transcription factors) have been tested in association studies. The results of many studies have been confusing and inconsistent, probably due to the few variants and samples (<1,000 subjects) that were investigated. Given the large number of studies, the results of which have remained inconclusive, in this review we will only summarize the main conclusions of common genetic polymorphisms in the CETP, LIPC, LPL, endothelial lipase gene (LIPG) , LCAT, ABCA1, APOA1, APOC3, APOA5, APOE, SR-BI, and PON1 genes that have been implicated in more than one study [reviewed by ( 15,20,97,98 )].
Among the genes with known function in HDL metabolism, common variants in CETP, LPL and LIPC have shown the most pronounced association with HDL-C levels ( 15 ). Genetic variants in CETP have been tested for association in numerous studies, with several of these variants showing a strong genetic effect (change of 2-3 mg/dl per each copy of the minor allele) ( 73 ). A large meta-analysis of 40,000-70,000 subjects is available for three of these variants, rs708272 (Taq1B), rs5882 (I405V), and rs1800775 power to detect allelic associations with low HDL-C levels. However, the presence of two independently associated variants located in the same intron of the WWOX gene warrants further investigation of the role of WWOX in HDL metabolism.
RESEQUENCING STUDIES. Resequencing provides a comprehensive analysis for candidate genes and regions as both common and rare variants can be identifi ed. However, until recently, these types of studies (using the traditional Sanger-sequencing method) were limited to coding regions and a small number of samples and genes, because they were expensive and laborious.
Up until now, candidate genes selected for resequencing studies were mostly genes known to cause monogenic conditions. Variants within these genes are probably individually rare (<1%). However, in aggregates, they may contribute to variation in HDL-C levels in the general population ( 124 ). Because variants with large phenotypic effects are more likely to be found at one extreme of the trait distribution, cases and controls for resequencing studies were typically selected on the basis of HDL-C levels < the 5-10 th or > the 90-95 th population-specifi c percentile for age and sex, respectively. An excess of mutations in individuals from one extreme versus the other extreme is considered strong evidence of an association.
Cohen et al. ( 124 ) have sequenced the coding regions (i.e., exons) of the APOA1, LCAT, and ABCA1 genes in individuals from the upper (n = 128) and lower (n = 128) 5% of the distribution of HDL-C levels from the Dallas Heart Study, a population-based study. Sixteen percent of the low HDL-C group had a nonsynonymous variant in one of these genes (mainly in ABCA1) ( Fig. 2 ) that was not present in the high HDL-C group. In contrast, only 2% of the individuals in the high HDL-C group had a nonsynonymous variant not seen in the low HDL-C group ( P -value < 0.0001). In a similar analysis of a Canadian sample, nonsynonymous variants in these three genes were also more common in the low HDL-C group (n = 155) versus the high group (n = 108) (14% vs. 3%). Six of the variants found only in the low HDL-C groups (both in the Dallas Heart Study and Canadian sample) were previously identifi ed in subjects with monogenic conditions. Furthermore, most ABCA1 variants from the low HDL-C group significantly decreased cholesterol effl ux versus none of the ABCA1 variants from the high HDL-C group that were tested ( 124 ). This infl uential study established that multiple rare variants with large phenotypic effects contribute to HDL-C variation in the general population.
More recently, these three candidate-genes were sequenced in Caucasian subjects with HDL-C levels below the 10th percentile for age and sex. Nonsynonymous variants were found in 25% of the cases and functional variants in 12% of the cases ( 31 ). Differences in selection criteria and populations may explain the small difference in occurrence of coding variants between the studies. The ABCA1 gene was also sequenced in individuals with the highest 1% (n = 190) and the lowest 1% (n = 190) HDL-C levels from the Copenhagen City Heart Study, a population-based sample ( 125 ). Ten percent of the individuals in effl ux of cholesterol and plant sterols from enterocytes back into the intestinal lumen ( Fig. 1 ), thus limiting their accumulation in the body and promoting RCT ( 59 ). However, the results of the association studies remain controversial. Furthermore, several candidate-gene studies have investigated the genetic effect of the apolipoprotein genes APOA2, APOA4, and APOB ( 97,115,116 ). The small scale association results for these genes have been mostly negative.
Tens to hundreds of candidate-genes may reside in a typical region of linkage ( ‫ف‬ 10 Mb). Due to the high cost of genotyping, often only those genes residing either directly under the linkage peak and/or with a possible role in lipid metabolism were tested for association in subsequent fi ne-mapping studies ( 94,117 ). For the most part, the underlying gene(s) contributing to the linkage signal has not been determined in subsequent association studies. However, only a few loci have met genome-wide signifi cance criteria for linkage or confi rmation in replication studies.
We fi ne-mapped the linked region on chromosome 16q22-24 using a tag-SNP association analysis approach ( 118 ). This region is the most consistently replicated region for HDL-C ( 91-94 ). Using HapMap data, we selected nonredundant tag-SNPs to explore the contribution of common variation within the 12.5 Mb-linked region to HDL-C levels. We identifi ed a SNP (rs2548861) signifi cant after correcting for multiple testing in the WW domain-containing oxidoreductase (WWOX) gene that explains much (67%) of the linkage signal on 16q (LOD = 3.9) ( 118 ) and demonstrated a population effect of this variant on HDL-C levels in large population-based studies (n = 6,728).
The WWOX gene has not been previously associated with HDL-C metabolism and could thus suggest a new metabolic pathway. It encodes a 46 kDa protein that contains two WW domains and a short-chain oxidoreductase domain ( 119 ). The short-chain oxidoreductase domain suggests a role in steroid metabolism, and in fact Wwoxdefi cient mice display impaired steroidogenesis ( 120 ). Furthermore, Wwox-defi cient mice have impaired serum lipid levels compared with matched age and sex control littermates and die by 4 weeks of age ( 119,121 ). Wwox was also shown to function as a tumor suppressor in mice heterozygous for the deletion ( 121 ). However, the underlying mechanism(s) by which WWOX infl uences HDL-C levels is currently unknown, and further studies are warranted to elucidate these molecular mechanisms.
Although the rs2548861 SNP or SNPs in LD did not reach genome-wide signifi cance in GWA studies for HDL-C ( 122, 123 ), a SNP, rs2667590, located in the same intron as the associated SNP, rs2548861, resulted in P -value of 2.3 × 10 Ϫ 5 and ranked 344th among the 2,559,602 SNPs tested for HDL-C in this GWA study ( 122 ). It was not further investigated, because typically only a limited number of association signals are followed up in GWA studies (as discussed in "GWA studies"). This SNP rs2667590 had a MAF of 3%, whereas our study included only SNPs with a MAF > 10% to ensure that our samples contained suffi cient fi ed and further validated in 8,726 subjects of a large population-based study ( P -value = 4.0 × 10 Ϫ 7 ) ( Fig. 2 ).
Regional candidate genes (i.e., gene residing in a linkage region) have also been investigated by resequencing. For example, the Zinc Finger Protein 202, a transcriptional repressor that binds elements found predominantly in genes involved in HDL metabolism, was selected for resequencing, because it is located in a susceptibility locus for FHA on chromosome 11q23 ( 134 ). The promoter and coding regions were screened in individuals with the highest 1% (n = 95) and lowest 1% (n = 95) HDL-C levels from the Copenhagen City Heart Study, but none of the identifi ed variants differed in frequency between the low and high HDL-C groups, suggesting that coding variants in Zinc Finger Protein not contribute to HDL-C levels in the general population.
GWA studies. Recently, the fi eld of human genetics has seen the completion of several tasks that have profoundly affected our approach to map human diseases and traits. These include the completion of the Human Genome Project ( 135 ), the deposition of millions of SNPs into public databases (dbSNP), rapid advances in high-throughput genotyping technologies ( 136 ), and the completion of the International HapMap Project providing haplotype (i.e., LD) maps of the human genome in diverse populations ( 96 ). These advances have made it possible to extend candidate-gene association studies to hypothesis-free GWA studies, in which a dense set of variants across the genome is genotyped to survey the contribution of common (MAF у 5%) genetic variation to disease or quantitative trait ( 84,95 ). In a short time, this approach has provided novel insights into the allelic architecture and genetic basis of many complex traits and thus revolutionized gene hunting ( Fig. 2 ) ( 137 ). Based on the NHGRI GWA Catalog data, 422 publications have reported about 300 genomewide signifi cant loci ( P -value < 5 × 10 Ϫ 8 ) for a wide range of common diseases and traits since the early GWA studies in 2006 ( 138 ).
In GWA studies, large-scale genotyping platforms (arrays or chips) are used to assay hundreds of thousands of SNPs simultaneously. The density of SNPs in the early genotyping arrays ( ‫ف‬ 100,000) was insuffi cient to capture a large proportion of common genetic variation. However, higher density platforms, comprising 300,000-1,000,000 markers, have rapidly evolved since the completion of the second phase of the HapMap Project, which provided a haplotype map of over 3.1 million SNPs ( 139 ). Genotyping platforms of 500,000-1,000,000 SNPs have been estimated to capture 67-89% of common SNP variation in European and Asian ancestry and 46-66% in African ancestry ( 139 ). Hence, as of yet, even the most current GWA platform provides an incomplete coverage. Furthermore, these estimates were based on the data available in Hap-Map, and considerably more common variants are expected to exist in the human genome (>10 million) ( 140 ). Therefore, there are numerous SNPs not captured (i.e., tagged) by the current HapMap data and thus by current the low HDL-C group were heterozygous for a mutation in ABCA1, and four individuals were carriers of a mutation previously identifi ed in Tangier patients.
To date, no genetic defi ciency has been reported for PLTP although in vivo and in vitro studies have demonstrated its key role in HDL-C metabolism ( 14 ). This may be partially explained by its pattern of expression, suggesting functions in organs such as brain, lung, and the gonads ( 126 ). The gene was screened for sequence variants in low HDL-C population in two independent studies, in 124 individuals with HDL-C levels below the 10 th percentile for age and sex ( 31 ) and in 276 subjects below the ‫ف‬ 20 th percentile for age and sex, as well as in 364 matched controls ( 127 ). Mutations in PLTP were uncommon in these low HDL-C populations.  127 ) identifi ed an intronic SNP (rs2294213) more prevalent in the controls (4%) than in the low HDL-C subjects (2%) and associated with higher HDL-C levels ( P -value < 0.001). Given these fi ndings, the authors also sequenced PLTP in 107 subjects with high HDL-C levels (>80 th percentile for age and sex) in a subsequent study ( 128 ). In agreement with their previous fi ndings, the MAF of rs2294213 was markedly increased in the high HDL-C group (7%) ( 128 ), suggesting that SNPs in PLTP may contribute to the entire range of variation in plasma HDL-C. This fi nding is the fi rst evidence for a direct link between variation in PLTP and HDL-C levels in humans.
Overexpression and gene deletion of EL (LIPG) have been shown to infl uence HDL-C levels in biochemical and animal studies ( 14,129 ), but until recently, the evidence from human genetic studies was weak. Recently, the coding region of the LIPG gene was sequenced in 213 individuals with high ( у 95th percentile) and 372 individuals with low ( р 25th percentile) HDL-C levels from European descent cross-sectional cohorts ( 130 ). This study identifi ed a signifi cant excess of nonsynonymous variants unique to the high HDL-C cases ( P -value = 0.02), suggesting that inhibition of EL may be an effective way to increase HDL-C levels ( 130 ). Importantly, using an in vitro lipase activity assay, the authors were able to demonstrate that the high HDL-C variants significantly decreased the EL activity. This type of functional information becomes essential in resequencing studies, when allele frequencies are too low to detect statistical associations ( Fig. 2 ).
The angiopoietin-related protein 4 (ANGPTL4) gene is an excellent candidate for infl uencing TG and HDL-C levels, because it appears to inhibit LPL activity in animal models ( 131,132 ). The ANGPTL4 gene was recently sequenced in 3,551 participants of the Dallas Heart Study ( 133 ). A low frequency ( ‫ف‬ 2%) nonsynonymous variant (E40K), associated with higher HDL-C levels, was identi-genes, including GRIN3A and CLPTM1, were also implicated by GWA studies to contribute to HDL-metabolism, but these are mainly based on study samples ascertained for metabolic conditions [i.e., type 2 diabetes mellitus (T2DM) and hypertension] ( 122,151 ). However, these results were not confi rmed in larger meta-analyses GWA of mostly population-based samples ( 123,148 ).
GWA fi ndings have provided confi rmation that variation within or near several genes previously known to be involved in HDL metabolism do indeed contribute to HDL-C levels ( Table 2 , Fig. 1 ), suggesting that the newly identifi ed loci should also contain genes encoding proteins with roles in lipoprotein metabolism ( Fig. 1 ). Although the function of the previously known lipid genes has been well defi ned (described in detail in section 1) and several of them were known to carry rare Mendelian mutations (ABCA1, APOA1, CETP, LCAT, LIPC, and, LPL) ( Table 1 ) ( 20 ), for some genes like LIPG, PLTP, APOB, or LCAT, the previous genetic evidence was weak, not signifi cant, or had remained confl icting. Similarly, GWA fi ndings have now provided compelling evidence for association with common variants in ABCA1 and APOA1 ( 123 ). Inadequate statistical power and insuffi cient coverage may be the major reasons why prior candidate-gene studies did not identify these variants found in the GWA studies. Furthermore, candidate-gene studies typically survey SNPs within ± 5 kb, whereas in the GWA studies, strong association signals were observed up to 70 kb downstream of LIPG and LPL and 10 kb upstream of CETP ( 150 ) in regions that were previously considered as 'intergenic.' However, one of the most important observations of these fi ndings is that a single locus can harbor both common variants with weak or moderated effects and rare variants with large effects ( Fig. 2 ). This observation suggests that the novel loci are strong candidates for Mendelian dyslipidemias and that a full spectrum of risk alleles should be expected at susceptibility loci.
ANGPTL4 is a secreted protein that inhibits LPL activity by converting the enzyme from catalytically active dimers to inactive monomers ( 131 ). Genetic studies in humans revealed an additional role for ANGPTL4 in HDL metabolism that was not apparent from mice studies. The overexpression of ANGPTL4 in mice causes severe hypertriglyceridemia, whereas mice lacking ANGPTL4 have an increased LPL activity and low plasma TG levels ( 132,162 ). Sequencing studies in large population-based cohorts identifi ed a low-frequency ( ‫ف‬ 2%) nonsynonymous variant (E40K) that is signifi cantly associated with both low TG and high HDL-C levels ( 133 ). The HDL-C associated GWA variant, rs2967605 ( Table 2 ), was only nominally signifi cant with TGs ( P = 0.001) ( 123 ). However, because ANGPTL4 modulates TG levels by inhibiting the LPL activity, the effect of common (modest) variants in ANGPTL4 may be missed by measuring fasting TG genotyping platforms, especially in regions diffi cult to genotype.
It should also be noted that less common SNPs with MAF of 1-5% have not yet been thoroughly investigated in GWA studies mainly due to insuffi cient sample sizes.
Because such a large number of hypotheses are tested in a GWA study, the sample size required to obtain a statistically signifi cant result after correcting for multiple testing ( P -value < 5 × 10 Ϫ 8 ) ( 141 ) is rather large and thus expensive. Therefore, a multi-stage study design that optimizes the power while minimizing the overall amount of genotypes required has been utilized in GWA studies ( 142 ). In the multi-stage design, a subset of the study sample is genotyped in the fi rst stage, and SNPs that surpass a significance threshold are tested in a second, independent study sample to reduce the number of false positive results ( 142 ). However, because large samples are needed to detect variants with a small effect and/or low frequency, a multi-stage design has limited power to detect such variants given the reduced stage I sample-size. Alternatively, power can be increased by pooling information from multiple GWA studies in a meta-analysis ( 143 ). Although different GWA studies often utilized different SNP arrays, collaborators have exploited this problem by using SNP imputation ( 122,123,144 ). In imputation, additional genotype and LD information from a reference sample (e.g., HapMap data) is used to statistically infer unknown marker genotypes ( 145,146 ). Imputation methods and their accuracy have improved quickly, and it is now common to report association data at 3 million SNPs in GWA studies, of which р 1 million SNPs have been directly genotyped and the remaining imputed ( 122,123,144 ).
GWA STUDIES FOR HDL-C. To date, 10 GWA studies on HDL-C levels are available ( 122,123,(147)(148)(149)(150)(151)(152)(153)(154). The early stage of GWA studies lacked adequate power to detect variants (or alleles in LD) contributing to HDL-C levels due to insuffi cient sample sizes and low SNP coverage. Subsequently, well-powered studies that investigated lipid phenotypes by pooling samples ( у 5,000 individuals) to perform a meta-analysis have identifi ed many ( ‫ف‬ 40) genome-wide signifi cant loci for serum lipid levels ( 122,123,148,149 ). These meta-analysis GWA studies identifi ed 16 loci associated with HDL-C levels ( P -value < 5 × 10  Table 2 ). The magnitude of their genetic effect is, however, modest, and altogether, these common variants (or variants in LD with them) are estimated to explain a small proportion ( ‫ف‬ 10%) of the variance in HDL-C levels ( 123 ). Variations at about one-half of these loci were previously known to alter plasma lipid levels ( Table 2 ) ( 155 ). Seven of the loci are without direct prior evidence for a role in lipid metabolism, hence immediately suggesting new biological hypotheses ( Table 2 ). Some other novel ) with a specialized HDL phenotype. levels (as in the GWA studies). These variants would probably have more profound effects on postprandial TG levels that are directly hydrolyzed by LPL ( 155 ). Furthermore, because ANGPTL4 is also a serum hormone known to be involved in the regulation of glucose homeostasis, lipid metabolism, and insulin sensitivity ( 163 ), it could infl uence HDL-C levels through other mechanisms not dependent on LPL. The FADS gene cluster (FADS1/FADS2/FADS3) showed genome-wide signifi cant association with TG as well as with HDL-C levels ( Table 2 ) ( 123 ). FADS are known to be involved in the PUFA biosynthetic pathway ( 156 ). PUFA is a class of fatty acids with multiple desaturations, such as linoleic acids and ␣ -linolenic acids. Depending on the position of their fi rst double bond, PUFA are classifi ed as n-6 or n-3 (omega-6 pathway or omega-3 pathway, respectively). Both pathways share and compete for the same FADS (FADS1 and FADS2) for their biosynthesis. The PUFA composition, in particular omega-3 PUFA, has been shown to be associated with the metabolic syndrome, CAD, and T2DM, as well as psychiatric and immune-related disorders ( 164 ). SNPs at this locus have been previously associated with the fatty acid composition and concentrations ( 164,165 ). Interestingly, dietary omega-3 PUFA, a substrate of FADS1, is known to lower plasma TG levels and raise HDL-C levels ( 166 ). There are reports suggesting that the mechanism behind these effects of omega-3 is a reduced hepatic synthesis of VLDL and TGs ( 167,168 ). However, the association with FADS in the GWA studies may suggest reduced fatty acid concentrations as a more proximal effect of omega-3 ( 169 ).
A low-frequency (3%) nonsynonymous variant (rs1800961 [T130I]) in the HNF4A surpassed genome-wide significance level in a recent meta-analysis GWA study with 30,714 individuals studied in the combined stage I and II ( Table 2 , Fig. 2 ) ( 123 ). HNF4A encodes a transcription factor, and several of its target genes are known to be involved in the glucose and lipid metabolism, including many, if not all, apolipoproteins (apoA1, A2, A4, B, C2, C3, E), microsomal TG transfer protein, cholesterol 7 ␣hydroxylase, SR-BI, and PPAR ( 157 ). Mice with targeted mutation of Hnf4A have a dramatic decrease in LDL-C and HDL-C levels, and their HDL particles are small and lipid-poor ( 157 ). Mutations in humans are known to cause maturity onset diabetes of the young (MODY) type I, an early onset autosomal dominant form of T2DM ( 170 ). MODY type I patients also have signifi cantly lower apoAI, AII, and HDL-C levels compared with controls ( 171 ). Furthermore, the most common type of MODY, MODY3, is caused by mutations in HNF1A, a direct target of HNF4A ( 172 ).
A previous study in Japanese patients with T2DM has shown that HDL-C levels are lower in subjects with the HNFA4 T130I mutation (rs1800961) compared with noncarriers ( 173 ). This study has also demonstrated that the transcriptional activation of target genes, such as HNF1A, was reduced in the T130I mutation construct compared with wild-type construct using a luciferase reporter gene assay in hepatocytes. Interestingly, a variant upstream of HNF1A was associated with LDL-C, but not with HDL-C ( P -value > 0.05,) in the same meta-analysis GWA study ( 123 ). The association of HNF4A and HNF1A with distinct lipid phenotypes may reveal exclusive targets for these HNF transcription factors and should also lead to a more effective prevention of cardiovascular risk in MODY patients.
SNPs in the fi rst intron of GALNT2 were associated with both HDL-C and TG levels already in an early meta-analysis GWA study (n = 8,816) ( Table 2 ) ( 122 ). This is a fi ne example of a gene that would have not been identifi ed in a hypothesis-driven approach, as it does not exhibit any known direct connection to lipid metabolism. GALNT2 encodes for N-actetylgalactosaminyltransferase 2, which is involved in the fi rst step of O-linked glycosylation of proteins ( 161 ). O-linked glycosylations are known to regulate protein function; hence, it is hypothesized that GALNT2 affects HDL-C and TGs indirectly through the glycosylation of proteins involved in the lipid metabolism ( 174 ). For instance, LCAT, apoCIII, VLDL, and LDLR are all O-glycosylated with N-acetylgalactosamine residues (175)(176)(177). Recently, Edmondson et al. ( 178 ) have created a liver-specifi c GALNT2 transgenic mouse, as well as a GALNT2 knockdown mouse ( ‫ف‬ 90%) using siRNA. The hepatic overexpression of GALNT2 decreased HDL-C by approximately 20%, and the GALNT2 knockdown resulted in a dose-dependent increase of 24-37%. These results support GALNT2 as the causal gene at the associated locus. However, the exact mechanisms by which GALNT2 After the GWA studies, the biggest challenge will now be to unambiguously identify the underlying susceptibility genes and risk variants at each locus and, most importantly, to convert these novel associations into mechanistic insights.
POLYGENIC RISK SCORES FOR HDL-C LEVELS. The GWA studies have provided empirical evidence for a polygenic inheritance of dyslipidemia with many genes, each with a small effect, contributing to quantitative lipid levels. Dyslipidemia could therefore be thought of as the extreme of a continuous distribution of the genetic risk, with the combination of several lipid-related variations infl uencing an individual's risk. Hence, a set of variants that are associated with the trait, as aggregates, could improve the risk prediction and liability models of dyslipidemia. Such sets of aggregates are termed 'genetic risk scores,' 'allelic dosage,' or 'polygenetic risk scores' ( 106,123,183 ). Kathiresan et al. ( 123 ) have constructed a genetic risk score using 14 genome-wide signifi cant SNPs for HDL-C. The genetic risk scores were strongly correlated with plasma HDL-C levels. The HDL-C levels in the lowest and the highest deciles of the genotype score were 58 mg/dl and 46 mg/dl, respectively. Hence, the difference in plasma HDL-C levels between the most 'deleterious' and the most 'favorable' genetic risk score was 12 mg/dl. It should be noted, however, that the genetic risk score was calculated using only the most signifi cant allele for each loci, although there were multiple independent alleles at several of these loci that could contribute to the risk.

Future GWA studies
Although it was hardly possible to predict the genetic risk prior to the GWA studies (as only very few genetic risk factors and their effect sizes were known), the GWA fi ndings can explain a small proportion of the genetic variance ( ‫ف‬ 10%) ( 123 ), and thus are not yet practical for risk predictions ( 184 ). However, because most of the identifi ed GWA variants have an effect close to the detectable limit, there could still be numerous common risk factors with effect sizes too small to surpass the strict signifi cance thresholds of GWA studies but that may collectively account for a substantial proportion of the variance. In agreement with this inference, in a recent GWA study of neurological disorders ( 185 ), genetic risk scores were summarized across SNPs using liberal signifi cance thresholds ( P -values < 0.1-0.5). The variances explained increased in a stepwise fashion with decreasing significance thresholds, suggesting a genetic architecture that includes many more common variants with very small effects. Furthermore, the authors of this study also demonstrated using simulations that increasing the discovery sample size will substantially refi ne the genetic risk scores and increase the variance explained ( 185 ). Hence, even larger sample sizes (>20,000 individuals) infl uences HDL metabolism remain to be clarifi ed in future studies.
A tag SNP upstream of the genes MVK and MMAB was implicated in two meta-analysis GWA studies ( Table 2 ) ( 122,123 ). The MVK and MMAB genes are both regulated by the sterol-responsive element-binding protein 2, a transcription factor that controls the cholesterol homeostasis, through a shared common promoter ( 158 ). MVK encodes mevalonate kinase that catalyzes an early step in the biosynthesis of isoprenoids that leads to cholesterol production ( 158 ). Homozygote mutations in MVK are known to cause the rare metabolic disorder, mevalonic aciduria (MIM 610377), or a milder syndrome, hyperimmunoglobulinemia D (MIM 260920) ( 179,180 ). In agreement with the GWA fi ndings, patients of both disorders are presented with persistently low levels of HDL-C ( ‫ف‬ 29 mg/dl). The role of methylmalonic aciduria, MMAB, in lipid metabolism is less understood. MMAB encodes an enzyme that is involved in the formation of adenosylcobalamin, a compound derived from vitamin B-12 that is necessary for cholesterol degradation ( 159 ). The exact function of MVK and MMAB in HDL metabolism needs to be clarifi ed in future studies.
Two SNPs at 11p11.2 were associated with HDL-C levels, rs7395662 in a meta-analysis GWA of populationbased samples ( 148 ), and rs7120118 in a GWA study of the genetically isolated Finnish population ( Table 2 ) ( 149 ). The genes fl anking rs7395662, MADD, and FOLH1 have not been implicated in lipid metabolism. However, the SNP rs7120118 is within the LXR gene. LXRs are excellent candidates for HDL-C levels, because these nuclear receptors play central roles in the transcriptional control of several lipoprotein remodeling enzymes, such as LPL, CETP, and PLTP ( 160 ). Furthermore, previous studies have shown that LXRs regulate genes in the RCT pathway, including ABCA1, ABCG5/G8, and ABCG1 ( 181 ). However, although LXRA is an excellent candidate, the association with rs7120118 was obtained in 4,763 individuals without further replication, and there is no LD between rs7120118 and rs7395662 ( r 2 < 0.1). Hence, further studies in this region are necessary to determine with certainty which gene(s) is responsible for the association at this locus.
The function of TTC39B is not understood yet. The tetratricopeptide repeat motif is an ancient proteinprotein interaction domain found in a number of functionally different proteins, with the majority of them participating in the cell cycle control, transcription, protein transport, and protein folding ( 182 ). Based on differential expression of the TTC39B transcript by the rs471364 genotype, TTC39B was suggested as the functional gene for this locus ( 123 ). The allele associated with higher HDL-C levels was also associated with lower TTC39B transcript levels in liver samples ( 123 ). Mapping genetic variations that underline individual differences in quantitative levels of gene expression [expression QTLs (eQTLs)] is an important mechanism for understanding complex traits.
tions. For example, in stage 2 of their GWA study, Konner et al. ( 152 ) investigated SNPs implicated in stage 1 and potentially functional SNPs near prespecifi ed genes in a sample that included 2,528 Mexicans. A nonsynonymous SNP in ABCA1 (rs9282541, R230C), which is exclusive for Amerindian and Amerindian-derived populations such as Mexicans ( 188 ), displayed genome-wide signifi cance with a large allelic effect (4.3%) on HDL-C levels. Furthermore, because only a small fraction of the common variants are directly or indirectly (imputed) genotyped in the current GWA studies, the lipid-associated SNPs might not represent the actual functional variants but rather be in LD with the causal variants. As LD relationships vary between populations ( 96 ), studies in diverse populations can also assist in fi ne-mapping the actual regional susceptibility variant(s). Additionally, it is necessary to establish whether the confi rmed loci have a consistent effect across ethnic groups ( 189 ).
Recent studies have examined whether the GWA risk variants are also involved in dyslipidemia in diverse populations, such as African-Americans, Mexicans, and Japanese (190)(191)(192). Most of the examined variants did not replicate for HDL-C levels in these studies, suggesting that a comprehensive investigation will be required to verify or exclude the relevance of the Caucasian loci in other populations. Furthermore, even if a variant has an effect in all ancestry groups, it might be more prevalent in one population and thus more easily detected in that particular population ( 192 ). Consequently, the contribution of each locus to the disease susceptibility may vary, and hence the implications of SNPs to public health can differ between populations.
Another reason for extending analyses to samples from diverse populations is that there is a clear difference in the distribution of HDL-C levels and thus the prevalence of low HDL-C across ethnic groups ( 193 ). Asian and Hispanic populations have the lowest concentrations of HDL-C, whereas Africans have the highest ( 193,194 ). These fi ndings remain consistent even after adjusting for cardiovascular risk factors and lifestyle. These differences are attributed to genetic and specifi c lifestyle factors as well as their interactions ( 193,194 ). For example, Africans have a higher allele frequency of the rs2070895 and rs1800588 variants in LIPC that lead to a lower HL activity and a lower percentage of central obesity than in Caucasians, which are both consistent with higher HDL-C and lower TGs among Africans ( 194 ).
Taken together, there are excellent reasons for investigating HDL-C in diverse populations, because it provides an opportunity to reveal population-specifi c genetic and environmental determinants of HDL-C as well as to assist in fi ne-mapping of risk variants at the confi rmed loci. Given the lessons learned from the existing GWA studies, large samples from diverse ethnic groups will be needed. Currently, comprehensive genotyping resources are limited to European, Chinese, Japanese, and Yoruba (African) descent (the HapMap populations) ( 139 ). However, the data that are soon to emerge from the 1000 Genomes Project ( 186 ) on broad geographic regions will enable more would be necessary to explain some of the remaining heritability. These data suggest that we are still in the linear phase of genetic discoveries for lipid traits (i.e., an increase in sample size increases the number of identifi ed loci), and thus potentially hundreds to thousands of very small effects could be identifi ed utilizing excessively large samples (Mega-GWA).
Mega-GWA studies may also provide adequate power to detect statistical associations with lower frequency alleles. Although low frequency variants (1-5%) are speculated to explain a substantial proportion of the missing heritability, they are poorly represented on existing genotyping platforms, because they are diffi cult to discover using the sample sizes of current reference panel (such as the HapMap sample). However, the 1000 Genomes Project will soon complete the genome-wide resequencing of more than 1,000 individuals from world-wide populations (using next-generation sequencing technologies discussed below) ( 186 ) and thus dramatically improve the catalog of both common and low frequency variants. The specifi ed goal of this project is to identify >95% of the variants with allele frequencies >1%, as well as to identify >95% of the variants with allele frequencies >0.1-0.5% in exons (http://www.1000genomes.org). Hence, these data from the 1000 Genomes Project will be used to produce even more comprehensive GWA arrays, and it is expected to facilitate the investigation of low frequency alleles. Furthermore, imputation effectiveness and accuracy can also be improved by using these extensive data measured in a large reference panel.
GWA studies in dyslipidemic cases. So far, GWA studies for lipids have examined the concentrations of HDL-C in study samples that were not ascertained for dyslipidemia. The early GWA studies were mainly based on participants ascertained for T2DM or other metabolic conditions ( 122,144,151 ), and the latest studies were predominantly metaanalyses of population-based cohorts ( 123,148 ). Using population-based cohorts has the advantage that the association results should not be obscured by disease process or medication ( 187 ). However, genetic variation might play a more pronounced role in the relevant disease population compared with the general population. Accordingly, it is likely that certain determinants of low HDL-C were missed by the GWA studies in normolipidemics and that some additional specifi c variants and genes will be identifi ed in GWA studies of HDL-C cases and controls.
GWA studies in diverse populations. Furthermore, to date, nearly all GWA studies have been performed in samples of European ancestry. Non-Caucasian cohorts have been included in only one of the early (i.e., a smaller scale) GWA study for lipids ( 152 ). It is likely that additional variants and loci will be identifi ed in studies based on populations with different ethnicities and demographic histories, because allele frequencies and patterns of LD vary across populations ( 96 ). Hence, each population offers an opportunity to reveal novel susceptibility alleles as well as population-specifi c variants and environmental interac-

Applications of next-generation sequencing technologies for HDL-C genetics
Large-scale sequencing will now allow investigation of low frequency (1% < MAF < 5%) and rare (MAF < 1%) variants with high-to-moderate effects. The seminal work by Cohen et al. ( 124 ) established the contribution of these types of variants to HDL-C levels in the general population. However, previously these rare variants have been mostly missed, because they are not frequent enough to be captured by current GWA studies and their effects are not large enough to be detected by linkage studies. Over the past 4 years, rapid advances in sequencing technologies have replaced Sanger sequencing applications with newer methods that are collectively referred to as next-generation sequencing ( 204 ). Next-generation sequencing technologies can process millions of sequence reads in parallel, and they have thus reduced the cost and increased the throughput of genomic sequencing by more than 3 orders of magnitude ( 204 ). Next-generation sequencing is expected to revolutionize genetic research, allowing for systematical resequencing of entire genomes, all genes, or targeted regions in large study samples to identify the entire spectrum of variation (including structural variants) ( 205 ). Routine sequencing of complete genomes will require even cheaper technologies. However, fi rst projects such as the 1000 Genomes and targeted resequencing of confi rmed GWA regions are already on-going ( 186,206 ).
The current limitations of next-generation sequencing technology are that they produce short read lengths of 50-100 bp and thus a high degree of sequencing depth, typically 20-to 40-fold, is necessary to achieve a complete coverage. Furthermore, the error rates of raw sequence data are higher than with Sanger sequencing. Although the overall error rate can be reduced by a high degree of sequencing depth, the greater depth is more expensive ( 204 ).
Targeted resequencing of confi rmed GWA regions. Deep resequencing of the associated genomic regions (i.e., fi nemapping) may pinpoint the causal genes by identifying coding mutations (e.g., nonsense mutations) ( 206 ). Furthermore, as susceptibility genes may harbor both common variants with weak effects and rare variants with large effects, such as the loci for ABCA1, APOA1, CETP, LCAT, LIPC, and LPL ( Fig. 2 ), sequencing the confi rmed GWA regions will allow to reveal the full spectrum of potential susceptibility variants and determine the entire genetic contribution (i.e., total variance explained) of the known loci.
Whole-genome resequencing studies. Previous resequencing studies of candidate-genes suggest that individuals at the extremes of HDL-C levels are more likely to carry lossof-function alleles ( 124,125 ). Thus, sequencing cases with high and low HDL-C levels will be important, especially in those regions with no prior association evidence. Because much larger study samples are required to determine associations with rare and low frequency variants than those needed for their discovery, and at present whole-genome effective GWA studies in diverse populations by optimization of SNP-genotyping platforms and increasing the accuracy of imputation.

Structural variation and its potential role in HDL-C genetics
Structural variations in which large genomic sequences (1-100 kb) are deleted, duplicated, inserted, or inverted were previously considered rare and pathogenic. However, recently due to advances in detection methods, their presence has been established in the general population ( 195,196 ). Because these types of variations have been mostly missed by gene mapping approaches to date, it is probable that structural variations may account for some of the unexplained heritability of complex traits. Copy number variations (CNV), deletions and duplications, have gained most attention due to their generality and potential dosage effect and because they are easier to assay. Recent estimates report that about 40% of the known genes are overlapped by CNVs ( 197 ).
Similar to SNPs, it is likely that both common CNVs (>5%) with small effects and rare CNVs with large effects will play a role in HDL metabolism. For instance, cases of HDL defi ciency due to rare alterations (deletions and inversions) in the APOA1/C3/A4/A5 gene cluster have been reported previously ( 198,199 ). More recent studies using improved detection methods have shown that about two-thirds of patients with heterozygous familial-hypercholesterolemia with no mutations in LDLR have structural variations in the LDLR gene ( 200 ). These fi ndings suggest that rare CNVs could also underline the HDL defi ciency cases with no known causative mutations. Likewise, a common CNV in the lipoprotein(a), kringle IV, has been shown to account for ‫ف‬ 50% of the variation in plasma lipoprotein(a) levels ( 197 ). However, as of yet, CNVs (rare and common) have not been comprehensively explored in genetic studies for lipids, mainly because of the challenges in yielding accurate genotypes ('CNV calling') required for association analysis (as opposed to the lower accuracy needed for CNV discovery) ( 201 ). Furthermore, although genotyping platforms of SNPs and CNVs (hybrid arrays) have been developed ( 202 ), their coverage is unclear, as the catalog of CNVs remains incomplete. Nevertheless, their use should provide insights into the contribution of CNVs to serum HDL-C levels. A recent study for premature CAD failed to identify associations with common CNVs or to detect greater burden of rare CNVs in CAD cases when compared with controls using a hybrid array ( 203 ). However, the rapid increase in the number of sequenced genomes (1000 Genomes Project) and the progress in CNV detection methods are expected to generate high-resolution maps of CNVs and thus to improve the current genotyping products. Furthermore, as CNVs mutate at a lower rate than other structural variations, many CNVs are in LD with nearby SNPs and can thus be tagged by them ( 201 ). Hence, a careful selection of tag SNPs may circumvent the limitations of genotyping CNVs. fected relatives are more likely to share the interacting loci (gene-gene interactions are discussed below). Family data also allows investigation of the mode of inheritance, haplotypes, parent-of-origin effects, and nonclassical inheritance patterns (such as CNVs).
Candidate-gene resequencing studies have demonstrated that rare mutations in known HDL metabolism genes are collectively frequent among low HDL-C cases ( ‫ف‬ 12-16%) ( 31,124,125 ). Thus, to reduce genetic heterogeneity, one of the main challenges in linkage analysis for complex trait, resequencing should be utilized to identify known predisposing variants in probands of family samples. Exclusion of families with known mutations from the analysis (or accounting for it in the model) could thus enhance the use of linkage for the polygenic HDL-C trait.

Gene by gene and gene by lifestyle interactions
The current estimates of variance explained are based on the simplifi ed assumptions that variants interact in an additive manner, although gene by gene and gene by lifestyle interactions are likely to have signifi cant roles in lipid metabolism. Given our limited power and understanding of interaction models, they have not been thoroughly explored. Yet genetic effects may in fact be larger and more easily detected in the context of other exposures and/or combinations of variants. These types of interactions have previously been shown to produce clinically relevant effects on serum HDL-C levels ( 215 ); for instance, a significant gene by gene interaction between common ABCG5/ G8 (rs3806471) and ABCA1 (rs4149272) variants on HDL-C concentrations has been demonstrated ( 216 ). The authors of this study suggested that the upregulation of the ABCG5/G8 genes may depend on the ABCA1 expression as a mechanism for this interaction effect. Furthermore, the observed interaction may help explain the inconsistent association between ABCG5/G8 and HDL-C ( 114 ). A signifi cant gene by gene interaction between common APOE (E2/E3/E4) and CETP (rs708272) polymorphisms on plasma HDL-C levels has also been observed ( 217 ). Similarly, this (disordinal) interaction effect may account for the controversial association results with APOE polymorphism and HDL-C levels ( 97 ). Yet, thus far, the impact of gene-gene interaction studies has been minimal, because the effects have been small and diffi cult to replicate. The diffi culties in replicating the results are most probably due to insuffi cient sample sizes. Larger study samples are required to test for gene-gene interactions, because the number of observations (subjects) becomes too small relative to the number of predictors (multi-locus genotype combinations) ( 218 ). Numerous experiments have shown that complex biological systems are more sensitive to combined perturbations, reviewed by Lehár et al. ( 219 ). For instance, many 'synthetic lethal' interactions are observed even when both single deletions have no effect on the system ( 220 ). Thus, gene-gene interactions are likely to have a signifi cant role in lipid metabolism, and they should be comprehensively investigated in extensive enough (>20,000) study samples.
resequencing of large numbers of samples is very expensive, the initial studies are focusing on coding regions (exome sequencing) ( 207 ). Highly parallel methods to capture thousands of exons for resequencing have recently been developed ( 208,209 ).
However, resequencing of all exons and the adjacent promoter region is still incomplete. The pilot phase of the Encyclopedia of DNA Elements project revealed that coding loci are more transcriptionally complex than previously thought and that the simple view of defi ned exons does not appear to be accurate ( 210 ). These data also propose that many susceptibility variants might affect the complex regulatory processes. For example, 10-15% of the top hits in recent GWA studies have shown strong cis associations with transcript levels of a nearby gene (cis-eQTL are discussed below) ( 123,211 ). These fi ndings suggest that sequencing efforts should be extended beyond the annotated regions for a more complete range of potential susceptibility variants.
In case of rare variants, each variant may be too rare to provide statistical evidence and variants within a gene must be aggregated into classes of similar functional consequences (collapsing methods) ( 212 ). Variants in the same gene can cause different phenotypic effects (e.g., APOA1); thus, the main challenge currently is to defi ne the functional annotation of variants, especially with variants in noncoding, promoter, or conserved elements. In contrast to GWA studies (i.e., common variants) that rely almost exclusively on statistical support, emerging evidence from other aspects of biology will become crucial for a proper analysis of rare variants. Furthermore, technical misclassifi cations can also substantially decrease the power of these collapsing methods, and thus an extremely high sequencing accuracy is required. When family members of the carriers are available, they may assist in confi rming and classifying which variants are potentially functional or neutral (discussed in the next section). Importantly, sequencing cases with a family history of a disease should enhance the probability of identifying causal variants. The large data of the 1000 Genomes Project ( 186 ) could also facilitate distinguishing between true and incidental (i.e., by chance) variants, as well as help discriminate between true and ancestry-specifi c variants (i.e., population substructure).

Family studies in investigation of private variants
Based on exome sequencing data, approximately 1,600-2,000 rare nonsynonymous variants are expected in an individual's genome ( 213 ). Furthermore, rare variants are more likely to affect protein function and cause distinct phenotypic effects than common variants ( 213 ). This suggests that private (i.e., rare) variants may play a larger role in polygenic low and high HDL-C levels than previously considered. Family studies can facilitate not only the detection of private variants but also their investigation ( 214 ), because the parental origin and cosegregation of the variant among the affected family members can be determined. Cosegregation analyses can also be useful in identifying epistatic loci and genetic modifi ers, because af-refl ect aggregate properties and cannot capture the full range of HDL effects. Hence, it is likely that some genetic effects would only be detected, or may be more profound with certain HDL subpopulations. Importantly, stronger signals with specialized phenotypes, such as HDL subpopulations, could suggest mechanistic hypotheses. Methodologies for quantifying HDL subpopulations (HDL 2 and HDL 3 ) and particle size (small, intermediate, and large) are only now becoming more accurate and available for larger study samples (such as NMR-based measurements). Two recent studies have examined specialized lipid phenotypes in thousands of subjects ( 123,229 ). One of these studies has investigated the signifi cant GWA loci with ApoA1, HDL 2 , HDL 3 , and HDL size among other specialized phenotypes ( 123 ). In several cases, they identifi ed stronger signals with a specialized HDL phenotype than with HDL-C levels. For example, variants in LCAT, LIPC, PLTP, and GALNT2 were more strongly associated with a large HDL concentration. The association of SNPs in MVK-MMAB and ANGPTL4 was strongest with HDL 3 particles and with apoA1 levels for a SNP in ABCA1. The second study performed a GWA in 17,296 Caucasian women with 22 plasma lipoprotein measures, including ApoAI and HDL particle size ( 229 ). The known loci, ABCA1, APOA1/C3/A4/A5, APOE/C1/C2/C4, CETP, FADS1/2/3, HNF4A, LIPC, LIPG, LPL, and PLTP, reached genomewide signifi cance with different HDL phenotypes ( Table 2 ). For example, the HNF4A variant was strongly associated with ApoAI levels, suggesting that this variant may infl uence HDL-C levels through transcriptional regulation of APOA1. Importantly, six loci that were not implicated with measurements of HDL-C in the current or previous GWA studies were identifi ed with various HDL particle sizes. These novel loci are within or near the genes APOA2, PCCB at 3q22.3, KLF14 at 7q32.2, WIPI1 at 17q24.2, CCDC92 at 12q24.31, and an intergenic region between ASCL and PAH (Table 2). Furthermore, the effect size and proportion of variance explained by several of these genome-wide signifi cant variants ( ‫ف‬ 20 SNPs) were larger with specialized phenotypes than with HDL-C levels, suggesting that the effect of loci may be higher than currently estimated due to imperfect measures of HDL particle concentrations.
Furthermore, it may be important to assess the effi ciency of RCT rather than merely determining the level of HDL, as the level does not necessarily directly refl ect the antiatherogenic properties of HDL ( 230,231 ). Kiss et al. ( 31 ) demonstrated that a low cholesterol-effl ux phenotype is common (33%) and heritable in low HDL-C cases (<10th percentile for age and sex) even in the absence of ABCA1 mutations. Hence, future studies assessing cholesterol effl ux may reveal novel genes infl uencing the RCT pathway and thereby the susceptibility to CAD. Additionally, although the majority of lipid metabolism takes place in the postprandial state, our knowledge about genetic effects in this state are scant. At present, postprandial lipid levels are not generally available, because a calibrated meal is required (as well as certain diet and physical restrictions). Genetic effects may be more profound after fat intake, but HDL-C levels are also modifi able through behavioral factors including diet, physical activity, alcohol intake, smoking, gender, and body size ( 87 ). Genetic variants may also contribute to inter-individual variability in HDL-C response to these factors ( 87 ). Numerous studies have examined this hypothesis by testing for interactions between known variants and lifestyle factors on HDL-C concentrations. For instance, an interaction between common SNPs in LIPG (rs6507931) and LIPC (rs2070895) with physical activity on HDL-C levels has been observed ( 221,222 ). The association with HDL-C levels of another common variant in LIPC (rs1800588), which is in almost perfect LD with rs2070895, was shown to depend on the amount and type of fat consumption in both Caucasians ( 223 ) and African-Americans ( 224 ), where this variant is much more prevalent (MAF of 20% vs. 50%, respectively) ( 194 ). Similarly, the newly identifi ed GWA genes, MVK and MMAB, have been recently shown to modulate plasma HDL-C levels depending on carbohydrate consumption ( 225 ), which may help elucidate their role in HDL metabolism. A geneenvironment interaction was also observed between the TaqIB polymorphism in CETP (rs708272) and alcohol consumption ( 226 ).
There is a marked sexual dimorphism in HDL levels ( 26 ). Therefore, the possibility of an interaction between gender and genetic factors has been explored in several studies ( 227,228 ). Recently, a GWA study of 18,245 women from a population-based prospective cohort identifi ed a novel locus for HDL-C at 2q24.3 near the genes COBLL1 and GRB14 ( 154 ). In a separate analysis of men (n = 501) and women (n = 168), a larger effect among women (7.3 vs. 1.7 mg/dl) and a signifi cant genotype by gender interaction ( P -value = 0.002) were observed ( 154 ). This interaction may help explain why this locus was not found by other GWA studies that investigated both men and women ( 123,148 ).
Although these types of fi ndings could be easily adopted to motivate individuals at increased genetic risk for a healthier lifestyle, thus far these fi ndings have not been valuable for clinical practice. Gene by lifestyle interaction studies bear the same complications as gene-gene interactions studies. Furthermore, environmental exposures, unlike genotypes, are diffi cult to measure if not impossible. For instance, all of the studies mentioned here used selfreported questionnaire to assess diet and/or physical activity. Moreover, the design of these studies, a cross-sectional design, cannot address the mechanism by which the environment and genotype interacts, as this could only be determined by a controlled intervention ( 223 ). Nevertheless, these studies suggest that certain genetic effects may only be observed in the context of environmental exposures, and that much larger samples, comprehensive studies, and superior surrogates of environmental exposures, such as plasma and/or epigenetic markers, will be necessary for the identifi cation of gene by lifestyle interactions to lead to practical implications.

Specialized HDL-C phenotypes
Because HDL is an extremely heterogeneous mixture of particles ( 8,10 ), conventional HDL-C measurements gene expression is probably an important mechanism infl uencing complex traits.
In a recent GWA study, cis-eQTL analyses of HDL-C associated variants and liver transcript levels have identifi ed several potentially functional variants ( 123,235 ). For example, the expression of PLTP was associated with a near-by associated variant (rs7679). Similarly, a promoter variant in LIPC (rs10468017) was associated with LIPC gene expression. An implicated variant at the FADS1/2/3 gene cluster (rs102275) showed associations with both FADS1 and FADS3, but not with transcript levels of FADS2, and a variant in the MMAB-MVK region (rs2058804) was associated with MMAB, but not with MVK gene expression levels. Importantly, cis-eQTL analysis allowed determining the most potentially functional gene, TTC39B, at the 9p22 locus, although the function of this gene is poorly understood.
Gene expression analyses can also reveal novel HDL susceptibility genes. Göring et al. ( 236 ) identifi ed 67 cisregulated transcripts in lymphocytes that correlated with plasma HDL-C levels. Vanin1 (VNN1), the most promising gene, produces cysteamine, a potent antioxidant that prevents lipid peroxidation (oxidative degradation). Importantly, promotor variants associated with transcript levels of VNN1 were also associated with plasma HDL-C levels.
Although the role of VNN1 in HDL biology remains to be established, this fi nding demonstrates that gene expression methods are an important addition to disease gene mapping.

Systems biology approaches for the discovery of HDL genes.
Systems biology approaches employ biological analyses that look beyond individual genes or proteins by examining multiple elements of a system, or the system as a whole ( 234 ). Because genetic and environmental factors infl uence only a small number of studies have measured postprandial lipids and typically in a very small sample set ( ‫ف‬ 100 individuals) ( 232 ).

Evidence from other aspects of biology in the study of HDL genetics
Emerging evidence from other aspects of biology, such as gene expression analyses and in vitro and in vivo experiments are crucial for establishing the actual causal genes regulating HDL-C levels and the functional consequences of the implicated variants. Furthermore, simplifi ed models, such as main effects, may fail to capture the complex inheritance of HDL regulation ( 233 ). Notably, HDL-C levels are strongly associated with a cluster of inherited factors that contribute to CAD (i.e., component traits of the metabolic syndrome), that should also be taken into account to improve existing models. Hence, alternative models and strategies that incorporate refi ned phenotypes and multiple lines of evidence, such as systems-based approaches, will help better address the complexity of HDL metabolism ( 234 ).
Gene expression analyses. Gene expression levels are directly modifi ed by polymorphisms in regulatory elements and can therefore be mapped as a quantitative trait locus eQTL) ( 211 ). eQTL analysis can be utilized to characterize the underlying gene(s) and the function of implicated variants, because it can provide a link between the diseaseassociated marker and the expression of a specifi c gene. In practical terms, eQTLs analysis evaluates the effects of DNA variants on a measurement of gene expression in a relevant cell or a tissue from multiple individuals. Thus far, 10-15% of the top hits in recent GWA studies have shown strong cis associations with transcript levels of a nearby gene (cis-eQTL) ( 211 ), suggesting that variation in Fig. 3. State-of-the-art and future strategies for identifying gene variations that contribute to HDL-C levels, making the most of existing and future studies. structural, rare, and low-frequency variants. Furthermore, an unbiased search using whole-genome sequencing of unexplained cases of extremely low and high HDL-C will undoubtedly reveal new factors that regulate HDL-C levels and hopefully provide novel and safe pharmaceutical targets. Ultimately, with the full collection of causal variants, it may be possible to assess and predict the relation to CAD. However, the main challenge associated with these new approaches lies in the interpretation of the variants. Through rigorous replication, large samples and systematic efforts to obtain multiple lines of evidence, a more comprehensive understanding of the genetics of HDL metabolism is sure to emerge. phenotypes indirectly by perturbing biological pathways (networks), analyzing individual components of a system may not be suffi cient. Furthermore, although alterations in individual genes may be subtle, coordinated changes across multiple members of a network could be detected more easily, and thus they can highlight patterns associated with disease ( 237 ). Systems biology (also known as network analysis) uses technologies that can simultaneously interrogate many elements of the system ( 234 ). For example, genotyping, gene expression, and chromatin remodeling (epigenetic) arrays, as well as high-throughput mass spectrometry, in combination with computational and statistical tools, can be utilized to reconstruct networks underlying complex traits, such as plasma HDL-C levels ( 238 ). A key concept in systems biology is integrating complementary data sets. A particularly powerful analysis for complex traits involves the integration of common DNA variation, global expression, and clinical phenotypes ( 238 ). Currently, this type of analysis is mainly applied to model systems, such as mice, but recent analyses of liver and adipose biopsies clearly demonstrate that these network approaches are applicable to complex traits in humans ( 235,239,240 ). Furthermore, by integrating cross-species network data of human and mice, Schadt et al. ( 235 ) were able to validate novel GWA susceptibility genes for LDL-C levels: PSRC1, CELSR2, and SORT1. Importantly, these data also elucidate the molecular networks in which these genes operate and why changes in the networks may lead to changes in lipids and disease. Such fi ndings can then be used to formulate hypotheses, tested through manipulation of gene expression levels using cellbased RNAi or overexpression technologies, and ultimately validated in human clinical studies. Although further integration of several additional aspects of lipoprotein biology, such as protein levels (proteome), metabolites (metabolome), chromatin changes (epigenetics), and environmental interactions remain a challenge, network analyses are likely to lead to important novel insights in the next few years.

CONCLUSIONS
Our understanding of the allelic architecture and genetic basis of low and high levels of serum HDL-C has rapidly progressed due to recent advances in high-throughput technologies ( Fig. 2 ). The modest genetic effects that are typically identifi ed agree with a polygenic inheritance and suggest that future studies will require considerable efforts and integrative, alternative approaches to reveal additional genetic determinants ( Fig. 3 ). Large GWA studies (metaand mega-GWA) will probably remain an effi cient approach for investigating the remaining heritability. However, future studies should focus on diverse populations, dyslipidemic and family study samples, as well as refi ned, specialized phenotypes. Genome-wide searches should also integrate environmental modifi ers, such as diet, smoking, obesity, and alcohol, into the analyses. Importantly, the signals from the GWA studies could suggest genomic regions for other types of variants, such as