Identification of CAD candidate genes in GWAS loci and their expression in vascular cells.

Recent genome-wide association studies (GWAS) have identified 35 loci that significantly associate with coronary artery disease (CAD) susceptibility. The majority of the genes represented in these loci have not previously been studied in the context of atherosclerosis. To characterize the roles of these candidate genes in the vessel wall, we determined their expression levels in endothelial, smooth muscle, and macrophage cells isolated from healthy, prelesioned, and lesioned mouse aortas. We also performed expression quantitative locus (eQTL) mapping of these genes in human endothelial cells under control and proatherogenic conditions. Of the 57 genes studied, 31 were differentially expressed in one or more cell types in disease state in mice, and the expression levels of 8 were significantly associated with the CAD SNPs in human cells, 7 of which were also differentially expressed in mice. By integrating human and mouse results, we predict that PPAP2B, GALNT4, MAPKAPK5, TCTN1, SRR, SNF8, and ICAM1 play a causal role in the susceptibility to atherosclerosis through a role in the vasculature. Additionally, we highlight the genetic complexity of a subset of CAD loci through the differential expression of multiple candidate genes per locus and the involvement of genes that lie outside linkage disequilibrium blocks.


Microscopy/Oil Red O staining
Dissected aortas were frozen in the OCT compound. Sections of aortic arch were fi xed onto Superfrost slides and stained with Oil Red O dye. Microscopy photos were taken at 20× magnifi cation.

Mouse endothelial cell isolation
The cell isolation protocol was adapted from a previously published method ( 11 ) (supplementary Fig. I). After a mouse was euthanized, the chest cavity was opened, and lungs, trachea, and esophagus were removed, and the aortic arch was excised under a dissecting microscope. Care was taken to maintain consistency in dissection of the arch region. No perfusion of the vasculature was performed. After rinsing in cold PBS, the vessel was placed on a glass slide, the surrounding connective tissue was removed, and the aorta was opened en face. To visualize the endothelial layer, the opened aorta was stained with 30 l hematoxylin for 3 min. The stain was rinsed off with cold PBS. The collagenase liberase blendzyme 2 (Roche) was diluted 1:100 with PBS, and 25 l was added to the top of the aorta and incubated at 37°C for 8 min. The slide with collagenase-treated aorta was then placed under a dissecting microscope, and the ECs were gently pried off using a 26 gauge needle. This process continued until all ECs were removed, as determined by the lack of hematoxylin-dyed nuclei on the surface of the sample. The liquid containing the ECs was then pipetted with a thin pipet tip into RNA extraction buffer.

Ldlr
Ϫ / Ϫ mice at 16 weeks of age either on chow or Western diet were injected intraperitoneally with 4% thioglycollate (Brewer Thioglycollate Medium, BD#211716). On the fourth day after injection, macrophages were collected through lavage of the peritoneal cavity using PBS buffer. Collected cells were treated with ACK lysis buffer to remove red blood cells and were then cultured overnight in DMEM media containing 20% FBS. The next morning the cell culture dishes were rinsed with PBS, and RNA from adherent cells was collected using buffer RLT from the Qiagen RNeasy kit.

Human cell culture and treatment
Human aortic endothelial cells (HAEC) were isolated from aortic explants of 147 heart transplant donors in the UCLA transplant program and grown to confl uence in 100 mm dishes as described previously ( 12 ). All protocols involving humans were approved by UCLA Institutional Review Board. Cell purity was 95% as indicated by positive staining for platelet endothelial cell adhesion molecule 1 as well as factor 8 and by acetylated LDL take up assay. At 100% confl uence, cells were treated in duplicate with either M199 media (Mediatech, Manassas, VA) containing 1% FBS (HyClone; Thermo Scientifi c, Logan, UT) or additionally with 40 g/ml oxPAPC for 4 h.

Total RNA extraction
Total RNA from mouse aortic endothelial cells (MAEC) was extracted using the Ambion RNAqueous®-Micro Kit following the manufacturer's instructions. The buffer and collagenase solution containing scraped ECs was pipetted directly into 100 l of Lysis buffer in a 200 l PCR tube, vortexed, and then incubated at 42°C for 30 min prior to isolation. Total RNA from mouse SMCs was extracted from aortas with the intimal layer removed using procedure described above. The tissue was homogenized in Qiazol for 30 s, and RNA was isolated with the Qiagen RNeasy kit. The complete removal of cells present in the intimal layer was observed by lack of cells stained with hematoxylin. Total RNA During atherogenesis, low density lipoproteins (LDL) accumulate in the intimal region of the vessel wall and over time become oxidized and otherwise modifi ed to generate proinfl ammatory species capable of activating ECs and inducing the surface expression of monocyte adhesion molecules ( 10 ). Certain oxidized phospholipid species in the modifi ed LDL, such as oxidized 1-palmitoyl-2-arachidonylsn -glycero-3-phosphocholine (oxPAPC), are likely to contribute to the infl ammatory response ( 10 ). This leads to the recruitment of blood monocytes and lymphocytes to the vessel wall. The monocytes differentiate to macrophages and then take up the modifi ed lipoproteins to give rise to the cholesterol-loaded "foam cells." In response to infl ammation, vascular smooth muscle cells (SMC) migrate from the media into the intima, forming a "fi brous cap." The intima of vessels with advanced aortic lesions is a heterogeneous collection consisting predominantly of ECs, macrophages, and SMCs. We therefore hypothesized that some of the CAD candidate genes would be involved with the disease process in these cell types present in the vascular wall.
In this study, we report our results from a two-pronged approach in understanding the contribution of candidate CAD GWAS genes to disease pathogenesis. First, we utilized a mouse model to examine the expression of these genes with disease progression in the three major cell types found in atherosclerotic lesions. Each candidate gene is characterized by its transcription in vascular cells in different stages of atherosclerosis. Second, we used an expression quantitative trait locus (eQTL) mapping approach to understand how the CAD risk variants correlate with gene expression in ECs from healthy aortas of 147 human donors under control and proatherogenic conditions. Together, these data provide fi rm directions for predicting the causal candidate genes in CAD risk loci and elucidating their roles in the disease pathology.

Animals
Unless otherwise indicated, C57BL/6J (BL6) male mice fed a chow diet were used for experiments. BL6 mice were obtained from the Jackson Laboratory or bred in our colony; BL6 Apoe tmUnc mice ( Apoe Ϫ / Ϫ ) were maintained in a colony in the University of California Los Angeles (UCLA) vivarium. Mice at both 4 and 24 weeks of age were used for the study; BL6 wild-type (wt) mice were used in samples representing healthy vascular cells, and BL6 Apoe Ϫ / Ϫ mice were used for prelesioned vascular cells. The aortas from BL6 Apoe Ϫ / Ϫ mice fed a chow diet at 4 weeks of age were considered prelesioned, and BL6 Apoe Ϫ / Ϫ mice at 24 weeks of age exhibited clear atherosclerosis. For macrophage studies, Ldlr Ϫ / Ϫ mice on a BALB/cJ background were either maintained on a chow diet for isolation of macrophages or placed on a Western diet (Open Source D12079B) at 8 weeks of age for 16 weeks for isolation of lipid-loaded macrophages, commonly referred to as foam cells. Cells were collected from four to six mice per condition. Mice were euthanized using isofl urane in accordance with UCLA Animal Resource Committee policies. All protocols involving mice were approved by UCLA Institutional Review Board.
SNPs were excluded because they failed the Hardy-Weinberg equilibrium test ( P < 10 Ϫ 4 ). Principal component analysis with 11 HapMap3 populations indicated the presence of multiple ethnicities among the HAEC donors (supplementary Fig. II); therefore, a mixed-model approach to account for the population structure as implemented in the EMMAX program was used to perform the association analysis ( 16 ).
We applied the linear mixed model: where = mean, x = SNP, ␤ = SNP effect, and u = random effects due to genetic relatedness, with Var(u) = g 2 K and Var(e) = e 2 , where K = IBS (identity-by-state) matrix across all genotypes in the panel. We computed a restricted maximum likelihood estimate for g 2 and e 2 , and we performed association based on the estimated variance component with an F test to test that ␤ does not equal 0. Expression values of the 18,630 probes in control and oxPAPC-treated conditions were considered as quantitative traits, and the association analysis of SNPs was carried out using the 574,391 informative SNPs that passed the fi ltering criteria. In the donor population, 108 males and 39 females were present; therefore, sex was considered as a covariate in the mixed model for association analysis. For cis (local) associations, we considered the SNPs within the ±1 Mb window on each side of the probe locations. We considered the remaining SNPs for trans (distal) associations. For cis and trans associations, 35,186,928 and 10,665,717,400 P -values, respectively, were calculated. FDRs were calculated using the Benjamini and Hochberg ( 17 ) approach for the cis and trans association P -values in the control and oxPAPC dataset separately. P -value cut-offs of 1.16 × 10 Ϫ 4 and 1.12 × 10 Ϫ 4 , which corresponded to 5% FDR, were used for the control and oxPAPC datasets, respectively, for cis associations. P -value cut-offs of 1.90 × 10 Ϫ 8 and 1.87 × 10 Ϫ 8 , which corresponded to 5% FDR, were used for the control and oxPAPC datasets, respectively, for trans associations. The results of the association analysis are available for user-friendly querying at systems.genetics. ucla.edu. This resource can be used to determine how EC gene expression is regulated by additional CAD-associated SNPs that may be discovered as a result of future GWAS and meta-analyses, as well as with diseases whose pathology may involve the vascular endothelium.

A majority of GWAS candidate genes are expressed in at least one of the cell types present in mouse atherosclerotic lesions
We determined the transcriptional profi les of ECs, SMCs, and macrophage/foam cells representing the three predominant cell types present in atherosclerotic lesions from healthy and diseased arteries, as well as lesions that represent a mixture of the three cell types ( Fig. 1 ). While BL6 wt mice do not develop atherosclerosis on a chow diet at 4 or 24 weeks, BL6 Apoe Ϫ / Ϫ mice fed the same diet are hyperlipidemic and their arteries are "prelesioned" at 4 weeks and have atherosclerotic lesions at 24 weeks (supplementary Fig. I). Therefore, in the context of CAD, we considered vascular cells taken from BL6 wt mice to be "healthy" and those taken from BL6 Apoe Ϫ / Ϫ mice to be "diseased." As BL6 Apoe Ϫ / Ϫ mice have not yet accumulated lipid deposits or macrophages at 4 weeks of age, the intimal cells taken from these mice are prelesioned endothelial cells, compared with healthy endothelial cells taken from from macrophage and HAECs was isolated using Qiagen RNeasy Kit. RNA quantity and quality was checked by the Nanodrop and the Agilent 2100 Bioanalyzer, respectively. Only RNA samples with RNA integrity numbers (RIN) of 7.0 or higher were used for subsequent processing. For mouse studies, RNA expression profi les from four to six mice per condition and for human studies RNA expression profi les from four samples per donor (two control-treated and two oxPAPC-treated) meeting quality control conditions were analyzed.

mRNA amplifi cation
Since the total amount of RNA obtained from MAECs was around 100 picograms, insuffi cient for global expression analyses, we amplifi ed the RNA using the NuGEN WT-Ovation One-Direct RNA amplifi cation system. The procedure followed manufacturer's protocol; preamplifi cation work was conducted in a sterile hood treated with RNase ZAP and DNA-OFF and using laboratory equipment exclusive to preamplifi cation to eliminate contamination.

cDNA synthesis and RT-qPCR
As the output of the NuGEN WT-Ovation One-Direct RNA amplifi cation system is cDNA, no separate cDNA synthesis was required for amplifi ed RNA for MAECs. Macrophage and SMC cDNA was synthesized with Applied Biosystems High-Capacity cDNA Reverse Transcription Kit. Roche and KAPA Biosystems SYBR green reagents were used for RT-qPCR, and reactions were processed on 384 well plates on the Roche LightCycler 480. Of the 55 genes we determined to be represented by published GWAS loci ( 8 ), mouse gene homologs were identifi ed for 52 genes and primers were designed for the mouse genes. Sequences for RT-qPCR primers are presented in supplementary Table I. Mouse gene 9530008L14Rik is the homolog of human gene C6orf105 , and mouse gene Zfp259 is the homolog of human gene ZNF259 . All other mouse genes share the names of their human homologs. Genes with Cp values less than 35 were considered to be expressed. Final expression values were calculated by normalizing the raw values for each transcript by the appropriate housekeeping gene (B2M for MAECs, lesions, and SMCs; 36B4 for macrophages), averaging the technical replicates, and then averaging these values for the four to six mice per condition.

Gene expression profi ling and analysis
Gene expression profi les of the human cells were determined using the Affymetrix HT HG-U133A microarray, which contains 18,630 probes. Intensity values were normalized with the robust multiarray average normalization method implemented in the affy package in Bioconductor ( 13 ). Expression data are available in Gene Expression Omnibus accession GSE30169. Differential expression of mouse genes was assessed using Mann-Whitney U test and corrected for multiple comparisons using the qvalue package in R ( 14 ). Multivariate analysis across cell types was assessed by Hotelling's T 2 test. Genes with false discovery rate (FDR) less than 10% were considered differentially expressed.

Genotyping and association analysis
HAEC genomic DNA was isolated using the Qiagen DNeasy kit. SNP genotyping was performed using the Affymetrix SNP 6.0 microarray platform as described previously ( 15 ). Microarray images were processed using the Affymetrix Genotyping Console 4.1 to make the SNP calls. SNPs were fi ltered out based on the following criteria: 296,063 SNPs with minor allele frequency less than 10%; 18,030 SNPs with less than 95% genotyping rate among the donors; 13,213 SNPs were set to missing because they were detected as heterozygous in haploid genotypes; and 3,890

Differential expression of GWAS candidate genes in disease conditions in mice
To determine whether the candidate genes were differentially expressed in diseased cells, we calculated the change in gene expression in the three cell types during lesion development using the healthy condition as the baseline ( Fig. 1 ). Thirty-one of the 46 GWAS candidate genes that are expressed in the vessel wall showed differential expression between the diseased and healthy states in at least one of the comparisons ( Fig. 3 , supplementary Table II). A majority of these genes are upregulated in diseased cells, with only 9 of them showing downregulation compared with healthy cells. None of the 31 candidate genes are differentially expressed in prelesioned medial SMCs, although 4 candidate genes showed differential expression in SMCs taken from atherosclerotic lesions in mice at 24 weeks of age. Twelve candidate genes showed differential expression in only one cell type: Adamts7 , Celsr2 , and Sort1 in foam cells; Cyp17a1 , Col4a1 , Gip , Snf8 , Srr , Tcf21 , Tctn1 , and 9530008L14Rik in prelesioned MAECs; and Pdgfd in the lesioned intima exclusively.
GWAS candidate genes with differential expression from healthy to disease state in more than one cell type Galnt4 exhibited statistically signifi cant differential expression in all four cell preparations, while seven genes showed dysregulation in the diseased condition in three of the cell types tested ( Fig. 3 , supplementary Table II). Col4a2 , Mia3 , Pcsk9 , and Ppap2b were upregulated in diseased MAECs, foam cells, and atherosclerotic lesions. Cxcl12 , Icam1 , and Psrc1 were upregulated in diseased MAECs and medial cells, but they were downregulated in their wt counterparts. The intimal samples taken from diseased mice at 24 weeks of age represent the heterogeneous cell composition present in advanced atherosclerotic lesions. Medial SMCs were collected from both healthy and diseased aortas at both prelesioned and lesioned stages. This set of atherosclerotic lesion cell types was completed by assaying the transcriptional profi le of macrophages compared with lipid-loaded foam cells resembling those found in lesions. We used Ldlr Ϫ / Ϫ mice fed a chow diet or Western diet for these experiments, as this background is required for peritoneal macrophages to form foam cells ( 18 ).
The 35 published GWAS loci for CAD represent 55 candidate genes reported in the original studies ( 3 ). We utilized RT-qPCR to assess the presence of the mouse homologs for 51 of these transcripts among the three sets of mouse cell types as well as among advanced lesions . Although 46 of the GWAS candidate genes were expressed in at least one of the cell preparations, there were 5 transcripts whose expression was below the detection threshold in all samples. We did not detect the expression of Abcg8 , Abo , Apoa4 , Apoa5 , or Kcne , suggesting that the involvement of these genes in the pathogenesis of atherosclerosis is not in the vascular wall ( Fig. 2 ). We did not identify any transcripts that were undetectable under the healthy condition of a cell type and detectable in the diseased condition, or vice versa. Thirty-six of the candidate genes were expressed in all three cell types. Pdgfd , Phactr1 , and Slc22a3 were only expressed in MAECs, SMCs, and lesions, while Cyp17a1 , Gip , Il5 , and Morf4l1 were only expressed in MAECs and advanced lesions.  to determine whether these 7 SNPs had signifi cant associations with nearby genes in cells other than HAECs, we queried the Phenotype-Genotype Integrator (ncbi. nlm.nih.gov/gap/PheGenI), the University of Chicago eQTL database (eqtl.uchicago.edu), and Genevar database ( 19 ) by looking up the associations of the 7 SNPs with nearby gene expression from various eQTL studies. In 4 loci, cis eQTLs were observed in monocytes obtained from 1,490 humans ( 20 ), and in adipose and skin tissues as well as lymphoblastoid cell lines obtained from 856 humans ( 21 ) ( Table 1 ). We observed that in four of the seven loci, the significantly associated gene was outside of the linkage disequilibrium (LD) block. For example, in the 7q32.2 locus, SNP rs11556924 encodes a missense mutation in the ZC3HC1 gene; however, we identifi ed that the SNP was associated with the expression level of the nearby KLHDC10 gene, which also had a signifi cant cis association in monocytes with the same SNP ( 20 ). The risk allele (C) was associated with higher expression of KLHDC10 ( Fig. 4 ). In the 17q24.12 locus, SNP rs3184504 encodes a missense mutation in the SH2B3 gene; however, in our dataset, its proxy SNP rs653178 was associated with the transcript levels of two nearby genes, MAPKAPK5 and TCTN1 . The risk allele (T) was associated with lower expression of MAPKAPK5 but higher expression of TCTN1 (supplementary Figs. IV and V). The association between SNP rs653178 and MAPKAPK5 was significant only in control cells but not in oxPAPCtreated cells, suggesting a differential role for this gene in various disease stages. In the 12q21.33 locus, SNP rs7136259, an intergenic SNP located between ATP2B1 and MRPLP2P1 , was associated with the expression level of the nearby GALNT4 gene, and this association was replicated in monocytes ( 20 ). The risk allele (T) was associated with lower expression of GALNT4 (supplementary Fig. VI). In the 19p13.2 locus, the risk allele (G) of the SNP rs1122608, located in an intron of the SMARCA4 gene, was associated with the lower levels of the nearby ICAM1 transcript (supplementary Fig. VII).
In three of the seven loci, the signifi cantly associated gene was within the same LD block as the variant, in some Phactr1 , Smg6 , Ube2z , and Zfp259 . Hierarchical clustering of the genes based on the level of dysregulation identifi ed two distinct clusters: one in which CAD genes are upregulated in diseased ECs and lesion cells, and another in which they are downregulated in macrophage/foam cells ( Fig. 3 ). To identify differential expression from healthy to diseased mice across all four cell types as a whole, we applied multivariate statistical testing for each gene and found statistical signifi cance across tissues in six genes (supplementary Table V). These differentially expressed genes were Col4a2 , Cxcl12 , Icam1 , Mia3 , Ppap2b , and Pik3cg , all of which were also determined as differentially expressed in separate cell types.

Genetic regulation of human endothelial genes in GWAS loci
To understand how the genetic variants associated with CAD perturb gene expression and thereby contribute to disease susceptibility, we studied the genetic regulation of transcript abundance in HAECs. We cultured the cells under control conditions or conditions in which HAECs were stimulated with oxPAPC for 4 h to mimic the activation of the endothelium with mmLDL . We observed that approximately 1,500 genes had a signifi cant cis association with nearby SNPs in control or oxPAPC-treated cells, 1,239 of which overlap. Trans associations in control and oxPAPCtreated cells were observed in 185 and 164 genes, respectively, and 22 of these overlapped between two datasets (supplementary Fig. III).
We then specifi cally focused on how the SNPs most signifi cantly associated with CAD in multiple GWAS loci ( 8 ) affect nearby gene expression. Of these SNPs, 17 were genotyped in our study. We determined proxy SNPs for 11 of them, but we did not have information for 7 of the peak SNPs (supplementary Table III). In 7 of the 28 CAD loci studied, we observed signifi cant association between the peak CAD SNP or its proxy SNP and the nearby transcript levels, suggesting that these endothelial genes play a role in disease susceptibility ( Table 1 ). It is plausible that the CAD SNPs are functional in tissues as well as in endothelial cells; therefore,  ( 21 ); l, lymphoblastoid cell line ( 21 ); m, monocytes ( 20 ); s, skin tissue ( 21 ).
(C) of the SNP rs216172 was associated with the higher expression of SRR (supplementary Fig. IX). In the 1p32.2 locus, SNP rs17114036 is located in an intron of PPAP2B gene. Its proxy SNP rs6588635 (r 2 = 0.831 in 1,000 genomes of European population) was associated with the PPAP2B transcript levels (supplementary Fig. X). The risk allele (A) was associated with lower expression of PPAP2B .

Differential expression of predicted causal human GWAS genes in murine vascular cells in vivo
To determine how the genes associated with human GWAS SNPs change in different stages of the disease, we evaluated their expression in ECs, lesions, SMCs, and macrophages in our mouse models. Among the 8 genes cases allowing for differentiation between multiple candidate genes reported per locus. For example, in the 17q21.32 locus, the risk allele (T) of SNP rs46522, located in the intron of UBE2Z gene, was associated with higher expression level of the SNF8 gene but not the other three genes in the same locus ( Fig. 5 , supplementary Fig. VIII). In monocytes ( 20 ), however, the same SNP was associated with the expression levels of two other genes in the locus, ATP5G1 and UBE2Z , suggesting a different mechanism for disease susceptibility in endothelial cells and monocytes. In the 17p13.3 locus, the CAD SNP rs216172 is located in an intron of the SMG6 gene; however, we observed that the proxy SNP rs7217226 for rs216172 (r 2 = 1 in 1,000 genomes of European population) was associated with the expression level of the nearby SRR gene. The risk allele

DISCUSSION
Epidemiological and family studies predict the genetic component of CAD risk to be 40% to 60% ( 8 ). While GWAS have identifi ed 35 genomic loci associated with a 7% to 37% increase in CAD risk, the majority of the identifi ed loci are not associated with known risk factors and as such represent novel pathways involved in disease etiology. CAD is a complex disease that involves a variety of cell types, tissues, and pathways. To study the role of these genes in the vessel wall specifi cally, we assembled a comprehensive dataset using an in vivo mouse model of atherosclerosis and in vitro aortic endothelial gene expression from approximately 150 human donors. Discrete samples consisting of three major cell types that contribute to atherosclerosis, ECs, SMCs, and macrophages, along with the heterogeneous mixture of cells that constitute atherosclerotic lesions in the intima, were assessed for changes in gene expression that correlate with the progression of disease. This dataset allowed for the thorough assessment of GWAS candidate gene expression in healthy and diseased with cis eQTLs, seven of them (all but Klhdc10 ) showed differential expression in one of the cell types in at least one of the disease conditions ( Table 2 ). Srr was upregulated in the prelesioned MAECs and this regulation was consistent with the association of higher expression of this gene with the risk allele of the CAD SNP in HAECs. Snf8 was upregulated in the SMCs from atherosclerotic mice and its high expression was associated with the risk allele in HAECs. Three cis eQTL genes in HAECs, Galnt4 , Icam1 , and Ppap2b , were dysregulated in different stages of atherosclerosis in the three cell types we studied in mice, suggesting a multicellular role for these genes. Mapkapk5 and Tctn1 dysregulation were in the opposite direction of genetic regulation by the risk SNP suggesting that these two genes play different roles in disease susceptibility and progression. Klhdc10 expression did not change with disease despite a signifi cant association with the CAD risk SNP. Taken together, these results suggest that the majority of the cis eQTL genes identifi ed in HAECs play a role in cells present in the vasculature during disease progression.  and provide information suggesting an additional role for this potential therapeutic target. Utilizing both in vivo mouse and in vitro human gene expression analysis, we can draw some preliminary conclusions for GWAS loci that represent more than one gene, as summarized in Fig. 6 and supplementary Table IV. The involvement of some loci appears to be relatively straightforward, such as the 21q22 locus. The genes SLC5A3 , MRPS6 , and KCNE2 are reported as candidates for this region, and our data indicate that only one gene, MRPS6 , is differentially expressed in only one cell type, SMCs, suggesting that this gene and cell type are responsible. Other loci, such as 17q21.32, seem to be more complex. For example, four candidate genes are reported for this locus: UBE2Z , GIP , ATP5G1 , and SNF8 ( 3 ). In mice, we fi nd that Atp5g is only differentially expressed in foam cells, Ube2z and Gip are differentially expressed exclusively in the intimal layer, and Snf8 is upregulated specifi cally in the medial SMCs. On the other hand, the CAD risk SNP was associated with SNF8 expression in HAECs but with ATP5G and UBE2Z in monocytes ( 20 ), adipose, and skin tissues, as well as in lymphoblastoid cell lines ( 21 ). Since all four of the genes in this locus are expressed in different cell types and at different stages of disease progression, these results suggest that this locus plays a complex role in contributing to CAD. The locus 11q23.3 also represents multiple genes: ZNF259 and APOA5-A4-C3-A1 . Although apolipoproteins have been well-established as contributors to heart disease through lipid metabolism, we found that another gene at this locus, ZNF259 , was upregulated in the prelesioned endothelium and intimal lesions, indicating the need for follow-up studies to identify the function of this gene. Another complex locus is 10q24.32, which harbors the genes CYP17A1 , CNNM2 , and NT5C2 and has been associated with CAD risk and blood pressure in humans ( 27 ). Our results show that only one gene, CYP17A1 , is differentially expressed in the prelesioned endothelium, suggesting vascular walls. Our data address two important challenges: the fi rst is the identifi cation of causal genes at GWAS loci, and the second is their expression patterns in disease-relevant cell types through which they contribute to disease.
Our mouse GWAS gene expression dataset provides new information for nearly all CAD candidate genes. There are many candidate genes for which no additional information linking them to the disease is available, such as the differentially expressed genes shown in Fig. 6 :  9530008L14Rik , Adamts7 , Cyp17a1 , Kiaa1462 , Mia3 ,  Morf4l1 , Phactr1 , Ppap2b , Sh2b3 , Smg6 , Srr , Tcf21 , Ube2z , and Zfp259 . Of the differentially expressed genes that have prior association with atherosclerosis, Celsr2 , Pcsk9 , Pik3cg , Psrc1 , and Srr , the cell type differentially expressing these genes had not been previously observed. It is probable that the six genes with undetectable transcripts in the vascular cell types are not directly active and do not affect atherosclerosis through action in the vasculature. The seven genes that are differentially expressed in both the prelesioned endothelium and the lesioned intima but with no differential expression in smooth muscle or macrophage cell type ( Fig. 3 ) may be induced in the endothelium early during atherogenesis and continue their dysregulation throughout the development of the atheroma. The seven genes that show differential expression in three cell types may be candidates for more widespread involvement in CAD. We can also use this dataset to gain further insight into the functions of well-studied candidate genes, such as PCSK9 . The gene product of PCSK9 promotes LDLR degradation and has a direct proatherogenic effect ( 22,23 ), and it has emerged as a new pharmacological target for hypercholesterolemia with various PCSK9 inhibitors now being evaluated in clinical trials ( 24,25 ). Although this protein has been reported as present in human atherosclerotic plaques ( 26 ), no previous studies describe expression in preatherosclerotic endothelial cells. Our studies suggest a role for PCSK9 in the arterial wall In the 1p32.2 locus, we predict PPAP2B to be the causal gene. PPAP2B encodes the lipid phosphate phosphohydrolase 3 (LPP3) enzyme, which belongs to the LPP family of cell-surface-associated lipid phosphatases that hydrolyze and inactivate lysophosphatidic acid (LPA) and the related lysophospholipid sphingosine-1-phoshate. PPAP2B also functions as a cell-associated integrin ligand ( 32,33 ). We observed that lower expression of PPAP2B was associated with the risk allele in HAECs, whereas its expression was upregulated in lesions and MAECs. This was consistent with different functions that have been attributed for PPAP2B in ECs and SMCs. In ECs, PPAP2B localizes to adherens junctions and regulates ␤ -catenin signaling, thereby affecting cell migration ( 34 ). In SMCs, it regulates lysophospholipid signaling responses and affects SMC phenotypic plasticity ( 35 ). Our results indicate that PPAP2B may be involved in susceptibility to atherosclerosis via its dual function in vascular wall cells.
In the 12q24.12 loci, the signifi cant association between the CAD SNP and nearby gene expression was outside the LD region. Based on our eQTL and in vivo vascular expression data, we predict that MAPKAPK5 and TCTN1 are causal genes for CAD risk in the 12q24.12 locus. These genes have not been previously studied in the context of atherosclerosis. MAPKAPK5 is a protein kinase that functions downstream of p38 in the MAPK pathway, a mediator of infl ammatory response ( 36 ). Studies have shown that it is active in endothelial cells, where it mediates EC migration and angiogenesis ( 37 ). Little is known regarding a role for TCTN1 in cardiovascular disease. It has a role in hedgehog signaling during development ( 38 ), and as it plays a role in ciliogenesis, mutations in this gene cause Joubert syndrome ( 39 ), a ciliopathy that affects the brain. There is evidence for ciliated endothelial cells in atherogenic, disturbed-fl ow regions of the vessel wall ( 40 ), and it is possible this gene functions in atherosclerosis through affecting ciliogenesis.
In the 7q32.2 region, based on HAEC and monocyte eQTL results, we predict KLHDC10 to be the causal gene. This gene is not well characterized, but it has recently been shown to regulate oxidative stress-induced apoptosis in multiple cell types ( 41 ). Because the expression levels of this gene were not regulated in the different vascular cell types with disease progression in vivo, it is probable that the SNP may play a regulatory role for the expression levels of this gene without consequences for increased CAD risk.
In the 17p13.3 and the 17q21.32 loci, we suggest that the human genes SRR and SNF8 are candidates for further study in vascular cells, based on corresponding expression data between the two different experimental approaches reported in this study. SRR codes for serine racemase, which synthesizes D-serine from L-serine. SRR variants have been associated with schizophrenia ( 42 ), and they are thought to be primarily expressed in the brain ( 43 ). SNF8 codes for a subunit of the endosomal sorting complex required for transport II (ESCRT-II) ( 44 ) and has that the CYP17A1 gene plays a role in increasing CAD susceptibility.
An additional facet of this GWAS locus complexity is that the candidate genes they represent may not necessarily be defi ned by LD blocks. Our eQTL results based on HAEC gene expression demonstrate that genes outside of the LD block of a given locus may also be candidates for involvement in disease susceptibility. Four of the seven CAD SNPs have signifi cant associations with nearby genes that are not in the same LD block as the SNP, perhaps signifying that these SNPs may tag regulatory variants rather than protein-coding regions. This is consistent with recent evidence from the 1,000 Genomes Project that showed only 19% of GWAS signals resulted in nonsynonymous changes in protein coding genes, suggesting that the majority of the GWAS signals are associated with the regulation of gene expression ( 28 ).
The integration of these gene expression data from two independent experimental systems allowed us to confidently predict that a subset of GWAS candidate genes contribute to CAD through a role in the artery wall. Some of the candidate genes we have identified have already been shown to play a role in disease progression. For example, GALNT4 expression is associated with a SNP that is located between ATP2B1 and MRPLP2P1 in the 12q21.33 locus in both HAECs and monocytes. The GALNT4 gene encodes the N-acetyl galactosaminyl transferase 4 enzyme and is thought to play an important role in endothelial-platelet interactions by O-glycosylating the threonine residues of the P-selectin glycoprotein ligand (PSGL-1) ( 29 ). Coding polymorphisms in this gene have been associated with incidence of myocardial infarction in an Irish population ( 30 ), and PSGL-1 interacts with P-selectin expressed on the surface of ECs and regulates tethering and rolling of monocytes on these cells. The CAD risk allele was associated with lower expression of GALNT4 in HAECs, and the transcript was downregulated in macrophages but increased in prelesioned MAECs, lesioned intima, and medial smooth muscle cells. Taken together, our results and reports from previous studies suggest that GALNT4 confers susceptibility to CAD via structural variants that affect its protein structure and regulatory variants in the 12q21.33 locus that affect its expression, thereby mediating platelet-endothelial interactions.
Another relatively well-studied locus is 19p13.2. Although we predict the causal gene with respect to vascular activity in this region to be ICAM1 , this locus also contains LDLR , a gene with a well-established role in cholesterol homeostasis and atherosclerosis. One possibility is that the CAD SNP is in LD with other SNPs that encode missense mutations in the LDLR protein, thereby affecting its function. On the other hand, ICAM1 has also been shown to play a critical role in the initiation of atherosclerosis ( 31 ), and we have observed upregulation in MAECs, SMCs from lesioned aortas, and macrophage foam cells ( Table 2 ). In the context of these fi ndings, it is probable that the CADassociated variants in this locus have a multitude of regulatory functions that affect the structure of LDLR and been shown to mediate sorting of ubiquitinated proteins for lysosomal degradation ( 45 ). These genes have not been shown to play a role in the etiology of atherosclerosis previously; however, our results suggest a role for endothelial involvement of these genes in atherosclerosis.
As additional GWAS studies are completed and new loci identifi ed, it is important to recognize that simply identifying a genetic variant will not reveal the extent of involvement of these loci in the disease process. In this study, we provide comprehensive expression data for 58 candidate genes in different cell types and disease conditions of the vasculature. We believe this resource will be valuable for researchers planning focused functional follow-up studies on these loci. Furthermore, as we have shown here, in many cases the signifi cantly associated loci play complex roles; therefore, in identifying genetic regions for follow-up studies, it will be useful to consider areas that may not be within the defi ned LD block and potential regulatory regions in addition to the candidate genes reported. Further, it is possible that the genomic regions marked by these loci contribute to disease through more than one pathway and in multiple cell types. Detailed biochemistry studies are required to understand the function of candidate genes in the vascular cells in relation to CAD susceptibility.