Integration of QTL and bioinformatic tools to identify candidate genes for triglycerides in mice.

To identify genetic loci influencing lipid levels, we performed quantitative trait loci (QTL) analysis between inbred mouse strains MRL/MpJ and SM/J, measuring triglyceride levels at 8 weeks of age in F2 mice fed a chow diet. We identified one significant QTL on chromosome (Chr) 15 and three suggestive QTL on Chrs 2, 7, and 17. We also carried out microarray analysis on the livers of parental strains of 282 F2 mice and used these data to find cis-regulated expression QTL. We then narrowed the list of candidate genes under significant QTL using a "toolbox" of bioinformatic resources, including haplotype analysis; parental strain comparison for gene expression differences and nonsynonymous coding single nucleotide polymorphisms (SNP); cis-regulated eQTL in livers of F2 mice; correlation between gene expression and phenotype; and conditioning of expression on the phenotype. We suggest Slc25a7 as a candidate gene for the Chr 7 QTL and, based on expression differences, five genes (Polr3 h, Cyp2d22, Cyp2d26, Tspo, and Ttll12) as candidate genes for Chr 15 QTL. This study shows how bioinformatics can be used effectively to reduce candidate gene lists for QTL related to complex traits.

considerable interest has been given to the results of genome-wide association studies of triglycerides and lipids in humans (2)(3)(4)(5). However, these studies often do not control for environment and only explain about 10% of the overall lipid variation, indicating that additional genes involved in lipid metabolism are yet to be discovered ( 5 ).
Studies in inbred mice successfully accommodate both genetic and environmental issues. Because we have such a tightly controlled environment in our mouse rooms, any phenotypic variation in triglyceride levels among inbred strains must be attributed primarily to genetic variation. This makes quantitative trait loci (QTL) analysis in inbred mouse strains a powerful approach for identifying loci and genes regulating lipid levels. To date, our laboratory and others have identifi ed more than 30 mouse triglyceride QTL ( 6,7 ). We continue to improve the ways in which we convert these QTL into the identifi cation of QTL genes (QTG). For example, since 2003 we have used a list of QTG criteria published by the Complex Trait Consortium (CTC) to indentify causal QTL genes ( 8 ). These criteria are based on the premise that a QTG must carry a polymorphism between the parental strains of the mouse cross that affects either the structure/function of the gene (a nonsynonymous coding polymorphism) or the expression of the gene. Still, most of the CTC criteria involve in vitro and in vivo experiments and are not practical strategies when more than 100 genes are located under the QTL, a common characteristic of most QTL.
To improve and accelerate the process of gene identification, our laboratory and others developed a set of bioinformatic tools to help narrow QTL in the mouse and Abstract To identify genetic loci infl uencing lipid levels, we performed quantitative trait loci (QTL) analysis between inbred mouse strains MRL/MpJ and SM/J, measuring triglyceride levels at 8 weeks of age in F2 mice fed a chow diet. We identifi ed one signifi cant QTL on chromosome (Chr) 15 and three suggestive QTL on Chrs 2, 7, and 17. We also carried out microarray analysis on the livers of parental strains of 282 F2 mice and used these data to fi nd cis -regulated expression QTL. We then narrowed the list of candidate genes under signifi cant QTL using a "toolbox" of bioinformatic resources, including haplotype analysis; parental strain comparison for gene expression differences and nonsynonymous coding single nucleotide polymorphisms (SNP); cis -regulated eQTL in livers of F2 mice; correlation between gene expression and phenotype; and conditioning of expression on the phenotype. We suggest Slc25a7 as a candidate gene for the Chr 7 QTL and, based on expression differences, fi ve genes ( Polr3 h, Cyp2d22, Cyp2d26, Tspo, and Ttll12) as candidate genes for Chr 15 QTL. This study shows how bioinformatics can be used effectively to reduce candidate gene lists for QTL related to complex traits. - Leduc Supplementary key words quantitative trait loci • gene expression • genetics One of the major predictors of the development of coronary artery disease (CAD) is lipid levels, which are determined by a complex interaction of genetic and environmental factors. High levels of low density lipoprotein (LDL) cholesterol and triglycerides (TG) are associated with higher incidence of heart disease ( 1 ). Recently, were approved by the Jackson Laboratory Animal Care and Use Committee.

Genotyping and phenotyping
Lipid measurements. Mice were fasted for 4 h prior to retroorbital bleeding at 8 weeks of age. Blood samples were collected with EDTA, and plasma was isolated by centrifugation within 2 h of the bleed. The serum was frozen at -20°C for a week until measured. Triglycerides were measured using a synchron CX Delta System (Beckman Coulter, Fullerton, CA).
Genotyping. DNA was isolated and genotyped as described previously ( 20 ). Briefl y, a total of 259 markers were genotyped, 258 with the Illumina platform comprising 760 single nucleotide polymorphisms (SNP) and one additional marker (rs33585432). Physical marker positions were determined in build 37 from the Mouse Genome Informatics (MGI) database, and sex-specifi c and averaged genetic positions were estimated from the newly calculated mouse genetic map ( 21 ).
Liver collection. Livers in F2 mice were collected at 13 weeks of age as described previously ( 20 ). The mice were housed individually for 3 days prior to the tissue collection and fasted for 4 h on the day of the collection. Liver samples were preserved in RNAlater (Ambion, Applied Biosystems, Foster City, CA) and saved at -80°C prior to the gene expression study.
Sequencing. We obtained MRL and SM genomic DNA from the Mouse DNA Resource at the Jackson Laboratory. We resequenced seven genomic loci embedding a polymorphism at a probe binding site and for which the Center for Genome Dynamics (CGD) database did not have genotypes available in MRL and SM (supplementary Table I). We used direct sequencing on the PCR products using Big Dye Terminator Cycle Sequencing Chemistry and the ABI 3700 Sequence Detection System (Applied Biosystems). Results were analyzed using Sequencher software (version 4.2).

Bioinformatic approach
Candidate genes. The list of genes located under each QTL was downloaded from Ensembl in build 37. A gene was qualifi ed as candidate if the gene i ) was located within the 95% confi dence interval (CI) of the QTL; ii ) located within a region that differed in the haplotype of the parental strains; iii ) segregated a damaging nonsynonymous coding polymorphism between MRL and SM; or iv ) was differentially expressed between the two QTL strains (MRL and SM), cis -regulated, and correlated with the phenotype in the F2 mice. If a QTL was found specifi cally in males or females, the expression evidence (expression differences, cis eQTL, correlation, and causality modeling if the QTL was significant) was also investigated in that specifi c sex stratum. Candidate genes identifi ed based on expression were subject to expression causality modeling if the QTL was signifi cant. For the candidate genes showing causality based on expression differences, we searched the Ensembl database for polymorphisms located in probe binding sites. We determined the genotype of the SNP for MRL and SM using the CGD database at the Jackson Laboratory or by direct resequencing.
Haplotype analysis. We used a haplotype approach to identify genes that differ between MRL and SM. For this purpose, we used the mouse diversity array database and its web tool available on the CGD website to identify the region where MRL differs from SM using an interval of 1 bp. We then identifi ed the genes within these regions. Because of the high density of the SNPs and identify the most likely candidate gene(s) by adding support to each gene located within the QTL ( 9-12 ). As the availability and breadth of bioinformatic resources expands, so does the usefulness of these tools. The recent development of the mouse diversity genotyping array and imputation methods gives us access to the genome of a large number of mouse strains that were previously unavailable, allowing in-depth haplotype analysis and the identifi cation of amino acid changes among these strains ( 13 ). In addition, a recent gene expression survey in 12 mouse inbred strains gives us the ability to identify differences in gene expression among these strains ( 14 ). However, identifying a difference in expression between the parental strains of a mouse cross is not suffi cient; the causative polymorphisms must be located under the QTL, meaning that the expression differences must be cis -regulated. Expression QTL (eQTL), the method of choice for identifying cis -regulated genes based on the gene expression profi le of F2 mice, presents an ideal framework for estimating causality ( 15 ). The high cost of eQTL studies has limited expression microarray profi ling in F2 crosses. Thus far, eQTL studies have been performed in a few F2 crosses in mice between inbred mouse strains and in congenic strains [C57BL6/J×DBA/2J (B6×D2) ( 16 ), C57BL6/ J×C3H/HeJ (B6×C3H) ( 17 ), D2×AKR/J ( 18 ) and C57BL6/ J×A/J (B6×AJ)] (S. W. Tsaih, personal communication) and in HG9 congenic mice ( 19 ). These powerful studies have helped identify new genes involved in complex traits. Here, we present an additional cross: MRL/MpJ×SM/J (MRL×SM).
In a QTL analysis between inbred mouse strains MRL and SM for triglyceride levels, we confi rmed one QTL and identifi ed three new QTL. We then applied our bioinformatic tools to narrow the QTL to just a few genes. We investigated databases for genotype information and for any potential functional amino acid change between the parental strains (MRL and SM). We also performed gene expression profi le analysis from liver samples in the F2 mice. We used this information to identify cis -regulated genes that also correlate with triglycerides. Conditioning the gene expression on the phenotypes identifi ed those QTL genes potentially causal to triglyceride variation. Our analysis confi rms the value of our bioinformatic tools as an effi cient model for identifi cation of genes that regulate complex traits.

Mice
MRL/MpJ (MRL) and SM/J (SM) mice were obtained from the Jackson Laboratory (Bar Harbor, ME). F1 mice were produced by intercrossing MRL females with SM males, while reciprocal F1 (RF1) mice were obtained by intercrossing SM females with MRL males. F1 mice were intercrossed by brother-sister mating to produce 282 F2 mice (135 females and 147 males). Parental strains, F1, RF1, and F2 mice were bred and housed in a climate-controlled, pathogen-free facility at the Jackson Laboratory with a 12:12 h light-dark cycle. F1, RF1, and F2 males and females were weaned at 21 days and fed chow diet (LabDiet® 5K52, PMI Nutritional International, Bentwood, MO). All experiments assessed in a combined multilocus model, and the proportion of the lipid trait explained by these QTL was determined through regression analysis. The genotypes and phenotypes are publically available at the CGD website.
Expression analysis. To evaluate each transcript for cis expression, we performed eQTL analysis in the 282 F2 mice with sex as an additive covariate, and in the female and male only. We used the Haley Knott method with a 2 cM interval. Thresholds for signifi cant ( P < 0.05) and suggestive ( P < 0.63) LOD scores were based on 10,000 permutations of the observed data. We defi ned a cis-QTL as a transcript for which the peak of the QTL was located within 20 cM of the genetic location of the gene with a suggestive LOD score. Pearson correlation coeffi cient and signifi cance were calculated between the level of expression of the transcripts and the phenotype in the entire cross (after adjusting for sex) and in males and females only. All analysis was performed at the transcript level but reported at the gene level using the transcript with the strongest cis -QTL. For the candidate genes showing causality based on expression differences, we also verifi ed that the cis-QTL was not due to the presence of a polymorphism between MRL and SM at the probe binding site. We fi rst searched Ensembl for polymorphisms and confi rmed the alleles of MRL and SM in the CGD imputed database or by resequencing. If a segregating polymorphism was present, we performed cis-QTL analysis at the probe level and verifi ed that the probe carrying the polymorphism was not solely responsible for the overall cis-QTL of the gene.
Causal analysis. Conditional genome scans can be used as a graphical modeling strategy to estimate causal relationships between traits ( 31,32 ). We applied this strategy to the eQTL data to identify candidate genes that are causal (upstream) to the clinical trait ( Y = triglyceride). In this approach, QTL mapping is performed with model 1 for the clinical trait (model 1: . This is compared with QTL mapping using a gene expression trait ( X ) as a covariate for the clinical trait (model 2: Y = ␤ 0 + ␤ 1 Q + ␤ 2 Sex + ␤ 3 X + ). A decrease in LOD score after conditioning on the gene expression trait (model 2) can be interpreted as the gene expression trait acting as a mediator of the QTL effect. The same strategy is then applied to the gene expression trait. QTL mapping for the candidate gene expression (model 3: X = ␤ 0 + ␤ 1 Q + ␤ 2 Sex + ) is compared with QTL mapping for a model that includes the clinical trait as a covariate (model 4: . Genome scans from these models were compared to determine the extent to which the gene expression trait reacts to the clinical trait. In our application, a candidate gene in the QTL region was considered causal if two criteria were met: fi rst, conditioning on the candidate gene the non-uniformity of the SNP distribution within the genome, we also included any gene located within 10 Kb of these loci. Nonsynonymous coding polymorphisms. We used the high-density imputed SNP database available on the CGD website to identify any nonsynonymous coding polymorphism between MRL and SM. We evaluated its potential functionality using the Sorts Intolerant From Tolerant (SIFT) tool ( 22 ). If the polymorphism was characterized as "damaging" or it led to a stop codon, we concluded that the amino acid change is functional.
Microarray analysis for liver gene expression. Microarray data were processed as previously described ( 20 ). Briefl y, RNA was hybridized to the Mouse Gene 1.0 ST microarray (1M) (Affymetrix, Santa Clara, CA). The data were processed using the R language/environment version 2.7.2 for data analyses. Quality control and quantiles normalization ( 23 ) were performed with the affy V 1.20.0 and preprocessCore V1.6 packages from Bioconductor. The transcript analysis was performed with a custom CDF fi le ( 24 ) for Ensembl transcripts (ENST package V.11, 37,264 probesets) from the BrainArray (University of Michigan) website. There were 2,858 redundant probesets in the CDF fi le that were removed, producing a dataset with 34,406 probesets for following analyses. Microarray analysis was performed in males and females of the parental strains MRL and SM (N = 3 for each category) and in the F2 mice (N = 282). Difference in expression between the parental strains was assessed in the parental analysis and in the F2 mice. Cis -QTL and correlation were estimated in the F2 mice as described below. Microarrays have been deposited in the Gene Expression Omnibus (GEO accession: GSE25322).

Statistical analysis
Data analysis. Parental strains, F1, RF1, and F2 mice were compared with ANOVA for females and males separately (JMP 7.0; SAS institute, Cary, NC). The data were transformed using a Van Der Waerden normal score ( 25 ).
QTL analysis. Linkage analysis was performed using R/qtl (v1.09-43) ( 26 ). We performed a three-step analysis. First, triglyceride level was analyzed for main-effect QTL using sex as an additive covariate to account for the difference in lipid level between males and females (model 1). Second, sex was added as an interactive covariate (model 2); the difference between the additive and interactive models provided a test for QTL by sex interaction (i.e., one genotype affects the trait in males but not in females) as shown in Refs. 27 and 28 . We also performed QTL analysis in males and females separately. We used the sex-specifi c positions to run the QTL analysis, but we translated these positions in the averaged genetic position to ease the comparison with the combined sex analysis. Third, epistatic effects were investigated using the pairscan function of R/qtl that tests for interacting QTL. For the chromosomes (chr) that showed a potentially secondary QTL on the same chromosome, we compared the best model with one QTL to the best model with two QTL. If the logarithm of the odds (LOD) score difference between these models was greater than 2, we concluded that two QTL were present. Thresholds for signifi cant ( P < 0.05) and suggestive ( P < 0.63) LOD scores were based on 1,000 permutations of the observed data for the autosomes and 17,940 permutations for the X chromosome ( 29 ). The Bayesian method was used to determine the 95% CI ( 30 ). Briefl y, the interval is obtained by assuming 10^LOD is the true likelihood function, assuming a priori that the QTL is equally likely to be anywhere on the chromosome. The posterior density can be derived from these two assumptions. The Bayes credible interval is defi ned as the interval for which the posterior exceeds a given probability, in this case 0.95. All suggestive and signifi cant QTL were SM mice had higher triglyceride levels compared with homozygous MRL mice in a dominant and additive manner on Chr 7 and 15, respectively ( Fig. 2 ). On Chr 17, homozygous MRL mice had higher triglycerides compared with homozygous SM mice ( Fig. 2 ). On Chr 2, both homozygous MRL and SM had higher triglycerides compared with heterozygous mice ( Fig. 2 ). We did not identify any significant QTL by sex interaction ( Fig.1B ). However, we performed the QTL analysis in each sex separately ( Fig. 1C, D and Table 2 ). We confi rmed the QTL on Chr 7 and 17 in males and the QTL on Chr 15 in females. The QTL on Chr 2 was not replicated in the male-or female-only analysis, most likely due to the loss of statistical power by having fewer mice. Additionally, a new QTL was identifi ed on Chr X in males, but the LOD score was low and may indicate a false-positive result. The Chr 7 QTL reached signifi cance in the male-only QTL analysis, while the Chr 15 QTL was observed in the female-only QTL analysis, indicating that the allelic effect at these loci is stronger in one sex than the other. Overall, we were able to explain only 14.2% of the genetic variation in triglycerides, 18.2% in males and 13% in females ( Table 3 ).

Gene expression analysis in the parental and F2 mice
We performed microarray analysis in 3 males and 3 females of each parental strain (MRL and SM) as well as in 282 F2 mice. In the parental strains, 3,423 genes were transcript reduced the LOD score below the suggestive level (LOD < 2.2) for the clinical trait (model 2), and second, conditioning on the trait did not reduce the LOD score below the suggestive level for the gene transcript (model 4). This is a stringent criterion that will reveal the most causal candidates. Conditional linkage for the clinical traits was only applied on the QTL with signifi cant LOD score ( P < 0.05).

Lipid characteristics of the parental strains, F1, RF1, and F2 mice
Means and standard error of triglyceride levels are summarized in Table 1 . No statistical difference in triglyceride levels was observed among the parental strains, F1, RF1, and F2 mice.

Identifi cation of genomic loci underlying triglyceride levels in the F2 mice
Genome-wide scans are represented in Fig. 1 and summarized in Table 2 , with the 95% CI, LOD scores for the relevant model, closest marker, high allele strain at the locus, and mode of inheritance. We fi rst added sex as an additive covariate and identifi ed four main-effect QTL: one signifi cant QTL on Chr15@38.8cM ( Tgq35 ) and three suggestive QTL on Chr2@103.8cM, Chr7@10.1cM ( Tgq34 ), and Chr17@33.2cM ( Tgq1 ) ( Fig. 1A and Table 2 ). Homozygous The difference between both models (B) with a delta LOD > 2 indicates a QTL by sex interaction (dotted line). Analysis was also performed in males (C) and females (D) separately. For each model, data were permuted 1,000 times to determine the genome-wide level of signifi cance. The threshold of signifi cance for Chr X was determined with 17,940, 16,784, and 19,085 permutations in the combined sex, male-only, and female-only analyses, respectively. In the combined sex analysis (A and B), the suggestive and signifi cant thresholds for the autosomes were 2.2 and 3.7, respectively, in the additive model, and 3.2 and 5.0, respectively, in the interactive model. For the X chromosome, the suggestive and signifi cant thresholds were 2.1 and 3.6, respectively, for the additive and the interactive models. In males and females alone (C and D), the suggestive and signifi cant thresholds were 2.2 and 3.6, respectively. For the X chromosome, the suggestive and signifi cant thresholds were 1.4 and 2.7, respectively, in males, and 2.5 and 2.8, respectively, in females. The dashed line represents the threshold of P = 0.05, and the dash-dotted line represents the threshold for suggestive QTL ( P = 0.63).
female-specifi c analysis, 27% of the genes), or a sex-specifi c cis -QTL or correlation (if the gene was identifi ed in males or females but not in the combined sex analysis, 39% of the genes). Genes that were located under a signifi cant QTL and showed molecular evidence of a QTG based on expression were subject to conditional linkage analysis.  Table 2 ), indicating that the allele effect is stronger in males than in females. We therefore examined this QTL in males only. Through haplotype analysis, we reduced the number of candidate genes from 660 to 395 ( Fig. 3A ). Among the candidate genes, we identifi ed 3 genes with an amino acid change between MRL and SM characterized as damaging (Q386R in Psg29, pregnancy-specifi c glycoprotein 29; I27F in Igfl 3 , IGF-like family member 3; and E468G in Rasgrp4 , RAS guanyl releasing protein 4) ( Table 4 ). In addition, differentially expressed at the signifi cant level ( P < 0.05) in males or females. In the 282 F2 mice, 2,840 genes were cis -regulated at the signifi cant level ( P < 0.05), and 1,411 additional genes were cis -regulated at the suggestive level ( P < 0.63), while 3,399 and 3,150 genes were cis -regulated in males only and females only, respectively (2,095 and 1,907 at the signifi cant level, respectively) (supplementary Table II). Overall, the expression of 631 genes was correlated with triglycerides in males and females together, 671 in males only, and 622 in females only. The genes showing the strongest correlation are indicated in supplementary  Table III. Molecular evidence for QTL genes based on expression differences between the parental strains includes a gene that is differentially expressed between MRL and SM and whose expression is cis -regulated and correlated with triglyceride levels in the F2 mice. We identifi ed 433 potential candidate genes in males and females together, 347 in males alone, and 201 in females alone. About 34% of the genes identifi ed in the combined sex analysis were also identifi ed in the male-or female-specifi c analysis. The genes that did not overlap refl ected either a lower number of mice in the sex-specifi c analysis (if the gene was identifi ed in the combined sex analysis but not in the male-or c LOD scores were calculated with sex as an additive covariate in the Males + Females analysis. Bold indicates signifi cant QTL. d QTL analysis was run in males and females separately. The results are indicated in the supplementary data. If the peak of the QTL in the combined sex analysis was also observed in the sex-specifi c analysis, the sex in which the QTL was found is indicated. Bold indicates the high triglyceride allele. e Main effect QTL. may be responsible for the QTL, because the bioinformatic evidence from our cross fi t the in vivo evidence from literature, we concluded that Slc25a7 is the triglyceride QTL gene for the Chr 7 triglyceride QTL.

Candidate genes for the QTL on mouse Chr 15 ( Tgq35 )
Peroxisome proliferator activated receptor alpha ( Ppara ), located at 85.5 Mb right at the QTL peak, is known to be involved in triglyceride metabolism ( 34 ) and is the most likely QTL gene. However, we failed to fi nd any evidence supporting Ppara as a candidate gene in either the resequencing or expression studies. Ppara did not have any noncoding polymorphism between strains MRL and SM. Similarly, the expression studies did not fi t the characteristics of the QTL. This QTL was found in females but not in males. Although the expression differed between the parental strains in males and females (+2.14-fold change MRL versus SM, P < 0.001 in males, and +1.46-fold change MRL versus SM, P = 0.026 in females), the expression in F2 mice showed that Ppara was cis -regulated only in males and not in females (supplementary Table V). The lack of cis -regulation in females indicates that the expression difference found between the parental strains cannot account for the QTL. In addition, the expression of Ppara was not correlated with triglycerides ( P > 0.05) (supplementary Table V).
After failing to fi nd any support for Ppara as a QTL gene, we investigated the QTL for new candidate genes by applying our bioinformatic approach. This QTL was observed in the combined sex (LOD = 4.1 at 82.8 Mb) and the female-only analyses (LOD = 4.1 at 78 Mb) at the signifi cant level ( Table 2 ). Through haplotype analysis, we reduced the number of candidate genes from 388 to 145 within the 18.3 Mb locus ( Fig. 3B ). Among these genes, we identifi ed one candidate gene based on an amino acid change between MRL and SM characterized as damaging: Recql4 (recQ protein-like 4), L527M. We also identifi ed 5 genes whose expression was strongly correlated with triglyceride levels in the F2 populations where the QTL was identifi ed for males and females combined and females only (supplementary Table III): Polr3 h , polymerase (RNA) III (DNA directed) polypeptide H; Tspo , translocator protein; Ttl1l2 , tubulin tyrosine ligase-like family, member 12; and Cyp2d22, we identifi ed 17 genes for which the expression was cisregulated in the F2 male (N = 146), differentially expressed between MRL and SM males, and correlated with triglycerides in males ( Table 4 ). We applied conditional expression analysis in males and identifi ed 3 genes likely to be causal: Slc27a5 [solute carrier family 27 (fatty acid transporter, member 5)], Sae1 (SUMO1 activating enzyme subunit 1), and Cadm4 (cell adhesion molecule 4). The expression of all 3 genes was signifi cantly different between MRL and SM, cis -regulated, and correlated with triglyceride levels. We verifi ed that the cis -regulation of the gene expression was not due to the presence of a polymorphism at the probe binding sites in all 3 genes. No polymorphism was reported at any probe binding sites of Cadm4 in Ensembl. For Sae1 and Slc27a5 , we identifi ed several polymorphisms at the probe binding sites, and we performed QTL analysis at the probe level (supplementary Table IV). For Slc27a5 , we identifi ed 34 out 35 probes cisregulated; 31 of them did not carry a polymorphism segregating between MRL and SM (supplementary Table IV). For Sae1 , we identifi ed 19 out of 23 probes cis -regulated, none of which carried a segregating polymorphism. This indicated that the cis -regulation of Slc27a5 and Sae1 was not due to a polymorphism at the probe binding site. Among the 6 fi nal candidate genes ( Psg29 , Igfl 3 , Rasgrp4 , Cadm4 , Sae1 , and Slc27a5 ), only 1 ( Slc27a5 ) was known to affect triglyceride level, and we compared our bioinformatic results to the published data ( 33 ). Expression of Slc27a5 was higher in F2 males carrying the SM allele compared with MRL allele ( Ϫ 2.34-fold change, P < 0.001), cisregulated (eQTL on Chr7@9.2, LOD = 13.5), and positively correlated with triglycerides ( r = +0.34, P < 0.001) ( Table 4  and supplementary Table III). In addition, at the QTL, homozygous SM mice had higher triglycerides compared with homozygous MRL mice. These results fi t with the published knockout mouse model for Slc27a5 that exhibits lower triglyceride levels ( 33 ). Conditioning triglycerides for the expression of Slc27a5 in males lowered the LOD score from 4.1 to 1.1, and conditioning the Slc27a5 eQTL for triglycerides did not lower the LOD score below the suggestive threshold (from 13.5 to 11.2) ( Table 5 ). Although we cannot exclude that one of the other candidate genes a Regression analysis was performed in the entire F2 population (N = 282) using sex as an additive covariate or in males and females separately.
b Sex-specifi c positions were used in the regression analysis for males and females only as provided on the CGD website.
c All QTL and interactive QTL were fi tted into a model for triglycerides separately. QTL that did not pass the 0.01 threshold were removed one at a time.
were cis-regulated, and no segregating SNPs within any probe binding site were identifi ed (supplementary Table  IV). For Polr3 h, Ttll12 and Tspo, 19 out of 27 probes (70%), 18 out of 23 probes (56%), and 14 out of 25 probes (78%), respectively, were cis -regulated, and none of them carried a polymorphism that differed between MRL and SM (supplementary Table IV). Additional studies, such as congenic mice, will help determine which of these genes is the triglyceride QTL gene. None of these genes is known to affect triglyceride metabolism.

Additional candidate genes on Chr 17
The Chr 17 QTL was identifi ed in the combined sex analysis (LOD = 3.2 at 64.6 Mb) and in males only (LOD = 2.3 at 74.1 Mb), both at the suggestive level ( Table 2 and Fig. 3C ). This QTL has previously been observed in an intercross between MRL and SJL. We did not have any information on which allele was the high or low triglyceride allele at this locus, but MRL is the common strain between both crosses. Therefore, we hypothesized that the same gene must be responsible for the QTL in both crosses. We reduced the number of candidate genes from 273 to 70 by haplotype analysis using the following criteria: MRL ≠ (SM = SJL). We identifi ed four candidate genes at this locus. One is based on a segregating nonsynonymous polymorphism ( 4930564C03Rik , L150V), and three are based on expression differences between the parental strains, cisregulation and correlation with triglycerides in the F2 mice: Ubr2 (ubiquitin protein ligase E3 component n-recognin 2), Treml4 (triggering receptor expressed on myeloid cells-like 4) and 2310039H08Rik ( Fig. 3C and Table 4 ). We did not apply the conditional linkage approach because the QTL ( Tgq1 ) was not signifi cant. None of these genes is known to affect triglyceride metabolism.

DISCUSSION
In this study, we performed QTL mapping for triglycerides using an intercross between inbred mouse strains MRL and SM. We identifi ed four QTL on Chrs 2, 7, 15, and 17. We then applied our mouse bioinformatic "toolbox" to identify candidate genes located under the signifi cant QTL. Our bioinformatic toolbox is based on recommendations by the Complex Trait Consortium (CTC) ( 8 ). This powerful approach includes haplotype analysis and a search for the presence of nonsynonymous coding polymorphisms or differential expression between the parental strains ( 9,10 ). To expand our "classic" bioinformatic toolbox, we also performed expression profi ling in the F2 mice and applied conditional genome scan analysis. In F2 mice, eQTL data allowed us to determine i ) cis -regulated genes, ii ) signifi cant correlations between gene expression and triglyceride level, and iii ) genes for which the expression is likely to be causal to the QTL. In our study, our strength was to have all the tools available to combine and improve our discovery of the causal QTL gene.
The mouse bioinformatic toolbox, as described previously ( 9, 10 ), offers advantages that are not readily available Cyp2d26 or cytochrome P450, family 2, subfamily d, polypeptides 22 and 26. The differential expression of these 5 genes between MRL and SM were found to be causal in the combined sex or in the female-only analyses ( Tables 4  and 5 ). Two of the genes ( Polr3 h and Tspo) were not originally identifi ed by haplotype analysis, but because of their strong correlation with triglyceride levels in the combined sex and female-only analyses, we suspect that a low SNP density in the database was responsible for their exclusion from the haplotype analysis. However, on the basis of the strong expression evidence, we added these 2 genes in our list of candidate genes. We also determined that the expression differences between MRL and SM as well as the cis -regulation and correlation for all genes were not due to polymorphisms within any probe binding sites. Cyp2d22 did not have any reported polymorphisms within any probe binding site in Ensembl. For Cyp2d26, all 22 probes Each panel shows the relevant bioinformatic tools used to reduce the confi dence interval of the QTL (upper bar) with haplotype analysis (middle bar). Each gene within the locus is examined for additional molecular evidence (lower bar), whether it shows either an amino acid change that is potentially functional using SIFT (on the top of the lower bar) or differential expression in males or females between MRL and SM, whether it is cis -regulated, and whether it signifi cantly correlates with the phenotype (on the bottom of the lower bar) in the F2 mice. For the haplotype analysis, we looked for regions that were different between MRL and SM. The number of genes is indicated for each step. The best candidate genes are indicated for each QTL with an asterisk at their respective locations and are also presented in Table 4 with their evidence. Candidate genes for each QTL with their molecular evidence. Bold indicates the genes for which expression was found to be potentially causal by conditional QTL analysis ( Table 5 ). b Potentially functional polymorphism between MRL and SM, as characterized as "damaging" using SIFT. The correlation was calculated over the entire population (males and females), female only or male only after adjusting the rank Z transformed phenotype for sex.
f Bioinformatic evidence is indicated as 1 ) "Damaging SNP" if a damaging non synonymous coding SNP was found; 2 ) "Expression only" if the expression of the gene was signifi cantly different between MRL and SM, cis -regulated and correlated with TG levels; or 3 ) "Conditional expression" if the expression of the gene was found to be likely causal as indicated in Table 5 .
g Genes identifi ed based on signifi cant cis eQTL and signifi cant correlation (supplementary Table III) but not based on haplotype.
on next-generation sequencing of the entire mouse genome are being developed by the Sanger Institute, but currently they are available for only a few strains (not MRL and SM). This valuable resource will give access to the complete information of the mouse genome and help complete the picture of the underlying molecular evidence of a QTL between two inbred strains by identifying i ) polymorphisms within probe binding sites that could lead to a false difference of expression between the parental strains, and ii ) polymorphisms within the coding sequence that could lead to a difference in structure or function of the protein.
Finally, the conditional modeling approach usually results in a short list of candidate genes. Unless additional in vivo work is performed, no gene can be determined as the QTL gene. In our study, we reduced each signifi cant QTL to only a few genes. From the list of three candidate genes on Chr 7 found to be potentially causal based on expression ( Slc27a5 , Sae1 , and Cadm4 ), we determined that Slc27a5 was the QTL gene in males based on a knockout mouse model for Slc27a5 that had previously shown a lower triglyceride level compared with controls ( 33 ). This fi t with our expression level, where lower expression of Slc27a5 was found in the low triglyceride strain. The other QTL were reduced to only a few genes, such as the six genes on Chr 15, with some likely candidates based on function. Additional work in vivo, however, will be necessary to determine which gene is the QTL gene. On Chr 15, we identifi ed two genes from the P450 cytochrome gene family, Cyp2d22 and Cyp2d26 , for which the expression is likely to be causal to the QTL. While these two genes have not been shown to be involved in triglyceride metabolism, another member from the same gene family has been: the knockout for Cyp19a1 shows increased triglyceride levels ( 40 ). We also identifi ed Ttll12 as a potential candidate gene for the Chr 15 QTL. This gene has recently been shown to be involved in tubulin posttranslational modifi cation and chromosomal ploidy and may contribute to a Conditional genome scans were performed in the F2 population for triglycerides on Chr 7 and 15. Sex was added as a covariate in the model. Causality modeling was performed specifi cally in males for Chr 7 and females for Chr 15 without any sex adjustment. Only the genes showing evidence of causality are reported.
b Conditional genome scan on triglycerides using the candidate gene expression as a covariate. c Conditional genome scan on the gene expression using the phenotypic trait as a covariate. To be considered causal, the LOD score of the candidate gene had to be reduced below the suggestive level (LOD < 2.2), and the expression QTL LOD score adjusted for the trait must not have been reduced below the suggestive level. d Cyp2d22, Polr3 h , and Tspo were identifi ed only in the female-only analysis, while Cyp2d26 and Ttll12 were identifi ed in both the female-only and male-and-female analyses.
for other animal models. The search for candidate genes, which consists of comparing the two parental strains, is straightforward and requires limited laboratory experiments. Databases and bioinformatic tools are publically available to perform haplotype analysis and explore the presence of a nonsynonymous coding polymorphism and differential gene expression between the parental strains.
However, we recognize that our bioinformatic approach itself has limitations. First, while our search for genes carrying different haplotypes has been previously successful in identifying complex trait genes ( 35,36 ), it may also miss a candidate gene due to the scarcity of SNPs genotyped at the locus, which we suspect happened for Tspo and Polr3 h on Chr 15. Second, the large expression differences between the two parental strains or the strong cis -QTL could be due to the presence of a polymorphism at the probe binding site. In our study, we confi rmed that this was not the case for our eight candidate genes, but we cannot exclude the possibility, especially if the cross involved a wild-derived strain ( 37 ).
Conditional causality modeling methods on their own are very powerful in narrowing the confi dence interval of a QTL and identifying candidate genes for which an expression difference is responsible for the QTL ( 31,32,38 ). These studies require performing expression profi ling in all F2 mice, thus increasing the cost of the microarray experiments. Therefore, only a few studies have been performed and published (16)(17)(18), and they often focused solely on differential gene expression as the cause of a QTL. However, an amino acid difference between the parental strains can also cause the QTL by infl uencing the structure or function of the protein ( 35,39 ). In our study, we strengthened our approach, including this possibility by screening the CGD SNP database for nonsynonymous coding polymorphisms segregating between MRL and SM and by using SIFT to further characterize these changes as damaging. However, rare variants (present only in specifi c strains not commonly used, such as MRL) are likely to have been overlooked by using the CGD SNP database. New SNP databases based the development of tumors in prostate cancer ( 41 ). None of the identifi ed genes has a known role in triglyceride metabolism, and additional molecular and in vivo studies, such as congenic mice, must be used to determine which gene is the QTL gene.
To conclude, we identifi ed new genomic loci regulating triglycerides in mice. Most of the QTL identifi ed in this study are new. The development of advanced bioinformatic tools, expression QTL analysis, methods for causal inference, and large SNP databases will help to identify the causal genes for these QTL. Their discovery will provide potential new targets for drug development and lead to improved treatment for coronary artery disease.