Localization of genes for V+LDL plasma cholesterol levels on two diets in the opossum Monodelphis domestica.

Plasma cholesterol levels among individuals vary considerably in response to diet. However, the genes that influence this response are largely unknown. Non-HDL (V+LDL) cholesterol levels vary dramatically among gray, short-tailed opossums fed an atherogenic diet, and we previously reported that two quantitative trait loci (QTLs) influenced V+LDL cholesterol on two diets. We used hypothesis-free, genome-wide linkage analyses on data from 325 pedigreed opossums and located one QTL for V+LDL cholesterol on the basal diet on opossum chromosome 1q [logarithm of the odds (LOD) = 3.11, genomic P = 0.019] and another QTL for V+LDL on the atherogenic diet (i.e., high levels of cholesterol and fat) on chromosome 8 (LOD = 9.88, genomic P = 5 x 10(-9)). We then employed a novel strategy involving combined analyses of genomic resources, expression analysis, sequencing, and genotyping to identify candidate genes for the chromosome 8 QTL. A polymorphism in ABCB4 was strongly associated (P = 9 x 10(-14)) with the plasma V+LDL cholesterol concentrations on the high-cholesterol, high-fat diet. The results of this study indicate that genetic variation in ABCB4, or closely linked genes, is responsible for the dramatic differences among opossums in their V+LDL cholesterol response to an atherogenic diet.

candidate genes in our study population in an effort to identify the QTL infl uencing V+LDL cholesterol response to the HCHF diet.

Animals
The opossums used in this study were bred and raised at the Southwest Foundation for Biomedical Research (SFBR), San Antonio, Texas, which is accredited by the Association for Assessment and Accreditation of Laboratory Animal Care International. The origins of the laboratory population of M. domestica and the husbandry conditions have been described ( 17 ). All experimental protocols were approved by the SFBR Institutional Animal Care and Use Committee.
Plasma and DNA samples were collected from 331 animals (164 males and 167 females) in seven pedigrees (sizes = 106, 89, 84, 36, 10, 3, 3) comprising 28 partially inbred "parent" animals (14 males and 14 females) and their fi rst-and second-generation offspring (see supplementary Fig. I). The 28 parents for these pedigrees were members of the offspring generation from the population studied in an earlier report on the inheritance of lipoprotein cholesterol ( 13 ), and their average inbreeding coefficient was 0.50. The fi rst-generation offspring of the 28 parent animals were then mated (one brother with multiple sisters) to produce full-sibship litters within large half-sibships (the second generation). The power of genetic analyses to detect and locate QTLs depends in part on the numbers of different pairs of relatives. For example, in these pedigrees, we have 202 parentoffspring, 308 full-sibling, 168 half-sibling, 158 grandparentgrandchild, and numerous more-distant relationships.

Experimental protocol
All animals were maintained from weaning age on the basal diet, a commercial pelleted fox food diet, which is the standard for this species and which has moderate levels of fat and cholesterol [10.0% and 0.16% of dry weight, respectively ( 14 ); see supplementary Table I]. At approximately 5 months of age, animals were fasted overnight and bled via heart puncture while under methoxyfl urane-induced anesthesia. Next, they were fed an HCHF diet enriched in fat (from lard) and cholesterol [(18.8% and 0.71% dry weight, respectively ( 14 ); see supplementary Table I]. After 8 weeks on the HCHF diet, the animals again were fasted overnight and bled.

Measurement of HDL and V+LDL cholesterol concentrations
From each of the basal and HCHF diet blood samples, EDTA plasma was prepared by low-speed centrifugation and cooled on ice, and plasma cholesterol concentrations were measured enzymatically using a clinical chemistry analyzer and commercial reagents (reagents from Boehringer-Mannheim Diagnostics, Indianapolis, IN; calibrators from Polestar Labs, Escondido, CA). When the measurement exceeded that of the highest calibrator (9.25 mmol/l), the sample was diluted with saline to bring it into range. HDL cholesterol concentrations were measured in the supernatant after precipitation of apoB-containing lipoproteins with heparin-Mg 2+ ( 18 ); V+LDL cholesterol was calculated as the difference between total and HDL cholesterol levels.

Polymorphisms and genotyping
DNA was prepared from liver collected from the 303 fi rst-and second-generation offspring using the Wizard SV 96 Genomic DNA Purifi cation System (Promega). All 303 animals were geno-by-diet effects on lipid levels have been performed ( 8 ), results have been inconsistent, most likely due to relatively small sample sizes or diffi culties in measuring dietary components in human populations.
For many of these gaps in our knowledge base, a valid animal model can provide valuable insights. While an animal model may not precisely replicate human metabolism in every detail, effects of specifi c genes or pathways can be isolated under controlled and consistent conditions, a feature that is required to adequately characterize genotypeby-diet interactions. The gray, short-tailed opossum, Monodelphis domestica , is potentially an excellent candidate for this purpose: it is small, reaches sexual maturity in 5-6 months, breeds year-round, has large litters (mean size, eight), is maintained in colonies in at least seven countries, and is used in a variety of research areas, including comparative immunogenetics, neurobiology, carcinogenesis, endocrinology, fetal development, meiotic recombination, and more ( 9 ). In particular, lipoprotein metabolism in opossums is similar to human metabolism in many respects. For example, antibodies directed against human apolipoprotein B (apoB) and apoE cross-react with the equivalent opossum proteins ( 10 ). Furthermore, metabolic and expression studies indicate that aspects of cholesterol absorption, synthesis, and excretion are similar to humans ( 11,12 ). We also have observed heritability of key aspects of lipoprotein phenotypes that are similar to humans and, in particular, have observed a striking interindividual difference in non-HDL (V+LDL) cholesterol response to a high-cholesterol, high-fat (HCHF) diet ( 13 ). The V+LDL response largely involves increases of cholesterol rather than triglyceride levels [ ‫ف‬ 30-fold vs. ‫ف‬ 1.6-fold increases, respectively ( 14 )], and the increase in V+LDL cholesterol is elicited by high dietary cholesterol alone but not high fat alone (although the V+LDL response to dietary cholesterol is amplifi ed when dietary fat is also present). In contrast, HDL cholesterol increases in response to dietary fat, and these responses are similar to those seen in other species, including primates ( 14 ). We previously reported evidence that plasma levels of HDL cholesterol and V+LDL cholesterol on the basal and HCHF diets are under the control of quantitative trait loci (QTLs) that account for a large proportion of the variation in HDL and V+LDL cholesterol in our population of animals ( 13 ). We hypothesized that the metabolic basis for this interaction with diet replicates a similar mechanism in humans and that it may underlie an important component of cardiovascular risk in humans associated with these highly atherogenic lipoproteins.
Recently, the opossum genome was sequenced ( 15 ) and we developed a genetic linkage map for this species composed largely of microsatellite markers ( 16 ). In the present study, we used these resources to map two of the previously detected QTLs onto the M. domestica genome and conduct comparative synteny analyses to derive a group of positional candidate genes, followed by a literature survey to identify high-priority candidates. We then analyzed the expression of these genes in liver (because of its central role in lipid metabolism), sequenced them to detect variants, and genotyped variants in and near the we tested whether the full model (containing parameters for two SNPs) differed from a nested model in which parameters for one or both SNPs were removed. If the nested model does not significantly differ from the full model, then the nested model is accepted as the better (or "best") model because it describes the data as well as the full model with fewer parameters. If the nested model differs signifi cantly from the full model, then the full model is better. We also performed multipoint linkage scans while incorporating the candidate gene marker as a covariate in the model. If no signifi cant evidence for a QTL remained after inclusion of the marker as a covariate, i.e., LOD score <3.0, we interpreted this result to indicate that a variant (or haplotype) near the marker might be causal. Plots and histograms were created in R (R Foundation for Statistical Computing, Vienna, Austria).

Identifi cation of potential candidate genes in QTL regions
We used three unbiased, complementary approaches to identify a list of possible candidate genes in QTL regions of interest, i.e., genomic regions near QTLs that might infl uence V+LDL cholesterol on the basal and HCHF diets. First, we obtained the current annotated list of genes in the regions of interest from the University of California Santa Cruz (UCSC) Opossum Genome Browser Gateway (http://genome.ucsc.edu/cgi-bin/ hgGateway). The opossum genome is extensively annotated; however, it contains many predicted genes of unknown identity ("novel genes") that are likely to have orthologs or paralogs in corresponding regions of the human genome that were not identifi ed by the automated gene prediction algorithms ( 25 ). Second, to construct a more comprehensive catalog of genes likely to be present in the QTL regions of interest, we employed a comparative synteny strategy to identify comparable (homologous) regions in the much more thoroughly annotated human genome assembly. Specifi cally, we utilized the comparative synteny features of the Ensembl Opossum Genome browser (http://www. ensembl.org/Monodelphis_domestica/Info/Index) and UCSC Opossum Genome Browser Gateway to fi rst determine the regions of the human genome that are homologous to the stretch of opossum genome contained within each QTL candidate region. We then used the Biomart feature of Ensembl (http://www.ensembl. org/biomart/martview/bc3dd7b7a5f66472ea30409b20483b62) to determine the annotated gene content and chromosome assembly coordinates for each gene within those homologous regions of the human genome corresponding to the opossum QTL regions. This output was compared with the most recent human genome annotation at the UCSC browser (GRCh37). Wherever the UCSC data set contained a gene prediction not present in Ensembl Biomart output, this gene and its assembly coordinates were merged into the Biomart-based list to produce a comprehensive list of all genes and their genomic locations in the human homologous regions corresponding to the opossum QTL regions. Third, we used two public databases to identify a set of genes that had been tested for association with LDL cholesterol levels in at least one human population. We searched the National Human Genome Research Institute Catalog of Published Genome-Wide Association Studies (http://www.genome. gov/26525384) using the search term "LDL cholesterol" and listed all genes that were signifi cantly associated (at a genome level of signifi cance) with LDL cholesterol. We also searched the HuGE Navigator (http://www.hugenavigator.net/), a database of results from tests of associations between candidate genes and various traits in humans, using the "Phenopedia" option and the search term "Lipid Metabolism Disorders"; this database contains both positive and negative results. From these two databases, we obtained a list of 224 genes potentially associated with LDL metabolism in humans. To obtain the unbiased list of potential typed for 74 anonymous microsatellite markers and a microsatellite within the TDT (terminal deoxynucleotidyltransferase) gene. The map positions of these 75 markers are known from the opossum BBBX linkage map that comprises more than 330 autosomal and X-chromosome-linked loci ( 16 ; unpublished observations). Supplemental Table II lists references for the 42 previously published markers and gives descriptions and genotyping methods for the 33 marker systems described here for the fi rst time. Genotyping was then performed on ABI 3100 or ABI 3730XL genetic analysis platforms ( 19,20 ). Quality control procedures were similar to those previously described ( 21 ).
Although the total number of base pairs in the opossum genome is similar to that of humans ( 15 ), the overall meiotic map length is comparatively small, ‫ف‬ 867 cM ( 21 ). Therefore, initial genotyping was conducted for 66 marker loci across the eight autosomes: mean intermarker distance on each chromosome ranged from 12 to 17 cM. To further clarify the locations of the putative QTLs, we genotyped an additional nine markers in the QTL regions of interest; the locations of all 75 markers are given in supplementary Table II.

Statistical genetic analyses
All plasma cholesterol distributions were nonnormal. To reduce nonnormality prior to genetic analysis ( 22 ), V+LDL cholesterol on the basal diet was transformed by square-root, whereas V+LDL cholesterol on the HCHF diet and HDL cholesterol on both diets were transformed by natural logarithms. In addition, extreme outliers (±4 SD) were removed.
Heritabilities of plasma cholesterol levels were estimated using maximum likelihood variance components methods, in which phenotypic variation was modeled as follows : where y i is the plasma cholesterol level for the i th individual, is mean cholesterol level, X ij is the j th covariate for the i th individual, ␤ j is the corresponding regression coeffi cient, g i is the additive genetic effect, and e i is the residual error effect, which includes unmeasured environmental and nonadditive genetic components. Residual heritability, h 2 r , is the proportion of total trait variance due to the additive genetic component after adjusting for environmental covariates. The likelihood ratio test was used to assess the signifi cance of model parameters by comparing the full model (all covariates and additive genetic effects) with a nested model lacking a specifi c parameter ( 23 ). Because V+LDL on the HCHF diet was bimodal, the effects of sex, age, and weight were assessed within each modal group.
Two-point and multipoint linkage scans were performed using the variance components method, which extends the above model by including the effect of a putative QTL, 2 QTL , as a component of variance ( 23 ). Maximum likelihood methods were used to estimate 2 QTL based on the expected covariance between relatives due to their identity by descent (IBD) at a given marker (two-point analyses) or at an arbitrary chromosomal location (multipoint analyses) in tight linkage with the presumed QTL. Multipoint IBD probabilities were estimated at 1 cM (Haldane) intervals from map positions 89-200 cM on chromosome 1 ( Mdo 1) and 0-110 cM on Mdo 8 using a Markov Chain Monte Carlo algorithm, as implemented in Loki ( 24 ). The likelihood ratio test was used to compare the linkage model to the polygenic (i.e., no linkage, 2 QTL = 0) model, and results were reported as the log 10 of the likelihood ratio [logarithm of the odds (LOD) score]. Genetic analyses were performed using SOLAR software ( 23 ).
After identifying single-nucleotide polymorphisms (SNPs) in high-priority candidate genes (see following section) pedigreebased association analyses were performed to test whether SNP genotypes were associated with V+LDL cholesterol. Specifi cally, threshold cycle or C T , values) were averaged. For each assay in each animal, we obtained ⌬ C T (=C T gene Ϫ C T GAPDH ) and tested for mean differences in ⌬ C T between the high and low responders on each diet using t -tests ( 26 ). With this sample size, we had 80% power at P < 0.05 to detect 2-5 unit differences in mean ⌬ C T , that is, 4-32 fold expression differences, given the standard deviation for the assays ranging from 1-3 units.
The cDNA sequences of INSIG1 , ABCB1 , and ABCB4 from fi ve high and fi ve low responders (see supplementary Table III) were inspected for SNP discovery. Overlapping primer sets for cDNA sequencing were designed for each gene (see supplementary Table  V) and ordered from Bioneer, Inc. Twenty nanograms of cDNA was used as template for PCR with conditions optimized for each primer set. To minimize PCR products resulting from nonspecifi c priming, all PCR products were separated by 1% agarose gel electrophoresis and gel purifi ed using the QIAquick Gel Extraction Kit (Qiagen). Subsequent PCR was conducted to amplify the gel-purifi ed PCR products, and the resulting second PCR products were again purifi ed using the QIAquick PCR Purifi cation Kit (Qiagen). The forward primer of each primer set was then used to perform DNA sequencing on an ABI 3730 or ABI 3130 DNA analyzer platform using the manufacturer's specifi cations. All sequence data were viewed, aligned, and assembled using Sequencher4.9© software (Gene Codes Corp.). Regions of ABCB4 and INSIG1 cDNAs bearing putative SNPs were resequenced using the reverse primer to verify the allelic variants.

Genotyping polymorphisms in ABCB4 and INSIG1
All offspring were genotyped for polymorphisms in ABCB4 and near INSIG1. For ABCB4 , the Ile235Leu SNP was genotyped using restriction site-generating PCR (RG-PCR) ( 27 ). Specifi cally, we generated an artifi cial Hpy 188I restriction site in the A-bearing amplimer by creating a base mismatch near the 3 ′ end of the forward primer fl anking the SNP site. Primers used for ABCB4 RG-PCR were F: 5 ′ -TCCCTAAATATTTGGGTTTTATTGTTATTTCAG-3 ′ and R: 5 ′ -CACCGTCTTAATGGCAGACA-3 ′ ; where F is the mismatch primer. This primer design renders the A-bearing amplimer, but not the T-bearing amplimer, susceptible to Hpy 188I digestion, resulting in an artifi cial restriction fragment-length polymorphism (RFLP). The RFLP was visualized by electrophoresis of restriction digests on 4% NuSieve 3:1 agarose gels (FMC BioProducts) stained with ethidium bromide. We were unable to design a reliable genotyping strategy for the INSIG1 Arg24Gly SNP; thus, we developed a microsatellite polymorphism ( 8M706 ) that was ‫ف‬ 8 kb upstream from INSIG1 as a marker for the INSIG1 region. Methods for 8M706 follow those described for the other microsatellites used in this study (see supplementary Table II).

Means and heritability of cholesterol levels on two diets
Plasma cholesterol levels were available for 325 animals (163 males and 162 females) with mean age at baseline equal to 161 days (range = 110 -225 days) and mean weight (±SD) equal to 68 ± 18 g on the basal diet and 77 ± 20 g after 8 weeks on the HCHF diet. On the basal diet, mean (±SD) HDL and V+LDL cholesterol levels were 1.29 ± 0.23 mmol/l and 0.5 ± 0.19 mmol/l, respectively; whereas after 8 weeks on the HCHF diet, they increased to 1.83 ± 0.56 mmol/l and 11.11 ± 12.7 mmol/l, respectively ( Table 1 ). Plasma cholesterol levels were transformed prior to genetic analyses, and these transformations revealed the striking bimodality for natural logarithm-transformed positional candidate genes for the V+LDL QTL on each diet, we merged this list of 224 genes with the comprehensive list of opossum homologs of human genes in the QTL regions of interest. Finally, to identify a subset of high-priority candidate genes from this unbiased list of positional candidate genes, we further assessed the published literature for the specifi c effects or associations of these genes with LDL-related traits in humans and animal models.

Expression and sequencing of high-priority candidate genes
From the 303 offspring, we identifi ed 12 (6 male and 6 female) opossums with low and 12 (6 male and 6 female) with high V+LDL cholesterol on the HCHF diet, for which we also had stored liver samples. Low and high responders had V+LDL cholesterol <0.3 or >2.2 mmol/l (ln-transformed), respectively (see Fig. 1 ). These animals had liver samples collected immediately after the HCHF diet bleed (6 high and 6 low) or had liver samples collected after subsequently being fed the basal diet for 8-12 weeks (6 high and 6 low). Within each responder-by-diet group, animals were chosen to represent at least three different pedigrees (see supplementary Table III). Total RNA was extracted from the liver tissue of the 24 animals using the TRI-Reagent protocol (Molecular Research Center, Inc.). RNA extracts were DNase treated with the DNA-free Kit (Ambion) and used to synthesize double-stranded cDNA utilizing the SMARTer cDNA Synthesis Kit (Ambion). cDNA concentrations were determined using a Qubit fl uorometer (Invitrogen).
Expression assays were conducted for opossum homologs of the high-priority candidate genes ( ABCB1 , ABCB4 , INSIG1 ) and a ubiquitously expressed housekeeping gene, GAPDH , as a control for standardization. The Ensembl-predicted annotations for these genes were used to design forward and reverse PCR primers using PRIMER3.0 software (see supplementary Table IV). Twenty-fi ve nanograms of cDNA were used to conduct real-time PCR (RT-PCR) assays, and the PCR products were detected on a high-throughput gene quantifi cation platform, LightCycler® 480 (Roche). Each reaction was performed in duplicate to confi rm results, and the duplicate measures of relative abundance (i.e., we obtained very strong evidence for a QTL that accounted for ‫ف‬ 65% of the additive genetic variation in V+LDL cholesterol on the HCHF diet. This QTL was located between 8M253 and 8M705 , maximum LOD score = 9.88 at position 85 cM ( Fig. 2B ). The 2-LOD support interval comprises the QTL region of interest on each chromosome ( 28 ). Our region of interest for V+LDL cholesterol on the basal diet on Mdo 1 ranged from 1M637 to 1M372 (the q-end of the chromosome); whereas for V+LDL cholesterol on the HCHF diet on Mdo 8, the QTL region of interest was from 8M031 to 8M431 ( Fig. 2 ).

Identifi cation of candidate genes
To identify potential candidate genes that might infl uence V+LDL cholesterol on the basal and HCHF diets, we interrogated the opossum genome database and also identifi ed homologous chromosomal regions in humans ( Homo sapiens , Hsa ) that corresponded to our QTL regions of interest on opossum ( Mdo ) chromosomes 1 and 8. The QTL region of interest on Mdo 1 exhibited conserved synteny with regions of Hsa 2 and 19 that contained a combined total of 17 known genes, whereas the Mdo 8 region of interest was homologous to regions of Hsa 3, 7, and 10 and contained 215 known genes (see supplementary Table VII). Of the 17 known genes in the QTL region of interest on Mdo 1, only two genes, Alstrom syndrome 1 ( ALMS1 ) and lipin 1 ( LPIN1 ), were also on the list of genes that had been tested for association with lipoprotein-related traits in humans. We did not choose to follow up the Mdo 1 QTL in this report (see DISCUSSION).
For the QTL region of interest on Mdo 8, we identifi ed eight genes (or their paralogs) that were also on the unbiased list of potential candidates for "LDL cholesterol" or "Lipid Metabolism Disorders" ( Table 2 ). These genes included: nitric oxide synthase 3 (eNOS, NOS3 ), insulininduced gene 1 ( INSIG1 ), cubilin ( CUBN ), integrin ␣ 8 ( ITGA8 ), ATP binding cassette, subfamily B, members 1 and 4 ( ABCB1 and ABCB4 ), calpain 7 ( CAPN7 ), and UDP-N -acetylgalactosaminyltransferase-like 2 ( GALNTL2 ) . We next surveyed the literature to determine which specifi c traits or functions had been associated with these genes in humans or mice ( Table 2 ). Based on their involvement in various aspects of LDL metabolism, INSIG1 , ABCB4 , and ABCB1 were deemed high-priority candidate genes for the QTL on Mdo 8 that infl uences V+LDL cholesterol response to diet. V+LDL levels (in ln mmol/l) on the HCHF diet ( Fig. 1 ), as has been previously reported ( 13 ).
We next estimated the heritability and the effects of age, sex, and weight on the four plasma cholesterol traits. Compared with males, females had higher HDL on the basal diet and the HCHF diet ( P = 0.0009 and 10 Ϫ 9 , respectively) and lower V+LDL cholesterol on the basal diet ( P = 0.001); however, there was no difference between sexes for V+LDL on the HCHF diet (see supplementary Table VI). V+LDL cholesterol on the basal diet decreased with increasing age and age 2 ( P = 0.003 for both), but age did not infl uence any other trait. Weight was not signifi cant for any trait. The signifi cant covariates accounted for 0-12% of the total variation for each trait ( Table 1 ). Residual heritabilities (h 2 r ) ranged from 0.13 to 0.67, Table 1 ) and were highly signifi cant for HDL and V+LDL cholesterol on the basal diet and V+LDL on the HCHF diet. On the other hand, residual heritability was low for HDL cholesterol on the HCHF diet (h 2 r = 0.13).  HCHF, high-cholesterol, high-fat.

Expression studies of candidate genes for the QTL on Mdo 8
Using RT-PCR analysis, we compared expression of INSIG1 , ABCB1 , and ABCB4 in liver samples from both high-and low-V+LDL-responder opossums on both the basal and HCHF diets. Specifi cally, samples from 6 highand 6 low-responder animals on the HCHF diet as well as 6 high-and 6 low-responder animals on the basal diet were examined. There was no signifi cant difference between high-and low-V+LDL responders in mean expression for any of the three candidate genes on either diet, although expression of ABCB1 showed a borderline signifi cant difference on the HCHF diet ( P = 0.052, Table 3 ).

Sequence analysis of INSIG1 , ABCB1 , and ABCB4
We next looked for exon variants in the candidate genes. Two SNPs were identifi ed in the INSIG1 sequence (see supplementary Table VIII), including a nonsynonymous C/G SNP for amino acid 24 in the mRNA sequence predicted to result in an Arg → Gly substitution. Of nine SNPs identifi ed in the ABCB4 sequence (see supplementary Table VIII), two were predicted to be nonsynonymous; an A/G SNP predicting an Arg → Gly substitution at amino acid 29, and an A/T SNP predicting an Ile → Leu substitution at amino acid 235. SNP detection proved impractical for ABCB1 due to the apparent coamplifi cation of its sequence with that of an annotated pseudogene, resulting in a very high false SNP rate. Overall protein sequence similarities (human to mouse vs. human to opossum) based on BLAST alignment algorithms were 82% versus 85% for  supplementary Table IX). However, only Ile235 in ABCB4 was conserved across all tabulated vertebrate species (from human to chicken). Among the 10 sequenced opossums, the genotypes for the nonsynonymous SNPs in ABCB4 are not strongly correlated with high-versus low-responder group, although the genotypes are correlated with each other ( Table 4 ) . Because ABCB1 and ABCB4 are virtually contiguous in the genome ( Table 2 ), the ABCB4 SNP marks the region containing both genes. Similarly, the Arg24Gly SNP in INSIG1 is not strongly correlated with responder group, but is correlated with the microsatellite, 8M706 , which was used to mark the INSIG1 region ( Table 4 ).

Association and linkage analyses of variants for INSIG1 and ABCB4 regions
All animals with DNA were genotyped for the INSIG1 marker ( 8M706 ) and the ABCB4 Ile235Leu SNP. As expected from their genome location ( Table 2 ), the ABCB4 SNP and INSIG1 marker showed ‫ف‬ 12% recombination (LOD score for linkage = 33.1). We next assessed whether variation at either or both of these polymorphisms was signifi cantly associated with V+LDL cholesterol levels on the genotypes alone best described the variation in V+LDL cholesterol on the HCHF diet and that this model is signifi cantly better than a model with no SNPs ( P = 9 × 10 Ϫ 14 ). Although the ABCB4 SNP is signifi cantly associated with V+LDL cholesterol on the HCHF diet and accounts for 18% of the total variation in V+LDL cholesterol, the SNP genotype means do not correspond well with the bimodality seen in Fig. 1 . Median V+LDL cholesterols for the AA , AT , and TT genotypes were Ϫ 0.17, 1.58, and 2.87 (ln mmol/l ), respectively, and there was considerable variation in V+LDL cholesterol within each genotype ( Fig. 3 ), indicating that this variant is not likely to be the sole causal SNP for V+LDL cholesterol variation on the HCHF diet, although it may be in linkage disequilibrium with a causal variant or variants.
We performed two-point QTL analyses between V+LDL cholesterol and these variants; the two-point LOD scores were 7.94 and 1.56 for ABCB4 and INSIG1 , respectively. We also reran the multipoint linkage analyses to include ABCB4 and INSIG1 marker genotypes as covariates in the linkage analysis model to determine whether signifi cant evidence for linkage remained after including the effects of each polymorphism. Using data on the same subset of 289 animals described above, the maximum LOD score was 8.68 at position 85 cM ( Fig. 2B ). Inclusion of the ABCB4 SNP in the multipoint linkage model obliterated the QTL signal on Mdo 8; i.e., the maximum LOD score at 85 cM dropped from 8.68 to 0.14. In contrast, inclusion of INSIG1 in the linkage model did not completely remove evidence for linkage; at position 85, the maximum LOD score was 4.5 ( P = 10 Ϫ 5 ), which remains highly signifi cant evidence for a QTL at this position ( Fig. 2B ). Furthermore, HCHF diet and whether inclusion of these SNP genotypes as covariates in the linkage analysis models would reduce the QTL signal on Mdo 8. For appropriate comparison of results, all association and linkage analyses described below were performed using data on the subset of 289 opossums that had valid V+LDL cholesterol measures on the HCHF diet and valid genotypes at both the ABCB4 and INSIG1 markers. Using pedigree-based association methods, the model containing neither SNP was highly significantly different from the model containing both variants ( P = 9 × 10 Ϫ 14 , Table 5 ), indicating that ABCB4 or INSIG1 or both variants were signifi cantly associated with V+LDL cholesterol levels. The model containing ABCB4 alone was not signifi cantly different ( P = 0.70) from the full model; thus, ABCB4 alone described the variation in V+LDL cholesterol as well as a model with both variants, but with fewer parameters. In contrast, the model containing INSIG1 alone was signifi cantly different from the full model ( P = 7 × 10 Ϫ 9 ), indicating that if ABCB4 is in the model, INSIG1 does not provide any additional information.
These results indicate that the model containing ABCB4  generation offspring of animals analyzed in the previous study strongly support this conjecture. Locating QTLs for specifi c traits, either by linkage or association analyses, is the fi rst step toward the goal of identifying candidate genes and, ultimately, causative variants. A potential limitation of the opossum model is that the genome sequence is not as well annotated, nor is the panel of genetic variants as extensive as those of human and some other model species. However, we used a novel approach that incorporated information from a variety of public genomic databases to identify the corresponding conserved syntenic regions in humans and then used the human genome annotation to create a list of known genes in the region. Of the annotated genes that were also orthologs or paralogs of genes on our list of 224 candidate genes (based on two public databases of association studies in humans), we derived a subset of candidate genes ( Table 2 ) residing within the QTL intervals. We then did translated protein BLAST (tBLASTn) analyses of these genes to confi rm that the orthologs or paralogs of these genes are indeed present in the opossum chromosomal regions of interest. The power in this multistage analytical approach is that it is unbiased and solely informed by publically available information. After identifying a set of positional candidates, we surveyed the literature to select high-priority candidate genes for follow-up studies.
We identifi ed two potential candidate genes, ALMS1 and LPIN1 , located in the QTL region of interest on Mdo 1 ; however, they are relatively distant from the QTL (see supplementary Table VII), and a survey of the literature is not strongly supportive of either as candidates for this QTL. Although one of the characteristics of individuals with Alstrom syndrome (caused by mutations in ALMS1 ) is dyslipidemia, and mice without functional LPIN1 exhibit hypertriglyceridemia and fatty livers, studies of variants in these two genes in other human populations revealed no strong association with plasma lipid levels ( 43,44 ). These observations suggest that another, potentially novel, gene may infl uence V+LDL cholesterol on the basal diet. In fact, for future studies, we could expand our list of candidate genes to include all proteins, enzymes, and receptors that are known to be involved in lipoprotein metabolism using, for example, the STRING database of proteins and their although the Mdo 8 QTL signal was almost abolished, the ABCB4 SNP only accounted for approximately 1/2 of the additive genetic variation due to the QTL. These association and linkage results indicate that the region marked by the ABCB4 SNP, but not by the INSIG1 variant, accounts for much but not all of the variation in V+LDL cholesterol due to the Mdo 8 QTL. Thus, additional variants in the ABCB4 region remain to be detected.

DISCUSSION
In the current study, we performed unbiased, "hypothesis-free" genome-wide linkage analyses of plasma cholesterol levels in opossums fed two different diets. We obtained strong evidence (maximum multipoint LOD = 3.1, genomic P = 0.019, Fig. 2A ) that a QTL on Mdo 1 infl uenced V+LDL cholesterol levels on the basal diet and that a QTL on Mdo 8 infl uenced V+LDL cholesterol levels on the HCHF diet (maximum multipoint LOD = 9.9, genomic P = 5 × 10 Ϫ 9 , Fig. 2B ). These two QTLs accounted for ‫ف‬ 70% and ‫ف‬ 65% of the residual additive genetic variance of V+LDL cholesterol levels on the basal and HCHF diets, respectively, in these pedigrees. The linkage results are completely consistent with our previous analyses, in which we reported that a major gene accounted for ‫ف‬ 50% of the genetic variation in V+LDL cholesterol on a basal diet, and a separate major gene accounted for ‫ف‬ 80% of the genetic variation in V+LDL cholesterol on the HCHF diet ( 13 ). Because we do not have DNA samples from the animals used in our previous studies, we could not perform a combined segregation and linkage analysis to more conclusively demonstrate that the previously detected major genes are, in fact, the Mdo 1 and Mdo 8 QTLs. However, our current results using data from the fi rst-and second-  expression in response to the HCHF diet, that is, reduced expression in both responder groups, but lowest in the high-V+LDL responder group, was also observed for genes involved in cholesterol biosynthesis, e.g., HMG-CoA reductase, among opossums fed a high-cholesterol diet ( 11 ). Thus, these differences in expression may result as a general response to maintain cholesterol homeostasis. However, this interpretation is limited by our sample size. To detect SNPs that might affect structural characteristics of the INSIG1, ABCB1, and ABCB4 proteins, and, consequently, differential V+LDL cholesterol response to the HCHF diet, we sequenced all exons of the corresponding mRNAs (isolated from liver) corresponding to our candidate genes from fi ve high-and fi ve low-V+LDL responders. We detected one and two nonsynonymous SNPs in INSIG1 and ABCB4 , respectively, but the genotypes at these SNPs were not highly correlated with the high-versus low-V+LDL cholesterol response groups ( Table 4 ). Thus, these structural variants do not, by themselves, account for the differences in V+LDL cholesterol on the HCHF diet. Unfortunately, due to presumed coamplifi cation of sequence from an ABCB1 pseudogene, we were unable to identify SNPs in this candidate gene.
On the basis of the above expression and sequencing results, we were unable to identify variants in any of our candidate genes that would differentiate between high-and low-V+LDL cholesterol response groups. We next performed association and linkage analyses of polymorphisms in (or near) these candidate genes in an effort to narrow the candidate gene region. Results of association analyses revealed that the ABCB4 SNP, but not the INSIG1 marker, was signifi cantly associated with V+LDL cholesterol levels on the HCHF diet ( P = 9 × 10 Ϫ 14 ). Furthermore, inclusion of the ABCB4 SNP genotypes in the multipoint linkage model removed the QTL signal, whereas inclusion of the INSIG1 marker genotypes did not ( Fig. 2B ). However, variation marked by the ABCB4 Ile235Leu SNP genotype is associated with only ‫ف‬ 35% of the additive genetic variation in V+LDL cholesterol ( Fig. 3 ), whereas the Mdo 8 QTL accounted for ‫ف‬ 65% of the additive genetic variance. Thus, substantial genetic variation at this or other nearby genes remains to be identifi ed, and some of it could well be due to interactions among multiple genetic variants in this region, as well as to epistatic effects of genes on other chromosomes. In support of the latter concept, our previous study ( 13 ) indicated that the QTL infl uencing V+LDL cholesterol on the basal diet also infl uenced V+LDL cholesterol on the HCHF diet; in the current report, the basal diet QTL is located on Mdo 1.
The linkage and association results strongly indicate that the chromosomal region near ABCB4 contains variants that have large effects on V+LDL cholesterol on the HCHF diet. Mutations in the phospholipid transporter gene ABCB4 are associated with low levels of biliary phospholipids and a growing list of liver diseases in humans and mice ( 36,46 ). In mice, ABCG5 and ABCG8 appear to require intact ABCB4 (a.k.a. MDR2 ) for secretion of cholesterol into bile and, overall, "to infl uence the absorption, secretion, and plasma levels of neutral sterols" ( 47 ). Although interactions (http://string.embl.de/) and/or the Ingenuity Pathway Analysis program (http://www.ingenuity.com/).
In contrast to the QTL on Mdo 1, we identifi ed three high-priority candidate genes ( INSIG1 , ABCB1 , and ABCB4 ) for the QTL on Mdo 8 that infl uences V+LDL cholesterol on the HCHF diet. INSIG1 codes for a protein involved in the regulation of the synthesis of cholesterol, fatty acids, triglycerides, and phospholipids ( 32,33 ). ABCB1 and ABCB4 are large, membrane-spanning P-glycoprotein hepatobiliary drug transporters that transport amphipathic molecules and phospholipids, respectively ( 45 ). ABCB1 variants have been associated with the LDL cholesterol response to drugs ( 39 ); whereas humans and mice with mutations or disruptions in ABCB4 exhibit an impaired ability to secrete phospholipid into bile and develop liver disease ( 36,37 ). We further investigated these candidate genes using expression studies, sequencing, and association analyses.
First, we assessed whether mean differences in V+LDL cholesterol levels between high versus low responders could be due to differences in mRNA expression of the candidate genes. Although our expression study was limited because we only used liver samples, a strong result could have been informative. There were no signifi cant differences in hepatic mRNA expression of the candidate genes between low-versus high-V+LDL cholesterol responders on the basal diet, or for mean expression of INSIG1 or ABCB4 on the HCHF diet, but the difference in ABCB1 expression was borderline (nominal P = 0.052). Mean ABCB1 expression in the high-versus low-V+LDL cholesterol response groups decreased by ‫ف‬ 3 ⌬ C T units (or ‫ف‬ 8-fold), whereas mean V+LDL cholesterol serum levels increased by ‫ف‬ 25-fold. Furthermore, expression of all three genes appears to decline on the HCHF diet. This same pattern of ) with ABCB4 SNP genotypes. The solid line represents the median, the boxes encompass the interquartile range (IQR) (the middle 50% of data), and the whiskers mark the1.5 × IQR limits of the upper and lower quartiles. Individual circles are points that fall outside the 1.5 × IQR boundaries . many patients with impairment of ABCB4 have cholesterol levels in the normal range ( 37 ), a recent study has reported signifi cant associations in humans of plasma cholesterol levels with two tagging SNPs in ABCB4 ( 38 ). These observations are consistent with our hypothesis that genetic variation at or near the ABCB4 locus is at least partly responsible for the observed differences in V+LDL cholesterol on the HCHF diet. This hypothesis is further supported by previous comparisons of opossums with high-versus low-V+LDL cholesterol: high-responder animals had signifi cantly higher hepatic cholesterol, substantially lower biliary phospholipid and cholesterol levels, and larger livers ( 12,48 ), which are similar to phenotypic differences associated with different ABCB4 alleles observed in humans and mice. Although ABCB4 is a plausible candidate gene to infl uence at least part of the observed V+LDL response to diet, we are approaching the limits of information obtainable from additional statistical genetic analyses of the current population because both our sample size and the opossum meiotic map are relatively small. For example, we are not able to determine whether the ABCB4 SNP could be marking variants in ABCB1 or other nearby genes, including micro-RNAs . Thus, additional biochemical and molecular genetic studies will be needed to identify the specifi c variants and mechanisms responsible for the differential V+LDL cholesterol response to an HCHF diet.
In conclusion, we observed dramatic interindividual variation in V+LDL cholesterol response to dietary environment in the opossum, and detected evidence of diet-by-genotype interaction, in that the predominant QTLs for V+LDL cholesterol levels on different diets reside on different chromosomes. We adopted a novel, unbiased, "hypothesis-free" approach, combining linkage analyses, comparative synteny/homology methods, and published information on genes associated with lipoprotein physiology in human and mouse models to identify high-priority candidate genes for the V+LDL cholesterol response ( INSIG1 , ABCB1 , and ABCB4 ). We then performed gene expression and sequencing analyses, followed by association and linkage analyses, in an effort to include or exclude these candidate gene regions. Our results indicate that the chromosomal region encompassing ABCB1/ABCB4 is likely to harbor variants that affect V+LDL cholesterol response to an HCHF diet. These fi ndings illustrate how combining disparate methodologies can be used to identify candidate genes for quantitative traits, such as V+LDL cholesterol response to diet, in nontraditional animal models such as the opossum.