Plasma lipid profiling in a large population-based cohort.

We have performed plasma lipid profiling using liquid chromatography electrospray ionization tandem mass spectrometry on a population cohort of more than 1,000 individuals. From 10 μl of plasma we were able to acquire comparative measures of 312 lipids across 23 lipid classes and subclasses including sphingolipids, phospholipids, glycerolipids, and cholesterol esters (CEs) in 20 min. Using linear and logistic regression, we identified statistically significant associations of lipid classes, subclasses, and individual lipid species with anthropometric and physiological measures. In addition to the expected associations of CEs and triacylglycerol with age, sex, and body mass index (BMI), ceramide was significantly higher in males and was independently associated with age and BMI. Associations were also observed for sphingomyelin with age but this lipid subclass was lower in males. Lysophospholipids were associated with age and higher in males, but showed a strong negative association with BMI. Many of these lipids have previously been associated with chronic diseases including cardiovascular disease and may mediate the interactions of age, sex, and obesity with disease risk.


Lipid analysis
Lipid analysis was performed by LC ESI-MS/MS using an Agilent 1200 liquid chromatography system and Applied Biosystems API 4000 Q/TRAP mass spectrometer with a turbo-ionspray source (350°C) and Analyst 1.5 and Multiquant data systems. Liquid chromatography was performed on a Zorbax C18, 1.8 m, 50 × 2.1 mm column (Agilent Technologies). Solvents A and B consisted of tetrahydrofuran:methanol:water in the ratio (30:20:50) and (75:20:5) respectively, both containing 10 mM ammonium formate. Columns were heated to 50°C and the auto-sampler regulated to 25°C. Diacylglycerol (DG) and triacylglycerol (TG) species (1 l injection) were separated using an isocratic fl ow (100 l/min) of 85% solvent B over 6 min. All other lipid species (5 l injection) were separated under gradient conditions (300 l/min) 0% solvent B to 100% solvent B over 8.0 min, 2.5 min at 100% solvent B, a return to 0% solvent B over 0.5 min then 10.5 min at 0% solvent B prior to the next injection. Representative chromatograms are shown in Fig. 1 .

Method development and identifi cation of plasma lipid species
Internal standards were available for most lipid classes and subclasses investigated. Using direct infusion experiments of these standards, declustering potential, collision energy, and exit potential were optimized to give maximum response. Using these values, precursor ion scans and neutral loss scans were performed on a lipid extract of pooled plasma obtained from healthy volunteers to identify the major lipid species of the following classes  Table 2 ). Species that were chromatographically separated and gave a signal within the linear range of response (see Linearity of response below and Table 3 ) were subsequently incorporated into multiple reaction monitoring (MRM) experiments for comparative analysis. In the case of DG and TG species, the acyl chains were identifi ed by chromatographically aligning the different neutral loss scans of each precursor mass. In cases where more than one combination of acyl chains was possible for one precursor mass, a unique fragment was chosen to distinguish between isomers that were not chromatographically separated. MRM experiments, established for each lipid species, were combined into two scheduled MRM experiments whereby data from each MRM was only collected during its retention time window (± 30 sec) (see Table 2 and supplementary Table I).

Acquisition of comparative lipidomic data
Comparative lipid abundances were calculated by relating the peak area of each species to the peak area of the corresponding internal standard. Peak integration was performed using AB Sciex MultiQuant software v1.2. Total measured lipids of each class were calculated by summing the abundance of individual lipid species. In a number of cases described below correction factors were applied. DG and TG. Fragmentation of the ammoniated adducts of DGs and TGs leads to the loss of ammonia and a fatty acid.
Large population-based studies with hundreds or thousands of samples, such as ours, necessitate the need for high throughput analytical methodology. A plethora of analytical strategies have been developed for performing lipidomic profi ling ( 9 ). Here, we have combined a single phase extraction method with a targeted lipidomic approach using liquid chromatography electrospray ionization tandem mass spectrometry (LC ESI-MS/MS) to compare over 300 individual plasma lipids in a large population-based cohort, from the San Antonio Family Heart Study (SAFHS) (n = 1,076) ( 10 ). The large sample size provides us with statistical power to examine novel associations between circulating molecular lipid species and common anthropometric, physiological, and lifestyle measures (age, sex, obesity, and smoking) at a population level.

Cohort
The SAFHS investigated the genetics and risk factors of cardiovascular disease (CVD) in Mexican Americans by profi ling 1,431 individuals in 42 extended families at baseline ( 10 ). All procedures were approved by the institutional review board, and all subjects gave informed consent. Plasma cholesterol, HDL cholesterol , triglycerides, glucose, and insulin were measured ( Table 1 ). Plasma samples were collected and stored at Ϫ 75°C. Extensive genomic and gene expression profi ling has been performed and genome wide association studies (GWAS) have identifi ed many loci relating to type 2 diabetes, CVD, and other complex diseases (11)(12)(13)(14)(15).

Extraction procedure
Plasma samples from the SAFHS for which we had complete data (n = 1,076) were randomized prior to lipid extraction. Samples were thawed and 1 l of the anti-oxidant butylhydroxytoluene (BHT) (100 mM in ethanol) per 1,000 l of plasma was added. To each plasma sample (10 l) a mixture of internal standards in chloroform:methanol (1:1, 15 l) was added. The internal standards comprised lipids which are either stable isotope labeled or nonphysiological, and so present in plasma at extremely low concentrations ( Table 2 ). Lipids were extracted in a single phase chloroform:methanol (2:1) procedure as described previously ( 16 ). variable. We applied linear regression to identify linear associations of each individual lipid species and each lipid class or subclass to age and body mass index (BMI) adjusting for appropriate covariates in each analysis. Logistic regression was used for analyzing and describing the relationship between lipids and dichotomous dependent variables in the SAFHS study. It relates the log odds of the probability of an event to a linear combination of the However, DG can also lose water, which must also be considered to avoid erroneous assignment of the fatty acids in each DG species. For species containing different fatty acids, multiple product ions corresponding to the loss of each of the fatty acids will be formed and the signal divided between these competing pathways. While we would ideally monitor each of these losses with a separate MRM transition, the number of MRM transitions that would be required was too great to be compatible with the chromatographic timescale on which we were working. As a result, a single MRM transition was used to monitor each DG and TG. In this context it is important to recognize that for species which contain more than one of the same fatty acid, the loss of that fatty acid will result in an enhanced signal, as it is the end product from two competing pathways. Consequently, where we used an MRM transition that corresponded to the loss of a fatty acid that was present more than once, we divided by the number of times that fatty acid was present . While we recognize that the response factor for different species of TG varied substantially, the lack of suitable standards precluded the determination of suitable response factors for each TG species.
CE. Response factors were determined with seven commercially available species and used to create a formula to extrapolate for all CE chain lengths and double bonds. Saturated species were characterized by the following relationship: y = 0.74x Ϫ 10.56, where y is the response factor relative to the CE 18:0 d 6 internal standard and x is the carbon chain length. For monounsaturated species, the response factor was multiplied by 1.62 and for polyunsaturated species by 4.40.

Statistical analysis
Linear regression was used for analyzing and describing the linear relationship between lipids and selected characteristics in the SAFHS study. The ␤ -coeffi cient describes the slope of the regression line and refl ects the amount of variance of the dependent variable that is explained by variation of the independent PIS, precursor ion scan; NL, neutral loss scan; DP, declustering potential; EP, entrance potential; CollE, collision energy; CXP, collision cell exit potential.
a Amount of internal standard per sample.  Table I).
Further, as PC and SM undergo the same fragmentation to produce the phosphocholine head group with m/z 184, they can potentially be convoluted, if not separated chromatographically. The chromatography provided clear separation of the PC isotopologues from the isobaric SM species ( Fig. 3A ), but the SM isotopologues did coelute with the PC species ( Fig. 3B, C ). In many instances this did not make a signifi cant difference to the PC species as the SM isotopologue was of relatively low abundance. In cases where the signal from the SM isotopologue represented more than 5% of the PC species, the PC species was excluded from the analysis (species excluded on this basis were PC 32:3, PC 32:2, PC 32:1, PC 35:1, and PC 36:1).

Recovery
Recovery of lipids was calculated by extracting plasma, spiked with internal standards, and comparing peak areas to those of plasma extracts that were spiked with the same standards after extraction. Samples were reconstituted according to the protocol and analyzed as described (Methods). The recoveries of all standards were >86% (mean 89.0%, median 89.6%, Table 3 ). We have made the assumption that recovery of the internal standard is representative of the recovery for associated lipid species in that class or subclass.

Linearity of response
The linearity and range of the lipid measurements was assessed using serial dilutions of lipid standards in a plasma predictor variables. We applied logistic regression to identify associations of each individual lipid species and each lipid class or subclass to gender and smoking status adjusting for appropriate covariates in each analysis. The P values for linear and logistic regression were adjusted for multiple comparisons using the Benjamini-Hochberg approach ( 17 ).
Pearson correlation analysis was used to describe the linear relationship between each lipid species and all other lipid species.

Identifi cation of major lipid species in human plasma
The lipids in each class and subclass examined were identifi ed using pooled plasma samples from healthy volunteers by neutral loss and precursor ion scans ( Table 2 ) and confi rmed with product ion scans in positive and negative mode. Although the reverse phase chromatography resulted in the coelution of different lipid classes ( Fig. 1 ), it did provide clear separation of isobaric species within each class such that isobaric species of PC(O), PC(P), and PC were well separated ( Fig. 2A , supplementary Table I).
Similarly, lipid species of the same class which differ only by a double bond are chromatographically separated. This is particularly important as the [M+2+H] + ions of the more highly unsaturated species have the same nominal mass as the monoisotopic [M+H] + ions of the less saturated species. As a result, the more highly unsaturated species give rise to a signal for the MRM transition used to monitor the less saturated species. This chromatographic resolution quadrapole 1 (Q1) we selected for the isotopologues which have a higher m/z (1 or 2 Da, designated as M+1 or M+2, respectively) but lower abundance than the monoisotopic parent ion. We incremented the corresponding Q3 m/z by the same amount as the Q1 m/z . These species were then normalized to the Q1 and Q3 ion pair of the internal standard corresponding to the equivalent isotopolgues. matrix ( Table 3 ). Under the assay conditions used, we observed detector saturation for the most abundant lipids; such saturation could lead to nonlinear responses. To circumvent this problem for those abundant lipids, in  and biochemical measurements are detailed in Table 1 .
Of the 23 lipid classes and subclasses analyzed, 15 showed a signifi cant association with sex ( P < 0.05, Benjamini and Hochberg corrected). In addition to the expected elevated levels of CE (6.9%) and TG (12.8%) in males relative to females, we observed signifi cantly higher levels of Cer (6.9%), GM (9.1%), and LPC (14.1%) and signifi cantly lower levels of SM ( Ϫ 7.1%), THC ( Ϫ 5.7%), and PS ( Ϫ 16.6%) after adjusting for age, BMI, systolic blood pressure (SBP) , 2 h post load glucose, and smoking status ( Table 4 ). Of the 312 individual species measured, 223 were signifi cantly associated with sex, the most signifi cant being SM 32:2, P = 8 × 10 Ϫ 36 (supplementary Table II). Sixteen lipid classes and subclasses were signifi cantly associated with age (adjusted for sex, BMI, SBP, 2 h post load glucose, and smoking status) and 17 classes and subclasses were signifi cantly associated with BMI (adjusted for sex, age, SBP, 2 h post load glucose, and smoking status) (

DISCUSSION
We have presented a LC ESI-MS/MS lipid profi ling technique for analyzing more than 300 molecular lipid species across 23 lipid classes and subclasses in 20 min. The methodology is also applicable to tissue or cell extracts with as little as 25 g of cellular protein (18)(19)(20)(21)(22).
The Folch extraction method ( 23 ) has been used extensively to extract lipids from biological matrices. It involves the partitioning of the lipids in a biphasic solution of chloroform:methanol (2:1) and water. We identifi ed two major drawbacks in this method: a ) it does not lend itself to high sample throughput; and b ) polar lipids are differentially partitioned between the aqueous and organic phases. A modifi cation of this method was developed utilizing a single phase extraction and small plasma volume (10 l). Reconstitution in water saturated BuOH and 10 mM ammonium formate in methanol provided high recoveries and a suitable injection solvent for subsequent chromatography. Using the method described, our laboratory can routinely extract 400 samples per day. While robotics has obvious advantages ( 24,25 ), we have demonstrated that our method, even with manual handling and extraction of samples, is robust and applicable to cohorts of 1,000 or more samples.
Chromatographic separation prior to ESI-MS/MS allows for analysis of lipid mixtures and importantly, detection of low abundance lipids ( 26,27 ). However, HPLC Using a combination of transitions corresponding to the monoisotopic ion and the Q1 and Q3 values increased by one m/z unit we observed linear responses up to 500 M in plasma with R 2 values between 0.984 and 0.999. Using the Q1 and Q3 values increased by two m/z units, this was extended to 1,000 M for CEs (R 2 = 0.998) and TGs (R 2 = 0.997) to cover the full range of concentrations expected in plasma. As the difference between number of carbons in the internal standard and lipid of interest increases, so the relative intensities of the M+1 ions will also differ, potentially leading to greater difference in response factors and thereby compromising quantitative accuracy. While this is not an issue in comparative lipidomics, this differential should be adjusted for if accurate quantifi cation is the goal .

Assay performance
We used 63 evenly spaced quality control (QC) plasma samples within the analysis of the 1,076 SAFHS samples. The analysis was performed over four LC-MS runs each of between 2 and 3 days duration. Two measures were used to assess the performance of the lipid measurements: i ) the average intra-run %CV (coeffi cient of variation), where we calculate the average of the %CV for each of the four runs (each run contained between 13 and 17 QC samples); and ii ) the %CVs across the entire analysis (63 QC samples), see Table 3 . The median of the average intra-run %CV for the 23 lipid classes and subclasses (sum of the individual species) was 7.9% while the median %CV across the entire cohort was 9.6%. We also calculated %CVs of the 312 individual lipid species in the QC plasma extracts. The median average intra-run %CV was 10.6% with 90% of lipid species less than 27.6%, while the median %CV across the entire cohort was 13.8% with 90% lipid species less than 24.5%. Lipid species with %CV greater than 30% were typically of low abundance and/or had poor chromatography. Supplementary  Fig. I shows histograms of the %CVs for the lipid species.

Correlation between lipid species
Pearson's correlation analysis identifi ed positive correlations between lipid species within each lipid class/subclass, in particular PE, PS, DG, TG, and the sphingolipid classes were highly correlated ( Fig. 4 , supplementary Figs. II-IV). Between classes, PE displayed a strong correlation with PG, PI, DGs, and TGs as well as some species of PC and CEs. CEs, DG, and TG showed positive correlations for the majority of species. While the sphingolipid species were strongly positively correlated within classes, they displayed marginal positive or negative correlation across classes ( Fig. 4 , supplementary Fig. II). We also observed negative correlations between the PC(O) and plasmalogen with a number of lipid classes including PE, DG, and TG. PC species showed relatively weak correlation within the class and weak or negative correlations with the PC(O) and plasmalogen species ( Fig. 4 , supplementary Fig. III).

Associations of lipids with anthropometric and physiological measures
The SAFHS population cohort consisted of 1,076 individuals (39.1% male) from 15-91 years of age. Anthropometric used here, this has allowed us to monitor 261 lipids from a single injection on the gradient method and 66 lipids on our isocratic analysis.
Full structural elucidation of lipid species by mass spectrometry is rarely achieved and practically impossible in a single MS experiment. For example, negative mode fragmentation of PCs provides information on the chain length and number of double bonds present in each acyl chain, but does not provide empirical evidence on either the position of those acyl chains on the glycerol backbone or the position of the double bonds within the acyl chain. While mass spectrometry-based experiments have been developed to obtain further structural information on both these points ( 28 ), in the current context it is impractical to attempt to fully elucidate the composition of isomeric lipids. Due to the necessity for high throughput in the context of population profi ling, only limited structural elucidation is achieved, as refl ected in the nomenclature. For example, PCs are reported with the sum of the number of carbons in the two esterifi ed fatty acids and the sum number of double bonds, e.g., PC 36:3. conditions that allow the separation of a wide range of lipids in large sample cohorts and are compatible with downstream ESI-MS/MS have not been widely reported. We have used a gradient method that provides good separation of most lipid classes and subclasses and importantly resolves most isomeric and isobaric species in 14 min. However, we observed high background signals for some DG and TG species on the gradient method, resulting from solvent impurities, and so developed a short isocratic LC method that is able to minimize this background.
To ensure accurate peak fi tting and integration when using LC, it is necessary to maintain a sampling rate (in this experiment, 1 s) that provides a suffi cient number of points across a chromatographic peak. Because MRM is a serial technique, as the number of MRM transitions increases, the time spent on each MRM transition is reduced with a deleterious effect on sensitivity. In order to circumvent this limitation, the MRMs may be scheduled so that an MRM pair is only monitored around its expected retention time, thus reducing the number of transitions monitored at any one point in time. Coupled with the robust RP-HPLC account for variation of signal response between lipids and their corresponding internal standard. However, prior to regression analysis, we standardize each lipid measurement to the interquartile range. The calculated odds ratios then refl ect the change in likelihood of belonging to the outcome group as you increase the relative level of the lipid from the 25th to the 75th percentile of the population. Similarly the ␤ -coeffi cients refl ect the change in outcome measure as you increase the lipid level from the 25th to the 75th percentile. This approach provides for the rigorous statistical analysis of the biological interactions of lipids without the need for accurate quantifi cation and is particularly suited for the analysis of large population cohorts.
Despite the limitation in accuracy, when we compare values from our study (expressed as M) with published quantitative measures of plasma lipids [LIPID MAPS Consortium ( 2 )], we observe a close alignment, with most lipid classes giving comparable values (less than 50% difference). For example, CEs, DGs, and TGs were Ϫ 16, Ϫ 8, and Ϫ 47% respectively in our study relative to the LIPID MAPS study, while PC and SM were Ϫ 41 and 17% respectively. In contrast, PI was 3-fold higher and PE was 7-fold lower in our measurements. The major factor leading to these large differences is most likely the response factors of individual species relative to the internal standards used.
Our lipidomic analysis is suffi ciently rapid to have allowed us to analyze a large population cohort of over 1,000 samples. This provided statistical power to identify signifi cant associations between circulating plasma lipids that refl ect the The gold standard for quantifi cation in mass spectrometry is to employ a stable isotope internal standard for each analyte of interest. For lipidomic studies analyzing large numbers of lipid species, this approach is impractical due to the high cost and limited availability of suitable standards. While it has been widely recognized that lipid class is the primary determinant of ionization effi ciency ( 29,30 ), carbon chain length, number of double bonds, and changes in solvent composition (due to different elution times) will also infl uence ionization effi ciency and signal response. While these factors will adversely infl uence quantitative measurements, they have minimal effect on analytical precision. For quantitative lipidomics by LC-MS the use of multiple internal standards can been used, often combined with algorithms to interpolate response factors for those species not covered by the internal standards. Alternatively, calibration curves can be used to determine relative response factors of species for which no internal standard is available ( 2,31,32 ). While such approaches can provide accurate quantitative data, the time and cost required may preclude such quantitative approaches in some situations.
For large population studies such as described here, where the objective is to identify associations and relationships between lipids and anthropometric and physiological measures, an alternate approach is to perform comparative lipidomics. In this analysis the emphasis is on precision of the assays to provide maximum power to identify signifi cant associations and correlations within the resulting dataset. Thus, we have not, with the exception of DGs, TGs, and CEs, utilized correction factors to palmitate by serine palmitoyl transferase. While Cer level is not associated with BMI, we did observe a strong and specifi c association between Cer 18:0 and BMI ( P = 1.29 × 10 Ϫ 12 ), suggesting yet another regulatory mechanism involving ceramide synthase 1 which has specifi city for C18 acyl chains and is expressed primarily in skeletal muscle ( 37 ). In addition to this fi nding, we observed signifi cant associations of all species of dhCer with BMI. dhCer is the metabolic precursor to Cer indicating an upregulation of the de novo biosynthetic pathway without increasing Cer per se, but rather downstream metabolites. The glycoplipid classes (MHC, DHC, and THC) are all signifi cantly negatively associated with BMI, but in contrast, SM, which has been reported to be proatherogenic ( 38 ), shows a positive association and so appears to be the end-product of this upregulation.
Interestingly, SM showed a strong positive association with age but in contrast to Cer was signifi cantly lower in males than females. In a study of 1,102 recruited participants with coronary artery disease (CAD), Schlitt et al. ( 39 ) found no association of SM with age or BMI and no difference in SM levels between male and female participants. However, a positive association of plasma SM with CAD was reported. Based on the fi nding that SM was strongly associated with triglyceride levels, they hypothesized that, rather than an infl ammatory marker, SM is related to atherogenic lipoprotein aggregation, a process involved in the progression of atherosclerosis. The absence of an association with age in the study by Schlitt et al. ( 39 ) may relate to the older and narrower range of ages in their study compared with SAFHS (average 61.1 years, standard deviation 10.1 overlapping, intersecting, or competing metabolic pathways involved in the synthesis and catabolism of these lipid species and the lipoproteins they comprise. In addition to this, we were able to identify signifi cant independent associations of individual lipid species and lipid classes and subclasses with key anthropometric, physiological, and lifestyle factors (age, sex, BMI, and smoking).
Given the clear difference in risk of CVD between men and women and the well-established association with plasma lipids, this study provides the opportunity to examine whether differences in circulating lipids may contribute to this effect. Our results show that the traditional risk factors of cholesterol and triglycerides are higher in men than women. Furthermore, we, and others, have previously reported that plasma Cer is positively associated with CVD (33)(34)(35) and further with unstable CVD ( 16 ). Here, we observed that men have signifi cantly higher plasma levels of Cer than women (+6.9%, P = 2.21 × 10 Ϫ 6 , Table 4 ). Analysis of the Cer subspecies shows that the increase is primarily driven by the long chain species of Cer, 22:0, 24:0, and 24:1, which are highly correlated (supplementary Table II, supplementary Fig. II). This implies that the difference may be related to differential expression of ceramide synthase 2, which preferentially uses long chain (C20-C26) fatty acyl CoA species to produce Cer ( 36 ), and is the most abundant species expressed in the liver ( 37 ). We also observed that Cer is strongly associated with age ( Table 5 ); however in this instance, all Cer species contribute to the association (supplementary Table III), suggesting regulation of Cer synthesis at the rate-limiting step of the de novo biosynthetic pathway, the condensation of the serine and years vs. average 38.3 years, standard deviation 16.1 years). Elevated levels of LPC and LPC(O), the precursor to platelet activating factor, in men relative to women ( Table 4 ,  supplementary Table II), suggests an increase in phospholipase activity (specifi cally lipoprotein-associated phospholipase A2 (Lp-PLA 2 ), also known as platelet activating factor acetylhydrolase). This enzyme is highly expressed in atherosclerotic plaque and associated with pathogenesis of atherosclerosis ( 40,41 ). This is further supported by several reports that Lp-PLA 2 activity is higher in men than women and is linked to a higher risk of CAD, reviewed by Gregson et al. ( 42 ). We also observed a strong negative association of LPC with BMI. Because BMI is not reported to be associated with Lp-PLA 2 activity, we must look for an alternate regulation of LPC levels. Lecithin:cholesterol acyltransferase (LCAT) activity is a major source of LPC in circulation and studies have shown LCAT activity is decreased in obesity ( 43,44 ) potentially leading to a decrease in circulating LPC. Our data showed relatively few lipid species associated with smoking status ( Table 4 , supplementary Table  II). At a class and subclass level, only PC(P) and PE(P) were signifi cantly decreased in individuals who smoked ( Table 4 ). Plasmalogens are particularly susceptible to oxidation (45)(46)(47) and the negative association may be explained by an increase of oxidative stress associated with smoking ( 48,49 ). Analysis of a subset of the KORA (Cooperative Health Research in the Region of Augsburg) study ( 5 ) quantifi ed 198 metabolites, including 44 sphingolipids and 89 glycerophospholipids in 238 participants, and examined differences between smokers and nonsmokers. Twenty-three lipids, including three plasmalogens were identifi ed as nicotine-dependant biomarkers. This study suggested smoking was associated with plasmalogen defi ciency disorders specifi cally related to the reduced activity of the enzyme alkylglycerone phosphate synthase.
Our study highlights the power of performing lipidomic analysis at a population level where we have identifi ed signifi cant associations of many lipid classes, subclasses, and individual species with common anthropometric and lifestyle traits. In the context of population profi ling, this highlights the power available in large cohort studies to identify independent associations between lipid species and anthropometric and lifestyle traits. Our study also suggests that lipid metabolism could play a role in the pathogenesis of various diseases, and our lipidomic methodology provides a means to further investigate these hypotheses.
The authors are grateful to the participants of the SAFHS.