A robust all-atom model for LCAT generated by homology modeling.

LCAT is activated by apoA-I to form cholesteryl ester. We combined two structures, phospholipase A2 (PLA2) that hydrolyzes the ester bond at the sn-2 position of oxidized (short) acyl chains of phospholipid, and bacteriophage tubulin PhuZ, as C- and N-terminal templates, respectively, to create a novel homology model for human LCAT. The juxtaposition of multiple structural motifs matching experimental data is compelling evidence for the general correctness of many features of the model: i) The N-terminal 10 residues of the model, required for LCAT activity, extend the hydrophobic binding trough for the sn-2 chain 15–20 Å relative to PLA2. ii) The topography of the trough places the ester bond of the sn-2 chain less than 5 Å from the hydroxyl of the catalytic nucleophile, S181. iii) A β-hairpin resembling a lipase lid separates S181 from solvent. iv) S181 interacts with three functionally critical residues: E149, that regulates sn-2 chain specificity, and K128 and R147, whose mutations cause LCAT deficiency. Because the model provides a novel explanation for the complicated thermodynamic problem of the transfer of hydrophobic substrates from HDL to the catalytic triad of LCAT, it is an important step toward understanding the antiatherogenic role of HDL in reverse cholesterol transport.


Multiple sequence alignment
The sampling of the sequence similarity of human LCAT with protein sequences from other species was performed using as a seed the sequence of human LCAT and the PSI-BLAST algorithm. Non-redundant protein sequence databases were searched, and general parameters were set using values for the expected threshold ( 20 ) of 10 and a word size of 3; the scoring parameters used were BLOcks SUbstitution Matrix 62 (BLOSUM62) protein substitution matrix ( 21 ) with costs of gap opening and extension of 12 and 1, respectively. The search generated more than 40 sequences from fi sh, amphibians, birds, and mammal species.
The sequences similar to LCAT were aligned using a combination of UCSF Chimera and Clustal 2 ( 22,23 ). As far as regards the software Clustal 2, we used the graphical user interface Clust-alX ( 24-26 ) version 2.1 in order to produce Fig. 4 . Sequences were aligned by performing a complete alignment in the multiple alignment mode.

NMA
NMA is a direct way to analyze the vibrational motion of proteins based on a structural analysis (27)(28)(29)(30). In NMA calculations, the potential function is approximated by a sum of quadratic terms in the displacements around a minimum energy conformation.
We utilized the Normal Mode Analysis, Deformation, and Refi nement (NOMAD-REF) web server for all atomic NMA ( 31 ). NMA was performed on the homology model of LCAT coordinates consisting of 3,331 atoms as reference structures in combination with the following parameters for the elastic network model for the fi rst 36 modes. All interactions were weighted by a distanceweight parameter of 5 Å, and for the nonbonding interactions a cut-off of 10 Å was chosen. We obtained modes in PDB formats and inspected them using visual molecular dynamics (VMD) ( 32 ) for the specifi c motions that are described in results.

MD
All-atom minimization and MD simulations were used to explore the dynamic motion and measure the stability of the models we developed. The MD was performed using NAMD 2.9 ( 33 ) as described ( 34,35 ).
To obtain m5 from m5*, 10,000 steps of conjugate-gradient energy minimization were performed to remove possible close contacts between atoms after specifying harmonic constraints of 1 kcal/mol on all backbone atoms except for residues within fi ve of the disulfi de-linked cysteines. In addition, secondary-structure restraints (as determined by the initial structure from MODELLER) were also specifi ed by using the VMD ( 32 ) plug-in "SSRestraints" to ameliorate disruption during minimization. Hydrogen bonds were not restrained. These restraints were then used by NAMD's "extrabonds" command.
To obtain m5p, a POPC molecule was manually docked into the hydrophobic trough of m5, then 10,000 steps of conjugategradient energy minimization were performed after freezing all protein atoms in place.
To obtain m5d, residues 150-153 were removed from m5p, then E149-E154 were joined to form a continuous protein backbone. Then the structure was minimized for 10,000 steps while freezing residues 103-112, 122, 149, 181, 208, 288, and 342 to prevent disruption of the lipid . Next, while applying harmonic constraints of 1 kcal/mol on all residues >10 Å from N, O, and P atoms of the lipid, as well as the terminal methyls of the lipid, the system was subjected to 10,000 steps of conjugate-gradient energy minimization, heated to 310 K, and then simulated for 5.5 ns, giving m5d. In addition, this fi nal structure was also solvated, minimized for 10,000 steps, heated to 310 K, and simulated for of a portion of the structure of the 416 residue long LCAT chain, residues 73-210, was published 15 years ago by Peelman et al. ( 3,4,10 ). More recently, Sensi et al. ( 11 ) published a homology model for full length LCAT that is composed of an amalgam of homology and ab initio fragments developed by a step-by-step selection of best fi t models. A key criterion used for best fi t was spatial clustering of the three residues of the catalytic triad proposed by Peelman et al. ( 10 ). Although S181 is known to be the nucleophile in LCAT ( 12 ), the other residues of the proposed triad, D345 and H377, have not been established.
Here we explore the creation of a robust homology model for LCAT using the multi-alignment routine of the program MODELLER and two recently determined crystal structures homologous to portions of LCAT as templates. Derivation of this model required many fewer assumptions than employed by either Peelman et al. ( 10 ) or Sensi et al. ( 11 ). A single dual alignment generates multiple models essentially all possessing structural attributes matching LCAT features previously determined by biochemical and biophysical methods ( 3,4 ). We then examined the best models produced through molecular dynamics (MD) simulations and normal-mode analysis (NMA) and explored the molecular mechanisms whereby LCAT associates with HDL to create a tunnel for transfer of lipids from HDL to the LCAT active site.

Homology modeling calculations
Homology models search of the Research Collaboratory for Structural Bioinformatics (RCSB) Protein Data Bank (PDB) was carried out on the Basic Local Alignment Search Tool (BLAST) server by selecting the protein BLAST program and by using the position-specifi c iterated BLAST (PSI-BLAST) algorithm . We found the best template for the C-terminal domain of human LCAT (residues 51-416) to be the X-ray crystal structure of human lipoprotein-associated phospholipase A 2 (Lp-PLA 2 ) (RCSB PDB identifi cation (ID): 3D59) ( 13 ). The best template for the N-terminal domain of human LCAT (residues 1-50) was the X-ray crystal structure of tubulin-like protein, PhuZ, from bacteriophage 201 2-1 (RCSB PDB ID: 3RB8) ( 14 ).
In order to assemble the best templates for the N-terminal and the C-terminal domains of LCAT, the model of human LCAT was built using the homology modeling software MODELLER ( 15-18 ) version 9.11. The structural model of the human LCAT target sequence was based on the alignment against these two templates, which was performed using a Python script of MOD-ELLER, and as a graphical user interface the molecular graphics program UCSF Chimera ( 19 ) version 1.8. Finally, we used the discrete optimized protein energy (DOPE) potential to evaluate the best 10 models (defi ned as those with minimal DOPE values) out of the 100 structures generated by MODELLER, using the "loopmodel" function.
In a parallel experiment, the disulfi de bonds were specifi ed by the "self.patch" function within the "special_patches" routine in a modifi cation to the "loopmodel" function, and then another 100 structures were generated by MODELLER and the best 10 models were selected as above.
is similar to LCAT in that it tightly associates with HDL to hydrolyze the ester bond at the sn-2 position of oxidized (short) acyl chains of phospholipid. Further, LCAT can also transesterify and hydrolyze platelet-activating factor-2 or oxidized PC, each of which has shorter chains than POPC in the sn-2 position. The similarities make Lp-PLA 2 a credible template for LCAT.
The crystal structure of Lp-PLA 2 has a catalytic triad of S273, H351, and D296 ( Fig. 1F ), a short ( ‫ف‬ 7 Å) aromaticrich binding pocket for the short sn-2 oxidized acyl chain ( Fig. 1D ), and an acyl trough for binding the longer sn-1 chain ( Fig. 1E ). Again, in spite of an even lower similarity/ identity than tubulin PhuZ (20/7%), the alignment was good with only eight gaps, six of one to three residues, and two of four and fi ve residues ( Fig. 2 ). Importantly, there is only a single gap of one residue in the alignment of the core sequence ( 37 ) of LCAT (residues 106-260, green box in Fig. 2 ).
The chief inaccuracies in homology modeling derive from errors in sequence alignment and improper template selection ( 38 ). In spite of weak identity, because of a core alignment with only one gap, core regions tend to have conserved structural folds and are more amenable to homology modeling than peripheral loop regions ( 39 ), we used the multi-alignment routine in the homology program, MODELLER, to combine the Lp-PLA 2 structure another 5 ns without harmonic constraints. Finally, m5d was also simulated for another 1 ns without solvent and without any harmonic constraints.
For the simulation of the four residue peptide, consisting of residues 179-182, the initial unsolvated structure was minimized for 1,000 steps, heated to 310 K, and then simulated for 1 ns.

Creation of the homology model
When we performed a homology search against human LCAT ( Fig. 1A ), the top scoring homology match of residues 1-50 of human LCAT was to residues 2-51 of tubulin PhuZ, bacteriophage 201phi2-1 [ Fig. 1B ; RCSB PDB ID: 3R4V ( 14 )], a segment that forms a ␤ -sheet/ ␣ -helix motif ( Fig. 1C ). In spite of a similarity/identity of only 30/14%, the alignment was good with only two gaps of two residues each ( Fig. 2 ).
We found a best match of residues 51-416 of LCAT to residues 54-427 of Lp-PLA 2 [ Fig. 1D-F ; RCSB PDB ID: 3D59 ( 13 )], a member of the ␣ / ␤ -hydrolase fold family characterized by a ␤ -sheet core of eight or so strands connected by ␣ -helixes to form an ␣ / ␤ / ␣ sandwich ( 36 ). Lp-PLA 2 , also known as platelet-activating factor acetylhydrolase, Color code: Aromatic residues (magenta); hydrophobic residues (gold); basic residues (blue); acidic residues (red); binding site for sn-2 chain (white double-headed arrow); S273 of catalytic triad (green). E: (D) rotated 90° showing proposed binding of human plasma platelet-activating factor based upon a crystal structure of the organophosphate compound paraoxon bound to PLA 2 . F: Ribbons image of (D) showing active site triad. the resulting structure (designated m5 ) is shown in Fig. 3B . The cysteine residues were suffi ciently close in m5 that disulfi de bond formation created no signifi cant structural distortion following energy minimization; the relative motions of m5 during energy minimization after disulfi de bond formation are illustrated in supplementary Fig. 1. All four cysteines are completely conserved in all LCAT sequences, except for C356 that is exceptional only because it falls in a deleted region of one mammalian species ( Fig. 4 , red arrowheads).
We also created disulfi de bonds using the alternate approach of specifying the disulfi de bonds within MOD-ELLER via a routine instead of within NAMD minimization. We created 100 disulfi de-containing models and selected the 10 with the lowest energy for further examination. Alignment of the top 10 models is shown in supplementary Fig. 2a. Alignment of the models using the disulfi deforming routine was good, but not quite as good as models created with no disulfi de bonds; in particular, the N-terminal domain was more mobile. We will compare the principle features of model m5 , all top 10 disulfi de-free models, and all top 10 models created using the disulfi de-forming routine as we move through the Results. and the PhuZ structure as templates ( Fig. 1A ) to create homology models for human LCAT. After initial models looked promising, we proceeded to create a total of 100 models with MODELLER and chose the top 10 lowest energy models for further study. All 10 of the lowest energy models are quite similar, an alignment of the 10 structures is shown in Fig. 3A . Their alignments differ predominantly in the C-terminal domain, where eight prolines occur in the last fi fteen residues of LCAT. The terminal six residues of this region are unmatched in the template ( Fig. 2 ).
We examined each of these 10 models in detail using molecular graphics and, although each was similar, the fi fth best scoring model (designated m5 *) was chosen for further modeling. It was immediately apparent that the model m5 * possessed a number of structural attributes that matched known LCAT features. We proceeded to the formation of disulfi de bonds.

Disulfi de bonds
Disulfi de bonds were specifi ed in m5 * between the correct cysteine disulfi de pairs, C50-C74 and C313-C356 ( 3,4 ), the cross-linked model was subjected to harmonic constraints and secondary-structure restraints, energy minimized, and The core domain, residues 107-260, is boxed in green. The three key domains, N terminus, lid, and triad are boxed in bold black. Key residues R147, E149-E155, R158, G179-S181, and D204-G205 are bolded. Identity between template and target is denoted by black, similarity by gray.
15-20 Å in m5 . Examination of the alignment of the ten lowest energy states for the disulfi de-free models, fi nds that in fi ve models the hydrophobic trough is completely open, in three models the trough is partially closed by contact of the loops marked by yellow arrowheads ( Fig. 3A ), and in two models the trough closure is more extensive. In the top ten models created using the disulfi de bond-forming routine (supplementary Fig. 2a), the hydrophobic trough is completely open in four models (supplementary Fig. 3), but is in varying states of closure in the remaining six models primarily due to contact of the loops marked by yellow arrowheads in supplementary Fig. 2a.
Remarkably, POPC, a good substrate for LCAT, can be docked into the L-shaped hydrophobic trough of m5 in an orientation similar to that of platelet-activating factor or oxidized PC in the Lp-PLA 2 crystal structure ( Fig. 1E ). For docking, a POPC molecule was manually inserted into the trough in m5 with the terminal methyl of the sn-2 chain in Van der Waals contact with the completely conserved W2 of the N-terminal aromatic cluster and the sn-1 chain and the polar head group bent at approximately 90° to the sn-2 chain to fi t snugly into the shorter end of the L-shaped hydrophobic trough. Then the structure was energy minimized with parts of the protein subjected to harmonic constraints, but the POPC allowed to move freely, and the results (designated m5p ) are shown in Fig. 3C, D .
Secondary and tertiary structure. Model m5 has a helical content of 25%, identical to the experimentally determined value of 24-25% ( 40 ) for human LCAT. Further, the model has features similar to other members of the ␣ / ␤ -hydrolase fold family, with 12 helical segments, 9 at least four residues in length, that surround a core of eight ␤ -strands ( Fig. 3B ). Values for the 10 lowest energy models were similar. The models created without disulfi de bonds have an average helical content of 27 ± 1.2% with an average of 15 helical segments surrounding a core of 13 ␤ -strands. The models created with the disulfi de-forming routine have an average helical content of 28 ± 1.5% with an average of 15 helical segments surrounding a core of 11 ␤ -strands.

Hydrophobic PC binding trough
The fi ve N-terminal residues, FWLLN, known to be required for LCAT activation by apoA-I ( 4 ), form in m5 one end of an L-shaped trough lined by numerous aromatic and hydrophobic residues ( Fig. 3B-D ). The fi rst 10 residues of LCAT, F W LL NV L FPP , are highly conserved and six (underlined) are completely conserved in all 45 LCAT orthologs that have been sequenced, from mammals to fi sh (see box in sequence alignment, Fig. 4 ). These 10 residues, shown as a transparent surface representation in  showing: active site triad as spacefi lling, S181 (green), H180 (pale blue), and D204 (pink); disulfi de bonds, C50-C74 and C313-C356 (gold spacefi lling); lipase lid (cyan stick); PhuZ homologous region (residues 1-50) (magenta); residues 1-10 (magenta surface); known glycosylation sites (dotted pale green spacefi lling). Helical domains, H1 and H2, located at the end of the sn-1 trough, red arrows (see Fig. 8 ). C: Space fi lling image of model m5 with same view as (B) containing a POPC molecule inserted into the hydrophobic crevice and energy minimized (see Materials and Methods). Color code: aromatic residues (magenta); hydrophobic residues (L, V, M, I, A, P) (gold); basic residues (blue); acidic residues (red); conserved residues F1 and W2 (magenta arrowheads); other LCAT residues (gray); POPC, (yellow spacefi lling). D: Same image as (C) with docked POPC in stick format to show location of the lid (dotted spacefi lling cyan). Helical domains, H1 and H2, located at the end of the sn-1 trough, red arrows (see Fig. 8 ).
iii ) The terminal methyl of the sn-2 chain remains in Van der Waals contact with the completely conserved residue W2 ( Fig. 3C ). iv ) The sn-2 chain has kinked at the double There are several detailed features of the complete POPC/LCAT model, m5p , that are worthy of note: i ) The sn-2 chain binding trough is capped by a pair of -stacked Trp/Phe residues in both the model and the Lp-PLA 2 template, W2/F1 in the LCAT model ( Fig. 3C ) and W298/ charge-relay network to polarize and activate the nucleophile, which attacks the substrate, forming a covalent intermediate that in the hydrolase is hydrolyzed to regenerate the free enzyme. One key feature of the charge-relay network is that the acid, most often Asp, forms a hydrogen bond with the base, most often His, to drive formation of a salt bridge during activation of the nucleophile, most often Ser ( 41 ).

Catalytic triad
A catalytic triad refers to three amino acid residues that function together at the active site of certain hydrolases, such as proteases and lipases ( 41 ). LCAT is unusual because it is the sole example of an enzyme that combines lipase and transferase activities. A common method for generating a nucleophilic residue for covalent hydrolysis is an acid-base-nucleophile triad. The residues form a   5. Several noteworthy features of the lid, the "catalytic triad," and their interactions. A: Relaxed-eyed stereo image of the alignment of the lipase lid (residues 150-153) and the catalytic triad (S181, H180, D204) from the full alignment of the 10 lowest energy models created without disulfi de bonds shown in Fig. 2A to illustrate persistent formation of a hydrogen bond between the acid, D204, and the base, H181, of the catalytic triad. The lid is in ribbons and the catalytic triad is in stick. The persistent hydrogen bond is indicated by a blue-pink dotted line. B: Overlay of the catalytic triad of our LCAT homology model 5 (green) with the known catalytic triad of Lp-PLA 2 (yellow). C: Relaxed-eyed stereo image in stick representation of the tetra-peptide sequence, G179-H180-S181-L182, in which the hydroxyl of S181 and the nitrogen adjacent to the C ␥ atom of the protonated H180 imidazole ring were brought within hydrogen bond distance by rotation of [ , ], the peptide was energy minimized and then was subjected to 1 ns MD simulation at 310 K in vacuo. D: Relaxed-eyed stereo image of (C) in spacefi lling representation.
When the 10 lowest energy MODELLER structures of the models formed without disulfi de bonds were aligned by their ␣ carbons, the side chains of S181, H180, and D204 were remarkably well aligned ( Fig. 5A ); none of the nonhydrogen atoms of the key nucleophile, S181, known experimentally to be the nucleophile in the catalytic triad of LCAT ( 12 ), deviated from their position relative to the rest of the alignment by more than 1 Å. The same is true for the top 10 models created with the disulfi de bondforming routine (supplementary Fig. 2b).
The linear order S181 ↔ H180 ↔ D204 is identical to that in Lp-PLA 2 , except reversed in direction relative to the hydrophobic trough ( Fig. 5B ). It is especially noteworthy that, as required by the catalytic triad charge-relay network concept, D204 forms a hydrogen bond with H180 ( Fig.  5B ), a geometry that is maintained in both sets of 10 lowest energy models ( Fig. 5A ; supplementary Fig. 2b). S181, experimentally determined to be the nucleophile in the catalytic triad of LCAT ( 42 ), is at the elbow of the L-shaped hydrophobic trough in m5 at the opposite end of the long arm from the conserved N-terminal cluster, F1-P10 ( Fig. 3B-D ), in a comparable position to the key catalytic residue, S273, of Lp-PLA 2 (compare Fig. 1F and Fig.  3B ). H180 and S181 are completely conserved in all 45 LCAT ortholog sequences, while D204 is conserved in all 31 mammalian species ( Fig. 4 , red arrowheads).
The fact that S181 and H180 are adjacent in the linear sequence of m5 is a unique feature of the proposed catalytic triad in our model that requires discussion. In other catalytic triads determined from crystal structures ( 41 ), the nucleophile and base are in separate segments of the protein. The juxtaposition of the nucleophile and the base in our model is not a problem in principle because: i ) many catalytic triads with numerous combinations of nucleophile and base have been created by convergent evolution ( 41 ); and ii ) LCAT is unique in that it possesses both lipase and transferase activities ( 4 ).
Not only is LCAT only distantly related to lipases ( 11 ), but the enzyme itself acts as the donor in the acyl transferase step ( 4 ). The other enzymes that catalyze formation of CE, acyl-CoA (CoA):cholesterol acyltransferases, ACAT1, and ACAT2, and indeed most acyl transferases, require an activated donor ( 43 ). Because there is no evidence for more than one active site in LCAT, to serve as the donor, the acyl ester enzyme intermediate created by the catalytic triad must have a high chemical potential. Thus the evolutionary origin of the active site of LCAT is likely to be distinct from the evolutionary origin of the active site of pure lipases. Indeed, Das, Davis, and Rudel ( 44 ) have suggested that ACAT1 may involve a Ser/His/Asp catalytic triad of S456, H460, and D400, placing Ser and His near one another in the primary sequence of ACAT1. However, the possibility exists that another histidine residue acts as the base. Given its interaction with the lid, H186 is a possible candidate.
For adjacent Ser-His residues to function as a base-nucleophile charge relay system, one of the nitrogens of the His imidazole ring must be able to hydrogen-bond to the sn-2 ester bond ( Fig. 3B-D ). The location and conformation of this ␤ -turn is quite invariant in both sets of 10 lowest energy models ( Fig. 5A ; supplementary Fig. 2b).
To mimic opening of the lid, we removed the lid residues 150-153 from m5p , created a peptide bond between residues E149 and E154, minimized with harmonic constraints and secondary structure restraints, and performed a 5.5 ns MD simulation at 310 K to create model m5d . In this simulation, we harmonically constrained the terminal methyls of the two POPC acyl chains and all residues greater than 10 Å from the N and P atoms of the POPC molecule. The results are shown in Fig. 6A . Noteworthy features of m5d include: i ) signifi cant disordering of the POPC acyl chains induced in part by reorientation of several aromatic residues allowing greater insertion into the hydrophobic trough (white arrowheads, Fig. 6A ); and ii ) deeper insertion and reorientation of the POPC headgroup to allow the sn-2 carbonyl moiety to come within ‫ف‬ 4 Å of the hydroxyl of the nucleophilic residue of the catalytic triad, S181 ( Fig. 6B ).

The nucleophile S181 is restricted by direct or indirect interactions to four functionally critical residues
It is noteworthy that S181 in model m5 has direct interactions with one residue, E149, that determines the activity of LCAT with PC substrates containing different sn-2 fatty acyl compositions ( 46 ) and a second residue whose mutation, K218N, is known to cause LCAT defi ciency ( 47 ). Further, S181 has indirect interactions with a third hydroxyl of the Ser. To determine whether such a conformation was possible in our model, we created a peptide from the tetrapeptide sequence, G179-H180-S181-L182, rotated the [ , ] of H180 and S181 to move the hydroxyl moiety of S181 within hydrogen bond distance of the H180 ␦ and imidazole nitrogens and energy minimized the two peptides. Only the association of the S181 hydroxyl with the N ␦ of the H180 imidazole ring maintained hydrogen bond distance. When we created a hydrogen bond by placing a charge on the His and subjected that peptide to MD simulations at 310 K in vacuo for 3 ns to mimic the low dielectric environment of a catalytic triad, the hydrogen bond between the hydroxyl of S181 and the N ␦ of the H180 imidazole ring remained intact, creating the structure shown in Fig. 5C Fig. 4).

␤ -Turn lipase lid
Many lipases have a lid that separates the active site from solvent and maintains a low dielectric environment. For example, PLA 2 from cobra venom that hydrolyzes the sn-2 chain of PC contains a lipase lid (31-GGSGTP-36) structured as a ␤ -turn ( 45 ). In model m5 , a short ␤ -turn ( Fig.  3B ) formed by four residues (150-PGQQ-153) covers the key catalytic residue, S181, in the hydrophobic trough (cyan, Fig. 3B-D ), thus separating the active site from solvent approximately halfway between S181 and the docked NH to the backbone carbonyl of S181 ( Fig. 7A, B ); this hydrogen bond is maintained in both sets of lowest energy models, the top 10 created with the disulfi de bond-forming routine (supplementary Fig. 2b) and the top 10 created without disulfi de bonds (supplementary Fig. 5).
It is intuitively tempting to speculate that the breaking of the interactions of E149, K128 and Q152 to S181, and E154 and E155 to H186 and R99, respectively ( Fig. 7B ), would trigger the opening of the lid away from the nucleophile, S181, exposing the nucleophile to the substrate and freeing it to form a hydrogen bond with the base, H180 or perhaps H186.

Glycosylation sites
LCAT is a glycoprotein. Attached carbohydrate residues are expected to be on the protein surface exposed to the solvent. The known N -glycosylated residues, N20, N84, N272, and N384, and the O -glycosylated residues, T407 and S409 ( 3 ), are all located on loops on the surface of residue whose mutation, R147W, also causes LCAT deficiency ( 47 ) ( Fig. 7 ).
E149 in human LCAT, the residue immediately preceding the lid in m5 , is critical in determining the activity of LCAT with PC substrates containing different sn-2 fatty acyl compositions ( 46 ). It is probably no coincidence that E149 forms a hydrogen bond to the hydroxyl moiety of S181 ( Fig. 7A ) and forms salt bridges to two basic residues on the opposite side of the lid that are nearby in the primary sequence, R147 and R158 ( Fig. 7B ). The salt bridge fl uctuates between R147 and R158 in different low energy models (data not shown). It is equally signifi cant that a mutation of one of the two basic residues salt bridged to E149, R147W, is associated with LCAT defi ciency ( 47 ). Finally, K218, also associated with LCAT defi ciency via the mutation, K218N, forms a hydrogen bond to the backbone carbonyl of H180 ( Fig. 7A, B ), restricting the motion of both the nucleophile S181 and the base H180.
One additional residue of note, Q152 in the lid, is directly associated with S181 via a hydrogen bond of its backbone lid residue E155, might move into position to act as the base.
The occurrence precisely at the elbow of the L-shaped hydrophobic trough of S181, E149, K218, and a ␤ -turn structure with a sequence similar to previously described lipase-like lids ( 45 ), moieties that interact with each other, is also unlikely to be coincidental. Further, the dimensions of the L-shaped hydrophobic trough that fi t the size and shape of a docked PC molecule especially snugly after energy minimization and a short MD simulation ( m5p and m5d ) suggest that the general features of the proposed PC binding site are plausible and likely correct. Finally, the spatial closeness allowing creation of the correct disulfi de bonds without signifi cant structural distortion and the presence of all potentially glycosylated residues on the solvent-exposed surface of the LCAT structure directly opposite the lipid-binding trough are also unlikely to be coincidental.
Despite relatively low sequence conservation and identity between human LCAT and the template structures, the structural attributes of our model m5 that match known LCAT features ( 3,4 ) support the general correctness of the human LCAT homology model presented here. Although the general consensus is that sequences falling below 20% sequence identity can have very different structures ( 49 ), detectable levels of sequence similarity usually imply signifi cant structural similarity ( 17 ). The most unique features of our model, the interactions between S181, E149, and the lid, and between H181-A204, E149-R147, E149-R158, E154-R99, and E155-H186 are located in the core region of the alignment ( Fig. 2 ). The likely reason that we were able to derive a plausible model for LCAT under conditions of low similarity between target and template is because the core of our alignment has only a single residue gap; core regions tend to have conserved structural folds and are more amenable to homology modeling than peripheral loop regions ( 39 ).
While the ␣ -helical content of the model is 25%, spot on to the experimentally determined value of 24-25% ( 40 ), we are somewhat less confi dent about the ␤ -sheet content of 14-17% calculated by the SS program used by RasMol. A previous report using circular dichroism spectroscopy suggested a ␤ -sheet content of 40% ( 40 ) for human LCAT. While circular dichroism spectroscopy is not considered a particularly accurate way to calculate ␤ -sheet content ( 50 ), the ␤ -sheet content calculated for our LCAT model may be somewhat low.
Examination of the ribbons representation of our model in Fig. 3B shows a number of mostly parallel extended chain segments, many meeting the structural criteria of ␤ -strands, which thread back and forth through the center of the structure between the surface ␣ -helixes. The parallel extended chains not recognized as ␤ -sheet by the PyMol program used for the illustration or by the SS program used by RasMol for percent calculation may be, in several instances, ␤ -sheets in the actual LCAT structure. Because the spatial distribution of ␣ carbons are similar for extended chains and individual strands in ␤ -sheets, the the model on the side of the LCAT model opposite the hydrophobic trough (dotted pale green spacefi lling representation in Fig. 3B ).

Goodness of fi t of model m5 to known LCAT attributes
The human LCAT homology model m5 presented here contains multiple structural attributes that match known LCAT features previously determined by biochemical and biophysical studies ( 3,4 ). These include: i ) a putative catalytic triad ( 13 ) in which an aspartic acid is hydrogen bonded to a histidine; ii ) a ␤ -turn lid ( 45 ) covering and hydrogen bonding to the backbone carbonyl of the known nucleophile of the catalytic triad, S181; iii ) a unique location for E149, a residue that determines the specifi city of LCAT toward the sn-2 chain of the PC substrate ( 48 ), allowing its backbone carbonyl to hydrogen bond to the hydroxyl of S181 and salt bridge to a pair of arginine residues, one associated with LCAT defi ciency, on the opposite side of the lid from S181; iv ) a direct interaction of K218, a residue associated with LCAT defi ciency, with the nucleophile, S181; v ) an L-shaped hydrophobic trough containing the nucleophile (S181), the lid, E149, and K218 (associated with LCAT defi ciency) at its elbow; vi ) hydrophobic trough dimensions that snugly accommodate the sn-2 chain of a PC molecule in its longer end and the sn-1 chain and the choline head group (or UC) in its shorter end; vii ) disulfi de bonds ( 3,4 ) formed without structural distortion; viii ) location of the six known glycosylation sites ( 3 ) on surface loops; ix ) a structure belonging to the ␣ / ␤hydrolase fold family; and x ) an ␣ -helical content identical to experimentally determined values ( 40 ).
The close juxtaposition and proper ordering of a correct catalytic triad of residues, S181 ↔ H180 ↔ D204, seems unlikely to be coincidental. S181 has been shown to be the primary catalytic residue in human LCAT ( 42 ), so the presence of H180 immediately adjacent to it in the primary sequence is signifi cant; no other His is within 10 Å of S181. Even more unlikely to have occurred randomly is the hydrogen bond formation between H180 and D204 ( Fig. 5 ), a close juxtaposition seen in both sets of 10 lowest energy LCAT homology models ( Fig. 5A ; supplementary Fig. 2b) with C ␣ distances varying between 5 and 6 Å; no other Asp is within 10 Å of H180.
However, the juxtaposition of a nucleophile next to a base in the primary sequence is unprecedented for a catalytic triad. We have shown that it is possible for H180 and S181 to rearrange their [ , ] angles to allow hydrogen bonding between the hydroxyl moiety of S181 and one of the nitrogen moieties of the imidazole ring of H180. The resulting conformation might be a bit strained, but it is known that there is usually unfavorable geometry between the serine and histidine of a catalytic triad, a strain that reduces the energy of the hydrogen bond to facilitate the relaying of the proton to the substrate ( 41 ). In an alternative scenario, after the lid opens, H186, salt bridged to the the hydrophobic trough is signifi cantly more conserved that the sn-1 /UC end (supplementary Fig. 7b).

Mechanism of interaction of LCAT with lipid-associated apoA-I
MD simulations of dHDL containing both POPC and UC consistently result in a gap forming between the pairwise helix 5 repeats ( 55,56 ) of the antiparallel molecular double belt structure ( 57 ) of apoA-I in dHDL particles. Examination of an ensemble of 16 MD simulations ( 55,58,59 ) showed that, in 14 of the 16, the methyl ends of POPC acyl chains (POPC-Me) from the bilayer center are inserted into the gap and exposed to solvent (one example is shown in Fig. 8A ). In the other two examples of the ensemble, the hydroxyl moiety of UC is inserted into the gap and exposed to solvent (one example is shown in Fig. 8C ).
We proposed several years ago that, after attachment of LCAT to dHDL, the gap between pairwise helix 5 repeats in apoA-I creates an amphipathic "presentation" tunnel for migration of hydrophobic acyl chains and amphipathic global packing of residues will also be similar between the homology model and actual human LCAT.
LCAT is known to esterify cholesterol with POPC sn-2 chains of varying lengths. To accommodate itself to a given sn-2 chain, the length of the trough model m5 can be altered by movement of residues F1 and W2 ( Fig. 3C ). Measurements of the 10 lowest energy LCAT homology models (formed without disulfi de bonds) give an average distance between W2 and S181 of 26 Å and a range of 22-31 Å (not shown); the extended acyl chain of oleic acid (OA) ( Fig. 6D ) is approximately 24 Å in length.

Stability of the hydrophobic trough of the homology model of LCAT
The exposure to solvent of the hydrophobic trough of our model m5 poses a thermodynamic conundrum. The hydrophobic trough for the sn-1 chain of oxidized PC in the published structure for a truncated form of Lp-PLA 2 is presented as solvent exposed ( 13 ). In model m5d ( Fig. 6 ), although both the sn-1 and sn-2 chains of POPC are deeply buried in the long L-shaped acyl trough, the hydrophobic trough is also solvent exposed. The importance of the hydrophobic effect in biological systems ( 51 ) suggests that additional conformational changes or macromolecular interactions may be required before either Lp-PLA 2 or LCAT achieve their enzymatic functions.
Possible changes or interactions to stabilize the hydrophobic trough in m5 during LCAT function include: i ) dimerization (supplementary Fig. 6a); ii ) scissor-like articulation about the catalytic triad along the long axis of the hydrophobic trough resulting in juxtaposition of the sn-1 and sn-2 acyl chains (supplementary Fig. 6b); iii ) zipperlike association of the opposite edges of the long-axis of the hydrophobic trough (supplementary Fig. 6c); iv ) association of the complete hydrophobic trough with apoA-I on HDL particles (supplementary Fig. 6c). To evaluate possible mechanisms for trough closure in the monomeric state, we used NMA.
NMA is a computational method that can reveal potential changes in the conformation of large proteins ( 52 ). The biologically relevant modes for proteins are low-frequency modes that represent the most likely large-scale motions of the protein ( 53 ). When we applied all atom NMA to model m5 ( 54 ), the motions of the lowest frequency modes, 7 and 8, were located predominantly in the L-shaped hydrophobic trough.
In mode 7 (see supplementary Fig. 7a; supplementary Video 7a), the diagonal quadrants of the hydrophobic trough move toward and away and orthogonal to one another to create a sliding motion of each edge along the long axis of the trough. This motion could be involved in transferring PC and UC molecules from the apoA-I presentation tunnel ( 55,56 ) to the LCAT active site. In mode 8 (see supplementary Fig. 7b; supplementary Video 7b), the horizontal quadrants of the hydrophobic trough move toward and away from one another to open and close the trough, a motion that could be helpful in opening and closing the trough to shield the acyl tails from solvent. It is worthy of note that, as might be expected, the sn-2 end of  ( 55 ) showing the pairwise H5 helices of apoA-I forming the acyl chain/free cholesterol "presentation" tunnel (green). The methyl end of the sn-2 chain of one POPC inserts with 80-90% fi delity during simulation into the presentation tunnel, likely due to an increased mobility relative to the saturated sn-1 chain. B: LCAT model containing docked POPC indicating the sn-2 and sn-1 binding troughs and the helical pair, H1 and H2, located at the end of the sn-1 trough. C: A second simulated dHDL particle ( 59 ) showing the -OH of an UC inserted, in 5-10% of the MD simulations, into the 5/5 presentation tunnel. D: LCAT model containing docked OA and UC in the sn-2 and sn-1 binding end, respectively and the helical pair, H1 and H2, located at the end of the sn-1 trough. Gray and yellow arrows show proposed transfer of POPC and UC, respectively, from dHDL. All molecular model fi gures were created with the PyMOL molecular graphics system (version 1.3, Schrödinger). UC to the active site of LCAT ( Fig. 8 ). At some point during the interaction, the association of LCAT with HDL might shift to a full trough orientation (supplementary Fig. 6d).
In this model, apoA-I becomes part of the LCAT molecule when substrate is in the tunnel and the tunnel connects to the active site of LCAT. Studies by Parks and Gebre ( 7 ) suggest that the poor LCAT reactivity for longchain sn-2 acyl groups is multifactorial, involving turnover of substrate molecules at the active site of the enzyme and activation of the enzyme by its cofactor apoA-I. Our model is compatible with these observations because it suggests that turnover at the active site should be indistinguishable from activation, i.e., both would depend upon apoA-I structure.

CONCLUSIONS AND EXPERIMENTAL PREDICTIONS
The juxtaposition of multiple complicated structural motifs in a single model of LCAT, each motif matching extensive experimental biological and structural information ( 3,4 ), is compelling evidence for the general correctness of many features of the model. We consider it unlikely, however, that the hydrophobic lipid-binding trough is open to the solvent and we suggest as a working hypothesis that LCAT is dimeric in solution and monomeric when performing its enzymatic functions while associated with HDL. Because the model provides a detailed molecular solution to the complicated thermodynamic problem of how hydrophobic acyl chains and amphipathic UC can be transferred from HDL to the catalytic triad of LCAT, it has important implications for understanding the antiatherogenic role of HDL in reverse cholesterol transport.
In lieu of a full crystal structure for human LCAT, our model predicts the detailed location and identity of residues in four major functional motifs: i ) the catalytic triad ( 13 ); ii ) the ␤ -turn lipase lid ( 45 ) covering the key catalytic triad residue, S181; iii ) the L-shaped hydrophobic trough for uptake and binding of a PC molecule and its product after hydrolysis, an sn-2 fatty acyl chain and the UC substrate in the last step of esterifi cation; and iv ) activation of the lipase-lid by a trigger mechanism that involves E149, K218 and K147.
Powerful tests of the model would be to assay the effects on biological activity of LCAT of rationally designed sitedirected mutants of key residues of these four functional motifs: i ) the catalytic triad residues H180 and D204; ii ) lid residues G151 and Q152; iii ) aromatic residues in the Lshaped hydrophobic trough, e.g., W291, Y292, and Y157; and iv ) residues in the putative lid trigger mechanism H186, E149, and K147. An additional test would be to perform chemical cross-linking of specifi c residues in solution phase human LCAT and fi tting of observed crosslinks to contact map/cross-link matrix plot analyses ( 63 ).
The authors would like to thank University of Alabama at Birmingham Information Technology and Department of UC from the bilayer to the active site of LCAT ( 55,56 ). It is especially noteworthy that the gap between pairwise helix 5 domains has also been observed by X-ray crystallography ( 60 ). We considered the novel and consistent conformation of the residues forming the pairwise helix 5 gap after MD simulations to be suggestive of a role of the gap in activation of LCAT by lipid-associated apoA-I. In essentially every simulation of dHDL, either one or two terminal methyl groups or UC are inserted into the tunnel gap ( 55 ). For a movie of the insertion of two acyl methyl groups in the "presentation" tunnel during MD simulation see supplementary Video 8.
It has been suggested that the sn-2 chains of PC and UC have separate binding sites within LCAT ( 61 ). We note that the cavity on the short side of the L-shaped trough (opposite the sn-2 acyl chain) that initially binds the sn-1 chain in model m5 ( Fig. 6C ) is also of the correct size and shape for the binding of UC ( Fig. 6D ). Once the sn-2 chain of PC passes into the LCAT hydrophobic trough and is converted to an acyl ester enzyme intermediate, UC could displace the bound lysolecithin molecule as illustrated in Fig. 6D . It seems likely that the displaced lysolecithin would partition back into the lipid bilayer of the LCAT-associated dHDL before being taken up by albumin.
Although LCAT may initially bind to parts of apoA-I in addition to the pairwise helix 5 domain, e.g., arginine residues in tandem helical repeat 6 ( 62 ), the presentation tunnel hypothesis implies that during CE formation the pairwise helix 5 repeats interact directly with LCAT. While a scissor-like articulation about the catalytic triad along the long axis of the hydrophobic trough resulting in juxtaposition of the sn-1 and sn-2 acyl chains might be an intermediate in apoA-I/ LCAT interactions (supplementary Fig.  6b), it seems unlikely this conformation alone would allow PC transfer from dHDL. Further, although it may be the state of LCAT in solution, it seems improbable that a dimer of LCAT (supplementary Fig. 6a) is directly involved in PC transfer from dHDL. This leaves two more likely conformations for lipid transfer: a zipper-like association of the opposite edges of the long-axis of the hydrophobic trough (supplementary Fig. 6c) and/or the association of the complete hydrophobic trough with apoA-I on HDL particles (supplementary Fig. 6d).
The gap between helix 5 repeats of apoA-I in HDL particles is lined on both sides by four charged residues, E125, K133, E136, and K140 ( 55,56 ). A potential candidate for LCAT docking to the pairwise helix 5 gap is the methyl end of the sn-1 chain binding trough. Figure 8B shows that the methyl end of the sn-1 chain is held in place by antiparallel helical domains H1 (residues 113-125) and H2 (residues 345-354) of the LCAT model (also see Fig. 3B, D ). Both H1 and H2 contain acidic and basic residues (D133, K116, H122, and D345, D346, R351, E354, respectively) that could salt bridge to the pairwise helix 5 gap in lipidassociated apoA-I. In this way, the H1-H2 end of the hydrophobic trough of LCAT would bind to the presentation tunnel of phospholipid-rich HDL particles to open a passageway from the bilayer interior to the active sites of LCAT for movement of amphipathic PC acyl chains and