Genome profiling of sterol synthesis shows convergent evolution in parasites and guides chemotherapeutic attack.

Sterols are an essential class of lipids in eukaryotes, where they serve as structural components of membranes and play important roles as signaling molecules. Sterols are also of high pharmacological significance: cholesterol-lowering drugs are blockbusters in human health, and inhibitors of ergosterol biosynthesis are widely used as antifungals. Inhibitors of ergosterol synthesis are also being developed for Chagas's disease, caused by Trypanosoma cruzi. Here we develop an in silico pipeline to globally evaluate sterol metabolism and perform comparative genomics. We generate a library of hidden Markov model-based profiles for 42 sterol biosynthetic enzymes, which allows expressing the genomic makeup of a given species as a numerical vector. Hierarchical clustering of these vectors functionally groups eukaryote proteomes and reveals convergent evolution, in particular metabolic reduction in obligate endoparasites. We experimentally explore sterol metabolism by testing a set of sterol biosynthesis inhibitors against trypanosomatids, Plasmodium falciparum, Giardia, and mammalian cells, and by quantifying the expression levels of sterol biosynthetic genes during the different life stages of T. cruzi and Trypanosoma brucei. The phenotypic data correlate with genomic makeup for simvastatin, which showed activity against trypanosomatids. Other findings, such as the activity of terbinafine against Giardia, are not in agreement with the genotypic profile.

because the evolution of eukaryotes is thought to be interlinked with that of the sterols ( 21 ).
Here we present an in silico pipeline for comparative genomics of eukaryotes based on prediction of sterol biosynthetic enzymes. Our aim is to shed light on the evolutionary relationships of these enzymes and to identify new antiparasitic drug target candidates. Focusing on trypanosomatids, we further compare the stage-specifi city of expression of the sterol biosynthetic enzymes, and we probe the potential of selected SBIs against a panel of parasites.

Sequences
Proteome fi les were downloaded from Uniprot ( 22 ), Eu-PathDB ( 23 ), and Integr8 ( 24 ). The predicted proteomes were tested for completeness against the 100 most conserved proteins of the Core Eukaryotic Genes Mapping Approach database ( 25 ), which we had determined based on HMMer 3.0 profi le ( 26,27 ) searches of eukaryote reference proteomes ( C. elegans , inhibitors (SBIs) are widely deployed as chemotherapeutics. As indicated in Fig. 1 , human HMG-CoA reductase serves as the target of cholesterol-lowering statins such as simvastatin ( 9 ). Human farnesyl diphosphate synthase is the target of bisphosphonates (e.g., tiludronate), used against osteoporosis ( 10 ). Squalene epoxidase and sterol 24-methyltransferase are antifungal targets, inhibited by allylamines (e.g., terbinafi ne) and azasterols, respectively (11)(12)(13)(14). A particularly promising target is sterol 14-demethylase (CYP51). Azole inhibitors of CYP51 (e.g., ketoconazole) are widely used for fungal infections and, because trypanosomatid parasites also make ergosterol (15)(16)(17), lend themselves for a piggyback approach toward the urgently required new drugs for Chagas's disease. The latest antifungal approved by the U.S. Food and Drug Administration, posaconazole, was shown to be highly active against Trypanosoma cruzi in culture and in vivo (18)(19)(20), and it is currently undergoing phase 2 clinical trials for the treatment of Chagas's disease. Second, sterol biosynthetic enzymes are of phylogenetic importance. Sterols were proposed to hold a key to understanding eukaryote phylogeny

Bioinformatics
Multiple alignments were performed with ClustalW 2.0.10 ( 29 ). All profi le constructions and searches were carried out with HMMer 3.0 ( 26 ). The heat map was produced with the R library gplots ( 30 ). Hierarchical clustering was performed with the R library Pvclust , using Canberra distance and the McQuitty algorithm ( 31 ). Pvclust assesses uncertainty in hierarchical cluster Redundancy-reduced sets of reference sequences for the enzymes of interest ( Table 1 ) were downloaded from UniRef90 ( 28 ).

Chemicals
Simvastatin (S6196-25MG), tiludronate disodium salt hydrate (T4580-10MG), terbinafi ne hydrochloride (T8826-100MG), and ketoconazole (K1003-100MG) were purchased from Sigma-Aldrich. Fenpropimorph was kindly offered by M. Witschel from BASF. The test compounds were dissolved in dimethyl sulfoxide at 10 mg/ml and stored at Ϫ 20°C. Resazurin sodium salt (Alamar Blue) was  Table 1 for enzyme name EC classifi ers. The known differences among the analyzed eukaryotes were obvious. The absence of the MEV pathway (enzyme nos. 2 to 7 of Table 1 ) and the presence of the MEP / DOXP pathway (enzyme nos. 8 to 14) were apparent in most apicomplexan species, with the exception of Cryptosporidium parvum , which lacked either pathway ( Fig. 2 , top). This is in agreement with C. parvum having lost the plastid genome in the course of evolution ( 38 ). It was interesting to note that E. histolytica was also defi cient in either pathway, a fi nding that, to our knowledge, had not been reported before ( 39 ). Entamoeba was reported to be capable of limited cholesterol synthesis ( 40 ). Our data, however, support earlier reports suggesting an absolute requirement for cholesterol by Entamoeba ( 41 ). The vascular plants possessed both pathways, but the green algae C. reinhardtii and Ostreococcus tauri only had the MEP/DOXP arm, a fact that had been elegantly demonstrated for C. reinhardtii and other green algae by feeding experiments with radiolabeled glucose ( 42 ).
As expected, the sterol 24-C-methyltransferases SMT1 and SMT2 (enzyme nos. 37 and 41), characteristic of ergosterol synthesis, were absent in all vertebrates. The genome-wide screen also confi rmed the presence of a set of sterol synthetic enzymes, including 24-C-methyltransferases, in trypanosomatids. Surprisingly, though, the trypanosomatids lacked sterol O -acyltransferase as well as cholesteryl ester hydrolase (enzyme nos. 34 and 33, respectively), and yet, T. brucei had been shown to be capable of sterol esterifi cation ( 43 ), suggesting that trypanosomatids might use unusual enzymes for ester synthesis and hydrolysis.
Only 2 of the 42 sterol metabolic enzymes were present in all the analyzed proteomes: geranyl/farnesyl diphosphate synthase (enzyme nos. 17 and 20, respectively). Thus, our proteome-wide profi ling approach indicates that the synthesis of farnesylpyrophosphate is essential to all eukaryotes and that farnesylpyrophosphate is a metabolite of central importance to organisms, whether they use the MEV or the MEP/DOXP pathway. A possible explanation is that farnesylation and geranylation of proteins is essential to eukaryotes. This hypothesis is in agreement with the fact that farnesyltransferases (EC 2.5.1.29, 2.5.1.58, and 2.5.1.60) occur in all the analyzed eukaryotes as determined with a profi le search analogous to those for sterol biosynthetic enzymes (data not shown).

Hierarchical clustering reveals convergent evolution
To detect less obvious and possibly new relationships regarding sterol metabolism of eukaryotes, we subjected the sterol metabolic profi les ( Fig. 2 ) to hierarchical clustering. Every analyzed species was represented by a 42-tuple vector consisting of the best scores of the respective proteome to each profi le. Hierarchical clustering of these vectors produced the "sterol biosynthesis tree" shown in Fig. 3 , which basically subdivided the eukaryotes into species that make their own sterols (sterol prototrophs, left side) and species that do not (sterol auxotrophs, right side). The tree locally mirrored the phylogeny of the analyzed eukaryotes. The green plants analysis by implementing multiscale bootstrap resampling (n = 10,000) to estimate "approximately unbiased" ( au ) errors, where P = (100 -au )/100. For principal component (PC) analysis we used basic R functions and ggplot2 . The analyses were automated with Unix shell scripts and with Perl scripts. The phylogenetic trees were constructed from amino acid sequences with Mega 5, using the neighbor-joining algorithm and Jones-Taylor-Thornton substitution model. The number of bootstrap replications was 1,000.

Gene expression
Tag counts of the sterol biosynthesis enzymes were extracted from three published T. brucei short-read libraries (long slender bloodstream forms, short stumpy bloodstream forms, and procyclic tsetse fl y midgut forms) and from four T. cruzi short-read libraries (intracellular amastigotes, trypomastigotes, epimastigotes, and metacyclics). The libraries had been produced by Illumina sequencing using the spliced-leader trapping (SLT) protocol ( 32 ). Numbers of reads were normalized by using the DESeq ( 33 ) bioconductor package. The gene expression data were modeled with prcomp as a numerical matrix of M genes times N libraries. Eigenvalues and orthogonal eigenvectors were computed based on the square-symmetrical correlation matrix.

In vitro drug sensitivity
In vitro drug sensitivity assays were performed as described (34)(35)(36)(37). The tests were done over 72 h of incubation, except for the T. cruzi assay, which lasted 96 h. For L6 cells, L. donovani , T. brucei , and G. intestinalis , the redox-sensitive dye resazurin (Alamar Blue) served as an indicator of cell viability. For P. falciparum , incorporation of 3 H-hypoxanthine was used. For T. cruzi , ␤galactosidase activity was quantifi ed with the substrate CPRG/ Nonidet. IC 50 values were estimated by linear interpolation based on the semilogarithmic dose-response curves.

Sterol metabolic profi ling of eukaryote genomes
Aiming for a broad overview on sterol metabolism, we assembled a list of 42 relevant enzymes ranging from terpenoid backbone synthesis over squalene synthase to the formation of the different sterols and vitamin D derivatives as outlined in Fig. 1 . For each enzyme, all the amino acid sequences that had been annotated in the manually curated section of UniProt with the corresponding EC number were retrieved ( Table 1 ). Each of these sequence sets was redundancy reduced to 90%. Then a ClustalW multiple alignment was performed and converted to a positiondependent scoring matrix with hmmbuild of HMMer 3.0. The resultant 42 profi les were concatenated to a hidden Markov model (HMM) library for terpenoid backbone and sterol biosynthetic enzymes. This library was used for an in silico screen of eukaryote proteomes. For each proteome and each enzyme, we retrieved the profi le-alignment score of the protein that had returned the lowest expectancy ( E ) value. This approach allowed organizing the data in an unbiased, quantitative way, by plotting the obtained high scores as a two-dimensional heat map where the horizontal cross-sections represent the "sterol biosynthetic profi le" of a given organism ( Fig. 2 ). species located on the sterol prototrophic arm of the tree ( Fig. 3 ).

What is the origin of trypanosomatid ergosterol synthesis?
The sterol biosynthesis tree ( Fig. 3 ) is based on functional predictions and cannot serve for phylogenetic models. To investigate the evolutionary origin of the sterol biosynthetic genes in trypanosomatids, we constructed phylogenetic trees for two key enzymes of ergosterol synthesis: lanosterol 14 ␣ -demethylase (enzyme no. 24 in Table 1 ), the target of azoles, and sterol 24-C-methyltransferase (enzyme no. 37), the dedicative enzyme for ergosterol synthesis. Both trees had a similar topology (except that sterol 24-C-methyltransferase does not occur in animals) with distinct branches for the major groups of eukaryotes and a highly signifi cant separate branch for the included trypanosomatid sequences ( Fig. 4 ). It was impossible to root the trees because there are no suitable outgroups for the two enzymes such as orthologs from bacteria. Acquisition of foreign genes by trypanosomes from plants has been suggested for glycosomal enzymes of T. brucei ( 44,45 ). The phylogenetic trees of lanosterol 14 ␣demethylase and sterol 24-C-methyltransferase clearly do not support such a scenario for the sterol biosynthetic enzymes. Based on this bioinformatic analysis, one would exclude horizontal transfer as the evolutionary origin of trypanosomatid ergosterol synthesis. cosegregated, as did the trypanosomatids, the protostomes, and the deuterostomes where the sea urchin Strongylocentrotus purpuratus clearly clustered with the chordates. The fungi formed a cluster except for E. cuniculi , and the apicomplexans formed a cluster except for C. parvum . These two outliers segregated with G. lamblia and E. histolytica in a phylogenetically diverse branch of amitochondriates ( Fig. 3 ). The analyzed ciliates clustered with the insects; however, this association was not signifi cant, and clearly, a larger number of ciliate genomes would be desirable to better resolve their position in the sterol biosynthesis landscape.
We interpret cases where clustering based on sterol biosynthesis enzymes does not coincide with eukaryote phylogeny as indicative of convergent evolution. Thus, the amoebazoon Entamoeba and the fungus Encephalitozoon , both obligate endoparasites, clustered together on the sterol-auxotrophic branch of the tree, whereas the free-living amoebazoa D. discoideum and P. palladium segregate on the sterol-prototrophic branch together with the free-living, or facultative parasitic, fungi ( Fig. 3 ). A likely explanation of this clustering is that the eukaryote progenitor synthesized sterols de novo, whereas the obligate endoparasites independently lost the corresponding genes in adaptation to a parasitic lifestyle. The trypanosomatids are a notable exception: of all the included obligate endoparasites, they were the only  Table 1 for enzyme name and Fig. 1 for its position in the pathway. embryonic fi broblast and also against lovastatin, with the same result: IC 50 values clearly below 1 µM. Apparently, cholesterol synthesis was essential under our test conditions. Tiludronate was inactive against all the tested cells. The most potent compound against T. cruzi was ketoconazole, followed by simvastatin, which also showed a moderate activity against T. brucei (IC 50 = 4.6 µM) and L. donovani (IC 50 = 4.7 µM). The activity against T. cruzi was not conclusive due to the toxicity of simvastatin to L6 host cells (hence the parentheses in Table 2). Terbinafi ne, ketoconazole, and fenpropimorph were moderately active

Susceptibility of parasites to SBIs
Given the antichagasic potential of azoles, we tested a panel of further known inhibitors of sterol biosynthesis, as indicated in Fig. 1 , against parasites ( P. falciparum , L. donovani , T. cruzi , T. brucei , and G. lamblia ) and mammalian cells (rat L6 myoblasts). IC 50 values were determined in vitro ( Table 2 ). As expected, the mammalian cells were rather tolerant to most of the tested drugs, except for simvastatin, which had an IC 50 of 0.70 µM against the L6 cells. This result was surprising because simvastatin is widely used as a cholesterol-lowering agent. We also tested mouse  correlated with PC-2 ( Fig. 6 ). Taken together, PC analysis of the steady-state expression levels of sterol biosynthetic genes singled out the insect stages of T. brucei as well as T. cruzi ( Fig. 6 ), demonstrating parallel metabolic adaptations in African and South American trypanosomes.

CONCLUSION
Starting from the assumption that sterol synthesis is intrinsically linked to the evolution of eukaryotes, we performed comparative genomics based on the profi ling of sterol biosynthetic enzymes ( Fig. 1 ). The aim was to investigate convergent evolution in eukaryotes and possibly link this to chemotherapeutic strategies against parasites. We refrained from functionally annotating the analyzed against P. falciparum with IC 50 below 10 µM. Terbinafi ne was the only tested compound that showed a considerable effect on Giardia (IC 50 = 0.62 µM). However, terbinafi ne inhibits squalene monooxygenase (enzyme no. 22), an enzyme that is absent in Giardia ( Fig. 2 ). Comparing the in silico data of Fig. 2 with the in vitro data of Table 2 , we would expect a negative correlation between the score of the profi le search for a given enzyme and species and the IC 50 of a known inhibitor of the same enzyme. As shown in Fig. 5 , this was the case with simvastatin (Spearman rank order correlation coeffi cient r S = Ϫ 0.89, P < 0.05). For other compound-target pairs, there was either no correlation (ketoconazole and sterol 14-demethylase, r S = 0.06) or a positive, albeit nonsignifi cant, correlation (tiludronate and farnesyl diphosphate synthase, r S = 0.31; fenpropimorph and lathosterol oxidase, r S = 0.64). For terbinafi ne and its presumed target squalene epoxidase (enzyme no. 22), there was even a signifi cant positive correlation between profi le score and IC 50 ( r S = 0.89, P < 0.05; Fig. 5 ).

Stage-specifi c regulation of sterol biosynthetic enzymes in trypanosomes
A possible reason for the lack of correlation between genomic makeup and drug susceptibility is that a particular target enzyme may be differently expressed across the lifecycle stages of a parasite. We investigated the expression of sterol biosynthetic enzymes at the mRNA level in the trypanosomatids T. brucei and T. cruzi using previously published SLT data. SLT takes advantage of the conserved miniexon that is spliced in trans to all trypanosomal mRNA ( 46 ). We analyzed data from different stages of T. brucei (slender bloodstream form, stumpy bloodstream form, and procyclic tsetse-midgut form) and T. cruzi (intracellular amastigote form, trypomastigote form, and epimastigote triatomine-midgut form) ( 32,47 ). For both species, we found marked differences between the life-cycle stages regarding the steady-state expression levels of sterol biosynthetic genes. Generally, we detected higher mRNA levels of genes involved in sterol biosynthesis in the insect stages than in the mammalian stages of the organisms. This is in good agreement with the availability of sterols for the parasite in a mammalian host. PC analysis was performed for genes with orthologs in both species (n = 31). Plotting the fi rst two PCs revealed the proliferating insect stages of both T. brucei and T. cruzi to more closely align with PC-2 than with PC-1, and to positively correlate with either PC ( Fig. 6 ). In contrast, the proliferating mammalian stages, as well as all the nonproliferating stages, more closely aligned with PC-1 and oppositely  which belong to the amoebozoa but have been proposed to have algal ancestry based on their capability to synthesize cycloartenol ( 6 ). An unrooted phylogenetic tree of sterol 24-C-methyltransferase ( Fig. 4 ) did not shed light on the ancestry of the enzyme (it was not possible to root the tree due to the lack of an ortholog from prokaryotes to use as an outgroup). The main conclusion from the phylogenetic analysis is that there is no evidence for horizontal transfer as the origin of sterol biosynthetic genes in trypanosomatids. Apart from their enigmatic history, a major question brought forward by the sterol biosynthetic enzymes of trypanosomatids is to what extent they are exploitable for chemotherapy. Azoles are in development as antichagasic agents because they exhibit selective activity against T. cruzi . Their target is sterol 14-demethylase (enzyme no. 24), and in a phylogenetic tree of the enzyme, the trypanosomatid orthologs form a clearly distinct branch ( Fig. 4 ). We tested other known inhibitors of sterol biosynthesis against trypanosomatids and other parasites ( Table 2 ). In general, the in vitro activity of the inhibitors did not correlate with the presence of their presumed target enzyme ( Fig. 5 ). Although the activity of ketoconazole against T. cruzi was unmatched, simvastatin also showed activity against trypanosomatids. However, the results were not conclusive because, surprisingly, simvastatin was toxic to mammalian cell lines even though it is widely used as a cholesterollowering drug. Furthermore, simvastatin was successfully used for the treatment of mouse ( 49 ) and dog ( 50 ) models of Chagas's disease.
Comparing the stage-specifi city of expression of the sterol biosynthetic enzymes in T. brucei and T. cruzi revealed interesting parallels. The sterol metabolic enzymes were differentially regulated across the different life stages, and the expression patterns were similar for both species. The proteomes because our in silico pipeline did not furnish proof of function. It provided quantitative scores for all enzymes and proteomes ( Fig. 2 ), which lent itself for clustering. The resulting tree's principal subdivision was between the sterol-prototrophs and the sterol-auxotrophs ( Fig. 3 ). The latter included all of the analyzed protostomes, the former the deuterostomes (i.e., the vertebrates plus the sea urchin S. purpuratus ).
The interesting branches of the sterol metabolic tree were those where the grouping deviated from evolutionary descent. This was observed for unicellular obligate endoparasites, which appeared to have independently lost genes for sterol biosynthetic enzymes, presumably in adaptation to a parasitic lifestyle. Thus, the microsporidian Encephalitozoon did not group with the free-living and facultative parasitic fungi but with parasitic protozoa such as Giardia . The same was observed for Entamoeba , which did not group with the free-living amoebozoa but with the apicomplexan Cryptosporidium . In fact, both Entamoeba and Cryptosporidium exhibited extreme cases of metabolic reduction, lacking the sterol biosynthetic enzymes as well as either pathway, MEV or non-MEV, for isoprenoid synthesis. The only enzymes present in all the analyzed proteomes were the farnesyl/geranyl diphosphate synthases and the protein farnesyl transferase complex, indicating that protein prenylation is indispensable to all eukaryotes.
The only obligate endoparasites that possessed sterol biosynthetic genes were the trypanosomatids, Trypanosoma spp. and Leishmania spp. ( Fig. 2 ). Ergosterol has long been known to occur in trypanosomatids ( 48 ), and all the analyzed species scored positive for the key enzyme in ergosterol synthesis, sterol 24-C-methyltransferase (enzyme no. 37 in Figs. 1 and 2 and Table 1 ). In the sterol metabolic tree of Fig. 3 , all the trypanosomatids grouped together, sister to the slime molds Dictyostelium and Polysphondylium , bloodstream forms correlated positively with each other and negatively with the insect forms ( Fig. 6 ). The expression levels of the sterol biosynthetic genes were generally higher in the insect stages than in the mammalian stages of both T. cruzi and T. brucei . Sterol 24-C-methyltransferase (enzyme no. 37) might be a good drug target because it is highly expressed and thus probably essential in all the lifecycle stages. Finally, the trypanosomatids lacked the bona fi de genes for sterol esterifi cation ( Fig. 2 , enzyme nos. 33 and 34), and yet they had been shown to build sterol esters ( 43 ). Thus, we hypothesize that trypanosomatids possess atypical sterol ester synthase and esterase, which represent another possible point for chemotherapeutic intervention. In summary, we conclude that sterol metabolism offers further potential drug targets for selective inhibition of trypanosomes.