Clostridium scindens: a human gut microbe with a high potential to convert glucocorticoids into androgens

Clostridium scindens American Type Culture Collection 35704 is capable of converting primary bile acids to toxic secondary bile acids, as well as converting glucocorticoids to androgens by side-chain cleavage. The molecular structure of the side-chain cleavage product of cortisol produced by C. scindens was determined to be 11β-hydroxyandrost-4-ene-3,17-dione (11β-OHA) by high-resolution mass spectrometry, 1H and 13C NMR spectroscopy, and X-ray crystallography. Using RNA-Seq technology, we identified a cortisol-inducible (∼1,000-fold) operon (desABCD) encoding at least one enzyme involved in anaerobic side-chain cleavage. The desC gene was cloned, overexpressed, purified, and found to encode a 20α-hydroxysteroid dehydrogenase (HSDH). This operon also encodes a putative “transketolase” (desAB) hypothesized to have steroid-17,20-desmolase/oxidase activity, and a possible corticosteroid transporter (desD). RNA-Seq data suggests that the two-carbon side chain of glucocorticords may feed into the pentose-phosphate pathway and are used as a carbon source. The 20α-HSDH is hypothesized to function as a metabolic “rheostat” controlling rates of side-chain cleavage. Phylogenetic analysis suggests this operon is rare in nature and the desC gene evolved from a gene encoding threonine dehydrogenase. The physiological effect of 11β-OHAD on the host or other gut microbes is currently unknown.

methylene chloride, dried under nitrogen, and separated by silica B gel TLC plates with solvent A [5:25:0.2 isooctane:ethyl acetate:glacial acetic acid (v/v/v)] ( 6 ). Products were scraped from TLC plates and extracted from silica with ethyl acetate, dried under nitrogen, and resuspended in methanol . Products were further purifi ed by reverse-phase HPLC on a Beckman ODS C 18 10 × 250 mm semi-prep column. Samples were separated at 2.5 ml/min at 25°C and monitored at 240 nm. Peaks were collected and dried under a nitrogen gas atmosphere.

High-resolution mass spectrometry
High-resolution mass spectrometry (HR-MS) using an atmospheric pressure chemical ionization (HR-APCI-MS) was carried out using a JEOL AccuTOF JMS-T100LC liquid chromatographmass spectrometer (JEOL, Tokyo, Japan) with an APCI source and coupled to an Agilent 1200 series binary pump (Agilent, Santa Clara, CA) in the negative ion mode. HR-APCI-MS of the sample was carried out in the fl ow injection mode, using methanol as the mobile phase at a fl ow rate of 1 ml/min. The ionization conditions were as follows: needle voltage, Ϫ 2 kV; ion guide peak voltage, 2.5 kV; ion source temperature, 80°C; desolvation temperature, 500°C; orifi ce 1, 2 and 3 voltages, Ϫ 85, Ϫ 5, and Ϫ 15 V respectively; mass range, m/z 50-1,000; nebulizing gas, N 2 . 1 H and 13 C NMR spectra were obtained on a JNM-ECA 800 instrument operated at 800 and 200 MHz, respectively, with CDCl 3 containing 0.1% tetramethylsilane (TMS) as the solvent. Chemical shifts were expressed in ␦ (ppm) relative to TMS, and the following abbreviations are used: s, singlet; d, doublet; br, broad. The 13 C distortionless enhancement by polarization transfer (135°, 90°, and 45°) spectrum was measured between CH 3 , CH 2 , CH, and coherence based on their proton environments. In order to further confi rm the 1 H and 13 C signal detected heteronuclear multiple quantum ( 1 H- 13 C coupling) and 1 H detected heteronuclear multiple bond correlation (long-range 1 H- 13 C coupling) experiments were also performed.

X-ray crystal structure determination
Colorless crystals of steroid-17,20-desmolase reaction product suitable for X-ray crystal structure determination were grown by recrystallization of the compound from ether-n -hexane gave colorless needles with a melting point of 197-199°C. A crystal having dimensions of 0.25 × 0.10 × 0.05 mm 3 was selected. The X-ray intensity measurements were carried out on a Rigaku Mi-cro7HFM-VariMax Saturn 724R CCD system with a confocal X-ray mirror, using graphite-monochromated Mo K ␣ radiation ( = 0.7107 Å) at 220 K. The structure was solved using direct methods and refi ned by a full-matrix least-squares procedure based on F 2 using the CrystalStructure crystallographic software package. The crystal structures were refi ned with anisotropic temperature factors for all nonhydrogen atoms. The positions of hydrogen atoms were generated theoretically, and were refi ned using the riding model. Crystallographic details are summarized in supplementary Table II. Bacterial strains, growth conditions, and sterol induction C. scindens ATCC 35704 was purchased from ATCC. C. scindens ATCC 35704 was stored as a 30% glycerol stock at Ϫ 80°C prior to this study, and working stocks were cultivated in chopped meat broth. After overnight growth in Brain Heart Infusion broth (BHI), a 5% inoculum was transferred to 100 ml BHI containing 50 M cortisol. Cells used for RNA isolation were cultivated to O.D. 600 nm of 0.4 (mid-log phase), and quenched in a 1:2 v/v solution of RNALater (Ambion, Grand Island, NY) reagent and kept (SDase) ; however, the genes encoding these enzymes have yet to be identifi ed ( 7,8 ).
In mammalian cells, side-chain cleavage of glucocorticoids proceeds by an oxygen-dependent P450 monooxygenase reaction ( 9 ). However, gut microbial SDase occurs under anaerobic conditions. We showed previously that SDase was optimal under anaerobic conditions (optimum circa Ϫ 130 mV ) ( 6 ). Therefore, we expect that the mammalian and gut microbial steroid-17,20-desmolase gene(s) to have formed by convergent evolution. Human 20 ␣ -HSDH is encoded by AKR1C1, a member of the aldo/keto reductase family ( 10 ). However, AKR1C1 and microbial 20 ␣ -HSDH show different substrate specifi cities, again suggesting possible differences in gene ontology ( 7,8 ). The genome of C. scindens ATCC 35704 has recently been sequenced as part of the Human Microbiome Project reference gut microbial genome initiative (Accession: PRJNA18175). Both 20 ␣ -HSDH and steroid-17,20-desmolase activities in C. scindens ATCC 35704 are cortisol inducible ( 7,8 ). We decided to adopt a genome-wide transcriptomics (RNA-Seq) approach for gene discovery given that we can control induction of the genes for cortisol metabolism, the genome sequence can be used as a scaffold to organize transcripts, and the mechanism for SDase and thus the gene ontology is unknown, and not obvious from genome sequence alone .
In the present report, we thoroughly characterize the end product of bacterial steroid-17,20-desmolase and verify the identity of this steroid metabolite as 11 ␤ -hydro xyandrosten-3,17-dione (11 ␤ -OHA) through mass spectrometry (MS), 1 H and 13 C nuclear magnetic resonance (NMR) spectroscopy, and X-ray crystallography. Next, we identify a cortisol-inducible operon ( desABC D) by quantifying and comparing mRNA levels from cells induced by cortisol versus uninduced cells of C. scindens ATCC 35704 using RNA-Seq. We demonstrate that this operon encodes a 20 ␣ -HSDH ( des C) with substrate specifi city similar to that reported for the native enzyme. Bioinformatic analysis suggests a hypo thesis for the mechanism of microbial steroid-17, 20-desmolase ( des AB). The steroid-17,20-desmolase is hypothesized to proceed by transketolation with the two-carbon side chain feeding into the pentose-phosphate pathway. Our data also show that SDase activity is rare among bile acid 7 ␣ -dehydroxylating isolates from the human gut, and that only a fraction of C. scindens isolates have this activity. Finally, we perform thorough phylogenetic analysis of the 20 ␣ -HSDH gene product (DesC), which appears to be rare in the gut microbiota of most organisms. The gene appears to have evolved from the gene encoding threonine dehydrogenase and the operon appears to be conserved in organization, though not in function, among a small number of anaerobic microbes from diverse environments.

Purifi cation and identifi cation of steroid-17,20-desmolase product
Cortisol reaction products were extracted from reaction buffer by 1/10 vol 1 M HCl followed by two extractions with 2 vol of 1.6% denaturing agarose gel. RNA samples with distinct 23S rRNA and 16S rRNA bands were further purifi ed. In addition, residual genomic DNA contamination was determined by RT-PCR using the RT-for-PCR kit (Clontech) and 200 ng purifi ed RNA.
Dynabeads M-280 Streptavidin (Invitrogen, Grand Island, NY) was made RNase-free according to the manufacturer's recommendations. Biotinylated oligonucleotides were bound to beads (5 mg) (through streptavidin) by resuspending in 500 l diethylpyrocarbonate -treated 0.5× SSC (0.075 M NaCl; 0.0075 M trisodium citrate dehydrate; 10 mM Tris, pH 7.0; 1 mM EDTA) containing 360 pmol each biotinylated oligonucleotide (denatured by heating to 90°C for 5 min and cooling on ice for 3 min). Beads were then captured on a Promega magnetic stand (Promega, Madison, WI), the supernatant containing unbound probes aspirated, and residual unbound probes washed twice with one volume 0.5× SSC, and twice with 6× SSC (0.9 M NaCl; 0.09 M trisodium citrate dehydrate; 10 mM Tris, pH 7.0; 1 mM EDTA). Effi ciency of binding was determined by comparing A 260 nm of oligo-mix after bead incubation and washing and comparing to oligo-mix without bead incubation. Capture-hybridization was performed on a Biorad C1000 thermocycler (Biorad, Hercules, CA). First, 1 g total RNA suspended in 35 l 6× SSC in a RNasefree 0.2 ml PCR tube was heated to 70°Cfor 5 min to denature RNA, followed by 0°C for 3 min at which point the thermocycler was paused allowing addition of 150 g oligo-bound beads. The thermocycler was resumed at a temperature of 68°C for 30 min allowing capture hybridization between rRNA molecules and oligo-bound beads. Beads were then captured on a magnetic stand, and the supernatant was combined with two additional (1 vol) washes with 6× SSC, and the enriched mRNA was centrifuged and magnetically captured to remove residual beads before being precipitated. RNA bound to beads was quantifi ed by A 260 nm after removal by three stringent washes with 0.5× SSC containing 10% formamide at 70°C. Enriched mRNA was quantifi ed on a NanoDrop 2000 (Thermo Scientifi c, Waltham, MA) and stored at Ϫ 80°C until sequencing.

Whole transcriptome sequencing
MRNA-enriched samples were processed using the Roche cDNA synthesis system following the manufacturer's protocol. Resulting cDNA libraries were sequenced in a Roche 454 GS FLX Titanium system (Roche, Indianapolis, IN) using the rapid library protocol according to the manufacturer's instructions. The sequencing yield was of ‫ف‬ 300,000 reads for the libraries derived from cortisol-induced cells and from uninduced control cells.
Remaining ribosomal and transfer RNA reads were removed from the sequence dataset by BLAST searches ( 12 ) of these reads against a database of known Clostridium rRNA and tRNA sequences, using a minimum 90% identity cutoff. Non-rRNA reads were then mapped to the C. scindens ATCC 35704 genome using Burrows-Wheeler transformation ( 13 ), and resulting alignments were analyzed by CuffLinks ( 14 ) for differential expression detection.

Cloning and overexpression of desC gene in E. coli
Genomic DNA was isolated from C. scindens ATCC 35704 by enzymatic treatment, followed by bead beating as described previously ( 15 ). A streptavidin tag was engineered into the reverse primer and the des C gene was PCR amplifi ed, restriction digested, and ligated into the expression vector pSport1.

HPLC assay for steroid-17,20-desmolase activity
An HPLC assay was used to verify steroid-17,20-desmolase activity in C. scindens ATCC 35704 before proceeding with RNA isolation. This assay was also used to determine whether other fecal isolates with bile acid 7 ␣ -dehydroxylating activity also possess steroid-17,20-desmolase activity. Strains exhibiting bile acid 7 ␣ -dehydroxylating activity were isolated from human volunteer stool as described previously ( 11 ). We utilized an HPLC assay to determine whether strains of bile acid 7 ␣ -dehydroxylating bacteria have steroid-17,20-desmolase activity. Cortisol and 11 ␤hydroxyandrostenedione each absorb at 240 nm owing to their 3-oxo-⌬ 4 structures. Once late log phase was reached, steroids were extracted from the medium with dichloromethane, dried, resuspended in methanol, and separated on an Agilent Eclipse C 18 column (5 m, 4.6 × 250 mm) integrated on an Agilent 1200 series HPLC. The column temperature was maintained at 40°C, the fl ow rate was 1 ml/min, and the solvent was 50% methanol in HPLC grade water (Sigma-Aldrich). Concentration was determined by peak area compared with a standard curve of peak area versus known concentrations of cortisol and 11 ␤hydroxyandrostenedione.

Purifi cation of mRNA by magnetic bead-capture hybridization
Cells were quenched with RNALater solution (Ambion) and stored at Ϫ 80°C until processing. Total RNA was recovered from cells following disruption by bead beating in the presence of acid phenol. RNA was treated several times with DNase to remove contaminating genomic DNA , and was further purifi ed using the MEGAclear Kit (Ambion). mRNA was enriched by magnetic-bead capture hybridization using custom biotinylated TEG-spaced oligonucleotides designed along the length of the 16s rRNA and 23s rRNA molecules from C. scindens ATCC 35704 (supplementary Table I).
Cells were stored at Ϫ 80°C in RNAlater solution (Ambion). Cells were then collected by centrifugation and washed with lysis buffer (200 mM NaCl, 20 mM EDTA, diethylpyrocarbonate -treated water) and collected again by centrifugation. Cells were resuspended in 500 l lysis buffer and transferred to 2 ml screw-cap bead-beating tubes (Sarstedt, Germany) to which 200 l zirconium beads, 210 l 20% SDS solution (Ambion), and 1 ml 5:1 acid phenol were added. Cells were then disrupted on a Mini-BeadBeater (Biospec Products, Inc., Bartlesville, OK) at maximum speed twice for 1 min, with tubes kept on ice in between treatments. The aqueous and phenol phases were then separated by centrifugation, and the aqueous phase was washed once with 1 ml 5:1 acid phenol. Nucleic acids in the aqueous phase were then precipitated at Ϫ 80°C for 20 min by addition of 1/10 vol 5 M ammonium acetate (Ambion), 1 l Glycoblue (Ambion), and 1 vol ice-cold isopropanol, followed by centrifugation at 13,600 g for 20 min. RNA (>200 bp) was then purifi ed using the MEGAclear kit (Ambion) according to the manufacturer's instructions. Contaminating genomic DNA was removed by treatment with TURBO DNase (Ambion) according to the manufacturer's instructions. MEGAclear and DNase treatment steps were repeated once. At this point, RNA purity and integrity was checked spectrophotometically by the A 260 :A 280 ratio, the range of pure RNA falling between 1.7 and 2.1, and integrity checked by separating RNA on a clostridia, we used the automated phylogenomic analysis from PATRIC, which shows C. scindens as distantly related to most other Clostridium species, being deeply embedded in the Lachnospiraceae family and with several Ruminococcus species in close phylogenetic proximity. The Clostridium genes identifi ed as closely related to C. scindens ' desC belong to species only distantly related to this cluster of C. scindens , Ruminococcus , and the Lachnospiraceae. However, no member of the latter two taxa presents a gene closely related to C. scindens ' desC . On the other hand, desC grouped with a group that, besides the clostridia ( Clostridium leptum DSM 753 ZP_02081557, Clostridium ljungdahlii DSM 13528 YP_003780744 and YP_003780741, Clostridium sp . MSTE9 EJF39056, and Clostridium carboxidivorans P7 ZP_05390048), also contained a sequence from the Spirochaetes Treponema primitia ZAS-1 (ZP_09717055) and the actinobacteria Modestobacter marinus (YP_006366820); sequences in this group were very similar to each other, with BLAST E-values of around 1E-50 or better in comparison to C. scindens ' desC . The sister group for this clade is composed of two Thermotoga species ( Thermotoga maritima MSB8 NP_228110 and Thermotoga petrophila RKU-1 YP_001244210) and a group of Metazoa (15 insects, and one each of Cnidaria, Cephalochardata, and Tunicata); in this case, similarity compared with desC was much lower, with values between 1E-35 and 1E-21. The whole assemblage described above forms a clade with the very high bootstrap support value of 97, and is not closely related to other Firmicutes bacteria present in the tree.

Identifi cation of the side-chain cleavage product of cortisol by C. scindens ATCC 35704
The steroid product formed by whole cells and cell extracts of C. scindens ATCC 35704 was extracted and purifi ed by a combination of TLC and HPLC. The androgen exhibited an R f of 0.78 on TLC plates using solvent A. The androgen had a retention time of 43.9 min on a semi-preparative C 18 reversed-phase HPLC column as compared with cortisol, which eluted at 35.2 min. Reaction product was recrystallized from ether-n -hexane to give colorless needles (mp, 197-199°C; lit*197 -199°C) ( 21 ). The structural determination was performed by a combination of techniques using HR-MS, 1 H and 13 C NMR, and X-ray diffraction analysis. HR-MS of the purifi ed compound showed the depro tonated molecule [M-H] Ϫ at m/z 301.18038, a Biorad C1000 thermocycler. PCR products and pSport1 were then cut with the appropriate restriction enzymes (New England Biolabs, Ipswich, MA) and ligated with T4 DNA ligase (New England Biolabs). Plasmids were transformed into E. coli DH5 ␣ by heat shock at 42°C and cultivated for 1 h in Super Optimal Broth medium (Invitrogen) before plating on LB ampicillin (100 g/ml). Both strands of the insert were sequenced using T7 promotor and terminator primers. The protein was overexpressed in E. coli BL21(DE3)RIL and purifi ed by Strep-Tactin affi nity chromatography. Colonies were grown in LB ampicillin and induced for overexpression by addition of Isopropyl ␤ -D-1-thiogalactopyranoside to 1 M fi nal. Cultures were screened for expression by Western blot hybridization with the Strep Tag II antibody (IBA, Kansas City, MO). Cell extracts were prepared by treating cells with 5 g/ml lysozyme in buffer A (20 mM sodium phosphate buffer, pH 7.0; 0.1 M NaCl; 20% glycerol; 10 mM 2-mercaptoethanol) on ice for 1 h followed by two passes through a French pressure cell at 1,500 psi. Cell extract was then centrifuged for 30 min at 16,000 g . The supernatant was then applied to a Strep-Tactin affi nity column equilibrated with buffer A. The column was washed with buffer A until protein was not detected in the eluent. Recombinant proteins were eluted in buffer A containing 2.5 mM desthiobiotin.

␣ -HSDH enzyme assay
20 ␣ -HSDH activity was measured by continuous spectrophotometric assay at 340 nm performed in buffer A containing 50 M cortisol (Steraloids, Inc., Newport, RI) and 150 M NADH. A molar extinction coeffi cient of 6,220 M Ϫ 1 cm Ϫ 1 was used to determine product concentration. Reaction rates were performed under varying substrate concentrations below and above the respective K m values. Initial linear region of the reaction progress curves was utilized for determining the reaction rates. K m and V max values were determined by Lineweaver-Burk plots of the data.

Phylogenetic analyses of desC
Putative orthologs of desC were identifi ed by BLAST ( 9 ) search of the C. scindens ATCC 35704 protein sequence against the National Center for Biotechnology Information Non-redundant protein database, accepting no more than 10,000 matches and employing a maximum E-value cutoff of 1e-20. The resulting sequence set was then fi ltered to contain at most one strain per species and to remove duplicate sequences and those that were too long or too short, defi ned respectively as being longer than 1.5 times or shorter than 0.5 times the average sequence length (356 amino acids) of the original sequences. To save computational time, the resulting set of 2,045 protein sequences was submitted to a two-step phylogenetic analysis. For both steps, sequences were aligned using MUSCLE version 3.8.31 ( 16 ) and submitted to maximum likelihood analysis using RAxML version 7.2.8 ( 17 ); the substitution model employed was WAG ( 18 ) using empirical residue frequencies and the ␥ distribution model of rate heterogeneity (model PROTGAMMAWAGF). The fi rst step consisted of one detailed tree search, in order to get a fi rst overview of phylogeny and identify the region where C. scindens ' desC and its nearest neighbors would be located. The 341 sequences from the resulting subtree containing C. scindens ' desC and its nearest neighbors plus a few outgroups from another smaller subtree were then submitted to the thorough maximum likelihood tree searches. We performed bootstrap analysis with 100 pseudoreplicates, and bipartition frequencies were then drawn on the best tree out of 20 independent tree searches. Trees were drawn and formatted using TreeGraph2 ( 19 ) and Dendroscope ( 20 ), and further cosmetic adjustments were done in the Inkscape vector image editor . As a reference species tree of the Fig. 1. HR-APCI-MS of steroid-17,20-desmolase reaction product.
Identifi cation of cortisol-regulated genes in C. scindens ATCC 35704 using RNA-Seq technology In previous studies we reported that steroid-17,20desmolase activity was characteristic of C. scindens ATCC 35704, the type strain ( 6 ). However, we detected this activity in only three out of 31 strains of 7 ␣ -dehydroxylating human gut bacteria, many of which were identified as C. scindens by 16s rDNA gene sequencing (supplementary Table V). We chose C. scindens ATCC 35704 for identifi cation of cortisol-inducible genes due to the presence of steroid-17,20-desmolase and 20 ␣ -HSDH activity and a genome sequence on which to scaffold RNA-Seq data. In order to use RNA-Seq as a method for identifying cortisol-inducible which is entirely identical with the calculated mass of 301.18037 ( Fig. 1 ). The molecular structure of the steroid-17,20-desmolase product as 11 ␤ -OHA was determined unambiguously by 1 H and 13 C NMR and X-ray diffraction analysis of crystals of the steroid product ( Fig. 2 ; supplementary Tables II-IV). We did not detect any accumulation of 11 ␤ -hydro xytestosterone or 11 ␤ -hydroxyepitestosterone from either whole cells or cell extracts of this bacterium. In cell extracts, the reaction proceeded under strictly anaerobic conditions and was not stimulated by NAD + / NADP + suggesting a novel steroid-17,20-desmolase/oxidase catalyzing this biotransformation. For more structural details, see supplementary text.  Table II for crystallographic details and supplementary Table IV for Table I) ( 22 ). Oligonucleotides were synthesized with biotinylated 3 ′ TEG spacers allowing for magnetic capture-hybridization of rRNA on Dynabeads linked to streptavidin ( 23 ). Before mRNA purifi cation, we measured steroid-17,20-desmolase activity to verify cortisol induction ( Fig. 3A ). Quantifi cation of the mRNA fraction genes in C. scindens ATCC 35704, it was necessary to develop a method to enrich the mRNA, as it represents only about 1-5% of the total cellular RNA of bacteria. In this regard, we designed oligonucleotides along the length of the 16s rRNA and 23s rRNA molecules based on genomic sequences from C. scindens ATCC 35704 and C. scindens VPI Fig. 3. Measurement of cortisol induction, mRNA purifi cation, and effect of cortisol induction on transcriptome of C. scindens ATCC 35704. A: Prior to isolation of mRNA for RNA-Seq, whole cells were measured for steroid-17,20-demolase activity by measurement of 11 ␤ -OHAD formation (n = 3). B: One microgram of total RNA was subjected to bead-capture hybridization (see Materials and Methods for details). We observed ‫ف‬ 80% removal of total RNA (Elution) by streptavidin bound magnetic beads coated with biotinylated oligonucleotides designed to bind 16s rRNA and 23s rRNA molecules. C: RNA gel demonstrating reduction in intensity of rRNA bands following bead-capture hybridization. Control (Ctrl) sample (10 g total RNA) was subjected to bead-capture hybridization with beads lacking biotinylated capture oligos. Enrichment (Enrich) fraction containing enriched mRNA was sequenced. D: Results of RNA-Seq experiments comparing transcriptome data from control cells grown in BHI without cortisol and cortisol-induced cells. Upregulated genes are represented in blue, downregulated in red. Gene annotations from BLAST were used to group genes by general function (see supplementary Dataset I).
At least 10 genes encoding enzymes involved in the biosynthesis of thiamine pyrophosphate (TPP) were induced from 3-to 500-fold (supplementary Dataset S1). All known transketolases require TPP as a cofactor. The two-carbon side chain of cortisol and the two-carbon group removed from a ketose donor during transketolation are identical suggesting the possibility that steroid-17,20-desmolase proceeds by a TPP-dependent mechanism ( Fig. 4 ). Several genes encoding enzymes in the pentose-phosphate pathway were also upregulated by cortisol induction including: xylulose-5-phosphate isomerase (400-fold), ribulose-5phosphate isomerase (276-fold), and ribose-5-phosphate isomerase (10-fold) (supplementary Dataset S1). This is expected if the two-carbon side chain of cortisol is feeding into the pentose-phosphate pathway. Fifteen genes for vitamin B 12 biosynthesis were upregulated by cortisol (supplementary Dataset S1). The physiological function of B 12 in C. scindens cortisol metabolism is currently unknown.

The desC gene encodes an adrenocorticosteroid 20 ␣ -HSDH
It has been previously reported that C. scindens ATCC 35704 encodes steroid-inducible 40 kDa 20 ␣ -HSDH ( 7 ). Indeed, this enzyme has been purifi ed and characterized, though the gene encoding this enzyme has yet to be identifi ed ( 7,8 ). An N-terminal sequence was previously reported for the 20 ␣ -HSDH; however, characterization of this and a second gene with matching N-terminal sequences showed that these genes encode enzymes with GADPH activity (supplementary text). Bioinformatic analysis of the desC (EDS07887.1) gene indicated that it encodes an NAD(P)dependent oxidoreductase, M r 38.5 kDa, induced 1,000fold by cortisol, making this gene product a good candidate for the 20 ␣ -HSDH. The desC gene was cloned into pSport1 as an N-terminal streptavidin-tagged (ST) recombinant protein and purifi ed by Strep-Tactin affi nity chromatography to apparent electrophoretic homogeneity with an M r of 40 kDa ( Fig. 5 ).
We used several different glucocorticoid substrates to determine the specifi city of this enzyme. We observed NADH-dependent reductive activity with cortisol (11 ␤ ,17 ␣ , 21-trihydroxypregn-4-ene-3,20-dione), but not 20 ␣ -cortisol (11 ␤ ,17 ␣ ,20 ␣ ,21-tetrahydroxypregn-4-en-3-one), and 20 ␤cortisol (11 ␤ ,17 ␣ ,20 ␤ ,21-tetrahydroxypregn-4-en-3-one) was not a substrate for this enzyme. When we tested each of these steroid substrates in the oxidative direction (NAD + ) only 20 ␣ -cortisol was a substrate. This data suggests that the desC gene encodes an NAD + -dependent 20 ␣ -HSDH. The recombinant desC-ST showed an apparent K m of 5.35 M and V max of 126 nmol/min/mg protein for cortisone, (enrichment fraction) as well as the RNA eluted from magnetic beads coupled to our custom biotinylated rRNA capture oligos following capture (elution fraction) showed that our method removed ‫ف‬ 80% of the total RNA. Diminution of rRNA band intensity was observed between control total RNA (Ctrl), which used the same procedure as the mRNA enrichment sample (Enrich) except that capture oligos were omitted ( Fig. 3B, C ). Sequencing of the cDNA libraries resulted in around 300,000 reads using this sequencing technology for each library (cortisol-induced and uninduced control). Filtering of the ribosomal and transfer RNA sequences from these data showed that our rRNA depletion method removed about 90% of the rRNA from the libraries. However, considering that such sequences usually comprise ‫ف‬ 95-99% of the total extracted RNA, sequenced libraries still ended up with a 9:1 ratio of rRNA to non-rRNA sequence reads, with a fi nal number of non-rRNA reads of 28,858 for the uninduced library and 27,701 for the cortisol-induced library. A combination of MEGAClear spin column and a pyrosequencing process allowed the removal of small RNAs (<100 bp). Only about 300 reads matched known tRNA sequences.
We have previously reported the characterization of the baiBCDAFGHI operon in C. scindens , which encodes genes involved in bile acid-inducible 7 ␣ / ␤ dehydroxylation ( 1 ). We expected that if our method for isolating mRNA were valid, we should expect to see these genes signifi cantly upregulated in cholic acid-treated cells. Experiments with 50 M cholic acid induction were performed as described for cortisol (see Materials and Methods). Indeed, the bai operon was upregulated 452-fold as compared with control cells (supplementary Fig. I).

Identifi cation of a cortisol-inducible operon in C. scindens ATCC 35704
The genome (4.336 kbp) of C. scindens is estimated to encode 3,995 genes, of which a total of 186 genes were upregulated (2-to 1,000-fold) while 97 genes were downregulated by the addition of cortisol to growing cultures of C. scindens ATCC 35704. We grouped the data according to number of genes by function ( Fig. 3D ; supplementary Dataset S1). The most interesting genomic region upregulated in response to cortisol is an operon consisting of four open reading frames (upregulated ‫ف‬ 1,000fold) two genes of which were annotated as N-terminal subunit ( desA ) (EDS07885.1) and C-terminal subunits of transketolase ( desB ) (EDS07886), the third gene encoding a putative zinc-dependent dehydrogenase ( desC ) (EDS07887), and the fourth a hypothetical sodium-dependent symporter ( desD ) (EDS07888) ( Fig. 4 ). We also identifi ed a gene apart from this operon encoding a putative ABC-type multidrug transport protein (EDS07925.1) that was upregulated 500-fold by cortisol that may represent an androgen exporter (supplementary Dataset S1). BLAST searches of desAB showed only 34% amino acid identity with other transketolases in the database. In contrast, three other genes encoding transketolase in the genome shared 85-97% amino acid identity with other annotated transketolases. Importantly, a BLAST search of the genome of C. scindens better understand the distribution of this gene in nature, and in particular the gut. Analysis of strains of C. scindens and other species of bile acid 7 ␣ -dehydroxylating bacteria suggest the desC gene is rare even in strains of C. scindens (supplementary Table V). The fi rst step in the phylogenetic analysis of DesC, involving 2,045 sequences, identifi ed a subtree of 341 sequences that contained C. scindens ' DesC and its nearest neighbors ( Fig. 6A ). Thorough analysis of these 341 sequences revealed the phylogenetic placement of DesC among sequences from a few other clostridia, one Actinobacteria ( Modestobacter ), one Spirochaetes ( Treponema ), a group of Metazoa (mostly insects), and two Thermotoga species, with the high bootstrap support value of 97 ( Fig. 6B ). All other Firmicutes sequences present in this subtree are distant from C. scindens ' DesC, including another C. scindens sequence (ZP_02432273, also from strain ATCC 35704). This sequence presents a much stronger match (E-value of 0) to the threonine dehydrogenase-like domain cd08234 than DesC (E-value 7e-63), and it is also placed in a group containing many other Firmicutes bacteria. Finally, if we examine genes surrounding desC -like genes in the cluster of microbial genes close to C. scindens ' desC and found a high degree of conservation of the transketolase subunits (corresponding to desAB in C. scindens ) and the Na + / melibiose transporter ( desD ), hypothesized to encode a cortisol transport protein ( Fig. 6C ). Each of the genes in this cluster is associated with organisms that live in anaerobic environments, including the gut of animal species (24)(25)(26)(27).

DISCUSSION
C. scindens and a small number of species belonging to the genus Clostridium are responsible for signifi cant alterations in the human bile acid pool composition through and an apparent K m of 1.46 M and V max of 30 nmol/min/mg for cortisol in the presence of NADH. The apparent K m and V max for 20 ␣ -hydroxy cortisone was determined to be 4.54 M and 24 nmol/min/mg protein, respectively. Substrates lacking a 17 ␣ -hydroxy group such as 11 ␤ ,21-dihydro xypregn-4-ene-3,20-dione or those lacking a 21-hydroxy group such as 11 ␤ ,17 ␣ -dihydroxypregn-4-ene-3,20-dione were not substrates. Finally, we did not detect 17 ␣ -HSDH activity against the end product of the steroid-17,20-desmolase reaction, 11 ␤ -OHA. These data are consistent with characteristics determined for the purifi ed native 20 ␣ -HSDH isolated from this bacterium which recognize adrenocorticosteroids with 17 ␣ ,21-dihydroxy groups ( 7,8 ).

Phylogenetic analysis of desC
Annotation of the desC gene product suggests this gene evolved from threonine dehydrogenase (EC 1.1.1.103). We performed an extensive phylogenetic analysis to try to Fig. 4. RNA-Seq data suggests a hypothesis for the mechanism of steroid-17,20-desmolase. We detected a cluster of four genes upregulated 1,000-fold by cortisol. This operon contains two genes encoding the N-and C-terminal subunits of a putative transketolase. The C. scindens ATCC 35704 genome contains additional transketolase genes which are not upregulated by cortisol and which display higher sequence identity with other transketolases found in nr BLASTP database (99-91% vs. 45-41% in fi rst fi ve hits; E-values 0.0 vs. 7e Ϫ 71 to 4e Ϫ 60 ).
A comparison of the structure of the side chain removed by steroid-17,20-demolase, and transketolation reactions in the pentose-phosphate pathway, suggests that steroid-17,20-desmolase may proceed by TPP-dependent transketolation. HD, haloacid dehalogenase . small bowel ( 37,38 ) and inducers of antimicrobial peptides ( 39 ). Perturbations in the biliary bile acid pool composition can be indicative of hepatogastrointestinal diseases such as fat malabsorption ( 40 ), gallstones ( 3 ), gastrointestinal cancers ( 41 ), and possibly type II diabetes ( 42 ). Recent studies also illustrate the important role gut microbes play in the levels and profi les of bile acids in various tissues of the body including: liver, kidney, plasma, serum, and heart ( 43 ). The physiological signifi cance of these alterations in bile acid profi les in host organs such as heart and kidney is unknown. However, decades of research strongly suggest that secondary bile acids are bile acid 7 ␣ / ␤ dehydroxylation ( 1 ). Bile acids serve as natural ligands, with varying affi nities, for nuclear receptors (farnesoid X receptor, pregnane X receptor, Vitamin D) and the G protein-coupled receptors TGR5 and SIP 2 (28)(29)(30)(31)(32). Secondary bile acids, products of gut microbial metabolism, are the most potent activators of TGR5 ( 33,34 ). Through activation of these receptors, bile acids regulate their own synthesis, conjugation, transport, and detoxifi cation (reviewed by 30,31 ), and are important regulators of lipid, glucose, and energy homeostasis ( 35,36 ). Furthermore, bile acids play an important role in maintaining intestinal barrier function as antimicrobial agents in the microbiota form multiple metabolites from 11 ␤ -OHAD produced by C. scindens . The gut microbiota are capable of reducing the 17-oxo group of 11 ␤ -OHA to either the ␣ -or ␤ -confi guration generating 11 ␤ -hydroxy-epitestosterone or 11 ␤ -hydroxy-testosterone, respectively ( 53,54 ). In addition, some members of the gut microbiota can reduce the ⌬ 4 bond in ring A to either 5 ␣ -or 5 ␤ -confi guration resulting in trans or cis A/B rings, and can reduce the 3-oxo group to either 3 ␣ -or 3 ␤ -hydroxy groups ( Fig. 7 ) ( 2 ). Thus steroid biotransformation products from one microbe yield further metabolites through the metabolic activity of other members of the human microbiome ( Fig. 7 ). Microbial biotransformation products of host bile acids and steroids may play physiological roles in the host-microbiome axis.
In mammalian tissues, side-chain cleavage of steroids requires molecular oxygen-dependent monooxygenases ( 9 ). Side-chain cleavage of cortisol under anaerobic conditions by the gut microbiome has been known since 1959 ( 48,55,56 ); however, a mechanism has not been suggested until now. We hypothesize that the desAB genes encode the steroid-17,20-desmolase/oxidase and that this reaction proceeds by transketolation ( Fig. 4 and Fig. 8 ). Previous work reported that TPP addition stimulates side-chain cleavage in cell extracts of C. scindens ATCC 35704 ( 7 ). Indeed, we observed a marked induction by cortisol of a number of genes involved in the biosynthesis of TPP ( Fig. 8 ).
We have identifi ed and characterized a cluster of coexpressed genes induced by cortisol, one of which encodes a 20 ␣ -HSDH. Reduction of the 20-oxo group to either ␣ -or ␤ -configuration blocks side-chain cleavage of C 21steroids by C. scindens ( 23 ). If the two-carbon fragment of associated with several diseases of the gastrointestinal system including: cancers of the colon, esophagus, and biliary duct (reviewed by 41,44 ), as well as cholesterol gallstone disease in select patients ( 4 ).
In the current study, we have unambiguously demonstrated the presence of a 17-oxo group, identifying the end product of steroid-17,20-desmolase as 11 ␤ -OHA ( Fig. 2 ). 11 ␤ -OHA is also known to be a primary adrenal steroid in the host, produced at a rate of 1.5 mg/day, slightly less than the rate of androstenedione formation (2.3-3.3 mg/ day) ( 45 ). The function(s) of 11 ␤ -OHA in the human host is largely unknown ( 46,47 ). However, it is known that 11 ␤ -OHA derived from metabolism of cortisol by gut microbes is reabsorbed into the bloodstream and excreted in urine ( 48 ).
There have been very few studies that have quantifi ed glucocorticoids and their metabolites in human stool. Tracer studies of [ 14 C]cortisol in humans suggest that only small quantities enter human bile, and that 90% of the radioactivity was found in urine and feces by 72 h ( 49,50 ). Animals vary in the amounts of glucocorticoids secreted into the gut, and levels in stool are a marker for stress in animals under captivity ( 51 ). It is not presently known what the effect of 11 ␤ -OHAD is on the host, or for that matter other microbes. It has been reported that some intestinal microbes possess signal-transduction systems which respond to host hormones; a kind of inter-kingdom signaling ( 52 ). To date, these observations have been made in well-studied human pathogens. However, whether the biotransformation products of host steroids activate signaling pathways in members of the normal gut microbiota is unknown. We do know that other members of the resident show this gene as a very rare occurrence in clostridia and present only in fi ve Clostridium species other than C. scindens ATCC 35704 (all of them distantly related to C. scindens ). This suggests that desC is a novel ortholog derived from duplication (of an ancestral gene encoding a threonine dehydrogenase-like protein) followed by functional divergence and either gene loss in most clostridia or transfer from one species to a few select other species. There is also conservation of genes in the desABCD operon from C. scindens ATCC 35704 in organisms whose " desC " genes are most closely related, suggesting this operon may have passed in toto . Alternatively, this gene, and operon, could have been transferred to these organisms from a currently unknown source. Interestingly, multiple strains, which by 16s rDNA gene analysis were identifi ed as C. scindens , were found to lack steroid-17,20-desmolase activity. This includes C. scindens VPI 12708 (formerly Eubacterium sp. strain cortisol is shuttled into the pentose-phosphate pathway, as our data suggests, it is possible that 20 ␣ -HSDH acts as a metabolic rheostat, regulating carbon fl ow ( Fig. 8 ). A similar mechanism may be involved in the bile acid 7 ␣ -dehydroxylation pathway found in some intestinal clostridia, including C. scindens ( 1 ). Oxidation of the 7 ␣ -hydroxy group of cholic and chenodeoxycholic acid by 7 ␣ -HSDH inhibits bile acid 7 ␣ -dehydroxylation. These steroid HSDHs may act as "switches", regulating entrance into steroid biotransformation pathways based on the NAD + /NADH ratio in the cell. The same phenomenon occurs in mammalian tissues expressing, for instance, AKR1C1, a human 20 ␣ -HSDH, whose function is to regulate the levels of progesterone over short time scales ( 10 ).
The human superorganism contains 20 ␣ -HSDH genes of both mammalian and microbial origin, which do not share common ancestry. Our phylogenetic analyses of desC gene (yellow) product supports our hypothesis that the transketolase encoded by the desAB genes (red, orange) encode steroid-17,20-desmolase, which proceeds by transketolation. Transketolases are TPP-dependent, and we observe an upregulation of genes involved in TPP synthesis. The desD gene (green) is predicted to encode a cortisol transport protein, and an ABC-type transporter was identifi ed (500-fold induction; blue) which could serve to pump 11 ␤ -hydroxyandrostenedione out of the cell. We also observed several pentose phosphate pathway genes upregulated, suggesting that the two-carbon side chain might enter the pentose phosphate pathway. We hypothesize that the 20 ␣ -HSDH encoded by the desC gene regulates fl ux of the two-carbon fragment into the pentose phosphate pathway, as 20 ␣ -hydroxycortisol is not a substrate for steroid-17,20desmolase. VPI 12708) from which most of our knowledge of the bile acid 7 ␣ -dehydroxylation pathway has been elucidated ( 1 ). Interestingly, C. scindens VPI 12708, but not C. scindens ATCC 35704, expresses a 17 ␣ -HSDH, suggesting some niche partitioning of host glucocorticoid metabolism within a species of bile acid 7 ␣ -dehydroxylating bacteria ( 54 ). Reduction of oxo-groups by bacteria may in part serve to regenerate NAD + necessary for fermentation. Host steroids and bile acids can therefore serve as electron sinks.
We now possess tools in the scientifi c community to uncover the relationship between our resident microbiota and the host. The "omics" revolution allows us to identify and quantify thousands of metabolites (metabolomics), and the genes encoding enzymes (genomics), the level at which genes are transcribed (transcriptomics), or the enzymes themselves (proteomics). However, this data is only as good as our ability to assign function or potential function to the genes. Steroids are important host signaling molecules. It has become clear that microbial metabolites of host steroids, such as deoxycholic acid, can have signifi cant physiological and pathophysiological effects. Therefore, identifi cation of the genes encoding enzymes responsible for gut microbial biotransformation of host steroids is important for future omics studies of host-gut microbe interactions in health and disease. The present data suggests that RNA-Seq analysis of inducible genes can aid in the discovery of genes in cultivated organisms known to carry out steroid-biotransforming reactions. It may also be possible to use this approach for gene discovery in mixed populations of microbes by comparing transcriptomes in the presence and absence of a particular gene inducer. Genes of interest can be cloned and characterized even from "unculturables" through a combination of transcriptomics and metagenomics.