Lipophorin receptor of Bombyx mori: cDNA cloning, genomic structure, alternative splicing, and isolation of a new isoform.

The cDNA and genomic structure of a putative lipophorin receptor from the silkworm, Bombyx mori (BmLpR), indicated the presence of four isoforms, designated LpR1, LpR2, LpR3, and LpR4. The deduced amino acid sequence of each isoform showed five functional domains that are homologous to vertebrate very low density lipoprotein receptor (VLDLR). All four isoforms seem to have originated from a single gene by alternative splicing and were differentially expressed in a tissue- and stage-specific manner. BmLpR1 harbored an additional 27 amino acids in the O-linked sugar domain, resulting in an extra exon. The silkworm BmLpR gene consisted of 16 exons separated by 15 introns spanning >122 kb and was at least three times larger than the human VLDLR gene. Surprisingly, one of the isoforms, LpR4, was expressed specifically in the brain and central nervous system. Additionally, it had a unique cytoplasmic tail, leading to the proposition that it represents a new candidate LpR for possible brain-related function(s). This is the first report on the genomic characterization of an arthropod lipoprotein receptor gene and the identification of a brain-specific receptor variant from a core member of the low density lipoprotein receptor family in invertebrates.

Supplementary key words silkworm . lipoprotein receptor . isoforms Cell surface receptors belonging to the low density lipoprotein receptor (LDLR) family regulate diverse biological functions, including the binding and uptake of plasma lipoproteins (1). In mammals, the LDLR family comprises seven core members and several distantly related genes (2). The structural characteristics that distinguish the core members from others are well described (2). In insects, lipophorin (Lp) is the major hemolymph lipoprotein delivering lipids between tissues; it also serves as an important yolk protein precursor in many insects (3). Lp functions as a reusable lipid shuttle that operates without internalization of the protein portion of the Lp. The transfer of lipids is facilitated by a lipid transfer particle (4)(5)(6). However, it has been shown that high density lipophorin (HDLp) binds to lipophorin receptor (LpR) on the cell surface and is sequestered via receptor-mediated endocytosis (7). Transfection experiments showed that HDLp is recycled upon LpR-mediated endocytosis in mammalian (CHO) cells (8) but not in insect (S2) cells. This indicates that the recycling mechanism is cell typespecific (9), adding complexity to the receptor-mediated endocytosis of Lp. Support for insect LpR comes from only a few species: biochemical characterization in Manduca sexta (10) and molecular characterization of LpRs from Locusta migratoria (7), Aedes aegypti (11,12), and Galleria mellonella (13). Most of these insect LpRs are highly homologous to very low density lipoprotein receptor (VLDLR), belonging to the LDLR family. The common structural elements of the family include a ligand binding domain (LBD), an epidermal growth factor (EGF) precursor domain, an O -linked sugar domain, a transmembrane domain, and a cytoplas-mic domain. Drosophila melanogaster yolk protein receptor (14) and A. aegypti (15) and Solenopsis invicta (16) vitellogenin receptors are other members of the family that are characterized at the cDNA level. However, these proteins are significantly larger than VLDLR and harbor additional repeats in the ligand binding and EGF domains (1).
Unlike their vertebrate counterparts, no published report is available on the genomic structure of invertebrate lipoprotein receptors except for a low density lipoprotein receptor-related protein (LRP) from Caenorhabditis elegans (17). Here, we present the cDNA cloning and expression of four putative Bombyx mori LpR isoforms (LpR1, LpR2, LpR3, and LpR4) as well as their genomic data. Our findings include a novel member of the LpR family, LpR4, whose expression was restricted to silkworm brain and the central nervous system. This isoform was found to possess a unique cytoplasmic tail that is possibly involved in functions other than the receptor-mediated endocytosis of Lp.

MATERIALS AND METHODS
B. mori strain p50T was grown on an artificial diet under standard conditions.

Cloning of silkworm LpR cDNA
Total RNA was extracted from the vitellogenic ovaries using Trizol (Invitrogen) and a standard protocol. First-strand cDNA was synthesized according to the SMART rapid amplification of cDNA ends (RACE) cDNA amplification kit (Clontech) protocol and was used as a template in touchdown PCR amplifications using two degenerate primers based on the conserved regions of vertebrate VLDL receptors. The sense primer was from a conserved region of an EGF homology domain, MYWTDW: ATG-TA(C/T)TGGAC(I)GA(C/T)TGG); the antisense primer was from the conserved region containing the internalization signal, MNFDNPVY: TA(I)AC(I)GG(A/G)TT(A/G)TC(A/G)AA(A/) TTCAT) (I 5 deoxyinosine). Amplification conditions were 948C for 1 min, followed by 30 cycles of 948C for 45 s, 558C for 30 s, which was decreased by 0.58C per cycle, and 728C for 1 min, and finally 728C for 5 min. The amplified PCR product of z550 bp was cloned into pGEM-T vector (Promega) and sequenced using an ABI 3700 DNA Analyzer (Applied Biosystems). The full-length cDNA was then amplified by the Clontech RACE protocol with the following gene-specific primers: 5 RACE, AGAGCAGTGACCGTTGACA-GCAGCGCAGTGATTCACACC; 5 RACE Nested, GAGCGTGT-GTGTAGATGTGATTGCTTCCACGTCTTTACCG; 3 RACE, AGCGGTCGCGATGCCGGAGTGGTCGCCGGTATCGT; 3 RACE Nested, TGGCTGCAGTGATCGCAGGTGTTATGTACAGACACT-ACG. The amplified products were cloned and sequenced as described above. Direct sequencing was also done for confirmation. Several independent clones were used to eliminate possible PCR errors.
Real-time PCR for LpR4 was carried out using GCTCTTACG -CTATGGAGTGCACG and TGGATCGCTGAGATGTTCGCG as sense and antisense primers, respectively. For control experiments, the following 18S RNA primers were used in all PCR cycles: sense, TTGACGGAAGGGCACCACCAG; antisense, GCA-CCACCACCCACGGAATCG.
Real-time PCR was performed as reported (18) with an Applied Biosystems 7700 Sequence Detection System using the relative standard curve method with SYBR Green (Applied Biosystems) for detection. Tissues of brain and central nervous system cDNA samples were amplified with LpR4 and 18S primers, and their relative mRNA transcript levels were assessed using standard curves. Finally, the amount of transcript was normalized with 18S RNA and compared among samples.

Protein expression
The cDNAs of LpR1 and LpR4 without their signal sequences were cloned into pIVEX2.3d and pVIEX1.3d WG vectors (Roche) between NcoI and SacI sites and in-frame with the C-terminal His 6 tag. After sequence verification, the constructed plasmids were used for in vitro cell-free protein expression with the RTS 100 Escherichia coli HY Kit or the RTS 100 Wheat Germ CEFC Kit (Roche). Expression was verified by Western blotting using anti-His 6 antibody (Roche) and subsequent detection using alkaline phosphatase-conjugated secondary antibody and chemiluminescence (Western Breeze; Invitrogen). Upon confirmation of protein expression, 25 ml of each RTS reaction product was diluted and ultrafiltered (Microcon YM50; Millipore) to remove low molecular weight proteins. Protein refolding was carried out as reported (19).

Lp collection and ligand binding
Hemolymph from mid fifth instar female larvae (nonvitellogenic) was collected, hemocytes were removed, and Lp was collected by KBr density gradient ultracentrifugation (20). HDLp (d 5 1.0635 g/ml), which formed a clear yellow band, was collected, desalted, and used immediately for the binding assay.
After refolding, expressed proteins were separated on 7.5% SDS-PAGE gels under nonreducing conditions and then electrophoretically transferred to a polyvinylidene difluoride membrane (Immun-blot; Bio-Rad). After blocking, the membrane was incubated with 20 mg/ml HDLp in binding buffer (20 mM HEPES, 150 mM NaCl, and 2 mM CaCl 2 , pH 7.5) containing 0.5% BSA and then incubated with silkworm anti-HDLP antibody. Bound antibodies were detected with alkaline phosphatase-labeled antirabbit IgG and chemiluminescence as described above (Western Breeze; Invitrogen).

Isolation of genomic clones and sequencing
A silkworm bacterial artificial chromosome (BAC) library (21) was initially screened using an z1 kb probe extending from 670 to 1,645 nucleotides of LpR1 full-length cDNA, resulting in two overlapping BAC clones, 3C2 and 13C23, which lacked exons 1 and 2. Further screening with z0.45 kb probe located at the 59 region (40-480 nucleotides) of the LpR1 cDNA yielded a positive BAC clone, 25A14, which contained the missing two exons and overlapped with 3C2. These three overlapping BAC clones were used to characterize the silkworm LpR gene structure. The protocols for BAC hybridization and processing of BAC clones for subcloning, sequencing, and contig generation were as described previously (21). Gaps were estimated by PCR and also by analyzing the silkworm whole-genome shotgun sequence (accession number BAAB00000000; http://sgp.dna.affrc.go.jp/).

Southern blot
Genomic DNA was extracted from the posterior silk glands of fifth instar larvae (day 4) according to a standard protocol. Five micrograms of genomic DNA was digested with EcoRI, KpnI, and SacI and electrophoresed on a 0.8% agarose gel. After depurination and denaturation, the DNA was transferred onto a nylon membrane (Roche) and then exposed to ultraviolet light for 3 min. Labeling of the probe, hybridization, and detection were performed using the ECL system according to the manufacturer's instructions (Amersham). A DNA probe common to all LpR isoforms was amplified by PCR with the following sense and antisense primer set: AGACTAGATCAATTCCAATGC and CGGTCGAATG -GAGAATGAGC. Isoform-specific probes were also used for Southern hybridization.

General methods
Unless indicated otherwise, all molecular biology techniques were performed essentially as described (22). Analytical grade chemicals were used for all experiments. Protein estimation was performed with the BCA Protein Assay Kit (Pierce) using BSA as the standard. Phylogenetic analysis was done using ClustalX (23) and PAUP (4.0 b10 Win; Sinauer Associates). Genetyx (Software Development Co.) was used for nucleotide and amino acid analyses.

cDNA structure
Using degenerate primers based on the conserved regions of vertebrate LDLRs, we cloned four isoforms of silkworm LpR by PCR using mRNA isolated from vitellogenic ovaries. The inferred amino acid sequence of each isoform showed a predicted signal peptide of 37 amino acids followed by five functional domains (Figs. 1, 2B): 1) an N-terminal LBD consisting of eight cysteine-rich repeats; 2) an EGF precursor homology domain including EGF A, B, and C cysteine-rich repeats and five copies of the characteristic (Y/F) WXD tetrapeptide; 3) a putative Olinked sugar domain; 4) a hydrophobic transmembrane domain; and 5) a cytoplasmic domain with a well-conserved internalization signal, FDNPVY, responsible for the endocytosis of the receptor via coated pits. Comparison of the O -linked sugar domain among isoforms revealed LpR1 to have an additional 27 amino acids in this region, whereas comparison of the cytoplasmic domain showed LpR4 to differ in the last 18 amino acids, with 5 of them protruding as a 39 tail. The BmLpR cDNA is characterized by short 59 and long 39 untranslated regions (UTRs). All four variants have the same start and stop codons except for LpR4, which has a different stop codon from the other three variants. Additionally, LpR4 bears a longer 39 UTR as a result of the presence of a 533 bp insertion. No canonical poly(A) signal was seen in the silkworm LpR cDNA. The predicted molecular masses of LpR1, LpR2, LpR3, and LpR4 are 100.97, 98.16, 98.47, and 98.71 kDa, respectively, including signal peptides. We first isolated LpR1, and the alternative isoforms were identified in the course of analyzing expression levels of LpR1 mRNA by RT-PCR in various tissues. Subsequent cDNA sequence verification and further genomic data analysis confirmed the presence of each isoform.

Tissue-specific expression
To elucidate the expression pattern of LpR variants (Fig. 3A), primers specific to each variant were used for RT-PCR. Each RT-PCR product yielded a discrete single band of the expected length, the sequence of which was verified. LpR1 expression was detected in most tissues at various developmental stages, with highest transcript levels seen in the pupal and adult ovary; a weak band was also observed in the nonvitellogenic ovary. Notably, LpR1 expression was also prominent in the brain and central nervous system. LpR2 transcripts were expressed at high levels in pupal fat body, pupal and adult ovary, and pupal malphigian tubule and at moderate levels in larval tissues. Tissues of brain and the central nervous system were excluded from LpR2 expression assays because of the similarity of its sequence with LpR4 and other isoforms. By contrast, LpR3 transcripts showed comparatively low abundance in most tissues except in fat body and ovary, with barely detectable amounts in silk gland, brain, and the central nervous system. Interestingly, LpR4 transcripts were detected exclusively in the brain and central nervous system, with highest levels in the pupal tissues. This was confirmed by the real-time PCR assay, which showed a relative increase in LpR4 transcripts in pupal brain on days 4 and 6 ( Fig. 3B). Unlike other isoforms, we could not isolate or amplify a full-length cDNA of LpR3 from any tissues, whereas LpR1 was frequently isolated from vitellogenic ovary and brain, LpR2 from pupal fat body, and LpR4 from the brain and central nervous system, showing their tissue-specific abundance.

Genomic structure
To obtain the complete structure of the LpR gene, we screened a silkworm genomic BAC library using z1 and 0.45 kb cDNA fragments, which yielded three positive overlapping clones, 25A14, 3C2, and 13C23 ( Fig. 2A). Table 1 and Fig. 2 show the exon/intron structure. Sequence analysis revealed that the BmLpR gene is composed of 16 exons interrupted by 15 introns and spanning .122 kb (Fig. 2A). The second intron was the largest, spanning .65 kb. Although the first two clones overlapped, several small contigs representing different orientations of 3C2 at the overlapping positions made it difficult to estimate the size of the second intron beyond 65 kb. All other isoforms except LpR1 had 15 exons separated by 14 introns. The sequences of the exon/intron boundaries of the BmLpR isoforms have conserved GT/ AG sites except for introns 5 and 10 (see supplementary  table). Comparison of nucleotide sequences among all variants showed that exons 1-12 encode a common cDNA region except for exon 4 of LpR3, which had a different These data strongly suggest that all four isoforms were generated from alternative splicing of a single gene; this notion was further supported by Southern hybridization results. As shown in Fig. 4, genomic DNA digested with three different enzymes and hybridized with a common LpR probe resulted in a discrete single band in each restriction digest. Isoform-specific probes also showed the same pattern (data not shown) in several restriction digests. Additionally, preliminary linkage analysis revealed that the silkworm LpR gene is located on chromosome 5 (data not shown).

Functional expression
Both LpR1 and LpR4 cDNAs without their signal sequences were fused with vectors containing a C-terminal His 6 tag, expressed using cell-free systems, and detected using poly-His 6 antibody. The expressed proteins showed the expected molecular masses of 98.1 and 96.1 kDa (His 6 tag and other linker amino acids contributed 1.1 kDa) for LpR1 and LpR4, respectively (Fig. 5A). Upon incubation with Lp and subsequent detection using anti-Lp antibody, both expressed receptor proteins showed binding to Lp (Fig. 5B). No binding was detected in the control reaction, in which Lp antibody was omitted (data not shown).

DISCUSSION
The deduced amino acid sequence of the BmLpR cDNA revealed that it encodes a protein homolog of the  (12), which has only seven repeats. The amino acid sequence homology of silkworm LpR1 with L. migratoria (7) and A. aegypti (11,12) is 67% and 62%, respectively. The O -linked sugar domain of the silkworm LpR variants excluding LpR1 is marked by an in-frame deletion of 81 nucleotides, resulting in the loss of 27 amino acids. Hence, LpR1 seems to have a full glycosylation domain and was highly expressed in the vitellogenic ovary, possibly indicating its role in mediating the uptake of Lp into developing oocytes. Similar reports from mosquito AaLpRov (11), which is a splice variant having a longer O -linked sugar domain than AaLpRfb (12), showed that it is highly expressed in vitellogenic stages. By contrast, in laying hens, LR8, which lacks the O -linked sugar domain, is expressed in oocytes, whereas the somatic tissues express a variant that contains this domain (24). Although the function of the Olinked sugar domain in LDLR is not precisely known, it is marked by many additions and deletions and is the most divergent among LDLR family members (25). Unlike the sugar domain, the cytoplasmic domain is the most conserved among insect LpRs. BmLpR, except LpR4, showed  72.58% homology with locust and mosquito LpR cytoplasmic domains. The differential expression of each form of silkworm LpR variant seemed to be under specific control at different developmental stages. LpR1 was present in high amounts in vitellogenic ovary, the LpR2 transcript was more confined to pupal fat body, and LpR3 was most highly expressed in larval and pupal fat body but seemed less abundant than other variants in other tissues and stages examined. Moreover, LpR3 has a variant exon 4b, which is different from the corresponding exon 4a of other isoforms. Our data do not support it is an artifact. RT-PCR revealed that it was transcribed, and the PCR product was verified for its specific sequence. In addition, exon 4b of LpR3 and the corresponding common exon 4a of all other isoforms lie close to each other in the genomic area. Exon 4b of LpR3 is nearly 400 nucleotides upstream of exon 4a of all other isoforms and is present in the same contigs of 3C2 and 13C23 BAC clones (data not shown). Another possibility is that it can form a part of a highly homologus gene, such as vitellogenin receptor, a member of the LDLR superfamily. However, the present sequence of LpR3 did not show additional repeats in the ligand binding and EGF domains, as seen in other insect vitellogenin receptor genes. Thus, the presence of variant exon 4b of LpR3 is probably attributable to mutually exclusive splicing, as alternative splicing results by substituting one segment of the sequence for another, with each isoforms having its own unique subsequence (26). However, confirmation of this needs additional experimental data.
Most notably, LpR4 appeared to be a brain-specific receptor. Although LpR4 mRNA levels were high in pupal brain on days 4 and 6, no direct correlation was seen between this and hemolymph prothoracicotrophic hormone titer, which showed a prolonged peak in early pupal-adult development and was closely correlated with ecdysteroid titer (27). Although the transcript start site has not been determined for the silkworm LpR isoforms, it seems from their gene structure that they may share a common regulatory region; if so, the splicing events may contribute to their tissue-and stage-specific expression. VLDLR exists in variant forms arising from differential splicing, but the precise role of each spliced variant is not well understood (1). Alternative splicing plays a key role in the regulation of gene expression during many developmental processes, ranging from sex determination to apoptosis (28), and it could produce receptors with different affinities or specificities. It appears, as in other insects, that BmLpR variants may have a role in lipid metabolism, but their involvement in other functions cannot be ruled out. The recent discovery of other potential ligands for several LDLR members has expanded their possible function far beyond lipid metabolism (1).
Although the amino acid sequence revealed LpR structural domains that are highly similar to those of vertebrate VLDLR, the genomic structure is strikingly different. A distinctive feature of the BmLpR gene is its enormous intron size. The total size of the BmLpR gene (.122 kb, with 16 exons) is at least three and two times larger than human and mouse VLDLR (40 and 50 kb with 19 exons), respectively (29,30). This relatively larger size (2-to 3fold) mainly comes from the first two introns of BmLpR. A common feature among BmLpR and human/mouse VLDLR is that the first two introns are larger than the  rest. However, unlike BmLpR, the remaining introns in VLDLR are ,2 kb. Interestingly, a BLAST search indicated the BmLpR introns are marked by the presence of several transposable elements. This is in good agreement with recent reports on the whole-genome shotgun sequence of B. mori (31,32), which revealed that the silkworm genome contains many repetitive sequences derived from transposons. Because the presence of long introns is an uncommon feature of vertebrate VLDLR genes, analysis of other large genes in B. mori might help to clarify the role of such long introns or to explain the selective forces underlying their presence.
Cysteine-rich repeats of both LBD (I, II, III, VII, and VIII) and EGF are encoded by individual exons in VLDLR, whereas in BmLpR this is true only of LBD I, II, and III repeats. Comparatively, vertebrate VLDLRs have more exons whose boundaries coincide with the borders of repeats and are characterized by short introns, thus highlighting the evolutionary distance between insects and vertebrates. Despite the differences in intron size, phylogenetic analysis (see supplementary figure) of complete coding regions of lipoprotein receptors revealed a high degree of conservation between insect LpRs and other LDLR family members, showing their divergence from a common ancestor, in addition to indicating the existence of a functional parallel among eukaryotes. Moreover, it is evident that there exists a distant relationship between insect LpRs and insect yolk protein/vitellogenin receptors, as reported previously (11).
Data from RT-PCR, cDNA and genomic structure, Southern blot, and the number of positive BAC clones all showed LpR variants to be generated from a single-copy gene in the silkworm genome by alternative splicing. Additionally, the silkworm LpR gene has been located on chromosome 5. In contrast, the two variants of the human receptor genes, LDLR and VLDLR, are located on separate chromosomes (19 and 9), despite their high similarity in exon/intron structure (29). Although several LDLR family members have been predicted from the genomic sequence of D. melanogaster (33), lack of further genetic characterization of LpR genes from Drosophila and Anopheles makes a comparison difficult, at least for now, between Diptera and Lepidoptera.
LpR4 is novel for its exclusive expression in the brain and central nervous system, unique cytoplasmic tail, and different splicing at the 39 region. The cytoplasmic domain of LpR4 contained 66 amino acids, of which 18 were unique, 5 of them protruding as a tail. Similarly, human apolipoprotein E receptor 2, a member of the LDLR family that is expressed predominantly in the brain, is characterized by the presence of an additional 59 amino acids in its cytoplasmic domain compared with LDLR and VLDLR (34). Also, its cytoplasmic tail shows a minimal rate of endocytosis (35), and its proline-rich insertion binds to adaptor proteins linking the reelin pathway to the Jun N-terminal kinase signaling pathway (36). Although LpR4 does not contain a proline-rich cytoplasmic tail, a BLAST search of its specific sequence showed homology to several signal-transducing genes, such as protein kinase, gua-nine nucleotide binding proteins, and frizzled homolog 6, thus indicating that it may also participate in signaling events. Future studies may reveal the interacting partners for its cytoplasmic domain. Recently, Arrow, an LRP, was isolated from an embryonic cDNA library of Drosophila involved in Wingless/Wnt signal transduction in conjunction with Frizzled-class proteins (37). However, Arrow is homologous to vertebrate LRP5 and LRP6, whose domain structures are different from those of the core members of the LDLR family (2).
The presence of increased levels of expression of LpR1 and LpR4 transcripts in the brain and the central nervous system of the silkworm is interesting. The presence of a perfect internalization signal, FDNPVY, in the cytoplasmic domain of both isoforms strongly suggests their endocytic competence. Binding results showed that both isoforms bind to Lp, although we have not investigated whether they have preferential binding specificities. The question arises, why does the brain possesses a different receptor variant of its own? As noted, recent developments in lipoprotein receptor research revealed that several LDLR family members play important roles in signal transduction cascades through adaptor molecules that bind to their cytoplasmic tails (38). This suggests the likelihood that LpR4, with its unique cytoplasmic tail, may have roles in the endocytosis of other ligands and/or the cellular signaling required for the functional maintenance of the brain and central nervous system. Although the exact role(s) of this receptor remains to be determined, our findings may prompt studies on the diverse functions of lipoprotein receptor in invertebrates.