The G protein–coupled receptors (GPCRs) constitute a large and ancient superfamily of integral cell membrane proteins that play a central role in signal transduction and are activated by an equally diverse array of ligands. GPCRs share a seven hydrophobic α-helical domain structure and transduce signals through coupling to guanine nucleotide–binding regulatory proteins (G proteins). The seven hydrophobic domains are likely to span the membrane and are linked by three extracellular loops that alternate with three intracellular loops. The extracellular NH2 terminus is usually glycosylated and the cytoplasmic COOH terminus is generally phosphorylated. The presence of a large diversity of GPCR genes may be a characteristic of eukaryotic genomes since >1,000 GPCRs have been identified in the Caenorhabditis elegans genome, representing >5% of its total number of genes (Bargmann 1998).
The completion of the sequencing of the Drosophila melanogaster genome allows the analysis of its full repertoire of GPCRs for the first time. Do Drosophila GPCRs have counterparts in other phyla, or do they reflect a highly specialized insect biology? The Drosophila genome contains ∼200 genes coding for GPCRs, including neurotransmitter and hormone receptors, and olfactory and putative taste receptors (Adams et al. 2000; Clyne et al. 2000; Rubin et al. 2000). We have identified 100 genes in the Drosophila genome that code for putative neurotransmitter and hormone GPCRs and atypical seven-transmembrane domain (7 TM) proteins, 68 of which are described here for the first time (Fig. 1, red). These genes were manually curated after the use of gene prediction programs Genie and Genscan (Adams et al. 2000), resulting in an enhanced definition of predicted gene structures.
Drosophila GPCRs are classified into four families: rhodopsin-like (Fig. 1 A); secretin-like (Fig. 1 B); metabotropic glutamate–like (Fig. 1 C); and atypical 7 TM proteins (Fig. 1 D). This classification is based on primary and secondary structure predictions, sequence analysis using profile hidden Markov models, and sequence homology searches using BLAST. Despite the greater number and diversity of GPCRs in vertebrates and C. elegans as compared with Drosophila, the data point to conservation of hormone and neurotransmitter receptors across phyla, suggesting ancient evolutionary origins.
Rhodopsin-like Receptor Family
The rhodopsin-like family encompasses receptors for a large variety of stimuli, such as biogenic amine neurotransmitters, neuropeptides, peptide hormones, light, nucleotides, prostaglandins, leukotrienes, chemotactic peptides, and chemokines. Although their ligands vary considerably in structure, the rhodopsin-like GPCRs show sequence conservation within their seven putative TM domains.
The Drosophila photopigments form three subgroups: (i) Rh1, Rh2, and Rh6 are related to long wavelength–absorbing invertebrate visual pigments; (ii) Rh3, Rh4, and Rh5 belong to a group of short wavelength–absorbing invertebrate visual pigments (Salcedo et al. 1999); (iii) CG5648, which is a newly identified Drosophila opsin (Fig. 1). Subgroups 1 and 2 are more closely related to each other than to CG5648. Drosophila opsins are quite distinct from vertebrate opsins and are more closely related to other insect and mollusk opsins and to melanopsin, a dermal opsin from Xenopus laevis (Provencio et al. 1998). This level of sequence homology suggests that invertebrate opsins and melanopsin may share a common functional basis and evolutionary origin. Functionally, vertebrate retinal opsins require reisomerization into the 11-cis isomer, whereas invertebrate photopigments retain a covalently linked chromophore (Gärtner and Towner 1995).
GPCRs for Biogenic Amines, Related Compounds, and Purines
This is a large group of receptors for classical neurotransmitters and neuromodulators that may share a common evolutionary ancestor and is present in vertebrate and invertebrate lineages (Venter et al. 1988). Of the 21 receptors identified in this group, 11 are described here for the first time (Fig. 1). The biogenic amine GPCRs share high levels of sequence similarity within species and across phyla. Therefore, many of the newly described biogenic amine GPCRs cannot easily be classified into subgroups as defined by their putative ligands. Furthermore, it has been suggested that these receptors have changed substrate specificities during evolution (Peroutka and Howell 1994).
Insects, and Drosophila in particular, have proven to be ideal experimental organisms for the study of the roles of biogenic amine signaling in development, learning, and addiction. Serotonin (5-HT) is involved in circadian rhythms, locomotion, feeding, learning, and memory in invertebrates. The 5-HT2 receptor is known to play an early role in coordinating cell movements during gastrulation in Drosophila (Colas et al. 1999). Dopamine plays a role in the responses of Drosophila to nicotine and ethanol (Bainton et al. 2000). Targeted expression of either stimulatory or inhibitory G-α subunits in dopaminergic and serotoninergic neurons blocks behavioral sensitization to repeated cocaine exposures (Li et al. 2000). Octopamine and tyramine are monoamines thus far identified in arthropods and mollusks. Octopamine has been implicated in the establishment of associative learning in the honeybee (Hammer and Menzel 1998) and tyramine is essential for sensitization to cocaine in Drosophila (McClung and Hirsh 1999). We identified Drosophila receptors for most biogenic amines, with the exception of histamine. In fact, no histamine receptors have been cloned from invertebrates. However, histamine is thought to be the neurotransmitter for Drosophila photoreceptors (Hardie 1987). Therefore, one or more of the unclassifiable biogenic amine receptors may serve the function of histamine receptor (Fig. 1). There is a large amount of evidence supporting the existence of purinergic transmission in invertebrates, but their receptors have never been cloned. The newly identified gene CG9753 encodes a receptor that shares homology with vertebrate adenosine receptors and may constitute the first invertebrate purinergic GPCR.
We identified 25 putative peptide GPCRs (Fig. 1), 18 of which represent newly discovered genes. The Drosophila peptide GPCRs were assigned to nine different ligand types. Approximately 30 different types of peptide GPCRs have been identified in vertebrates. Thus, there appears to be a paucity of peptide receptor types in Drosophila, suggesting that there will be fewer cognate peptide hormones in Drosophila than in vertebrates. Drosophila peptide GPCRs also appear to be more closely related to vertebrate than to C. elegans peptide GPCRs. This finding is surprising given the extensive differences between insects and vertebrates in growth and hormonal regulation.
Sequence analyses of the novel putative Drosophila peptide GPCRs suggest roles for them in regulation of growth, fluid balance, visceral functions, and sexual development. Allatostatin is a 15–amino acid insect neuropeptide that inhibits juvenile hormone synthesis (Bendena et al. 1999). The receptors for LH, FSH, and TSH belong to a family of GPCRs characterized by large NH2-terminal extracellular domains containing leucine repeats, which are important for interaction with glycoprotein ligands (Hsu et al. 1998). A mutant phenotype is known for only one Drosophila peptide GPCR: the rickets mutation, which leads to developmental defects suggesting a role for this receptor in limb development (Ashburner et al. 1999). The gene rickets (rk) bears homology to vertebrate leucine-rich repeat containing GPCRs. Another putative hormone receptor gene, CG6111, encodes a protein related to mammalian vasopressin receptors. Three novel Drosophila genes code for putative growth hormone secretagogue (GHS) receptors: CG8784, CG8795 (two closely related genes located in tandem on opposite strands of chromosome 3R), and CG9918. The vertebrate GHS receptors are involved in regulation of growth hormone release and their endogenous ligand is unknown. The presence of GHS-like receptors in Drosophila is provocative and should help to elucidate the identity of their ligands and the functions of their vertebrate homologues.
14 Drosophila GPCRs, 12 of which are newly described here, did not show significant sequence homology to functionally characterized receptors and were included in the orphan receptor group (Fig. 1). Most of these orphan GPCRs showed higher degrees of sequence identity to C. elegans than to vertebrate GPCRs. This could be explained because their vertebrate homologues have not yet been identified. Alternatively, these orphan GPCRs may play developmental or physiological roles common between C. elegans and Drosophila.
Secretin-like Receptor Family
The secretin-like family includes receptors for many hormones such as secretin, calcitonin, vasoactive intestinal peptide, and parathyroid hormone and related peptides. The secretin-like receptors are characterized by long NH2-terminal domains containing five conserved cysteine residues that may form disulfide bonds and by short third cytoplasmic domains. We identified three novel GPCRs related to vertebrate calcitonin receptors (Fig. 1). Calcitonin receptors are involved in the regulation of Ca2+ homeostasis in vertebrates. Two receptors, encoded by CG8422 and CG12370, are related to insect diuretic hormone receptors (Fig. 1). Insect diuretic hormones are a group of peptides involved in the regulation of fluid and ion secretion (Reagan 1994). The newly identified Drosophila diuretic hormone receptors share 57% sequence identity, suggestive of a gene duplication. One novel latrophilin-like receptor gene was also identified (CG8639). Latrophilins are a heterogeneous group of Ca2+-independent receptors for α-latrotoxin, a potent presynaptic neurotoxin that stimulates massive neurotransmitter exocytosis leading to nerve terminal degeneration (Holz and Habener 1998). The endogenous ligands for latrophilins are unknown and may be involved in control of synaptic exocytosis. Genes CG11318 and CG15556 define another subgroup in the secretin-like receptor family, coding for two novel receptors that share 41% sequence identity. These GPCRs are distantly related to the HE6 receptor, a human receptor of unknown function specifically expressed in the epididymis (Osterhoff et al. 1997).
Methuselah-like Receptor Family
Methuselah is a Drosophila GPCR involved in modulation of life span and stress response. The mutant line methuselah, with a heterozygous mutation in the mth gene, showed increased average life span and enhanced resistance to various forms of stress (Lin et al. 1998). The Methuselah receptor is also essential for normal development since flies homozygous for the mth mutation displayed pre-adult lethality. No counterparts for mth have been identified in vertebrates or C. elegans. We have identified 10 novel genes related to mth in the Drosophila genome (Fig. 2 and Fig. 3). Methuselah is most closely related to Mth-like 2 (CG17795; 60% sequence identity). Two gene clusters were identified in this family. The genes CG17084, CG17061, and mth form a cluster on chromosome 3L. CG6530 and CG6536 are located in tandem on chromosome 2R and share 76% sequence identity at the protein level, indicating a fairly recent duplication. CG16992 and CG7674 predict truncated receptors but their classification as potential pseudogenes needs experimental confirmation. Identification of the ligands for the Methuselah-like receptors should be of major biological interest.
Metabotropic Glutamate Receptor–like Family
The ligands for the metabotropic glutamate–like GPCRs include calcium ions and amino acid neurotransmitters glutamate and γ-amino butyric acid (GABA). Glutamate is a major excitatory neurotransmitter in invertebrates, whereas GABA is generally released from inhibitory synaptic terminals. The metabotropic glutamate–like GPCRs are characterized by very long NH2-terminal extracellular domains containing ∼17 conserved cysteine residues that may form disulfide bonds. Eight members of the metabotropic glutamate receptor–like family were identified in the Drosophila genome; seven of them are described here for the first time (Fig. 1). The novel metabotropic glutamate and GABA-B receptor–like genes show very high degrees of sequence conservation with their vertebrate homologues, suggesting similar roles in synaptic function.
Atypical 7 TM Proteins
The Frizzled-like proteins, Starry night (Flamingo) and Bride of sevenless, are defined here as atypical 7 TM proteins, a group of receptors that share the typical topology of GPCRs but show no sequence conservation with members of the other GPCR families (Fig. 1). These receptors are involved in tissue polarity and cell–cell signaling but their signal transduction pathways are unclear. However, there is evidence that a rat homologue of the Frizzled-like group couples to G proteins (Slusarski et al. 1997). We identified a novel atypical 7 TM protein gene, CG4626, which encodes a Frizzled-like protein that is more closely related to mammalian Frizzled 4 than to other Drosophila Frizzled-like proteins.
Starry night (Stan) is a complex protein containing 7 TM domains and several cadherin, EGF-like, and laminin G domains. The stan gene may have evolved from the combination of ancestral genes coding for a secretin-like GPCR and a cell adhesion molecule. In Drosophila, Stan is implicated in establishment of tissue polarity (Taylor et al. 1998). A novel atypical 7 TM protein that may be distantly related to secretin-like GPCRs is encoded by CG20776, which contains multiple TM domains and several leucine-rich repeats thought to be involved in protein–protein interactions. Bride of sevenless (Boss) is another atypical 7 TM protein that might be distantly related to the metabotropic glutamate–like GPCRs.
In conclusion, GPCRs constitute a very large superfamily of proteins that play a central role in eukaryotic signal transduction. The families of typical GPCRs include the rhodopsin-like, secretin-like, and metabotropic glutamate–like receptors, fungal mating pheromone, Dictyostelium cAMP receptors, and C. elegans chemoreceptors. Additionally, there are three putative (or atypical) GPCR families: the Frizzled-like receptors and Drosophila olfactory and putative taste receptors (Clyne et al. 2000; Rubin et al. 2000). All the different GPCR families share the same seven membrane–spanning domain topology. The evolutionary relationship between the different families is uncertain since there are no significant degrees of sequence similarity between them. It is likely that they have evolved independently and convergently adopted the G protein signal transduction pathway.
Our analysis focused on the typical Drosophila GPCRs, particularly the neurotransmitter and hormone GPCRs, and how they compare with those found in vertebrates and C. elegans. Most of the 100 Drosophila GPCRs described in Fig. 1 show a high degree of sequence conservation with vertebrate GPCRs. Only eight Drosophila GPCRs appear to be more closely related to C. elegans that to vertebrate receptors. We have identified 68 novel Drosophila GPCRs including the mth-like receptors, a unique subfamily of GPCRs that appears to have no counterpart in vertebrates or C. elegans. There is evidence indicating that the mth-like receptor subfamily plays an important role in Drosophila development, stress response, and regulation of life span (Lin et al. 1998).
There has been a large expansion and diversification of chemoreceptors in C. elegans. There is also evidence of an expansion of the peptide receptors in vertebrates and odorant receptors in mammals. Drosophila GPCRs have not expanded to a similar degree: in particular there appears to be a lower number of peptide receptors than expected. This is somewhat surprising, since it has been suggested that peptide transmitters predate biogenic amines in evolution (Walker et al. 1996). In C. elegans, the expansion of GPCR genes is mirrored by an expansion in G protein subunits: 20 α-, 2 β-, and 2 γ-subunit genes have been identified in the C. elegans genome (Bargmann 1998). In contrast, the Drosophila genome contains only 6 α-, 3 β-, and 2 γ-subunit genes.
The organization of the GPCR genes in Drosophila genome shows several differences with other eukaryotic genomes analyzed to date. GPCR genes form large clusters in the genomes of C. elegans and mammals. In contrast, only small clusters of GPCR genes were identified in the Drosophila genome: six consisting of two genes and one of three genes. Substantial proportions of the vertebrate GPCR genes are thought to be intronless, but only 5 out of the 100 Drosophila GPCR genes described here were predicted to be intronless. The C. elegans and mammalian genomes contain a large number of GPCR pseudogenes. We identified only eight genes in the Drosophila genome that appear to code for incomplete GPCRs, but their identity as pseudogenes will require further experimental investigation.
Now that the full repertoire of Drosophila GPCRs is known, the next step is to match the newly identified receptors with their cognate ligands and biological functions. Systematic mutation of the Drosophila GPCRs will help determine their roles in development, neural function, and behavior and may also yield insights into the functions and mutational pathologies of their vertebrate homologues. For example, it is becoming clear that substantial overlap exists in the biological components of addiction in vertebrates and flies; consequently Drosophila should prove invaluable as a model for the study of addiction. Although it has served as a model organism for nearly a century, Drosophila has now been cast in a new role, which should further the investigation of the mechanisms of development, neural function, and disease, for which the analyses of GPCRs will prove crucial.
The authors would like to thank Kristin Scott for sharing information on Drosophila GPCRs; and Judith Brody, Leslie Vosshall, Harold Gainer, and Joseph Campbell for their helpful comments about the manuscript.
Abbreviations used in this paper: GPCR, G protein–coupled receptor; G protein, guanine nucleotide–binding protein; TM, transmembrane.