Annotated Bibliography
Research Gate
Proj ec t
Project
See discussions, stats, and author profiles for this publication at: https://www.researchgate.net/publication/12069307
Positive and Negative Selection in the DAZ Gene Family
Article in Molecular Biology and Evolution · May 2001
DOI: 10.1093/oxfordjournals.molbev.a003831 · Source: PubMed
CITATIONS READS
74 41
2 authors:
Joseph P Bielawski Ziheng Yang
Dalhousie University University College London
179 PUBLICATIONS 7,211 CITATIONS 310 PUBLICATIONS 70,548 CITATIONS
SEE PROFILE SEE PROFILE
Some of the authors of this publication are also working on these related projects:
An Integrated Microbiome and Genetic Analysis of Pediatric Crohn’s Disease View project
The Origin of Plants: Genomes, Rocks, and Biogeochemical Cycles View project
All content following this page was uploaded by Ziheng Yang on 19 January 2016.
The user has requested enhancement of the downloaded file.
Positive and Negative Selection in the DAZ Gene Family
Joseph P. Bielawski and Ziheng Yang Department of Biology, Galton Laboratory, University College London, London, England
Because a microdeletion containing the DAZ gene is the most frequently observed deletion in infertile men, the DAZ gene was considered a strong candidate for the azoospermia factor. A recent evolutionary analysis, however, suggested that DAZ was free from functional constraints and consequently played little or no role in human sper- matogenesis. The major evidence for this surprising conclusion is that the nonsynonymous substitution rate is similar to the synonymous rate and to the rate in introns. In this study, we reexamined the evolution of the DAZ gene family by using maximum-likelihood methods, which accommodate variable selective pressures among sites or among branches. The results suggest that DAZ is not free from functional constraints. Most amino acids in DAZ are under strong selective constraint, while a few sites are under diversifying selection with nonsynonymous/ synonymous rate ratios (dN/dS) well above 1. As a result, the average dN/dS ratio over sites is not a sensible measure of selective pressure on the protein. Lineage-specifc analysis indicated that human members of this gene family were evolving by positive Darwinian selection, although the evidence was not strong.
Introduction
Azoospermia is the most common form of infertil- ity in human males (Shinka and Nakahori 1996). A lo- cus on the human Y chromosome, the azoospermic fac- tor (AZF), is believed to contain a gene, or genes, cru- cial to proper differentiation of male germ cells. The observation that microdeletions at three different loci of AZF occur in 5%–15% of infertile men supports this hypothesis (Ferlin et al. 1999). One of these loci (AZFc) encodes the Deleted in AZoospermia (DAZ) gene. Be- cause AZFc is the most frequently observed deletion in infertile men, it was considered a strong candidate for the azoospermia factor (Ferlin et al. 1999). Genes from a number of different pathways, however, are required for normal spermatogenesis (Elliott and Cooke 1997).
DAZ is located on the Y chromosome, but it is closely related to the autosomal gene DAZL1. While DAZL1 is present in all vertebrates, DAZ is found only in Old World Monkeys. Thus, DAZ [Yq11.23 ] is be- lieved to have evolved via translocation of DAZL1 [3p24] to the Y chromosome (Saxena et al. 1996; Grom- oll et al. 1999) some time after the divergence of Old and New World monkeys; Kumar and Hedges (1998) dated that divergence to about 40 MYA. After the trans- location event, DAZ underwent a series of rearrange- ments and a modifed copy was amplifed, yielding a Y gene cluster.
DAZ and DAZL1 have a functional role in fertility. Both DAZ and DAZL1 are expressed exclusively in germ cells (Cooke et al. 1996; Ruggiu et al. 1997; Gromoll et al. 1999), and in humans DAZ expression is highest in spermatogonia (Menke, Mutter, and Page 1997). Ex- perimental elimination of DAZL1 in mice results in ter- mination of germ cell development beyond the sper- matoginial stage (Ruggiu et al. 1997). Moreover, Y-en- coded human DAZ can compliment the sterile phenotype of DAZL1 null mice, yielding a partial recovery of sper-
Key words: DAZ, DAZL1, gene family, maximum likelihood, codon model, positive selection.
Address for correspondence and reprints: J. P. Bielawski, Depart- ment of Biology, University College London, 4 Stephenson Way, Lon- don NW1 2HE, United Kingdom. E-mail: [email protected].
Mol. Biol. Evol. 18(4):523–529. 2001 q 2001 by the Society for Molecular Biology and Evolution. ISSN: 0737-4038
matogenesis, which suggests the same or similar target mRNA for DAZ and DAZL1 during spermatogenesis (Slee et al. 1999). Although the specifc functions of DAZ and DAZL1 are unknown, the presence of RNA recognition motifs suggests that these genes could be involved in controlling the cell cycle switch from mi- totic to meiotic cell division (Gromoll et al. 1999); this cell cycle switch is controlled by RNA-binding proteins in yeast (Watanabe et al. 1997).
Surprisingly, a recent evolutionary analysis of the DAZ family (DAZ and DAZL1 genes) indicated a lack of functional constraints on DAZ. Agulnik et al. (1998) found a high rate of nonsynonymous substitution, sim- ilar rates between exons and introns, and similar rates among the three codon positions. They concluded that there were no functional constraints on evolution of DAZ and that patterns of sequence divergence were due to neutral drift. They hypothesized that Y-linked DAZ played little role in human spermatogenesis.
The nonsynonymous-to-synonymous rate ratio (dN/ dS) provides a sensitive measure of selective pressure on the protein. However, when selection pressure varies among amino acid sites, the average dN/dS ratio might not be very informative about the evolutionary process- es affecting the gene. The objective of this study was to investigate the role of both purifying and positive selec- tion on the DAZ gene family by using maximum-like- lihood methods that accommodate differences in selec- tive pressures among sites (Nielsen and Yang 1998; Yang et al. 2000). Our fndings indicated that DAZ was not free of functional constraints and that other expla- nations for its rapid rate of nonsynonymous evolution must be considered. There has been considerable debate as to whether rapid evolution in gene families is caused by positive Darwinian selection after gene duplication (Ohta 1993) or by relaxation, but not complete loss, of functional constraints in redundant genes (Kimura 1983; Li 1985). In the latter case, a new function might evolve when formerly neutral substitutions convey a selective advantage in a novel environment or genetic background (Dykhuizen and Hartl 1980). We also examined variable selective pressures among lineages (Yang 1998; Yang and Nielsen 1998), and our fndings suggest that both
523
D ow
nloaded from http://m
be.oxfordjournals.org/ by guest on O ctober 18, 2015
C (0.001) (3.47) b
d (1.44)
DAZL1: Mus
a (0.10)
DAZL 1: Macacca
DAZL 1: Human
e (0.35) r---'-(o_.3_5_) __ DAZ: Macacca
0.1 L.. __ 9_(_1._1_4) __ DAZ:Human
524 Bielawski and Yang
FIG. 1.—Phylogeny for the DAZ gene family. This topology was recovered from a maximum-likelihood analysis of nucleotide sequenc- es and also from least-squares analysis of synonymous divergence. Branch lengths are proportional to the mean number of nucleotide sub- stitutions per codon as inferred under model D: free ratios (Yang 1998). All analyses were conducted using unrooted topologies; this topology is rooted for convenience. Numbers in parentheses are branch-specifc v ratios estimated under model D.
models could have played a role in the evolution of the DAZ gene family.
Materials and Methods Sequence Data
Two data sets were compiled, refecting a trade-off between more characters versus more taxa. Data set 1 was composed of 618 bp of DNA sequence (after ex- clusion of gaps) from fve representatives of the DAZ gene family (fg. 1). Sequences of DAZL1 were from Homo sapiens (GenBank accession number U066078), Macaca mulatta (AF053608), and Mus musculus (U046694), and sequences of DAZ were from H. sapiens (NM004081) and Macaca fascicularis (AJ012216). Se- quences included exons 1–6, A7, C8, and 10. Macaca fascicularis DAZ (AJ012216) contains an intragenic du- plication of exons 2–6 and multiple copies of exons 7 and 8. We sampled the 39 copy of exons 2–6, which is 99% similar to the 59 copy. The ‘‘A’’ copy of exon 7 and the ‘‘C’’ copy of exon 8 were sampled because each predates divergence of the human and Macaca lineages (Gromoll et al. 1999). Data set 1 was used to investigate variation in selective pressure among lineages. To study variation in selective pressure among sites, however, more sequences were needed. Hence, a second data set was compiled consisting of 11 members of the DAZ gene family (fg. 2), but only 291 bp of DNA sequence. Included in data set 2 were exons 3–5 and portions of exons 2 and 6. This data set was composed almost en- tirely of the RNA recognition domain, which spans ex- ons 2–5. Data set 2 included DAZL1 from Cebua apella (AF053608), H. sapiens (U066078), M. mulatta (AF053608), and Papio hamadryas (AF053607); a sin- gle copy of DAZ from H. sapiens (NM004081), Pan troglodytes (AF072324), and M. fascicularis (AJ012216); and two DAZ clones (C1 and C2) from P. hamadryas (C1: AF07230; C2: AF07321) and M. mu- latta (C1: AF072322; C2: AF072323). Clones from P. hamadryas and M. mulatta were divergent copies from a multicopy DAZ array on the Y chromosome (Agulnick et al. 1998). Saxena et al. (2000) recently reported that human DAZ genes occur as a four-copy array in the AZFc region of the Y chromosome. However, they
found that the four copies differed by only a single, silent, transition in exon 7A, so only one copy was in- cluded in our analysis.
Data Analysis
Tree topologies were estimated using maximum likelihood (ML) under the general time-reversible (GTR) model with a discrete gamma model (dG) of rate variation among sites (Yang 1994a, 1994b). Trees also were estimated by least-squares from synonymous di- vergences estimated by ML under a codon model of evolution (Goldman and Yang 1994). The PAUP* com- puter program (Swofford 2000) was used for conducting tree searches.
We implemented four nested models of variable se- lective pressures among branches (Yang 1998; Yang and Nielsen 1998). Model A was the simplest and assumed the same v ratio for all branches. Models B and C were based on the prediction that a gene family evolves under different selective pressures following gene duplication. Model B assumed two v ratios: one for the branch pre- dating the translocation to the Y chromosome (fg. 1; branch a), and a second for branches postdating the translocation (branches b–g). Model C assumed three v ratios: one for branch a, one for DAZL1 branches post- dating the translocation (branches b–d), and one for all DAZ branches (branches e–g). Model D (free ratios) as- sumed an independent v ratio for each branch of a to- pology and was employed to evaluate the potential for positive selection in any one branch of the tree.
ML models (Yang and Nielsen 2000; Yang et al. 2000) also permit testing and identifcation of selective pressures at individual codon sites. We implemented three such models: M3 (discrete), M7 (beta), and M8 (beta&v). M3 assumed two site classes with the pro- portions f0 and f1 and ratios v0 and v1 estimated from the data. M7 assumed that v ratios were distributed among sites according to a beta distribution. Depending on parameters p and q, the beta distribution can take a variety of shapes within the interval (0, 1). M8, an ex- tension of M7, added an extra class of sites having an v parameter freely estimated from the data. Positive se- lection was indicated when an v parameter of M3 or M8 was .1. The likelihood ratio test was used to com- pare a one-ratio model (M0) with M3 and to compare M7 with M8. If there were sites with v . 1, Bayesian methods were used to calculate the posterior probability that a site fell into each site class; sites with high prob- abilities for v . 1 were likely to be under positive Dar- winian selection (Yang et al. 2000).
All ML analyses of codon models were performed using the codeml program of the PAML package (Yang 1999). The models employed correction for transition/ transversion rate bias and codon usage bias, features of DNA sequence evolution that have a signifcant effect on the estimation of substitution rates (Yang and Nielsen 1998, 2000).
D ow
nloaded from http://m
be.oxfordjournals.org/ by guest on O ctober 18, 2015
A. Maximum likelihood tree from GTR+dr
t
DAZL 1: Cebus ape/la DAZL 1: Papio hamadryas DAZL 1: Homo sapiens DAZL 1: Macacca mulatta
Translocation
0.1
DAZ: Macacca fascicularis DAZ: Papio hamadryas C2 DAZ: Papio hamadryas C1 DAZ: Macacca mulatta C2 DAZ: Macacca mulatta C1 DAZ: Pan troglodytes
---- DAZ: Homo sapiens
B. Least squares tree from synonymous distances
t
DAZL1: Cebus
DAZL 1: Papio hamadryas DAZL 1: Homo sapiens DAZL 1: Macacca mulatta
Translocation
DAZ: Pan troglodytes DAZ: Homo sapiens DAZ: Papio hamadryas C1 DAZ: Macacca mulatta C2 DAZ: Macacca mulatta C1
--- DAZ: Macacca fascicularis ----------- DA Z: Papio hamadryas C2 0.1
Selection in DAZ Gene Family 525
FIG. 2.—Candidate topologies for the DAZ gene family in primates. A, Tree topology recovered from a maximum-likelihood analysis under the GTR substitution matrix combined with a gamma correction for among-sites rate variation. B, Tree topology recovered from least-squares analysis of synonymous divergence. Branch lengths are proportional to the mean number of nucleotide substitutions per codon as inferred under model M8: beta&v (Yang et al. 2000). All analyses were conducted using unrooted topologies; these topologies are rooted for convenience.
D ow
nloaded from http://m
be.oxfordjournals.org/ by guest on O ctober 18, 2015
Results Variable Selection Pressure Among Lineages
Phylogenetic analyses of data set 1 by different tree reconstruction methods yielded the same tree topology (fg. 1), and this topology was used to analyze variable selective pressures among lineages. Four models (A–D) were ftted by ML to data set 1 (table 1). The estimate of v for model A (v 5 0.295) represented an average over all codon sites and branches. Model A was then compared with model B, which assumed that selective constraints changed after translocation of DAZL1 to the Y chromosome. Twice the difference in their likelihood scores (2d) was compared with a x2 distribution with degrees of freedom equal to the difference between models in number of parameters. This likelihood ratio test indicated that model B provided a signifcantly bet- ter ft to these data (2d 5 17.8, df 5 1, P 5 0.000025).
The dN/dS ratio after the translocation, v1 5 0.51, is signifcantly higher than that prior to the translocation, v0 5 0.10.
Model B assumed that both autosomal DAZL1 and the copy that was translocated to the Y chromosome (DAZ) experienced the same change in selective con- straints after the translocation event. This simple model was compared with a more complex model (model C) in which changes in selective constraints after the trans- location were allowed to differ between DAZL1 and DAZ (table 1). Estimates under model C indicate very different v values for DAZ and DAZL1 after duplication (table 1). However, the likelihood of model C was not signifcantly better than that of model B (2d 5 1.12, df 5 1, P 5 0.29).
Because positive selection at any one point in the phylogeny could have affected our results, we applied
526 Bielawski and Yang
Table 1 Log Likelihood Scores and Parameter Estimates Under Models of Variable Selection Pressures Among Lineages
Model p Parameters for Branches ,
A: One ratio . . . . . . . . . . . . . 1 v0 5 0.295 for all branches 21,442.44 B: Two ratios . . . . . . . . . . . . 2 v0 5 0.102 for branch a 21,433.52
v1 5 0.513 for branches b, c, d, e, f, and g C: Three ratios . . . . . . . . . . . 3 v0 5 0.103 for branch a 21,432.96
v1 5 0.290 for branches b, c, and d v2 5 0.574 for branches e, f and g
D: Free ratios . . . . . . . . . . . . 7 v0 5 0.100 for branch a 21,426.40 v1 5 3.474 for branch b v2 5 0.001 for branch c v3 5 1.444 for branch d v4 5 0.350 for branch e v5 5 0.355 for branch f v6 5 1.144 for branch g
NOTE.—Analyses were conducted using k as a free parameter and the F61 model of equilibrium codon frequencies. p is the number of branch-specifc v parameters. v ratios greater than 1 are in bold. Branches are defned in fgure 1.
the free-ratios model (model D) to the same data. The likelihood score under model D was signifcantly better than that obtained for model B (2d 5 14.2, df 5 5, P 5 0.014). Branches b, d, and g exhibited v values .1 (table 1). Use of the simpler but less realistic F334 model, which calculates codon frequencies by using base composition at the three codon positions, produced similar results. Note that v for branch g was slightly less than 1 under the F334 model (v6 5 0.99), whereas it was greater than 1 under the F61 model (v6 5 1.144).
Variable Selection Pressure Among Sites
Phylogenetic analysis of data set 2 under the nu- cleotide model GTR1dG recovered a topology in which divergent copies of DAZ from the same species were not monophyletic, indicating that divergent copies of DAZ originated in an early amplifcation event and per- sisted in multiple lineages (fg. 2A). This result is similar to that obtained in a previous analysis of the DAZ gene family (Agulnik et al. 1998). We also inferred a tree topology from synonymous divergences (fg. 2B). This tree, although different from the tree obtained from the
Table 2
nucleotide analysis, also indicated that some copies of DAZ originated from an early amplifcation event and persisted to the present day. Both trees also indicate a clear bifurcation between all DAZL1 and DAZ sequenc- es, supporting the hypothesis that a single translocation event gave rise to the Y-encoded DAZ. To investigate the impact of tree topology, models of variable v values among sites were analyzed using both topologies in fg- ure 2 (table 2). The small size of data set 2 (291 bp; 97 codons) prevented use of the parameter-rich model of empirical codon frequencies (the F61 model), and the F334 model was used instead.
The discrete model (M3), which allowed two site classes with independent v ratios, provided a signifcant improvement over the one-ratio model (M0) regardless of the tree topology assumed (table 3). The selective pressure is not uniform among amino acid sites. Esti- mates of parameters under M3 suggest that most sites (95%–97%) are under selective constraint, with v0 5 0.35–0.37, while a few sites (3%–5%) are evolving by positive selection, with v1 close to 6. Both models, M3 and M8, which allowed for the presence of positively
D ow
nloaded from http://m
be.oxfordjournals.org/ by guest on O ctober 18, 2015
Log Likelihood Scores and Parameter Estimates for Four Models of Variable v’s Among Sites and Two Tree Topologies
Model Parameter Estimates Positively Selected Sites ,
Tree 1 (fg. 2A) M0: One ratio . . . . . . . . . . . v 5 0.47 None 2747.19 M3: Discrete . . . . . . . . . . . .
M7: Beta . . . . . . . . . . . . . . . M8: Beta&v . . . . . . . . . . . .
Tree 2 (fg. 2B) M0: One ratio . . . . . . . . . . .
v0 5 0.37, f0 5 0.97 v1 5 5.66 (f1 5 0.03) p 5 0.63, q 5 0.78 p 5 2.2, q 5 3.1, f0 5 0.98 v1 5 12.47 (f1 5 0.02)
v 5 0.47
26, 28, 42
Not allowed 28
None
2742.72
2745.84 2742.70
2758.47 M3: Discrete . . . . . . . . . . . .
M7: Beta . . . . . . . . . . . . . . . M8: Beta&v . . . . . . . . . . . .
v0 5 0.35, f0 5 0.95 v1 5 5.96 (f1 5 0.05) p 5 0.30, q 5 0.37 p 5 107, q 5 197, f0 5 0.95 v1 5 5.70 (f1 5 0.05)
26, 28, 42, 91
Not allowed 26, 28, 42, 91
2748.90
2754.99 2748.91
NOTE.— p and q are parameters of the beta distribution. f is the proportion of sites assigned to an individual v category or to a beta distribution with shape parameters p and q. The proportion f1 (in parentheses) is not a free parameter. Positively selected sites are those with posterior probabilities (P) . 0.50, and those with P . 0.95 are in bold.
Selection in DAZ Gene Family 527
Table 3 Likelihood Ratio Statistic (2d) for Comparing Models of Variable v’s Among Sites
M3 vs. M0 M8 vs. M7
Tree 1 (fg. 2A) . . . . . . . . 8.94* 6.82* Tree 2 (fg 2B) . . . . . . . . 19.14* 12.16*
NOTE.—See table 2 for model parameters. * Signifcant at the 5% level (x2 5 5.99, df 5 2).5%
selected sites indicated that some variation in selective pressure was due to positive selection (table 2). Likeli- hood ratio tests indicated that these models ft the data better than models in which positively selected sites were not allowed (table 3). It is also noteworthy that regardless of model or topology, v values for sites not subject to positive Darwinian selection were well below 1 (table 2), indicating evolution by purifying selection.
Agulnick et al. (1998) hypothesized that there were no functional constraints on DAZ sequences. To test this hypothesis specifcally, we reanalyzed only the DAZ se- quences of data set 2. The results were consistent with the previous analysis of data set 2; v values for those sites not subject to positive Darwinian selection were well below 1 (e.g., tree 1—discrete model: v0 5 0.43, f0 5 0.93, v1 5 3.4, f1 5 0.07; beta&v model— p 5 98, q 5 122, f0 5 0.94, v1 5 4.1, f1 5 0.06).
Discussion
Maximum-likelihood analysis of the DAZ gene family revealed signifcant variation in selective pres- sures among lineages and among sites. The majority of sites are clearly subject to purifying selection, with the nonsynonymous rate being well below the synonymous rate. A small fraction of sites exhibit nonsynonymous rates almost six times the synonymous rate, indicating the action of positive Darwinian selection. Lineage-spe- cifc analyses indicated that following the translocation of an autosomal copy of DAZL1 to the Y chromosome, both loci experienced increased rates of nonsynonymous substitution. In DAZL1 this was due, at least in part, to early evolution by positive Darwinian selection. Later, DAZL1 of M. fascicularis returned to evolution by pu- rifying selection, whereas DAZL1 of humans continued to evolve by positive Darwinian selection. Although there was also an increase in nonsynonymous substitu- tion in DAZ, its early evolution was most consistent with purifying selection. Nevertheless, more recent evolution in DAZ shows the same pattern as DAZL1, with human evolution by positive Darwinian selection and M. fas- cicularis evolution by purifying selection.
Based on an evolutionary analysis of the DAZ gene family, Agulnik et al. (1998) concluded that there were no functional constraints on the evolution of DAZ in primates and questioned the role of DAZ in human sper- matogenesis. Our fndings, however, indicate that the majority of sites in DAZ are subject to purifying selec- tion. This notion is supported by the observation of in- tact reading frames for all the sampled exons of DAZ; there are no frameshift mutations or premature stop co-
dons. Moreover, complementation of sterile-phenotype DAZL1 mice by human DAZ strongly suggests a func- tional role for human DAZ in spermatogenesis (Slee et al. 1999). These observations, taken together with ex- pression patterns of DAZ, lead us to conclude that DAZ is not free from functional constraints in primates and that the DAZ gene is likely to have functional impor- tance in human spermatogenesis.
The pairwise approach used by Agulnick et al. (1998) is the most common method of computing syn- onymous and nonsynonymous rates. This approach, however, averages rates over all sites and also over the entire time interval that separates a pair of sequences. Agulnick et al. (1998) did not observe dN/dS ratios in excess of 1 in most comparisons because evolution by positive selection occurred at a subset of sites and only in certain lineages of DAZ. This example is not unique, as the pairwise approach also failed to detect positive selection in HIV (Leigh Brown 1997; Crandall et al. 1999; Zanotto et al. 1999). Moreover, the same effect led to the incorrect conclusion that the k-casein gene was free from functional constraint (Ward, Honeycutt, and Derr 1997). These studies indicate that for some proteins the traditional approach to estimating the dN/dS ratio might not provide a sensible measure of selective pressure.
Gene duplication is considered an important mech- anism for functional divergence (Ohno 1970; Ohta 1993). However, the process by which duplicated genes acquire new functions is less clear. There is often an acceleration of the rate of evolution following gene du- plication (Li 1985; Ohta 1993, 1994). Accelerated rates could initially be driven by positive Darwinian selection for functional divergence (Ohta 1993, 1994) or by re- laxation of selective constraints. In the latter case, it is thought that random fxation of neutral changes even- tually leads to a novel function in one or both copies. This model was referred to as the ‘‘Dykhuizen-Hartl ef- fect’’ by Zhang, Rosenberg, and Nei (1998). Our fnd- ings are consistent with both models. The elevated rate of nonsynonymous substitution in autosomal DAZL1 following its duplication appears to result from the ac- tion of positive Darwinian selection. However, elevated rates of nonsynonymous substitution in DAZ immedi- ately following its origin via the translocation event ap- pear to result from decreased levels of purifying selec- tion, suggesting a possible role for the Dykhuizen-Hartl effect in the early stages of DAZ evolution.
Our fndings are consistent with several studies of gene families in which positive Darwinian selection was shown to be at least partially responsible for a rate in- crease following gene duplication (Ohta 1993, 1994; Zhang, Rosenberg, and Nei 1998; Duda and Palumbi 1999; Rooney and Zhang 1999; Schmidt, Goodman, and Grossman 1999). However, in the only other case in which the relative contribution of both models was in- vestigated, the Dykhuizen-Hartl effect was ruled out (Zhang, Rosenberg, and Nei 1998). Although in this re- spect our fndings appear to differ, it is important to point out that the method we employed to accommodate variation in selective pressures among lineages calcu-
D ow
nloaded from http://m
be.oxfordjournals.org/ by guest on O ctober 18, 2015
528 Bielawski and Yang
lated v as an average across all sites. Because an epi- sode of positive selection at a subset of sites could el- evate v at a specifc branch without causing it to exceed 1, it is not possible to completely rule out the positive- selection model. More complex models which can si- multaneously accommodate rate variation among sites and lineages are under development and might be useful in distinguishing between positive selection and the Dy- khuizen-Hartl effect.
Darwinian selection appears to be a relatively com- mon feature of mammalian reproductive proteins (Karn and Nachman 1999; Rooney and Zhang 1999; Wyckoff, Wang, and Wu 2000). Our fndings indicate the DAZ gene family represents another example of this pattern. However, not all lineages of the DAZ gene family are presently evolving by positive Darwinian selection; both DAZ and DAZL1 of M. fascicularis are evolving by pu- rifying selection. With regard to the difference between humans and M. fascicularis, it is interesting to note that the mature DAZ protein of humans has only 1 processed copy of exon 8, whereas the mature DAZ protein of M. fascicularis has 10 processed copies of exon 8. The DNA sequence for DAZ in both species contains mul- tiple copies of both exons 7 and 8. However, at some point in the human lineage, a mutation disabled the splice sites of exons 8A and 8D. Because all but one copy of exon 8 in present-day human DAZ are descend- ed from the disabled copies of exons 8A and 8D, the mature DAZ protein of humans includes only one pro- cessed copy of exon 8 (Gromoll et al. 1999). Because the open reading frame was preserved in M. fascicularis despite several duplication and rearrangement events, multiple copies of exons 7 and 8 must have evolved functional importance during divergence of this gene family (Gromoll et al. 1999). It is tempting to speculate that in the human lineage a loss of processing of all but one copy of exon 8 initiated adaptive evolution at other sites in both DAZ and DAZL1 to maintain proper sper- matogenesis. Additional sequences of DAZ and DAZL1 from a variety of primate species are needed to under- stand the role of positive selection in functional diver- gence of the DAZ gene family.
Acknowledgments
We thank Katherine A. Dunn for helpful comments and discussion. We also thank the associate editor and two anonymous reviewers for their constructive com- ments. This study was supported by a Biotechnology and Biological Sciences Research Council grant (31/ G10434) to Z.Y.
LITERATURE CITED
AGULNIK, A. I., A. ZHARKIKH, H. BOETTGER-TONG, T. BOUR- GERON, K. MCELREAVEY, and C. E. BISHOP. 1998. Evolu- tion of the DAZ gene family suggests that Y-linked DAZ plays little, or a limited, role in spermatogenesis but under- lines a recent African origin for human populations. Hum. Mol. Genet. 7:1371–1377.
COOKE, H. J., M. LEE, S. KERR, and M. RUGGIU. 1996. A murine homologue of the human DAZ gene is autosomal
and expressed only in male and female gonads. Hum. Mol. Genet. 5:513–516.
CRANDALL, K. A., C. R. KELSEY, H. IMANICHI, H. C. LANE, and N. P. SALZMAN. 1999. Parallel evolution of drug resis- tance in HIV: failure of nonsynonymous/synonymous sub- stitution rate ratio to detect selection. Mol. Biol. Evol. 16: 372–382.
DUDA, T. F., and S. R. PALUMBI. 1999. Molecular genetics of ecological diversifcation: duplication and rapid evolution of toxin genes of the venomous gastropod Conus. Proc. Natl. Acad. Sci. USA 96:6820–6823.
DYKHUIZEN, D., and D. L. HARTL. 1980. Selective neutrality of 6PGD allozymes in E. coli and the effects of genetic background. Genetics 96:801–817.
ELLIOTT, D. J., and H. J. COOKE. 1997. The molecular genetics of male infertility. Bioessays 19:801–809.
FERLIN, A., E. MORO, A. GAROLLA, and C. FORESTA. 1999. Human male infertility and Y chromosome deletions: role of the AZF- candidate genes DAZ, RBM and DFFRY. Hum. Reprod. 14:1710–1716.
GOLDMAN, N., and Z. YANG. 1994. A codon based model of nucleotide substitution for protein-coding DNA sequences. Mol. Biol. Evol. 11:725–736.
GROMOLL, J., G. F. WEINBAUER, H. SKALETSKY, S. SCHLATT, M. ROCCHIETTI-MARCH, D. C. PAGE, and E. NIESCHLAG. 1999. The Old World monkey DAZ (Deleted in AZoosper- mia) gene yields insights into the evolution of the DAZ gene cluster on the human Y chromosome. Hum. Mol. Genet. 8: 2017–2024.
KARN, R. C., and M. W. NACHMAN. 1999. Reduced nucleotide variability at an androgen-binding protein locus (Abpa) in house mice: evidence for positive natural selection. Mol. Biol. Evol. 16:1192–1197.
KIMURA, M. 1983. The neutral theory of molecular evolution. Cambridge University Press, Cambridge, England.
KUMAR, S., and B. HEDGES. 1998. A molecular timescale for vertebrate evolution. Nature 392:917–920.
LEIGH BROWN, A. J. 1997. Analysis of HIV-1 env gene reveals evidence for a low effective number in the viral population. Proc. Natl. Acad. Sci. USA 94:1862–1865.
LI, W.-H. 1985. Accelerated evolution following gene dupli- cation and its implications for the neutralist-selectionist controversy. Pp. 333–352 in T. OHTA and K. AOKI, eds. Population genetics and molecular evolution. Japan Scien- tifc Press, Tokyo.
MENKE, D. B., G. L. MUTTER, and D. C. PAGE. 1997. Expres- sion of DAZ, an azoospermia factor candidate, in human spermatogonia. Am. J. Hum. Genet. 60:237–241.
NIELSEN, R., and Z. YANG. 1998. Likelihood models for de- tecting positively selected amino acid sites and applications to the HIV-1 envelope gene. Genetics 148:929–936.
OHNO, S. 1970. Evolution by gene duplication. Springer-Ver- lag, Berlin.
OHTA, T. 1993. Pattern of nucleotide substitution in growth hormone-prolactin gene family: a paradigm for evolution by gene duplication. Genetics 134:1271–1276.
———. 1994. Further examples of evolution by gene dupli- cation revealed through DNA sequence comparisons. Ge- netics 138:1331–1337.
ROONEY, A. P., and J. ZHANG. 1999. Rapid evolution of primate sperm protein: relaxation of functional constraint or positive Darwinian selection? Mol. Biol. Evol. 16:706–710.
RUGGIU, M., R. SPEED, M. TAGGART, S. J. MCKAY, F. KIL- ANOWSKI, P. SAUNDERS, J. DORIN, and H. J. COOKE. 1997. The mouse DAZL1a gene encodes a cytoplasmic protein essential for gametogenesis. Nature 389:73–77.
D ow
nloaded from http://m
be.oxfordjournals.org/ by guest on O ctober 18, 2015
Selection in DAZ Gene Family 529
SAXENA, R., L. G. BROWN, T. HAWKINS et al. (11 co-authors). 1996. The DAZ gene cluster on the human Y chromosome arose from an autosomal gene that was transposed, repeat- edly amplifed and pruned. Nat. Genet. 14:292–299.
SAXENA, R., J. W. A. DE VRIES, S. REPPING, R. K. ALAGAPPAN, H. SKALETSKY, L. G. BROWN, P. MA, E. CHEN, J. M. N. HOOVERS, and D. C. PAGE. 2000. Four DAZ genes in two clusters found in the AZFc region of the human Y chro- mosome. Genomics 67:256–267.
SCHMIDT, T. R., M. GOODMAN, and L. I. GROSSMAN. 1999. Molecular evolution of the COX7A gene family in primates. Mol. Biol. Evol. 16:619–626.
SHINKA, T., and Y. NAKAHORI. 1996. The azoospermic factor on the Y chromosome. Acta Paediatr. Jpn. 38:399–404.
SLEE, R., B. GRIMES, R. M. SPEED, M. TAGGART, S. M. MA- GUIRE, A. ROSS, N. I. MCGILL, P. T. SAUNDERS, and H. J. COOKE. 1999. A human DAZ transgene confers partial res- cue of the mouse Dazl null phenotype. Proc. Natl. Acad. Sci. USA 96:8040–8045.
SWOFFORD, D. L. 2000. PAUP*. Phylogenetic analysis using parsimony (*and other methods). Version 4. Sinauer, Sun- derland, Mass.
WARD, T. J., R. L. HONEYCUTT, and J. N. DERR. 1997. Nucle- otide sequence evolution at the kappa-casein locus: evi- dence for positive selection within the family Bovidae. Ge- netics 147:1863–1872.
WATANABE, Y., S. SHINOZAKI-YABANA, Y. CHIKASHIGE, Y. HIRAOKA, and M. YAMAMOTO. 1997. Phosphorylation of RNA-binding protein controls cell cycle switch from mi- totic to meiotic in fssion yeast. Nature 386:187–190.
WYCOFF, G. J., W. WANG, and C.-I. WU. 2000. Rapid evolution of male reproductive genes in the descent of man. Nature 403:304–309.
YANG, Z. 1994a. Estimating the pattern of nucleotide substi- tution. J. Mol. Evol. 39:105–111.
———. 1994 b. Maximum likelihood phylogenetic estimation from DNA sequences with variable rates over sites: ap- proximate methods. J. Mol. Evol. 39:306–314.
———. 1998. Likelihood ratio tests for detecting positive se- lection and application to primate lysozyme evolution. Mol. Biol. Evol. 15:568–573.
———. 1999. Phylogenetic analysis by maximum likelihood (PAML). Version 2. University College London, England.
YANG, Z., and R. NIELSEN. 1998. Synonymous and nonsynon- ymous rate variation in nuclear genes of mammals. J. Mol. Evol. 46:409–418.
———. 2000. Estimating synonymous and nonsynonymous substitution rates under realistic evolutionary models. Mol. Biol. Evol. 17:32–43.
YANG, Z., R. NIELSEN, N. GOLDMAN, and A.-M. K. PEDERSON. 2000. Codon-substitution models for heterogeneous selec- tion pressure at amino acid sites. Genetics 155:431–449.
ZANOTTO, P. M. DE A., E. G. KALLAS, R. F. DE SOUZA, and E. C. HOLMES. 1999. Genealogical evidence for positive selection in the nef gene of HIV-1. Genetics 153:1077– 1089.
ZHANG, J., H. F. ROSENBERG, and M. NEI. 1998. Positive Dar- winian selection after gene duplication in primate ribonu- clease genes. Proc. Natl. Acad. Sci. USA 95:3708–3713.
EDWARD HOLMES, reviewing editor
Accepted November 20, 2000
D ow
nloaded from http://m
be.oxfordjournals.org/ by guest on O ctober 18, 2015
View publication statsView publication stats