Common name
Species
Reference
Atlantic cod
Gadus morhua
Star et al. (2011)
Chinese alligator
Alligator sinensis
Star et al. (2011)
Galapagos tortoise
Chelonoidis nigra
Loire et al. (2013)
Saker falcon
Falco cherrug
Zhan et al. (2013)
Puerto Rican parrot
Amazonia vittata
Oleksyk et al. (2012)
Tasmanian devil
Sarcophilus harrisi
Chimpanzee
Pan troglodytes
Mikkelsen et al. (2005)
Bonobo
Pan paniscus
Prüfer et al. (2012)
Gorilla
Gorilla gorilla
Scally et al. (2012)
Sumatran orang-utan
Pongo abelii
Locke et al. (2011)
Bornean orang-utan
Pongo pygmaeus
Locke et al. (2011)
Giant panda
Ailuropoda melanoleuc
Li et al. (2009)
Polar bear
Ursus maritimus
Tiger
Panthera tigris
Cho et al. (2013)
Tibetan antelope
Pantholops hodgsonii
Ge et al. (2013)
Yangtze River dolphin
Lipotes vexillifer
Zhou et al. (2013)
2 Common Applications of Molecular Markers in Conservation Genetics
The first and arguably most-important step in genetic management is the clear defining of objectives, including what to manage and how. This multi-faceted process ideally includes a thorough understanding of the ecology, evolutionary and population history of the population or species and a clear consensus of the management goals. The determination of “what to manage” can be based on criteria such as morphological and behavioral traits, geographic distribution, molecular genetic variables, uniqueness, ecology, economic concerns, cultural importance, population structure and size, and probability of extinction. Here we will focus on direct measures of genetic variation and briefly summarize how genetic markers and next-generation technologies can provide insights that help inform and assist in the conservation management of genetic resources.
Conservation genetics has historically focused on observed differences (or levels of genetic divergence) among individuals and populations, subspecies, or breeds of domestic animals as might be quantified by (1) estimating mean observed and expected heterozygosity averaged over the typed loci, (2) the average number of alleles, or (3) allelic richness (Luikart and Cornuet 1998). When groups of individuals do not have a large number of fixed differences (for example populations that are recently isolated or relatively inbred), molecular differentiation is estimated by differences in allele frequencies among populations as specific differences will be rare and most common alleles will be shared across groups (e.g. MacHugh et al. 1998; Balloux and Lugon-Moulin 2002; Laval et al. 2002). However, sometimes demographic events will be as or more important than time, as genetic drift and shifts in allele frequencies will become more significant through incidences of inbreeding, genetic bottlenecks, and/or increasing amounts of admixture (Lande 1988).
Fixed genetic differences are used as the basis to classify and identify species, especially those that are difficult to distinguish from only morphological features (e.g. in cryptic species). For example, DNA barcoding, which uses sequence variation from standardized regions of the mitochondrial DNA, has become a widely used system for cataloging animal biodiversity and has been instrumental in the discovery of several new species. Barcoding is being organized globally through several international initiatives, including the International Barcode of Life project and the Consortium for the Barcode of Life (CBOL) and DNA barcode databases such as the Barcode of Life Data Systems and the International Nucleotide Sequence Database Collaboration.
When subspecies, populations, or breeds are difficult to define based on geographic location, morphological, ecological or evolutionary criteria (Waples and Gaggiotti 2006), reliance on molecular genetic criteria becomes even more crucial. It is also often advantageous to identify groups without prior information of their genetic structure (e.g. without preassigned population or subspecific assignment), and to identify individuals with genetic heritage from more than one of these groups. Multi-locus clustering analyses, such as employed in the program STRUCTURE (Pritchard et al. 2000) use multi-locus genotypes and specific ancestry models to estimate the fraction the genome of each individual that belongs to each cluster. They can also be used to assign ‘unknown’ individuals to populations (Manel et al. 2003; Paetkau et al. 2004; Stella et al. 2008; Toro et al. 2009) and are especially useful when natural barriers to gene flow are not obvious, such as with marine species (Primmer 2009). For example, management of commercial fisheries such as Atlantic salmon (Griffiths et al. 2010) and lake sturgeon (Bott et al. 2009) have utilized genetic analyses to facilitate the identification of source populations and thus help avoid overexploitation. Similar approaches have been used for monitoring the source of animal products being sold in markets (Baker 2008; Chapman et al. 2009; Kochzius et al. 2010) or distinguishing among similar looking species, either as adults or only during specific stages of development (Kon et al. 2007; Ogden 2008).
These methods are also very effective for identifying escaped animals from captivity into the wild, confirm the origin of these escapes, and establish the extent of introgression into the wild population. For example, Kidd et al. (2009) documented hybridization between domestic mink (Neovison vison) that had escaped from farms and wild mink using a panel of microsatellites and admixture analyses, thereby altering the evolutionary integrity of the wild populations. As another example, genetic markers were used to document genetic introgression in the Florida panther (Puma concolor coryi) from individuals released of Central American origin, from a captive group of pumas from a small animal exhibit, in addition to the intentional release of pumas from Texas (Johnson et al. 2010).
Once genetic groups have been defined, the classic Wright’s F-statistic is commonly used to partition genetic variation into a within-subpopulation (average subpopulation inbreeding coefficient FIS) and a between-subpopulation component (fixation index FST), depending upon the genetic markers used, their mutations rates, and sampling scheme (see e.g. Cockerham and Weir 1984; Holsinger and Weir 2009). Although these estimates should not be strictly compared with each other or among studies, FST values from 0.05 and 0.3 are typical among populations or breeds, with values above 0.15 often interpreted as evidence of significant differentiation (Frankham et al. 2002). For comparisons among populations with sequence data, analysis of molecular variance (AMOVA) (Excoffier et al. 1992) is often the most-appropriate method.
3 Inbreeding, Relatedness, Effective Population Size, and Gene Flow
An assessment of inbreeding is one of the primary concerns of all efforts to conserve genetic diversity. Depending upon the type of available data, inbreeding is estimated from pedigrees or from molecular data with the Wright FIS inbreeding coefficient (Frankham et al. 2002). With high-density genome-wide SNP or sequence data, comparatively-long stretches of homozygosity are a sign of inbreeding (McQuillan et al. 2008; Kirin et al. 2010). Inbreeding is inversely correlated with effective population size (Ne), or the number of individuals for which random breeding in an ideal population would generate the same dispersion of allele frequencies or amount of inbreeding as that observed in the real population (Wang and Whitlock 2003; Charlesworth and Willis 2009). Ne is extensively used as a criterion for determining the risk status of populations and is invariably much less than the census population size. Ne is also correlated with the degree of relatedness among individuals and to some extent, populations, and is best estimated using a large number of genetic markers (Oliehoek et al. 2006; Toro et al. 2009). For example, Tapio et al. (2010) used this approach to estimate relatedness among non-pedigreed cryo-banked Yakutian cattle bulls, a breed of cattle native to Siberia with only ~1,200 purebred individuals remaining, and showed that these cryo-banked samples harbored unique allelic variation of potential use to enhancing the genetic diversity of the remaining purebred population.
Past population dynamics, such as population expansions and bottlenecks can also be inferred from patterns of genetic variation. Most recently, these approaches have included the analysis of individual whole-genome sequences with a pairwise sequentially Markovian coalescent model (PSMC, Li and Durbin 2011) or with the program BEAST using a Bayesian coestimation of time to most recent common ancestor, evolutionary rates, and past population dynamics (Drummond et al. 2005). This approach demonstrated, for example, that the domesticated water buffalo, yak, gayal, and bovine recently experienced a rapid population increase that was not observed in the wild African buffalo (Finlay et al. 2007).
Following the identification of populations, conservation units, species or other subdivisions, managers are often most interested in estimated levels of gene flow. Gene flow may be advantageous, for example in efforts to increase Ne and reduce the effects of inbreeding, but can be problematic if it leads to hybridization and undesired admixture or introgression. Gene flow or introgression can be detected by discordant results between autosomal and sex-linked markers, or from clustering analyses described above, as implemented in STRUCTURE (Pritchard et al. 2000).
Because introgression or hybridization can be a major threat to conservation in the form of outbreeding depression (e.g., Templeton 1986), early detection is fundamental for effective conservation strategies. For example, statistical analyses of genetic data from reintroduced Arabian oryx (Oryx leucoryx) demonstrated that outbreeding depression was affecting juvenile survival (Marshall and Spalton 2000). Molecular marker data have also demonstrated gene flow from wild populations to domesticated animals, for instance, from jungle fowl to domesticated populations of Vietnamese chicken (Berthouly et al. 2009). There are also an increasing number of examples of natural introgression and hybridization, for example among California tiger salamanders (Fitzpatrick et al. 2009), between American bison and domestic cattle (Halbert and Derr 2007), among Darwin’s finches (Grant and Grant 2010), and between African and Asian elephants (Roca et al. 2005) and brown and polar bears (Miller et al. 2012) See Box 5.1.
Box 5.1 Genomics and Population Genomics of the Giant Panda
The giant panda (Ailuropoda melanoleuca) is an endangered ursid found in mountain habitats across several provinces of western China. Unique among the bear family, giant pandas are almost entirely herbivorous, feeding almost exclusively on bamboo. The population size of the species is estimated to be between 2,500 and 3,000 individuals based on molecular genetic analyses (Zhan et al. 2006) and with a low rate of fecundity combined with loss of habitat, the giant panda faces a precarious future. As a result, the conservation management of the giant panda in both the wild and in captivity has received much attention.
In 2010, the genome of the giant panda became the first mammalian genome to be sequenced and assembled de novo (Li et al. 2010). Genome size was estimated to be 2.4 gigabases and in comparison with the genomes of the dog and human, it was found that the giant panda had a relatively low rate of divergence. Remarkably, however, the giant panda genome was found to have a high number (2.7 million) of heterozygous SNPs, with a rate of heterozygosity almost twice that found in the human genome. This finding confirmed earlier molecular genetic studies based on many fewer markers that suggested that giant pandas still retain a relatively high level of genetic diversity and little evidence of inbreeding (Zhang et al. 2007). Given the low rate of fecundity, the genome was also used to identify many genes involved in reproduction and gonad development.
Box 5.1 (continued) Once a reference genome of a species has been generated, additional individuals from different populations can be sequenced at lower coverage and mapped against the reference, thereby providing a population genomic perspective of genetic diversity and historical demography. This was done for the giant panda by Zhou et al. (2013), who sequenced 34 pandas at ~4.7-fold coverage from the three main areas where this species is found in western China. Although only two subspecies of giant panda have been recognized, analyses of genetic structure resolved giant pandas into three genetic clusters, with the isolated population in the Qinling Mountains, being the most distinct and estimated to have diverged about 300,000 years ago. The other subspecies was resolved into two genetic populations that diverged more recently, about 2,700 years ago. Analysis of historical demography suggested that giant pandas underwent two rounds of population bottlenecks and expansions each. Most interestingly, however, analyses of genome-wide SNP diversity made it possible to distinguish signatures of local adaptation from neutral diversity by locating genes that showed evidence of directional selection among the three populations. Population genomic studies such as this are of great interest to conservationists because the ability to identify genes of adaptive significance within species can be quantified and used to prioritize populations of conservation importance, as the neutral and adaptive components of genomic diversity may not always be correlated (Bonin et al. 2006, 2007).
4 Adaptation and Selection
Conservation strategies that focus on genetic variation are increasingly also interested in identifying genotypes associated with advantageous traits or phenotypes and the preservation of adaptive variation (or with maladaptive deleterious traits associated with dis-advantageous phenotypes). Fundamentally, this involves distinguishing positive or negative selection from neutral variation that is the product of genetic drift (Joost et al. 2007; Novembre and Di Rienzo 2009), or distinguishing between events that affect only a specific region of the genome (selection) versus the entire genome (drift).
A traditional method to identify positive selection is to compare allele frequencies of different populations with markers near genes of interest. With the increased availability of variable genetic markers from larger numbers of unrelated individuals (e.g. 30–50) from contrasting groups and available software packages such as Bayescan (Foll and Gaggiotti 2008), Lositan (Antao et al. 2008) and Mcheza (Antao and Beaumont 2011) Fst values that differ significantly from the rest of the genome are used to identify selection (high values suggest positive or negative selection and low values suggest balancing selection (Slatkin 2008)). These approaches were used to detect candidate loci for adaptation along a gradient of altitude in the common frog (Rana temporaria). Other methods taking advantage of genomic sequence data and analyses of mutational patterns in extended regions of linkage disequilibrium (LD) are also going to be increasingly accessible (Prasad et al. 2008; Joost et al. 2007; Rubin et al. 2010).
5 Molecular Markers in Conservation and Reproductive Sciences
Molecular genetic techniques are steadily becoming integrated into the reproductive sciences. Traditionally, reproductive management has relied upon phenotypic variation or measurable inherited characteristics for assisted selection or breeding for desired traits, starting with the earliest efforts of animal domestication. The use of genetic markers has largely been pioneered in agricultural species and model organisms such as mice, chicken, cattle, pig, horse, and dog. However, with the increased accessibility of genomic resources for non-model organisms and through comparative research, the opportunities and feasibility for integrating and enhancing reproductive technologies with genomic approaches are increasing rapidly.
5.1 Molecular Marker-Assisted Selection
Marker-assisted selection (MAS) is one of the most promising tools for linking reproductive techniques and genetics. MAS requires a genetic marker linked to a gene or genomic region containing quantitative trait loci. These markers can also be used to identify quantitative traits and also be used for the selection of mating pairs. Marker-assisted mate selection are especially useful to increase the efficiency of genetic improvements, especially when phenotype screening is difficult (e.g. with resistance to an infectious disease), is expressed late in life (e.g. late-onset diseases), or is expressed only in one sex.
To date, MAS has mostly been implemented in large-scale agriculture systems in developed countries (e.g. Dekkers 2004), but it may become an effective technique for group or population management to monitor wild and semi-wild populations, especially in gregarious species such as birds, fish, and ungulates to estimate the genomic breeding values or for species where following pedigrees is impractical. One of the best-described early cases of MAS involved genetic disease resistance in domestic animals to transmissible spongiform encephalopathy in sheep, where polymorphisms in the PrP gene were linked with susceptibility to scrapies, and which led to breeding programs for disease resistant sheep in the European Union (Hunter et al. 1996). In fish, molecular markers have been used to study and identify individuals with resistance to infectious pancreatic necrosis and infectious Salmon anemia in Atlantic salmon (Moen et al. 2009; Jieying Li et al. 2011). In addition, markers have been developed in fish that trace influence of growth, spawning time, sex determination, abiotic stress tolerance, and disease resistance (Loukovitis et al. 2011). Improved methods to isolate, sequence and interpret differences among pathogens will increase the power of diagnostics, efficacy of treatments, and the ability to monitor wildlife diseases such as rabies, distemper, blue tongue disease, avian influenza, foot-and-mouth disease, and CSV and to link these to variation and outcomes in affected individuals (Hoffmann et al. 2009) See Box 5.2.
Box 5.2 The Tasmanian Devil
The Tasmanian devil (Sarcophilus harrisii) is the largest carnivorous marsupial in the world and an island endemic. Populations of the species are declining due to various threats such disease epidemics and loss of habitat. The greatest threat facing Tasmanian devils, however, is a highly infectious and transmissible cancer known as Devil Facial Tumor Disease (DFTD). This clonal cancer is transmitted through the natural physical contact among Tasmanian devils (biting), and individuals that contract the cancer develop infections within months and suffer a 100 % mortality rate. Without intervention to stop the spread of DFTD, Tasmanian devils may go extinct (McCallum et al. 2007).
Two independent research groups sequenced the genome of the Tasmanian devil and its DFTD cancer (Miller et al. 2011; Murchison et al. 2012). Among the major findings was that genome-wide genetic diversity is quite low in devils, but that population genetic substructure exists across the range of the species, providing useful information that can be applied in captive breeding of healthy individuals not yet exposed to DFTD. The genome of the cancer revealed a large set of unique SNPs and copy-number variants, along with chromosomal rearrangements, that together suggest a distinct mutational process shaping the DFTD genome. Moreover, examination of protein-coding genes revealed 138 amino acid variants found only in the tumor genome compared to the normal genome of the host. The studies of the Tasmanian devil and DFTD genomes, which also include assessing genetic diversity within the species, provide a particularly strong example of the multifaceted applications of genomic data (Ryder 2005).
The implementation of genetic improvement using molecular tools in the conservation context has largely focused on highly managed captive populations, including the avoidance of inbreeding, selection against maladaptive traits, increasing genetic variation in highly inbred populations, maintaining “pure” individuals and populations that represent recognized subspecies and species. For example, more tigers live in captivity than in the wild, and the captive population hold genetic variation in pure and hybrid subspecies that has not been documented in the wild (Luo et al. 2008). Marker-assisted introgression might also be an efficient method of introducing desirable traits, such as disease resistance into wild populations. For example, over 200 amphibian species worldwide are declining from a fungal skin disease caused by Batrachochytrium dendrobatidis (Berger et al. 1998; Lips et al. 2006). However, since intraspecific and interspecific response to the disease varies within (Tobler and Schmidt 2010; Kriger and Hero 2006) and among (Stuart et al 2004; Woodhams et al 2007) species, the identification of resistance genes or gene markers would provide management options for improving amphibian conservation strategies.
However, because we know only a very few of the many possible loci that are critical to individual fitness and population viability and are not likely to be able to predict what genetic variation will be important for adaptation, survival and overall fitness in the future, it is prudent to act with caution. Variation that is advantageous in one environment (e.g. a captive setting) might well be linked with adaptations that are deleterious in some wild conditions, and vice versa. Therefore, if we were to model conservation programs on the basis of only a few loci about which we some knowledge, it is quite likely to affect genetic variability in unknown ways at other important loci (Hedrick 2001; Lacy 2000).
6 Genomics and Advancing Reproductive Sciences
6.1 Genetic Management and Reproductive Technologies
Reproductive technologies are an important and increasingly relied-upon management tool of both wild and captive populations. These include assisted reproduction techniques which have been used to enhance gene flow between isolated wild populations, between captive and wild populations, between different institutions, and to ensure the genetic representation of individuals that otherwise would not breed naturally through artificial insemination, in vitro methods, etc. (Pukazhenthi and Wildt 2004; Comizzoli et al. 2009; Wildt et al. 2010). Among the growing number of examples of captive populations that are have had important roles in augmenting or establishing wild populations are Puerto Rican parrots (Brock and White 1992), California condors (Geyer et al. 1993), Micronesian kingfishers (Haig et al. 1995), whooping cranes (Jones et al. 2002) and primate species [e.g., lion-tailed macaques (Morin and Ryder 1991), bonobos (Reinartz and Boese1977)], black-footed ferrets (Cain et al. 2011), and Iberian lynx (Vargas et al. 2008; Gañán et al. 2010).
In addition to avoiding the loss of genetic variation, the major concerns of highly managed populations is the risk of genetic drift, which can result in loss of alleles, and of adaptation to non-natural conditions. With captive animals, this includes attempting to prevent or mitigate adaptation to conditions of captivity (i.e. they must retain a certain degree of wildness) in addition to preventing the loss of overall genetic diversity. However, we have almost no understanding of specific levels of genetic variation or specific genotypes that are associated with survival in the wild, especially in changing environments. Therefore, the preservation of the most genetic variation possible or of equal representation of founder stock has been the default goal of genetic captive management plans in most cases.
Increasingly, genetic potential for future generations is being promoted through the establishment of germ-plasm banks and viably frozen cell lines and through testing of advanced assisted reproductive techniques. For example, cryobanking of germplasm is being used in almost all livestock species (Mazur et al. 2008) and increasingly in wildlife species as well (Comizzoli et al. 2009; Swanson et al. 2007). However, the implementation of these tools has been slow, in part because of the lack of fundamental knowledge of the complex reproductive biologies of these species (Andrabi and Maxwell 2007).
One of the most fertile areas for genomics and conservation is the development of tools and approaches that will facilitate the discovery of mechanisms of important life history and adaptive traits in populations and species. This is occurring most rapidly in model organisms and closely related species. For example, genetic markers and genes of complex traits associated with growth rate, milk production, and disease are being identified in many domestic animals (Fan et al. 2010). Similarly, comparative genomic techniques are being used to identify candidate genes involved in life history traits, development, and behavior in fish species such as the Atlantic salmon (Li et al. 2011; Miller et al. 2011; Sarropoulou and Fernandes 2011) and Bluefin tuna (Nakamura et al. 2013). Increased efforts are needed to develop assemblies and sequences from non-model organisms (Ekblom and Galindo 2011) specifically with the goal of elucidating the genomic underpinning of important evolutionary traits.
6.2 Genomics and Insights on Functional and Adaptive Variation of Reproductive Traits
At the cellular level, animal reproduction, and thus fitness and survival are fundamentally tied to the sperm, egg and to producing offspring, that in turn successfully propagate. Therefor, genomic techniques will probably have their most significant and fundamental influence on conservation by contributing to our understanding of reproductive biology across the wide diversity of plants and animals. These genome-level approaches will include proteomic and transcriptomic methods to enhance our understanding of reproductive physiology and the evolutionary mechanisms involved in reproductive isolation, gamete incompatibility, and associated pathologies.
One of the most powerful approaches will be to leverage the power of comparative genomics, or the study of patterns of variation across a range of individuals and/or organisms. These comparative methods allow insights into large-scale genomic re-arrangements, the conservation of functional elements and the tracking of evolutionary phylogenies through the examination of both closely and distantly related species. As an example, the characterization of marsupial genomes is providing insights on the shared and unique evolutionary history of reproductive genes in marsupials and eutherians, including the identification of highly modified reproductive genes, mammary gland-specific genes, and genes likely associated with other unique reproductive traits including long embryonic diapause (Frankenberg et al. 2011; Renfree et al. 2011; Pharo et al. 2012). Across diverse groups, especially groups like the carnivores with well-described model organisms (e.g. the domestic cat and dog; Table 5.1), comparing and contrasting reproductive patterns will be especially informative (Amstislavsky et al. 2012).
The process of comparative genomics is iterative, because once candidate genes are identified in one species they can be tested in others. For example variable markers from 14 candidate genes, some shared among diverse species from fly to human, have helped lead to the identification of genes associated with female and male fertility rates (Li et al 2012). Correlating conserved and divergent phenotypes with their corresponding genetic patterns, including differences among rapidly and slowly evolving genes and loss, the number of gene copies, and the number of intact functional genes in gene families will then provide hypotheses for formal testing. When combined or followed up with analyses of proteomic data this approach will also provide hypotheses for interactions among proteins, such as those involved in sperm-egg interactions.
Comparative genomic methods take advantage of the multiple mechanisms by which species maximize adaptive potential under diverse evolutionary scenarios. For example, among vertebrates there are a wide range of patterns of varying degrees of reproductive isolation, with some species diverging rapidly and developing strong methods of reproductive isolation (e.g. hybrid infertility) compared with other groups, such as parthenogenetic lizards, where hybridization may be a common recurrent mechanism for maintaining evolutionary potential and mitigating the effects of inbreeding (Fujita and Moritz 2009). Other areas of comparative genomic research, such as among normal and diseased tissues will also provide a synergistic approach of study that will assist in the management of inherited diseases through improved diagnosis and therapies (e.g. in horses as in Rosnaha et al. 2010). Finally, comparative genomic techniques and the application of metagenomic technologies and approaches to the study of whole “ecosystems” or biomes, such as the NIH Human Microbiome project, will also provide insights on the range of functional and abnormal systems and the role of microbiota in diverse settings such as in reproductive systems (Aagaard et al. 2012).
6.3 Sex Determination
Among vertebrates, gonadal development at the cellular level is conserved. However, the embryonic gonad is the only organ that is capable of producing two unique and complex adult organs as it can produce either the testis or the ovary through two distinct pathways. In mammals and birds, chromosomal sex determination is virtually universal, but in other groups sex is determined or strongly influenced by environmental factors, such as temperature, hormones, and a variety of chemicals (Parma and Radi 2012; Ungewitter and Yao 2013). These differences have large evolutionary repucusions. For example, in the fish species where population sex ratios are controlled by inherited, environmental, and biochemical elements, population dynamics and selection patterns can vary greatly both temporally and geographically (Piferrer et al. 2012).
In most species sex determination is closely tied with the equally sophisticated processes of sperm and egg production, whether occurring in the fetus or adult. Genomic methods are beginning to elucidate many of the steps involved in these processes, largely through a process of documenting the genes that are expressed in reproductive tissues and linking these patterns with genetic variation. These approaches have helped determine that at the molecular level, vertebrate gonad-specific genes generally evolve more rapidly, and thus are more diverged, than ovary genes. In turn, reproductive genes appear to evolve significantly faster than non-reproductive genes. However, functional orthologs of reproductive genes have thus far shown similar rates of evolutionary divergence across all vertebrate orders (Grassa and Kulathinal 2011).
6.4 Spermatogenesis, Oogenesis, and Fertilization
Spermatogenesis varies among species, but occurs in a series of complex steps involving hundreds of genes that are functionally active at specific times in specific tissues during development (Chocu et al. 2012). In mammals, sperm cells start forming during embryonic development and the pool of sperm stem cells are established shortly after birth (Govindaraju et al. 2012). Although many of these processes occur within the testis, they also include post-gonadal modifications controlled by genetic variation that influence sperm motility, interuterine interactions with the female, sperm capacitation, egg binding, and sperm penetration that in aggregate will determine levels of male fertility. Because a successful sperm also interacts with a wide variety of environments and must match a specific female genotype, individual male success also depends on maintaining a certain level of genetic and phenotypic variation while preserving many conserved functional motifs.
In aggregate, this complexity ensures that male fertility (and infertility, a common concern of conservation genetics) is multigenic, and that normal function can be altered in numerous ways. Comparative genomic techniques to elucidate differences among normal and abnormal spermatozoa and the associated metabolic and signaling pathways promise to improve our understanding of these fundamental processes and to provide biomarkers to assist managers and scientists in predicting the probability of successful fertilization. Our understanding of male reproductive biology is being empowered and is increasing at a more-rapid pace through new methods such as single cell (single sperm) sequencing and by an increased number of Y-chromosomes that are being sequenced. Traditionally, Y-chromosomes have not been completely sequenced because their highly repetitive genomic architecture can be difficult to interpret (Hughes and Rozen 2012).
In contrast with sperm, the structure and contents of the egg have been relatively conserved across vertebrates for millions of years, and these features are the main factors impacting successful zygotic growth. However, there are specific details, especially those related with sperm-egg interactions, that tend to be very species-specific and more rapidly evolving (i.e. less conserved) (Claw and Swanson 2012). For example, the rapid evolution of the egg’s extracellular barriers suggests that this is an important evolutionary feature and mechanism for ensuring species-specificity and the establishment of pre-zygotic barriers (Swanson et al. 2001; Swanson and Vacquier 2002).