Skip Navigation


Bioinformatics Advance Access originally published online on April 15, 2008
Bioinformatics 2008 24(11):1408-1409; doi:10.1093/bioinformatics/btn179
This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (Print PDF) Freely available
Right arrow All Versions of this Article:
24/11/1408    most recent
btn179v1
Right arrow Comments: Submit a response
Right arrow Alert me when this article is cited
Right arrow Alert me when Comments are posted
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Peng, B.
Right arrow Articles by Amos, C. I.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Peng, B.
Right arrow Articles by Amos, C. I.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

© The Author 2008. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oxfordjournals.org

Forward-time simulations of non-random mating populations using simuPOP

Bo Peng * and Christopher I. Amos

Department of Epidemiology, The University of Texas, M. D. Anderson Cancer Center, 1155 Pressler Blvd, Houston, TX, 77030, USA

*To whom correspondence should be addressed.


    ABSTRACT
 TOP
 ABSTRACT
 1 INTRODUCTION
 2 METHODS
 3 RESULTS
 4 DISCUSSION
 ACKNOWLEDGEMENTS
 REFERENCES
 

Summary: Computer simulations play an important role in studies of non-random mating populations. Because of implementation difficulties, only very limited types of non-random mating schemes are provided in the currently available simulation programs. Starting with version 0.8.5, simuPOP provides a few mating schemes that can be used to simulate arbitrary non-random mating models. This article describes the concepts and methods behind these mating schemes and demonstrates their uses in a few examples, including partial self-mating, positive assortative mating, non-random outbreeding, and simulation of overlapping generations in age-structured populations.

Availability: simuPOP is freely available at http://simupop.sourceforge.net, distributed under a GPL license. Cited examples are in the doc/cookbook directory of a simuPOP distribution.

Contact: bpeng{at}mdanderson.org


    1 INTRODUCTION
 TOP
 ABSTRACT
 1 INTRODUCTION
 2 METHODS
 3 RESULTS
 4 DISCUSSION
 ACKNOWLEDGEMENTS
 REFERENCES
 
In most analyses of natural populations, mating is assumed to occur at random. However, non-random mating can be important in studies of some species of plants and animals, even humans. For example, many plants reproduce by varying degrees of self-fertilization. Different levels of inbreeding exist in animal populations. Although mating in human populations can be considered random for most traits, non-random mating clearly exists for traits such as skin color.

Population simulations are widely used in studies of non-random mating, partly because of the difficulties in analyzing these mating schemes theoretically (Caballero and Hill, 1992). Although coalescent-based (backward-time) simulation methods (Kingman, 1982) are frequently used in simulating genetic data, they are incapable of simulating non-random mating because they are based on the Fisher–Wright mating model. In contrast, forward-time simulation methods can simulate, at least in theory, arbitrary mating schemes.

Whereas random mating is well defined in a population, non-random mating can occur in a variety of forms, such as positive or negative assortative mating, selfing, and mating in age-structured populations. Currently available simulation programs such as easyPOP (Balloux, 2001) provide no or very limited support for non-random mating. Researchers who are interested in the simulation of non-random mating populations must usually write their own programs.

Starting with version 0.8.5, simuPOP (Peng and Kimmel, 2005) provides a few mating schemes that can be used to simulate arbitrary non-random mating models. This article describes the concepts and methods behind these mating schemes and demonstrates their uses in a few examples.


    2 METHODS
 TOP
 ABSTRACT
 1 INTRODUCTION
 2 METHODS
 3 RESULTS
 4 DISCUSSION
 ACKNOWLEDGEMENTS
 REFERENCES
 
simuPOP is a forward-time population genetics simulation environment based on the Python scripting language. It provides a large number of Python objects, including populations, operators (objects that manipulate populations), mating schemes and simulators, and a mechanism to evolve populations forward in time. The user must write a Python script to glue different pieces together and form a simulation. Using a large number of operators and mating schemes, simuPOP is capable of simulating almost arbitrarily complex evolutionary processes.

Since its first release in 2005, simuPOP has improved considerably in the areas of efficiency, population manipulation, pedigree tracking and ascertainment, as exemplified by the introduction of binary modules, a number of new operators, and the concepts of information field and virtual subpopulation (VSP). The latter paves the way to implementation of arbitrary non-random mating schemes.

Information fields are float numbers that can be attached to individuals in a population. Because different simulation scenarios require different auxiliary information, simuPOP does not attach an information field to a population by default. An arbitrary number of information fields, such as age and geographic location, can be attached and used by operators and mating schemes.

A VSP refers to a group of individuals in a subpopulation who share the same property. For example, all male individuals, all individuals at age 20 and all affected individuals in a subpopulation can be defined as VSPs. Unlike subpopulations that strictly separate individuals, VSPs vary easily as individual properties change. Multiple VSPs can be defined in the same subpopulation. Also, they can overlap with each other and do not have to sum up to the whole subpopulation. A number of VSP splitters are provided that define VSPs by individual properties such as sex, affection status, genotype at given loci or information fields.

A homogeneous mating scheme determines how to produce offspring from a (virtual) subpopulation. It is composed of a parent chooser and an offspring generator. A parent chooser is used to choose a parent or parents from a parental (virtual) subpopulation, which can be random (with or without alpha individuals, with or without replacement, and with weights if selection is enabled), sequential, follow a specified pedigree, or the result from a user-defined Python function. The latest is called a hybrid parent chooser and is capable of simulating arbitrary parent choosing schemes. An offspring generator takes one or two parents chosen by a parent chooser and passes their genotype to one or more offspring with help from during-mating operators such as a recombinator. Mendelian, selfing, haplodiploid and clone offspring generators are currently provided. Users can define a mating scheme explicitly or use pre-defined mating schemes such as random, monogamous, or polygamous mating, selfing, haplodiploid mating in hymenoptera and preferential mating in animal communities with alpha individuals.

Another layer of complexity can be achieved by applying multiple homogeneous mating schemes to multiple VSPs. Using this heterogeneous mating scheme, a parent can be involved in more than one mating schemes or no mating scheme at all.


    3 RESULTS
 TOP
 ABSTRACT
 1 INTRODUCTION
 2 METHODS
 3 RESULTS
 4 DISCUSSION
 ACKNOWLEDGEMENTS
 REFERENCES
 
We have used a few examples to demonstrate the use of these mating schemes. These examples are distributed with simuPOP and will be included in an upcoming simuPOP cookbook.

Some species can mate in more than one way. For example, partial self-fertilization occurs frequently among plants. This can be simulated using a heterogeneous mating scheme that applies random mating and selfing to two VSPs defined by proportions of parents. The number of offspring produced by each mating scheme is controlled by a weighting scheme, which is described in detail in the simuPOP reference manual.

This technique is used in another example in which assortative mating is applied to part of the parental population. In this example, random mating is used to populate a specific proportion of the offspring generation. The rest of the offspring are produced by assortative mating between individuals with identical or similar genotypes at a locus. Figure 1 plots the change of the number of heterozygotes in a population using different types of positive assortative mating schemes. Similarly, assortative mating through phenotype can be achieved by defining VSPs by traits.


Figure 1
View larger version (12K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
Fig. 1. Number of heterozygotes of populations under partial assortative mating.

 
An age-structured population can be simulated using non-random mating schemes. In an example of overlapping generation, individuals are randomly assigned ages ranging from 0 to 9 initially. Individuals are classified into youth, adult and senior VSPs according to their ages. During an iteration, individual ages are increased by one before mating happens. Using a heterogeneous mating scheme, individuals at age 10 will die, others are kept, and offspring produced by adults are added.

If the boundaries of mating groups are difficult to define, a self-defined parent chooser can be used. For example, although male and female pilot whales stay with their natal pods, they rarely mate with each other. Male whales mate with females from other pods when two or more pods meet or when adult males pay short visits to other pods (Amos et al., 1993). Our example mimics this behavior by selecting a random male from a pod and then randomly selecting a female from another pod, using a user-defined Python function.


    4 DISCUSSION
 TOP
 ABSTRACT
 1 INTRODUCTION
 2 METHODS
 3 RESULTS
 4 DISCUSSION
 ACKNOWLEDGEMENTS
 REFERENCES
 
Non-random mating schemes, especially those using a hybrid parent chooser, are slower than regular mating schemes. To test the performance of different mating schemes, we evolved a population with 105 individuals (2000 SNP markers on 2 chromosomes) for 100 generations, with mutation (µ = 10–6), recombination and gene conversion (r = 10– 4). On a Linux workstation with a 3.7 G Hz Xeon CPU and 4G RAM, the simulation completes in 8.4, 8.9 and 12.9 min for a standard random, a partial assortative, and a mating scheme with a hybrid parent chooser, respectively.

If intense computation is involved in the selection of parents, implementing the parent choosing function in C++ is recommended. This technique is demonstrated in an example in which the geographically closest females are chosen for random males. This can be easily extended to spatially continuous non-random mating schemes in which an individual is more likely to mate with a spouse in its vicinity.


    ACKNOWLEDGEMENTS
 TOP
 ABSTRACT
 1 INTRODUCTION
 2 METHODS
 3 RESULTS
 4 DISCUSSION
 ACKNOWLEDGEMENTS
 REFERENCES
 
The authors thank Dr. Marek Kimmel for his suggestions in the application of non-random mating.

Funding: This research was supported by a Cancer Prevention Fellowship provided by the Jerry and Maury Rubenstein Foundation through The University of Texas M. D. Anderson Cancer Center and grant R01CA133996-01 from the National Institutes of Health.

Conflict of Interest: none declared.


    FOOTNOTES
 
Associate Editor: Martin Bishop

Received on March 7, 2008; revised on April 4, 2008; accepted on April 14, 2008

    REFERENCES
 TOP
 ABSTRACT
 1 INTRODUCTION
 2 METHODS
 3 RESULTS
 4 DISCUSSION
 ACKNOWLEDGEMENTS
 REFERENCES
 

    Amos B, et al. Social structure of pilot whales revealed by analytical DNA profiling. Science (1993) 260:670–672.[Abstract/Free Full Text]

    Balloux F. Easypop, a computer program for the simulation of population genetics. J. Hered (2001) 92:301–302.[Free Full Text]

    Caballero A, Hill WG. Effective size of non random mating populations. Genetics (1992) 130:909–916.[Abstract]

    Kingman J. The coalescent. Stochastic Processes Appl (1982) 13:235–248.[CrossRef]

    Peng B, Kimmel M. simuPOP: a forward-time population genetics simulation environment. Bioinformatics (2005) 21:3686–3687.[Abstract/Free Full Text]


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?



This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (Print PDF) Freely available
Right arrow All Versions of this Article:
24/11/1408    most recent
btn179v1
Right arrow Comments: Submit a response
Right arrow Alert me when this article is cited
Right arrow Alert me when Comments are posted
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Peng, B.
Right arrow Articles by Amos, C. I.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Peng, B.
Right arrow Articles by Amos, C. I.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?