DemoDivMS

Authors

This program DemoDivMS is designed to predict the genetic diversity at a microsatellite DNA marker, in a finite population, for various mutation models and for variable population size.
The program is very quick. It is written in Fortran and makes use of NAG numerical routines. It creates small files of the order of 50Ko.

Download

Important : the NAG library is needed in order to use the software DemoDivMS.

Download source code (version 0.2) You can them easily compile and run it under linux/unix OS.

Description of the model

The goal of the program is to propose a tool able to describe the expected current genetic diversity at a microsatellite marker from the past history of a population.

The user describes the mutation process of the marker and the demographic history of the population; the program estimates the final genetic diversity using coalescent theory adapted to variable effective size. For any pair of alleles sampled in the population, the program provides the probabilities P0, P1, P2, … Pd,… that alleles of the pair show a difference D of 0, 1, 2, … d … repeats of the microsatellite motif. Associated measures of diversity are given: the mean and variance of the |D| values, and the values of theta (the parameter 4 N µ) derived from expected heterozygosity or from the distribution of D .

Mutation process

The microsatellite marker is assumed to be submitted to a symmetrical Stepwise Mutation Model, so that the number of motifs is increased or decreased by a number k with equal probabilities if some mutation occurs. Available options include the single step mutation model, the geometric model (the probability that mutation leads to k steps is (1-c) c(k-1) for some c<1, or any user-defined model (mutation leads to k steps with probability mk).

Past demography

The demographic history is given as a chain of J intervals of time [ti, t(i+1)] in which the population size remains equal to Ni. It is further assumed that the population size was constant and equal to NJ before some time tJ in the past.

DemoDivMS_fig1.png

 

 

Migration

An option is proposed to account for recurrent immigration from a large population of constant structure. Two additional parameters are needed:

  • the size of this external population;
  • the rate of immigration, given as a constant number of immigrants per generation, or as a constant rate of immigration (in that case the mean number of immigrants is variable and proportional to the sizes Ni of the population in the past).

 

Software use

Restrictions on the Model

a) No more than 50 time events tJ (Jmax <=50)
b) No more than 10 steps for user-defined mutation model (K <= 10).
c) No more than 1,000,000 generations between t0 and tJ.

Input of parameters

The user is asked a name for the output file (max: 20 characters), and values to fix the mutation model and the past demography of the population.

>Give the name of output file

>Enter mutation model

To use the Single Step Model you have to enter 1.
To use the Geometric model, you have to give the ratio c (0 < c < 1).
To use the Special Model you have to enter the number K of possible steps (1<K<=10) then, enter values of pairs (k, mk) (note: m1 + m2 + … + mK = 1).
If you need HELP, you have to enter a negative number
If you need to STOP, you have to enter 0

>Give the current effective size N_0 of the population

>Give the mutation rate µ

>Give the total number of events (J)

If you have one effective size and if you want this size to be constant you have to enter 0.
If you want to variable size so J>1, you have firstly to list the past effective sizes N1,N2,...,NJ and secondly you have to list the past times when size changed t1,t2,...,tJ.

>Choose Migration (1) or No migration (0)

[Optional, if the Migration Model is chosen]

>Give the size of the large external population

> Choose the model of migration

  • Constant rate or Constant Number of immigrants
  • and either the constant rate of the constant number of immigrants

Example

We consider a population of present effective size 1000 that underwent a bottleneck between the 100-th and 200-th generations before present time. The global mutation rate of 0.001.

DemoDivMS_fig2.png

See below the details with the Single Step Model (SSM). The same example with the two others models are presented with just the prompt different from the SSM.

Single Step Model

We are presenting here an example of a session of DemoDivMS in the case of SSM model

  Microsatellite allelic diversity after demographic changes
  Calculus of Pr( D = d / µ, N_0, N_1, ...), 
the probability that 2 microsatellite alleles show a difference of  d  repeat motifs
  according to past demographic fluctuations
  Give name of output file (max 20 characters)
outfile
  Calculating the  Pr( D = d / µ, N_0, N_1, ...) ...
  Enter mutation model / Stop
                  STOP : answer 0
     Single Step Model : answer 1
     Geometric model   : give the ratio c (0 < c < 1)
     Special Model     : enter the number K of possible steps (K > 1)
                  HELP : enter a negative number
-1
  Stepwise Mutation Models
  Single Step (answer= 1), one motif is added or withdrawn
  Geometric model (answer a real number c, 0<c<1 )
        under mutation the number of added or withdrawn motifs
        is k with probability equal to  (1-c)*c^(k-1)
  Special model (answer an integer number K, 1 < K <= 10)
        under mutation the number of added or withdrawn motifs
        is k with probability  m_k. 
        K pairs of values (k, m_k) must be given
  Enter mutation model / Stop
                  STOP : answer 0
     Single Step Model : answer 1
     Geometric model   : give the ratio c (0 < c < 1)
     Special Model     : enter the number K of possible steps (K > 1)
                  HELP : enter a negative number
1
  Give the current effective size N_0 of the population
1000 
 Give the mutation rate µ
.001
  Modelling past variations of population size
size N
    !
N_J !                                ___________
    !                               !
N_2 !              ____  ....
    !            !
N_1 !       _____!
    !      !
    !      !
N_0 !______!
    !
    !
    !                                
    !____________________________________________
    t_0=0     t_1   t_2     ....      t_J     time in the past
 Give the total number of time events (Jmax)
 0 if the effective size has been constant
2
  Give the series of N_1, N_2, ... N_Jmax.
500 1000
 Give the series of times  t_1, t_2,...,t_Jmax.
100 200
Model with immigration ?
if immigration: answer 1, otherwise: 0
0

Geometric Model

We are presenting here an example of DemoDivMS entry in the case of geometric model

 Enter mutation model / Stop
STOP : answer 0
Single Step Model : answer 1
Geometric model   : give the ratio c (0 < c < 1)
Special Model     : enter the number K of possible steps (K > 1)
HELP : enter a negative number
.05

User defined model assuming either 1 or 2 steps

We are presenting here an example of DemoDivMS entry in the case of a user defined model

 Enter mutation model / Stop
STOP : answer 0
Single Step Model : answer 1
Geometric model   : give the ratio c (0 < c < 1)
Special Model     : enter the number K of possible steps (K > 1)
HELP : enter a negative number
2
For each case, give the number of steps and the probability
case   1: how many steps, which proba ??
1 .95
case   2: how many steps, which proba ??
2 .05

Choosing a migration model

We are presenting here an example of DemoDivMS entry in the case immigration.

Model with immigration ?
if immigration: answer 1, otherwise: 0
1
Give the size of the large external population
5000
Give the immigration rate. Two options are proposed:
First option : constant rate
(the number of immigrants is proportional to the current population size)
Second option: constant number of immigrants
give the option (1 or 2), and the right number
2 1

 

Outfile:

We present here an example of output file.

  Microsatellite allelic diversity  after demographic changes
Calculus of Pr( D = d / µ, N_0, N_1, ...),
the probability that 2 microsatellite alleles
show a difference of  d  repeat motifs
according to past demographic fluctuations
*************** 
Single Step Mutation model, µ = 0.00100
Current effective size   1000.
Number of past demographic events =  2
Series of past effective sizes
1000.    500.   1000.
Series of times when size changed
0.    100.    200.
No migration, the population is isolated
Largest value of D used =      13
Total probability of unconsidered events = 0.00008
Expected homozygosity Pr(D=0) = 0.35453
Probabilities Pr(D=1) to Pr(D=13)
0.16378 0.07966 0.03966 0.01982 0.00991 0.00495 0.00248 0.00124 0.00062 0.00031
0.00015 0.00008 0.00004
Mean and standard error of |D|:   1.27944   1.47130
Corresponding values of theta = 4Nµ under constant population size and SSM,
from homozygosity :   3.47797
from mean of |D|  :   3.71460
from mean of D^2  :   3.80168
*************** 
Single Step Mutation model, µ = 0.00100
Current effective size   1000.
Number of past demographic events =  2
Series of past effective sizes
1000.    500.   1000.
Series of times when size changed
0.    100.    200.
Assuming immigration from a large population in equilibrium, of constant size
Nefext =      5000 (theta =    20.00 )
Constant number of immigrants per generation =   1.000
Largest value of D used =      29
Total probability of unconsidered events = 0.00007
Expected homozygosity Pr(D=0) = 0.25224
Probabilities Pr(D=1) to Pr(D=29)
0.11867 0.07265 0.05015 0.03596 0.02610 0.01901 0.01387 0.01012 0.00739 0.00539
0.00393 0.00287 0.00210 0.00153 0.00112 0.00081 0.00059 0.00043 0.00032 0.00023
0.00017 0.00012 0.00009 0.00007 0.00005 0.00003 0.00003 0.00002 0.00001
Mean and standard error of |D|:   2.59948   3.06586
Corresponding values of theta = 4Nµ under constant population size and SSM,
from homozygosity :   7.35863
from mean of |D|  :  13.99735
from mean of D^2  :  16.15678Mathematical background

 

Mathematical background


Calculations are based on the joint analysis of the drift process in a finite population and of the mutation process at a microsatellite marker.

Drift in a population with variable size is studied assuming that the population has undergone successive periods in which population size remained constant, so that the global analysis of the coalescent is split into such periods. Hence calculations are easy within each period, and tractable when joining periods, to get in fine analytical expressions for most parameters of interest.


Modelling mutations assumes that under mutation of an allele, the number of the microsatellite motifs may be changed with probabilities that do not depend on the number of motifs in the allele. Although restrictive, the formulation allows analytical solutions to be derived using generating functions of the distributions, for any symmetrical mutation model. The single step mutation model, the geometric model and any user-defined stepwise mutation model are implemented in the program.

Basic calculations are concerned with the joint probabilities that a set of alleles are copies of a single ancestor gene some time ago, and that they show some difference concerning their state (number of microsatellite motifs). Results allow both the distribution of coalescent time and of distances between alleles to be calculated, for a large class of stepwise mutation models and for any ancestral demography. In the present program, only probabilities of distances between alleles are proposed, but the theory allows also the distributions of coalescent times to be calculated.

From a technical point of view, analytical solutions are obtained for the Fourier transform F(x) of the distribution of distances D between pairs of alleles (the distance between 2 alleles is the difference between the numbers of motifs they bear):

formule1.png


the summation being taken on all negative, null and positive integers k (noting that P(D=k) = P(D=-k) ). Analytical expressions of the function F(x) are shown in the given reference, as a function of the mutation model and of the demographic history assumed for the population. The numerical value of the probability that two alleles are at distance k is then obtained in this program from a numerical integration:

formule2.png


Additional indices of diversity are calculated from this distribution, such as the expectation and variance of the distances between alleles in a sample, and values of the theta = 4 N µ parameter giving the same diversity measures under basic assumptions of constant population size and single step mutation model.



References

Nikolic N, Butler J, Baglinière J-L, Laughton R, McMyn I.A.G, Chevalet C. 2009. An examination of genetic diversity and effective population size in Atlantic salmon. Genetics Research. 91: 1-18.

Chevalet C & Nikolic N. 2010. Distribution of coalescent times and distances between microsatellite alleles with changing effective population size. Theoretical Population Biology, 77(3): 152-163.