Package 'StAMPP' reference manual

Title:	Statistical Analysis of Mixed Ploidy Populations
Description:	Allows users to calculate pairwise Nei's Genetic Distances (Nei 1972), pairwise Fixation Indexes (Fst) (Weir & Cockerham 1984) and also Genomic Relationship matrixes following Yang et al. (2010) in mixed and single ploidy populations. Bootstrapping across loci is implemented during Fst calculation to generate confidence intervals and p-values around pairwise Fst values. StAMPP utilises SNP genotype data of any ploidy level (with the ability to handle missing data) and is coded to utilise multithreading where available to allow efficient analysis of large datasets. StAMPP is able to handle genotype data from genlight objects allowing integration with other packages such adegenet. Please refer to LW Pembleton, NOI Cogan & JW Forster, 2013, Molecular Ecology Resources, 13(5), 946-952. <doi:10.1111/1755-0998.12129> for the appropriate citation and user manual. Thank you in advance.
Authors:	LW Pembleton
Maintainer:	LW Pembleton <[email protected]>
License:	GPL-3
Version:	1.6.3.9000
Built:	2025-02-24 03:17:25 UTC
Source:	https://github.com/lpembleton/stampp

Example genotype input format

Description

A data frame containing Solcap potato genotype data in tetraploid and diploid format as an small example of the input format required by StAMPP

Usage

data(potato)
data(potato)

Format

A data frame with 30 rows and 48 variables:

Sample: Sample names
Pop: Population name
Ploidy: Ploidy level
Format: Format of genotype data
solcap_snp_c1_1: genotype data
solcap_snp_c1_1000: genotype data
solcap_snp_c1_10000: genotype data
solcap_snp_c1_10001: genotype data
solcap_snp_c1_10011: genotype data
solcap_snp_c1_10012: genotype data
solcap_snp_c1_10031: genotype data
solcap_snp_c1_10042: genotype data
solcap_snp_c1_10050: genotype data
solcap_snp_c1_10054: genotype data
solcap_snp_c1_10109: genotype data
solcap_snp_c1_10130: genotype data
solcap_snp_c1_10157: genotype data
solcap_snp_c1_10202: genotype data
solcap_snp_c1_10252: genotype data
solcap_snp_c1_10253: genotype data
solcap_snp_c1_10255: genotype data
solcap_snp_c1_1029: genotype data
solcap_snp_c1_10295: genotype data
solcap_snp_c1_10297: genotype data
solcap_snp_c1_10351: genotype data
solcap_snp_c1_10384: genotype data
solcap_snp_c1_10397: genotype data
solcap_snp_c1_10457: genotype data
solcap_snp_c1_10491: genotype data
solcap_snp_c1_10492: genotype data
solcap_snp_c1_10494: genotype data
solcap_snp_c1_10579: genotype data
solcap_snp_c1_10646: genotype data
solcap_snp_c1_10669: genotype data
solcap_snp_c1_10715: genotype data
solcap_snp_c1_10737: genotype data
solcap_snp_c1_10743: genotype data
solcap_snp_c1_10762: genotype data
solcap_snp_c1_10855: genotype data
solcap_snp_c1_10873: genotype data
solcap_snp_c1_10879: genotype data
solcap_snp_c1_10900: genotype data
solcap_snp_c1_10932: genotype data
solcap_snp_c1_1094: genotype data
solcap_snp_c1_11137: genotype data
solcap_snp_c1_11144: genotype data
solcap_snp_c1_11196: genotype data
solcap_snp_c1_11206: genotype data

Source

The example genotype data is a subset of data from the publically avaliable Solcap potato dataset which was re-scored in GenomeStudio in diploid and tetraploid formats

Smaller example genotype input format

Description

A data frame containing Solcap potato genotype data in tetraploid and diploid format as an small example of the input format required by StAMPP

Usage

data(potato.mini)
data(potato.mini)

Format

A data frame with 6 rows and 48 variables:

Sample: Sample names
Pop: Population name
Ploidy: Ploidy level
Format: Format of genotype data
solcap_snp_c1_1: genotype data
solcap_snp_c1_1000: genotype data
solcap_snp_c1_10000: genotype data
solcap_snp_c1_10001: genotype data
solcap_snp_c1_10011: genotype data
solcap_snp_c1_10012: genotype data
solcap_snp_c1_10031: genotype data
solcap_snp_c1_10042: genotype data
solcap_snp_c1_10050: genotype data
solcap_snp_c1_10054: genotype data
solcap_snp_c1_10109: genotype data
solcap_snp_c1_10130: genotype data
solcap_snp_c1_10157: genotype data
solcap_snp_c1_10202: genotype data
solcap_snp_c1_10252: genotype data
solcap_snp_c1_10253: genotype data
solcap_snp_c1_10255: genotype data
solcap_snp_c1_1029: genotype data
solcap_snp_c1_10295: genotype data
solcap_snp_c1_10297: genotype data
solcap_snp_c1_10351: genotype data
solcap_snp_c1_10384: genotype data
solcap_snp_c1_10397: genotype data
solcap_snp_c1_10457: genotype data
solcap_snp_c1_10491: genotype data
solcap_snp_c1_10492: genotype data
solcap_snp_c1_10494: genotype data
solcap_snp_c1_10579: genotype data
solcap_snp_c1_10646: genotype data
solcap_snp_c1_10669: genotype data
solcap_snp_c1_10715: genotype data
solcap_snp_c1_10737: genotype data
solcap_snp_c1_10743: genotype data
solcap_snp_c1_10762: genotype data
solcap_snp_c1_10855: genotype data
solcap_snp_c1_10873: genotype data
solcap_snp_c1_10879: genotype data
solcap_snp_c1_10900: genotype data
solcap_snp_c1_10932: genotype data
solcap_snp_c1_1094: genotype data
solcap_snp_c1_11137: genotype data
solcap_snp_c1_11144: genotype data
solcap_snp_c1_11196: genotype data
solcap_snp_c1_11206: genotype data

Source

The example genotype data is a subset of data from the publically avaliable Solcap potato dataset which was re-scored in GenomeStudio in diploid and tetraploid formats

Convert StAMPP genotype data to genlight object

Description

Converts a StAMPP formated allele frequency data frame generated from the stamppConvert function to a genlight object for use in other packages

Usage

stampp2genlight(geno, pop = TRUE)
stampp2genlight(geno, pop = TRUE)

Arguments

`geno`	a data frame containing allele frequency data generated from stamppConvert
`pop`	logical. True if population IDs are present in the StAMPP genotype data, False if population IDs are absent.

Details

StAMPP only exports to genlight objects as they are able to handle mixed ploidy datasets unlike genpop and genloci objects. The genlight object allows the intergration between StAMPP and other common R packages such as ADEGENET

Value

A object of class genlight which contains genotype data, individual IDs, population IDs (if present) and ploidy levels

Author(s)

Luke Pembleton <lpembleton at barenbrug.com>

Examples

# import genotype data and convert to allele frequecies
data(potato.mini, package="StAMPP")
potato.freq <- stamppConvert(potato.mini, "r")
# Convert the StAMPP formatted allele frequency data frame to a genlight object
potato.genlight <- stampp2genlight(potato.freq, TRUE)
# import genotype data and convert to allele frequecies
data(potato.mini, package="StAMPP")
potato.freq <- stamppConvert(potato.mini, "r")
# Convert the StAMPP formatted allele frequency data frame to a genlight object
potato.genlight <- stampp2genlight(potato.freq, TRUE)

Analysis of Molecular Variance

Description

Calculates an AMOVA based on the genetic distance matrix from stamppNeisD() using the amova() function from the package PEGAS for exploring within and between population variation

Usage

stamppAmova(dist.mat, geno, perm = 100)
stamppAmova(dist.mat, geno, perm = 100)

Arguments

`dist.mat`	the matrix of genetic distances between individuals generated from stamppNeisD()
`geno`	a data frame containing allele frequency data generated from stamppConvert, or a genlight object containing genotype data, individual IDs, population IDs and ploidy levels
`perm`	the number of permutations for the tests of hypotheses

Details

Uses the formula distance ~ populations, to calculate an AMOVA for population differentiation and within & between population variation. This function uses the amova function from the PEGAS package.

Value

An object of class "amova" which is a list containing a table of sum of square deviations (SSD), mean square deviations (MSD) and the number of degrees of freedom as well as the variance components

Author(s)

Luke Pembleton <lpembleton at barenbrug.com>

References

Paradis E (2010) pegas: an R package for population genetics with an integrated-modular approach. Bioinformatics 26, 419-420. <doi:10.1093/bioinformatics/btp696>

Examples

# import genotype data and convert to allele frequecies
data(potato.mini, package="StAMPP")
potato.freq <- stamppConvert(potato.mini, "r")
# Calculate genetic distance between individuals
potato.D.ind <- stamppNeisD(potato.freq, FALSE, "standard")
# Calculate AMOVA
stamppAmova(potato.D.ind, potato.freq, 100)
# import genotype data and convert to allele frequecies
data(potato.mini, package="StAMPP")
potato.freq <- stamppConvert(potato.mini, "r")
# Calculate genetic distance between individuals
potato.D.ind <- stamppNeisD(potato.freq, FALSE, "standard")
# Calculate AMOVA
stamppAmova(potato.D.ind, potato.freq, 100)

Import and Convert

Description

Imports biallelic AB formated or allele A frequency genotype data. If the data is in imported in biallelic AB format this function also converts it to allele frequencies

Usage

stamppConvert(genotype.file, type = "csv")
stamppConvert(genotype.file, type = "csv")

Arguments

`genotype.file`	the genotype input file. This should be a R matrix object or a file path for a csv file containing the genotype data in either bialleleic AB format or allele 'A' frequency format, or a genlight object containing genotype data
`type`	the type of file the genotype data is being imported from; "csv" = comma seperated file, "r" = data frame in the R workspace, "genlight" = genlight object.

Value

An object of class data.frame which contains allele frequency data for use in other StAMPP functions

Author(s)

Luke Pembleton <lpembleton at barenbrug.com>

Examples

# Import example data into the R workspace
data(potato.mini, package="StAMPP")
# Convert to allele frequencies
potato.freq <- stamppConvert(potato.mini, "r")
# Import example data into the R workspace
data(potato.mini, package="StAMPP")
# Convert to allele frequencies
potato.freq <- stamppConvert(potato.mini, "r")

Fst Computation

Description

This function calculates pairwise Fst values along with confidence intervals and p-values between populations according to the method proposed by Wright(1949) and updated by Weir and Cockerham (1984)

Usage

stamppFst(geno, nboots = 100, percent = 95, nclusters = 1)
stamppFst(geno, nboots = 100, percent = 95, nclusters = 1)

Arguments

`geno`	a data frame containing allele frequency data generated from stamppConvert, or a genlight object containing genotype data, individual IDs, population IDs and ploidy levels
`nboots`	number of bootstraps to perform across loci to generate confidence intervals and p-values
`percent`	the percentile to calculate the confidence interval around
`nclusters`	number of proccesor treads or cores to use during calculations.

Details

If possible, using multiple processing threads or cores is recommended to assist in calculating Fst values over a large number of bootstraps.

Value

An object list with the components: Fstsa matrix of pairwise Fst values between populations Pvaluesa matrix of p-values for each of the pairwise Fst values containined in the 'Fsts' matrix Bootstrapsa dataframe of each Fst value generated during Bootstrapping and the associated confidence intervals If nboots<2, no bootstrapping is performed and therefore only a matrix of Fst values is returned.

Author(s)

Luke Pembleton <lpembleton at barenbrug.com>

References

Wright S (1949) The Genetical Structure of Populations. Annals of Human Genetics 15, 323-354. <doi:10.1111/j.1469-1809.1949.tb02451.x> Weir BS, Cockerham CC (1984) Estimating F Statistics for the ANalysis of Population Structure. Evolution 38, 1358-1370. <doi:10.2307/2408641>

Examples

# import genotype data and convert to allele frequecies
data(potato.mini, package="StAMPP")
potato.freq <- stamppConvert(potato.mini, "r")
# Calculate pairwise Fst values between each population
potato.fst <- stamppFst(potato.freq, 100, 95, 1)
# import genotype data and convert to allele frequecies
data(potato.mini, package="StAMPP")
potato.freq <- stamppConvert(potato.mini, "r")
# Calculate pairwise Fst values between each population
potato.fst <- stamppFst(potato.freq, 100, 95, 1)

Genomic Relationship Calculation

Description

This function calculates a genomic relationship matrix following the method decribed by Yang et al (2010)

Usage

stamppGmatrix(geno)
stamppGmatrix(geno)

Arguments

geno

a data frame containing allele frequency data generated from stamppConvert, or a genlight object containing genotype data, individual IDs, population IDs and ploidy levels

Value

An object of class matrix which contains the genomic relationship values between each individual

Author(s)

Luke Pembleton <lpembleton at barenbrug.com>

References

Yang J, Benyamin B, McEvoy BP, et al (2010) Common SNPs explain a large proportion of the heritability for human height. Nat Genet 42, 565-569. <doi:10.1038/ng.608>

Examples

# import genotype data and convert to allele frequecies
data(potato.mini, package="StAMPP")
potato.freq <- stamppConvert(potato.mini, "r")
# Calculate genomic relationship values between each individual
potato.fst <- stamppGmatrix(potato.freq)
# import genotype data and convert to allele frequecies
data(potato.mini, package="StAMPP")
potato.freq <- stamppConvert(potato.mini, "r")
# Calculate genomic relationship values between each individual
potato.fst <- stamppGmatrix(potato.freq)

Genetic Distance Calculation

Description

This function calculates Nei's genetic distance (Nei 1972) between populations or individuals

Usage

stamppNeisD(geno, pop = TRUE, measure = "standard")
stamppNeisD(geno, pop = TRUE, measure = "standard")

Arguments

`geno`	a data frame containing allele frequency data generated from stamppConvert, or a genlight object containing genotype data, individual IDs, population IDs and ploidy levels
`pop`	logical. True if genetic distance should be calculated between populations, false if it should be calculated between individual
`measure`	a character string defining the distance measure to use: "standard" for the Neis standard genetic distance 1972 or "DA" for Neis DA distance 1983.

Value

A object of class matrix which contains the genetic distance between each population or individual

Author(s)

Luke Pembleton <lpembleton at barenbrug.com>

References

Nei M (1972) Genetic Distance between Populations. The American Naturalist 106, 283-292.

Examples

# import genotype data and convert to allele frequecies
data(potato.mini, package="StAMPP")
potato.freq <- stamppConvert(potato.mini, "r")
# Calculate genetic distance between individuals
potato.D.ind <- stamppNeisD(potato.freq, FALSE, "standard")
# Calculate genetic distance between populations
potato.D.pop <- stamppNeisD(potato.freq, TRUE, "standard")
# import genotype data and convert to allele frequecies
data(potato.mini, package="StAMPP")
potato.freq <- stamppConvert(potato.mini, "r")
# Calculate genetic distance between individuals
potato.D.ind <- stamppNeisD(potato.freq, FALSE, "standard")
# Calculate genetic distance between populations
potato.D.pop <- stamppNeisD(potato.freq, TRUE, "standard")

Export to Phylip Format

Description

Converts the genetic distance matrix generated with stamppNeisD into Phylip format and exports it as a text file

Usage

stamppPhylip(distance.mat, file = "")
stamppPhylip(distance.mat, file = "")

Arguments

`distance.mat`	the matrix containing the genetic distances generated from stamppNeisD to be converted into Phylip format
`file`	the file path and name to save the Phylip format matrix as

Details

The exported Phylip formated text file can be easily imported into sofware packages such as DARWin (Perrier & Jacquemound-Collet 2006) to be used to generate neighbour joining trees

Author(s)

Luke Pembleton <lpembleton at barenbrug.com>

References

Perrier X, Jacquemound-Collet JP (2006) DARWin - Dissimilarity Analysis and Representation for Windows. Agricultural Research for Development

Examples

# import genotype data and convert to allele frequecies
data(potato.mini, package="StAMPP")
potato.freq <- stamppConvert(potato.mini, "r")
# Calculate genetic distance between populations
potato.D.pop <- stamppNeisD(potato.freq, TRUE, "standard")
# Export the genetic distance matrix in Phylip format
## Not run: stamppPhylip(potato.D.pop, file="potato_distance.txt")
# import genotype data and convert to allele frequecies
data(potato.mini, package="StAMPP")
potato.freq <- stamppConvert(potato.mini, "r")
# Calculate genetic distance between populations
potato.D.pop <- stamppNeisD(potato.freq, TRUE, "standard")
# Export the genetic distance matrix in Phylip format
## Not run: stamppPhylip(potato.D.pop, file="potato_distance.txt")

Package 'StAMPP'

Help Index

Example genotype input format

Description

Usage

Format

Source

Smaller example genotype input format

Description

Usage

Format

Source

Convert StAMPP genotype data to genlight object

Description

Usage

Arguments

Details

Value

Author(s)

Examples

Analysis of Molecular Variance

Description

Usage

Arguments

Details

Value

Author(s)

References

Examples

Import and Convert

Description

Usage

Arguments

Value

Author(s)

Examples

Fst Computation

Description

Usage

Arguments

Details

Value

Author(s)

References

Examples

Genomic Relationship Calculation

Description

Usage

Arguments

Value

Author(s)

References

Examples

Genetic Distance Calculation

Description

Usage

Arguments

Value

Author(s)

References

Examples

Export to Phylip Format

Description

Usage

Arguments

Details

Author(s)

References

Examples