GAMETES: a fast, direct algorithm for generating pure, strict, epistatic models with random architecturesReport as inadecuate

GAMETES: a fast, direct algorithm for generating pure, strict, epistatic models with random architectures - Download this document for free, or read online. Document in PDF available to download.

BioData Mining

, 5:16

First Online: 01 October 2012Received: 24 April 2012Accepted: 14 September 2012


BackgroundGeneticists who look beyond single locus disease associations require additional strategies for the detection of complex multi-locus effects. Epistasis, a multi-locus masking effect, presents a particular challenge, and has been the target of bioinformatic development. Thorough evaluation of new algorithms calls for simulation studies in which known disease models are sought. To date, the best methods for generating simulated multi-locus epistatic models rely on genetic algorithms. However, such methods are computationally expensive, difficult to adapt to multiple objectives, and unlikely to yield models with a precise form of epistasis which we refer to as pure and strict. Purely and strictly epistatic models constitute the worst-case in terms of detecting disease associations, since such associations may only be observed if all n-loci are included in the disease model. This makes them an attractive gold standard for simulation studies considering complex multi-locus effects.

ResultsWe introduce GAMETES, a user-friendly software package and algorithm which generates complex biallelic single nucleotide polymorphism SNP disease models for simulation studies. GAMETES rapidly and precisely generates random, pure, strict n-locus models with specified genetic constraints. These constraints include heritability, minor allele frequencies of the SNPs, and population prevalence. GAMETES also includes a simple dataset simulation strategy which may be utilized to rapidly generate an archive of simulated datasets for given genetic models. We highlight the utility and limitations of GAMETES with an example simulation study using MDR, an algorithm designed to detect epistasis.

ConclusionsGAMETES is a fast, flexible, and precise tool for generating complex n-locus models with random architectures. While GAMETES has a limited ability to generate models with higher heritabilities, it is proficient at generating the lower heritability models typically used in simulation studies evaluating new algorithms. In addition, the GAMETES modeling strategy may be flexibly combined with any dataset simulation strategy. Beyond dataset simulation, GAMETES could be employed to pursue theoretical characterization of genetic models and epistasis.

KeywordsGAMETES SNP Epistasis Simulation Model Genetics AbbreviationsSNPSingle nucleotide polymorphism

EAEvolutionary Algorithm

GAMETESGenetic Architecture Model Emulator for Testing and Evaluating Software

MLGMulti-locus genotype

HWEHardy-Weinberg Equilibrium

MAFMinor Allele Frequency

KPopulation Prevalence.

Electronic supplementary materialThe online version of this article doi:10.1186-1756-0381-5-16 contains supplementary material, which is available to authorized users.

Download fulltext PDF

Author: Ryan J Urbanowicz - Jeff Kiralis - Nicholas A Sinnott-Armstrong - Tamra Heberling - Jonathan M Fisher - Jason H Moore


Related documents