Datasets¶
1000 Genomes EUR (REF_G1K-EUR_0.001)¶
Location: https://d3o0p4nu4e38rq.cloudfront.net/downloads/reference/1.0/REF_G1K-EUR_0.001.tar.gz
This is a reference dataset used for MAF filtering, LD pruning and
phasing. It’s based on the data from release 3 of 1000 Genomes
Project. It includes all
biallelic SNPs with MAF > 0.001
for unrelated invidiuals from ‘EUR’
superpopulation.
VCF
: all sample genotypes (separate file per chromosome)sample.txt
: list of included EUR samplesersa-mask.tsv
: list of regions with excessive IBD (generated with ersa for this sample)plink.chrALL.GRCh37.map.gz
: genetic map (included for convenience)
TrueFamily CEU (TFCeu)¶
Location: https://d3o0p4nu4e38rq.cloudfront.net/downloads/examples/0.2/TFCeu.tar.gz
This is synthetic dataset with simulated genotypes based on unrelated
individuals from CEU population of 1000 Genomes
Project. The pedigree is
defined in g1k_ceu_family_15_2.ped
and includes 15 generations.
TF-CEU-15-2.vcf.gz
: VFC file for the simulated genotypesg1k_ceu_family_15_2.ped
: pedigreeTF-CEU-15-2.true.rel
: true relations