This database presents true haplotypes and LD structures of Japanese genome, determined using DNA samples obtained from complete hydatidiform moles.
Data name | README |
---|---|
Description of data contents | HTML file to describe "D-HaploDB" data. |
File | README_e.html (English) |
Data name | SNP List (Phase II) |
---|---|
Description of data contents | A list of SNPs in D2 (Phase II). SNP genotypes in D1 (Phase I, Perlegen 281K SNPs) and those determined using Affymetrix 500K Array for overlapping 74 CHM samples were merged and QC'ed. LD bins were then determined. |
File | dhaplo_d2_snp_list.zip (16.6MB) |
Data item | Description |
---|---|
RefSNP ID | RefSNP ID (rs number) given by dbSNP. (Linked to dbSNP in Quick Search) |
Affy/Perlegen ID | SNP ID given by Affymetrix or Perlegen |
Chromosome | Chromosome number that each SNP resides |
Position | Chromosomal nucleotide position (NCBI Build 35) of each SNP |
Alleles | Alleles |
MAF | Minor allele frequency |
Genotypes | Genotypes for the 74 CHM samples |
LD bin | Name of LD bin (Linked to LD bin list in Quick Search) |
tagSNP | The flag that indicates whether the SNP is a tagSNP or not. 1: tagSNP 0: non-tagSNP -: SNP not included in LD bin calculation (MAF<0.05) td=""> |
Best tagSNP | The flag that indicates whether the SNP is the best tagSNP or not. 1: Best tagSNP 0: non-best tagSNP -: SNP not included in LD bin calculation (MAF<0.05) td=""> |
Data name | SNP List (Phase III) |
---|---|
Description of data contents | A list of SNPs in D3 (Phase III). The data is essentially the same as those described in Kukita et al. (2010) paper, but contains additional samples (CHM010 and CHM035) because their data were judged to be acceptable with regard to genotypes, though they were excluded at QC steps in the previous report. LD bins were then determined. |
File | dhaplo_d3_snp_list.zip (23.9MB) |
Data item | Description |
---|---|
RefSNP ID | RefSNP ID (rs number) given by dbSNP. (Linked to dbSNP in Quick Search) |
Affy/Perlegen ID | SNP ID given by Affymetrix |
Chromosome | Chromosome number that each SNP resides |
Position | Chromosomal nucleotide position (NCBI Build 36) of each SNP |
Alleles | Alleles |
MAF | Minor allele frequency |
Genotypes | Genotypes for the 87 CHM samples |
LD bin | Name of LD bin (Linked to LD bin list in Quick Search) |
tagSNP | The flag that indicates whether the SNP is a tagSNP or not. 1: tagSNP 0: non-tagSNP -: SNP not included in LD bin calculation (MAF<0.05) td=""> |
Best tagSNP | The flag that indicates whether the SNP is the best tagSNP or not. 1: Best tagSNP 0: non-best tagSNP -: SNP not included in LD bin calculation (MAF<0.05) td=""> |
Data name | LD bin list (Phase II) |
---|---|
Description of data contents | LD bin list of D2 (Phase II). LD bin is a group of SNPs that mutually shows high LD (r2 > 0.8). See below for detail. |
File | dhaplo_d2_ld_bin_list.zip (3.0MB) |
Data item | Description |
---|---|
LD bin | Name of LD bin |
Chromosome | Chromosome number each LD bin resides (Chr1 - Chr22, ChrX) |
Position Start | Start position of LD bin (nucleotide position in each chromosome, according to NCBI Build 35) |
Position End | End position of LD bin (nucleotide position in each chromosome, according to NCBI Build 35) |
SNPs Count | Number of SNPs in LD bin |
tagSNPs Count | Number of tagSNPs in LD bin |
Best tagSNP | tagSNP that showed the highest mean r2, given by RefSNP ID (rs number) |
Data name | LD bin list (Phase III) |
---|---|
Description of data contents | LD bin list of D3 (Phase III). LD bin is a group of SNPs that mutually shows high LD (r2 > 0.8). See below for detail. |
File | dhaplo_d3_ld_bin_list.zip (3.1MB) |
Data item | Description |
---|---|
LD bin | Name of LD bin |
Chromosome | Chromosome number each LD bin resides (Chr1 - Chr22, ChrX) |
Position Start | Start position of LD bin (nucleotide position in each chromosome, according to NCBI Build 36) |
Position End | End position of LD bin (nucleotide position in each chromosome, according to NCBI Build 36) |
SNPs Count | Number of SNPs in LD bin |
tagSNPs Count | Number of tagSNPs in LD bin |
Best tagSNP | tagSNP that showed the highest mean r2, given by RefSNP ID (rs number) (Link to dbSNP available in quick search) |
Data name | Genotype Data (Phase II) |
---|---|
Description of data contents | A list of SNP genotypes in D2(Phase II). SNP genotypes in D1 (Phase I, Perlegen 281K SNPs) and those determined using Affymetrix 500K Array for overlapping 74 CHM samples were merged and QC'ed. |
File | mole_info_DhaploD2.txt.gz (13.7MB) |
Data Item | Description |
---|---|
rs | RefSNP accession ID (rs number) |
chr | Chromosome number that the SNP resides (1 - 22, X) |
pos | Nucleotide position on chromosome that the SNP resides |
allele1 | allele 1 |
allele2 | allele 2 |
gtype | genotypes of 74 samples of CHMs |
Data name | Genotype Data (Phase III) |
---|---|
Description of data contents | Genotype data (876K SNPs, 87 samples). Essentially the same as described in Kukita et al. paper (2010), except that two additional samples (CHM010 and CHM035) were included. No CNV information is included in the download data. |
File | mole_info_DhaploD3.txt.gz (23.8MB) |
Data Item | Description |
---|---|
chr | Chromosome number (1-22,X) |
sample | Sample (CHM) name |
rs | RefSNP accession ID (rs number) |
pos | Nucleotide position on chromosome |
allele1 | allele 1 |
allele2 | allele 2 |
gtype | genotypes of 87 CHM samples |
ss | Unique ID, given by Affymetrix |
Data name | LD_bin Data (Phase II) |
---|---|
Description of data contents | Results of LD bin calculations for D2 (Phase II) data sets. Files are in GFF format, and contains two kinds of lines, that are distinguishable by column# 3. |
File | bin_2R80M5.gff.gz (12.1MB) |
Column Number | Definition in GFF format | Description |
---|---|---|
#1 | seqname | Chromosome that the SNP resides (e.g. Chr1) |
#2 | source | name of dataset (e.g. CHM_2R80M5Z) |
#3 | feature | description of data. SNP information or LD bin boundary(e.g. LD_BIN, LD_BIN_BOUNDARIES) |
#4 | start | Chromosomal position of SNP or start position of bin (NCBI Build 35) |
#5 | end | Chromosomal position of SNP or end position of bin |
#6 | score | LD_BIN line: 2 for Best tagSNP, and 1 for tagSNP, and 0 for other SNP. LD_BIN_BOUNDARIES line: always "." |
#7 | strand | always "+" |
#8 | frame | always "," |
#9 | attributes |
This column contains the following items. |
Data name | LD_bin Data (Phase III) |
---|---|
Description of data contents | Results of LD bin calculations for D3 (Phase III) data sets. Files are in GFF format, and contains two kinds of lines, that are distinguishable by column# 3. |
File | bin_3R80M5Zb36.gff.gz (12.8MB) |
Column Number | Definition in GFF format | Description |
---|---|---|
#1 | seqname | Chromosome that the SNP resides (e.g. Chr1) |
#2 | source | name of dataset (e.g. CHM_3R80M5Z) |
#3 | feature |
description of data. SNP information or LD bin boundary (e.g. LD_BIN, LD_BIN_BOUNDARIES) |
#4 | start | Chromosomal position of SNP or start position of bin (NCBI Build 36) |
#5 | end | Chromosomal position of SNP or end position of bin |
#6 | score | LD_BIN line: 2 for Best tagSNP, and 1 for tagSNP, and 0 for other SNP. LD_BIN_BOUNDARIES line: always "." |
#7 | strand | always "+" |
#8 | frame | always "." |
#9 | attributes |
This column contains the following items |
You may use this database in compliance with the terms and conditions of the license described below. The license specifies the license terms regarding the use of this database and the requirements you must follow in using this database.
Tomoko Tahira
Kinjo Gakuin University
E-mail: E-mail: ttahira[at]kinjo-u[dot]ac[dot]jp
Date | Update contents |
---|---|
2016/12/13 | Description of the original site is updated. |
2011/09/22 | D-HaploDB English archive site is opened. |
2005/07/20 | D-HaploDB (http://orca.gen.kyushu-u.ac.jp/) is released. |
Tomoko Tahira
Kinjo Gakuin University
E-mail: E-mail: ttahira[at]kinjo-u[dot]ac[dot]jp