[ Japanese | English ]
About This Database

Gene Name Thesaurus

Data description
Data name
Gene Name Thesaurus
DOI
10.18908/lsdba.nbdc00966-001
Description of data contents
Curators who have expertize in biological research edit gene names found in various databases and articles to show associations between them.
Data file
File name :
dictionary.zip
File URL :
File size :
4.6MB
Simple search URL
http://togodb.biosciencedbc.jp/togodb/view/lsdb_gene_thesaurus#en
Data acquisition method

We extracted synonyms described in databases such as Entrez Gene, Swiss-Prot and HGNC.

Data analysis method

1. Collect gene names automatically from synonym information fields in various gene/genome databases.
2. The curators who have expertise in biological research confirm the name variation for genes and associate them. They also delete names which are confusing to associate (polysemy, acronyms for different genes etc.).
3. Extract words describe gene names from MEDLINE abstracts and collect unregistered names.
4. Evaluate detection performance of gene names in the dictionary.
5. Add non-detected words to the dictionary and repeat 4-5 using other literature set.

Number of data entries

Gene family Number of genes: 12,110 Number of names: 27,923 Human Number of genes: 27,959 Number of names: 145,623 Mouse Number of genes: 48,545 Number of names: 173,375 Rat Number of genes: 17,319 Number of names: 61,801 Zebrafish Number of genes: 24,230 Number of names: 60,270 Fruit fly Number of genes: 30,708 Number of names: 96,934 Nematode Number of genes: 25,304 Number of names: 96,220 Budding yeast Number of genes: 7,359 Number of names: 29,533 Fission yeast Number of genes: 7,943 Number of names: 15,431 Bacillus subtilis Number of genes: 4,206 Number of names: 14,816

Data detail
Data item Description
SWISS-PROT_ID

ID of SWISS-PROT

EntrezGene_ID

ID of EntrezGene

Other_ID

ID of databases other than above

Gene_Name

Gene name