PGDBj - Ortholog DB

2017/03/07

Web Site: http://pgdbj.jp/pages/index.html?dir=&page=od&ln=en
HTTPS Site: https://dbarchive.biosciencedbc.jp/data/pgdbj-ortholog-db/

The database of orthologous relationships of genes that are computationally determined according to similarities between amino acid sequences.

README Content

  1. Database Component
  2. Data Description
  3. License
  4. Update History
  5. Literature
  6. Contact address

1. Database Component

  1. README
  2. Protein (Viridiplantae)
  3. Protein (Cyanobacteria)
  4. Cluster (Viridiplantae)
  5. Cluster (Cyanobacteria)
  6. Taxon (Viridiplantae)
  7. Taxon (Cyanobacteria)
Return to Top

2. Data Description

2.1 README

Data name README
Description of data contents HTML file to describe "PGDBj - Ortholog DB" data.
File README_e.html (English)
Return to Top

2.2 Protein (Viridiplantae)

Data name Protein (Viridiplantae)
Description of data contents

Amino acid sequences of Viridiplantae (green plants) obtained from the NCBI Reference Sequence Database with the NCBI GI numbers, the Reference Sequence IDs, and annotations. The IDs of clusters that the amino acid sequences belong to in each taxon are indicated.

File pgdbj_ortholog_db_viridiplantae_protein.zip (200 MB)

Data items are the following:
Data itemDescription
GI number NCBI GI number of Amino Acid sequence
RefSeq ID NCBI Reference Sequence ID
Chr Chromosome number or genomic ID
Start Position Start Position
Stop Position Stop Position
Cluster (Kingdom) Cluster ID (rank: Kingdom)
Cluster (Phylum) Cluster ID (rank: Phylum)
Cluster (No rank 1) Cluster ID (rank: No rank 1)
Cluster (No rank 2) Cluster ID (rank: No rank 2)
Cluster (No rank 3) Cluster ID (rank: No rank 3)
Cluster (No rank 4) Cluster ID (rank: No rank 4)
Cluster (No rank 5) Cluster ID (rank: No rank 5)
Cluster (No rank 6) Cluster ID (rank: No rank 6)
Cluster (No rank 7) Cluster ID (rank: No rank 7)
Cluster (No rank 8) Cluster ID (rank: No rank 8)
Cluster (No rank 9) Cluster ID (rank: No rank 9)
Cluster (No rank 10) Cluster ID (rank: No rank 10)
Cluster (Class) Cluster ID (rank: Class)
Cluster (No rank 11) Cluster ID (rank: No rank 11)
Cluster (Subclass) Cluster ID (rank: Subclass)
Cluster (No rank 12) Cluster ID (rank: No rank 12)
Cluster (Order) Cluster ID (rank: Order)
Cluster (Family) Cluster ID (rank: Family)
Cluster (No rank 13) Cluster ID (rank: No rank 13)
Cluster (Subfamily) Cluster ID (rank: Subfamily)
Cluster (No rank 14) Cluster ID (rank: No rank 14)
Cluster (Tribe) Cluster ID (rank: Tribe)
Cluster (Subtribe) Cluster ID (rank: Subtribe)
Cluster (Genus) Cluster ID (rank: Genus)
Cluster (Subgenus) Cluster ID (rank: Subgenus)
Cluster (Species) Cluster ID (rank: Species)
Cluster (Subspecies) Cluster ID (rank: Subspecies)
Cluster (Forma) Cluster ID (rank: Forma)
Cluster (No rank 15) Cluster ID (rank: No rank 15)
Annotation Annotation of protein
Organism Organism
AA sequence Amino acid sequence
Return to Top

2.3 Protein (Cyanobacteria)

Data name Protein (Cyanobacteria)
Description of data contents

Amino acid sequences of Cyanobacteria (blue-green algae) obtained from the NCBI Reference Sequence Database with the NCBI GI numbers, the Reference Sequence IDs, and annotations. The IDs of clusters that the amino acid sequences belong to in each taxon are indicated.

File pgdbj_ortholog_db_cyanobacteria_protein.zip (106 MB)

Data items are the following:
Data itemDescription
GI number NCBI GI number of Amino Acid sequence
RefSeq ID NCBI Reference Sequence ID
Chr Chromosome number or genomic ID
Start Position Start Position
Stop Position Stop Position
Cluster (Phylum) Cluster ID (rank: Phylum)
Cluster (Class) Cluster ID (rank: Class)
Cluster (Order) Cluster ID (rank: Order)
Cluster (No rank 1) Cluster ID (rank: No rank 1)
Cluster (Family) Cluster ID (rank: Family)
Cluster (Genus) Cluster ID (rank: Genus)
Cluster (No rank 2) Cluster ID (rank: No rank 2)
Cluster (Species) Cluster ID (rank: Species)
Cluster (No rank 3) Cluster ID (rank: No rank 3)
Cluster (Subspecies) Cluster ID (rank: Subspecies)
Cluster (No rank 4) Cluster ID (rank: No rank 4)
Annotation Annotation of protein
Organism Organism
AA sequence Amino acid sequence
Return to Top

2.4 Cluster (Viridiplantae)

Data name Cluster (Viridiplantae)
Description of data contents

Clusters of amino acid sequences of Viridiplantae (green plants) obtained from the NCBI Reference Sequence Database. Along a phylogenetic tree, clusters were generated in Viridiplantae taxon and in each sub-taxon of Viridiplantae by using the results of all-against-all BLAST searches among the amino acid sequences. An amino acid sequence belongs to only one cluster in a taxon.

File pgdbj_ortholog_db_viridiplantae_cluster.zip (16.6 MB)

Data items are the following:
Data itemDescription
Cluster ID The cluster ID is composed of a Taxonomy ID and a serial number beginning with “0”. For instance, “cluster ID: 33090.0” means the protein belongs to the cluster ranked 0th among the clusters in the “taxon: 33090”. This cluster ID is uniquely-assigned by the PGDBj Ortholog Database.
Cluster size Number of proteins affiliated with the Cluster
Supercluster Next supercluster
Subcluster Next subcluster
Return to Top

2.5 Cluster (Cyanobacteria)

Data name Cluster (Cyanobacteria)
Description of data contents

Clusters of amino acid sequences of Cyanobacteria (blue-green algae) obtained from the NCBI Reference Sequence Database. Along a phylogenetic tree, clusters were generated in Cyanobacteria taxon and in each sub-taxon of Cyanobacteria by using the results of all-against-all BLAST searches among the amino acid sequences. An amino acid sequence belongs to only one cluster in a taxon.

File pgdbj_ortholog_db_cyanobacteria_cluster.zip (8.2 MB)

Data items are the following:
Data itemDescription
Cluster ID The cluster ID is composed of a Taxonomy ID and a serial number beginning with “0”. For instance, “cluster ID: 33090.0” means the protein belongs to the cluster ranked 0th among the clusters in the “taxon: 33090”. This cluster ID is uniquely-assigned by the PGDBj Ortholog Database.
Cluster size Number of proteins affiliated with the Cluster
Supercluster Next supercluster
Subcluster Next subcluster
Return to Top

2.6 Taxon (Viridiplantae)

Data name Taxon (Viridiplantae)
Description of data contents

Phylogenetic relationships among the recursively generated clusters in "Cluster (Viridiplantae)."

File pgdbj_ortholog_db_viridiplantae_taxon.zip (4.0 KB)

Data items are the following:
Data itemDescription
Taxonomy name NCBI Taxonomy name
Taxonomy ID NCBI Taxonomy ID
Taxonomy rank NCBI Taxonomy rank
Number of clusters Number of clusters affiliated with the Taxon
Number of proteins Number of proteins affiliated with the Taxon
Higher taxon Next suprageneric taxon
Lower taxon Next infrageneric taxon
Return to Top

2.7 Taxon (Cyanobacteria)

Data name Taxon (Cyanobacteria)
Description of data contents

Phylogenetic relationships among the recursively generated clusters in "Cluster (Cyanobacteria)."

File pgdbj_ortholog_db_66_cyanobacteria_taxon.zip (7.4 KB)

Data items are the following:
Data itemDescription
Taxonomy name NCBI Taxonomy name
Taxonomy ID NCBI Taxonomy ID
Taxonomy rank NCBI Taxonomy rank
Number of clusters Number of clusters affiliated with the Taxon
Number of proteins Number of proteins affiliated with the Taxon
Higher taxon Next suprageneric taxon
Lower taxon Next infrageneric taxon
Return to Top

3. License

Last updated : 2017/03/07

You may use this database in compliance with the terms and conditions of the license described below. The license specifies the license terms regarding the use of this database and the requirements you must follow in using this database.

 

Creative Commons License

The license for this database is specified in the Creative Commons Attribution-Share Alike 4.0 International.
If you use data from this database, please be sure attribute this database as follows: "PGDBj - Ortholog DB c Akihiro Nakaya (Osaka University) licensed under CC Attribution-Share Alike 4.0 International".

The summary of the Creative Commons Attribution-Share Alike 4.0 International is found here.

With regard to this database, you are licensed to:

  1. freely access part or whole of this database, and acquire data;
  2. freely redistribute part or whole of the data from this database; and
  3. freely create and distribute database and other adapted materials based on part or whole of the data from this database,

under the license, as long as you comply with the following conditions:

  1. You must attribute this database in the manner specified by the author or licensor when distributing part or whole of this database or any adapted material.
  2. You must distribute any adapted material based on part or whole of the data from this database under CC Attribution-Share Alike 4.0 (or later), or CC Attribution-Share Alike Compatible License (the list is here).
  3. You need to contact the Licensor shown below to request a license for use of this database or any part thereof not licensed under the license.

Department of Genome Informatics, Graduate School of Medicine, Osaka University
2-2 Yamadaoka, Suita, Osaka 565-0871, JAPAN
E-mail: pgdbj[at]kazusa[dot]or[dot]jp / nakaya[at]gi[dot]med[dot]osaka-u[dot]ac[dot]jp

About Providing Links to This Database

You can freely provide links to all contents in this database. But, contents might be changed without notice.

Return to Top

4. Update History

DateUpdate contents
2017/03/07 PGDBj Ortholog DB (Release66 ver.) is rereased as Archive V2.
These data are updated.
2016/07/29 Database Description page is updated.
  • The URL of the Whole data download
  • The URL of The original website information
2014/05/12 PGDBj Ortholog DB (Release57 ver.) English archive site is opened.
(Archive V1)
2012/08/01 PGDBj Ortholog DB (http://pgdbj.jp/ortholog-db.html) is opened.
Return to Top

5. Literature

Erika Asamizu, Hisako Ichihara, Akihiro Nakaya, Yasukazu Nakamura, Hideki Hirakawa, Takahiro Ishii, Takuro Tamura, Kaoru Fukami-Kobayashi, Yukari Nakajima and Satoshi Tabata
Plant Genome DataBase Japan (PGDBj): A Portal Website for the Integration of Plant Genome-Related Databases
Plant Cell Physiol (2014) 55 (1): e8.
PMID: 24363285

Akihiro Nakaya, Hisako Ichihara, Erika Asamizu, Shirasawa Sachiko, Yasukazu Nakamura, Satoshi Tabata and Hideki Hirakawa
Plant Genome DataBase Japan (PGDBj).
Methods Mol Biol (2017) 1533: 45-77.
PMID: 27987164

Return to Top

6. Contact address

When you have any question about "PGDBj - Ortholog DB", contact the following:

Department of Genome Informatics, Graduate School of Medicine, Osaka University
2-2 Yamadaoka, Suita, Osaka 565-0871, JAPAN
E-mail: pgdbj[at]kazusa[dot]or[dot]jp / nakaya[at]gi[dot]med[dot]osaka-u[dot]ac[dot]jp

Return to Top