About This Database

Cluster (Cyanobacteria)

Data description

Data name

DOI

10.18908/lsdba.nbdc01194-02-005.V002

Description of data contents

Clusters of amino acid sequences of Cyanobacteria (blue-green algae) obtained from the NCBI Reference Sequence Database. Along a phylogenetic tree, clusters were generated in Cyanobacteria taxon and in each sub-taxon of Cyanobacteria by using the results of all-against-all BLAST searches among the amino acid sequences. An amino acid sequence belongs to only one cluster in a taxon.

Data file

File name :

pgdbj_ortholog_db_cyanobacteria_cluster.zip

File URL :

https://dbarchive.biosciencedbc.jp/data/pgdbj-ortholog-db/LATEST/pgdbj_ortholog_db_cyanobacteria_cluster.zip

File size :

8.2 MB

Simple search URL

http://togodb.biosciencedbc.jp/togodb/view/pgdbj_ortholog_db_cyanobacteria_cluster#en

Data acquisition method

Data in "Protein (Cyanobacteria)" was used.

Data analysis method

Along a phylogenetic tree obtained from the NCBI Taxonomy Database, clusters in lower taxa (subclusters) were recursively aggregated to form clusters in a taxon (superclusters).

Number of data entries

1,095,715 entries

Data detail

Data item	Description
Cluster ID	The cluster ID is composed of a Taxonomy ID and a serial number beginning with “0”. For instance, “cluster ID: 33090.0” means the protein belongs to the cluster ranked 0th among the clusters in the “taxon: 33090”. This cluster ID is uniquely-assigned by the PGDBj Ortholog Database.
Cluster size	Number of proteins affiliated with the Cluster
Supercluster	Next supercluster
Subcluster	Next subcluster