| Data name ⇅ | Database name ⇅ | DOI ⇅ | Description of data contents ⇅ | Data file ⇅ | Simple search URL ⇅ | Data acquisition method ⇅ | Data analysis method ⇅ | Number of data entries ⇅ | Data detail |
|---|---|---|---|---|---|---|---|---|---|
| Designation of organism group | Gclust Server | 10.18908/lsdba.nbdc00464-007 |
The definition for grouping 95 species of organism is specified. The first line specifies the number of organism species, and "//END" is entered on the final line. The line starting with "#" is a line for comment. Data are provided in a tab-delimited text file format. |
grp_def1
(1KB) |
- |
- |
- |
- |
Data detail
open_in_full
|
| Parameters for Organism Grouping | Gclust Server | 10.18908/lsdba.nbdc00464-008 |
The file designated with the threshold for the ratio of organism species showing homology in the organism species in each organism group when allocation to the organism group is made. For example, when the designated value is 0.5, the cluster is determined as belonging to the "Plants" group if the sequences of four or more organism species out of seven species in this organism group exist in the cluster. |
pat_def1
(1KB) |
- |
- |
- |
- |
Data detail
open_in_full
|
| Prefix list for each organism | Gclust Server | 10.18908/lsdba.nbdc00464-006 |
List of prefixes for organisms used in Gclust. Each prefix is applied to the top of the sequence ID according to each organism. The first line specifies the number of organism species (95). From the second line, the prefix of each organism is listed on each line, and "//END" is entered on the last line. |
prefix_all95
(1KB) |
- |
- |
- |
- |
Data detail
open_in_full
|
| Proteins in similarity relationship with the cluster | Gclust Server | 10.18908/lsdba.nbdc00464-003 |
Protein sequences that are similar to any clustered sequence of 95 organisms species, but not clustered. The data are given in a CSV format text file. |
gclust_related.zip
(69MB) |
- |
Sequence data stated in "Amino acid sequences of predicted proteins and their annotation for 95 organism species". |
- |
14,444,047 entries |
Data detail
open_in_full
|
| Sequence ID and annotation information | Gclust Server | 10.18908/lsdba.nbdc00464-005 |
A tab-delimited text file specifying the ID, length and annotation information of the amino acid sequences of the predicted proteins for 95 organism species. |
all95.p.table.zip
(7.28MB) |
- |
- |
- |
- |
Data detail
open_in_full
|
| Data name | Database name | DOI | Description of data contents | Data file | Simple search URL | Data acquisition method | Data analysis method | Number of data entries | Data detail |
List of Data Metadata