[ Japanese | English ]
About This Database

EST sequences and their annotation (amino acid sequence and results of homology search)

Data description
Data name
EST sequences and their annotation (amino acid sequence and results of homology search)
DOI
10.18908/lsdba.nbdc00419-001
Description of data contents
Sequences of cDNA clones of Dictyostelium discoideum and their annotations (amino acid sequence, homology search results (with target DBs: dicty EST-DB, DNA-DB and protein-DB)). Links to the Atlas database (<a href="http://dictycdb.biol.tsukuba.ac.jp/~tools/bin/ISH/index.html">http://dictycdb.biol.tsukuba.ac.jp/~tools/bin/ISH/index.html</a>), which is the database of images depicting localization of clones in Dictyostelium discoideum, the National BioResource Project (<a href="http://www.nbrp.jp/">http://www.nbrp.jp/</a>) and the dictyBase (<a href="http://dictybase.org/">http://dictybase.org/</a>) are provided. Link to the table of contigs containing EST sequences is also provided. For each clone, the following four categories are established: 5' EST sequence, 3' EST sequence, 5' EST-3'EST-ligated sequence and full-length cDNA sequence. If both 5' EST sequence and 3' EST sequence are available, the treatment to ligate the 5' EST sequence to the 3' EST sequence is peformed to generate a sequence connecting the two sequences with ten pieces of hyphen. If an overlapped region exists between the two sequences, the sequence obtained by overlapping is considered as a full-length sequence. For some clones that do not allow overlapping, the full-length is obtained by sequencing the gap region. F, Z, P and E are added to the end of the Clone ID to identify 5' EST sequence, 3' EST sequence, 5' EST-3'EST-ligated sequence and full-length cDNA sequence. For each clone, these sequences are stored on a single line. Among these sequences, one sequence is selected as the representative sequence by prioritizing full-length cDNA sequence, 5' EST-3'EST-ligated sequence, 5' EST sequence and 3' EST sequence in this order, and the BLAST-based homology search is performed. Search is performed by the blastn search against clone sequences of dicty_cDB, the DNA sequence in public database, and blastx search against the protein sequence in public database, and then the top 10 hit information is stored. CSV format text file.
Data file
File name :
dicty_cdb_clone.zip
File URL :
File size :
181MB
Simple search URL
http://togodb.biosciencedbc.jp/togodb/view/dicty_cdb_clone#en
Data acquisition method

Capillary sequencer

Data analysis method

Clones obtained from 11 different cDNA libraries originating from five developmental stages were sequenced.

Number of data entries

97,337 entries

Data detail
Data item Description
IDs and Links

-

Library

14 different sequenced cDNA libraries (AF, AH, CF, CH, FC, FC-IC, FCL, SF, SH, SL, SS, VF, VH and VS) derived from five developmental stages.

Clone ID

ID of cDNA clone

Atlas ID

ID of Atlas database (http://dictycdb.biol.tsukuba.ac.jp/~tools/bin/ISH/index.html) and link to Atlas database

NBRP ID

ID of cDNA clone covering full-length ORF provided by the National BioResource Project (http://www.nbrp.jp/). The link to the "National BioResource Project (NBRP) Dictyostelium discoideum" gene database (http://nenkin.lab.nig.ac.jp/genes?locale=en) is provided in the TogoDB edition.

dictyBase ID

ID of Protein Coding Gene in dictyBase (http://dictybase.org/). The link to dictyBase is provided in the TogoDB edition.

Link to Contig

Link to contig containing EST (TogoDB edition only)

Representative Seq. and Annotation

-

Representative seq. ID

ID of DNA sequence used in homology search. Among these sequences, one sequence is selected as the representative DNA sequence by prioritizing 1) full-length cDNA sequence, 2) 5' EST-3'EST-ligated sequence, 3) 5' EST sequence and 4) 3' EST sequence in this order.

Representative DNA sequence

Representative DNA sequence

sequence update

Last update of representative DNA sequence

Translated Amino Acid sequence

Amino acid sequence translated from representative DNA sequence

Translated Amino Acid sequence (All Frames)

Amino acid sequences resulting from translation in all six reading frames of DNA sequence

Homology vs CSM-cDNA

List of top 10 hits in blastn search against the clone sequence in dicty_cDB

own update

Last update of homology search against CSM (a set of clone sequences)

Homology vs DNA

List of top 10 hits in blastn search against DNA sequences in public database

dna update

Last update of homology search against DNA sequences in public database

Homology vs Protein

List of top 10 hits in blastx search against protein sequences in public database

protein update

Last update of homology search against protein sequences in public database

PSORT

The results of PSORT (http://psort.ims.u-tokyo.ac.jp/), which is a program to predict the subcellular localization of proteins.

Seqeunces

-

5' end seq. ID

ID of 5' EST sequence. "F" is added to the end of ID.

5' end seq.

5' EST sequence. FASTA format.

Length of 5' end seq.

Length of 5' EST sequence.

3' end seq. ID

ID of 3' EST sequence. "Z" is added to the end of ID.

3' end seq.

3' EST sequence. FASTA format.

Length of 3' end seq.

Length of 3' EST sequence.

Connected seq. ID

ID of 5' EST-3' EST-ligated sequence. "P" is added to the end of ID.

Connected seq.

Sequence of 5' EST ligated to 3' EST by 10 gaps (-).

Length of connected seq.

ID of 5' EST-3' EST-ligated sequence. (Gap (-) are not counted.)

Full length Seq ID

ID of full-length cDNA. "E" is added to the end of ID.

Full length Seq.

Full-length cDNA sequence. FASTA format.

Length of full length seq.

Length of full-length cDNA.