[German]  [Feedback]


Expressed Sequence Tags Database

dbEST (Nature Genetics 4:332-3;1993) is a division of GenBank that contains sequence data and other information on "single-pass" cDNA sequences, or Expressed Sequence Tags, from a number of organisms.

Sequence Tagged Sites

The STS division file contains sequence and mapping data on short genomic landmark sequences or Sequence Tagged Sites.

DNA Data Bank of Japan"  info

DDBJ (DNA Data Bank of Japan) began DNA data bank activities in earnest in 1986 at the National Institute of Genetics (NIG) with the endorsement of the Ministry of Education, Science, Sport and Culture. From the beginning, DDBJ has been functioning as one of the International DNA Databases, including EBI (European Bioinformatics Institute; responsible for the EMBL database) in Europe and NCBI (National Center for Biotechnology Information; responsible for GenBank database) in the USA as the two other members.

EMBL Nucleotide Sequence Database"  info

The European Bioinformatics Institute (EBI) maintains and distributes the EMBL Nucleotide Sequence database, Europe's primary nucleotide sequence data resource.

Entrez Genome"  info

The whole genomes of over 1000 viruses and over 100 microbes can be found in Entrez Genome. The genomes represent both completely sequenced organisms and those for which sequencing is in progress. All three main domains of life - bacteria, archaea, and eukaryota - are represented, as well as many viruses and organelles.

Entrez Nucleotide Database"  info

The Nucleotide database contains sequence data from GenBank, EMBL, and DDBJ, the members of the tripartite, international collaboration of sequence databases. EMBL is the European Molecular Biology Laboratory (EMBL) at Hinxton Hall, UK; DDBJ is the DNA Database of Japan (DDBJ) in Mishima, Japan. Sequences are also incorporated from the Genome Sequence Data Base (GSDB), Santa Fe, NM. Patent sequences are incorporated through arrangements with the U.S. Patent and Trademark Office (USPTO), and via the collaborating international databases from other international patent offices.

GenBank"  info

GenBank is a database of nucleotide sequences from >130,000 organisms. Records that are annotated with coding region (CDS) features also include amino acid translations. GenBank belongs to an international collaboration of sequence databases (described below), which also includes EMBL and DDBJ.

Genomes Server at the EBI

The first completed genomes from viruses, phages and organelles were deposited into the EMBL Database in the early 1980's. Since then, molecular biology's shift to obtain the complete sequences of as many genomes as possible combined with major developments in sequencing technology resulted in hundreds of complete genome sequences being added to the database, including Archaea, Bacteria and Eukaryota. These web pages give access to a large number of complete genomes.

H-Invitational Database (H-InvDB)"  info

H-Invitational Database (H-InvDB) is a human gene database, with integrative annotation of 41,118 full-length cDNA clones currently available from six high throughput cDNA sequencing projects. This database represents 21,037 cDNA clusters describing their gene structures, functions, novel alternative splicing isoforms, non-coding functional RNAs, functional domains, sub-cellular localizations, metabolic pathways, predictions of protein 3D structure, mapping of SNPs and microsatellite repeat motifs in relation with orphan diseases, gene expression profiling, and comparative results with mouse full-length cDNAs in the context of molecular evolution.

Homologous Processed Pseudogenes Database"  info

Hoppsigen is a nucleic database of homologous processed pseudogenes.

Influenza Sequence Database

The Influenza Sequence Database is a curated database of nucleotide sequences. It is intended to provide the research community with easy sequence deposit and retrieval capabilities, together with tools tailored, in particular, to the analysis of hemagglutinin and neuraminidase sequences.
This database is operated by the University of California for the US Department of Energy

Mammalian Gene Collection

The goal of the Mammalian Gene Collection (MGC), a trans-NIH initiative, is to provide full-length open reading frame (FL-ORF) clones for human, mouse, and rat genes. All MGC sequences are deposited in GenBank and the clones can be purchased from distributors of the IMAGE consortium.

Marine Microbial Biodiversity Database"  info

Micro-Mar is a database for dynamic representation of marine microbial biodiversity.

Neisseria meningitidis / MLST

Database about Neisseria meningitidis.

Nucleosome Positioning Region Database (NPRD)"  info

Nucleosome Positioning Region Database (NPRD), which is compiling the available experimental data on locations and characteristics of nucleosome formation sites (NFSs), is the first curated NFS-oriented database.

Pig EST Data Explorer"  info

Database of full-length enriched cDNA libraries and ESTs in pigs.


A PopSet is a set of DNA sequences that have been collected to analyse the evolutionary relatedness of a population. The population could originate from different members of the same species, or from organisms from different species. They are submitted to GenBank via Sequin, often as a sequence alignment.

Reference Sequence (RefSeq)

The Reference Sequence (RefSeq) collection aims to provide a comprehensive, integrated, non-redundant set of sequences, including genomic DNA, transcript (RNA), and protein products, for major research organisms.

Third Party Annotation Sequence Database

A TPA sequence is derived or assembled from primary sequence data currently found in the DDBJ/EMBL/GenBank International Nucleotide Sequence Collaboration Databases. It can be genomic or mRNA sequence, and can be assembled or derived from primary genomic and/or mRNA sequences.

UniGene"  info

UniGene is an experimental system for automatically partitioning GenBank sequences into a non-redundant set of gene-oriented clusters. Each UniGene cluster contains sequences that represent a unique gene, as well as related information such as the tissue types in which the gene has been expressed and map location.


UniSTS is a NCBI resource that reports information about markers, or Sequence Tagged Sites (STS).
For each marker, UniSTS displays the primer sequences, product size, and mapping information, as well as cross references to LocusLink, dbSNP, RHdb, GDB, MGD, and the Entrez Map Viewer. The marker report also lists GenBank and RefSeq records that contain the primer sequences, as determined by Electronic PCR (e-PCR). Marker data, e-PCR and mapping data are availble from the FTP site.
UniSTS integrates marker and mapping data from public resources including GenBank, RHdb, GDB, various human maps (Genethon genetic map, Marshfield genetic map, Whitehead RH map, Whitehead YAC map, Stanford RH map, NHGRI chr 7 physical map, WashU chrX physical map), various mouse maps (Whitehead RH map, Whitehead YAC map, Jackson laboratory's MGD map).