Nucleotide Databases
Genbank [Benson et al. 2007] is the NIH genetic sequence database, an annotated collection of all publicly available DNA sequences. There are 42,734,478 loci, and 46,849,831,226 bases in 42,734,478 sequence records as of February 2005 (see GenBank growth statistics http://www.ncbi.nlm.nih.gov/Genbank/genbankstats.html). As an example, you may view the record for a Saccharomyces cerevisiae gene. A new release is made every two months. GenBank is part of the International Nucleotide Sequence Database Collaboration, which comprises the DNA DataBank of Japan (DDBJ), the European Molecular Biology Laboratory (EMBL), and GenBank at NCBI. These three organizations exchange data on a daily basis.
The EMBL Nucleotide Sequence Database [Kulikova et al. 2007] constitutes Europe's primary nucleotide sequence resource. Main sources for DNA and RNA sequences are direct submissions from individual researchers, genome sequencing projects and patent applications.