Data Formats
In the previous chapter you have learned about different databases that store a variety of information. Each database has its own data format, even though sometimes they share the exact same information as e.g. GenBank, EMBL and DDBJ.
To make use of the databases, often the information of interest has to be parsed from the flat file and converted in a different format that can then be used by sequence analysis programs.
Data formats discussed in this section will be:
  • GenBank
  • EMBL
  • XML
You will see examples for each of these formats, process and convert some entries.