about sitemap home home
Databases Data Formats Database Search Genome Browser RNA Secondary Structure Alignments Primer Design WebServices
FASTA Genbank EMBL XML
Exercise EMBL
Bielefeld University Center of Biotechnoloy Institute of Bioinformatics BiBiServ
 
EMBL Data Format - Exercise 1
grep, agrep, cut, sort and wc are very useful UNIX tools to get a quick overview of text files or extract certain information from files. Use these tools to answer the following questions:
  1. Browse to the EBI website and download an EMBL formatted file of all nucleotide sequences from Rat-kangaroos (Potoroidae).
  2. How many sequences are available? Did the download succeed? (Sometimes huge batch downloads break, so it is always a good idea to check if all sequences were retrieved!)
  3. How many of the entries have the feature key 5'UTR?
  4. How many and which tRNA-genes are annotated?
  5. A very useful tool to convert files between different formats is readseq. Try the web version at http://www.ebi.ac.uk/cgi-bin/readseq.cgi to convert the formats and extract features.