|
|
|
|
|
|
|
Each GenBank entry includes a concise description of the sequence, the scientific name and taxonomy of the source organism, and a table of features that identifies coding regions and other sites of biological significance, such as transcription units, sites of mutations or modifications, and repeats. Protein translations for coding regions are included in the feature table. Bibliographic references are included along with a link to the Medline unique identifier for all published sequences. Each sequence entry is composed of lines. Different types of lines, each with their own format, are used to record the various data that make up the entry.
|
|
|
The overall goal of the feature table design is to provide an extensive vocabulary for describing features in a flexible framework for manipulating them. The Feature Table documentation represents the shared rules that allow the three databases to exchange data on a daily basis. The range of features to be represented is diverse, including regions which:
- perform a biological function,
- affect or are the result of the expression of a biological function,
- interact with other molecules,
- affect replication of a sequence,
- affect or are the result of recombination of different sequences,
- are a recognizable repeated unit,
- have secondary or tertiary structure,
- exhibit variation, or have been revised or corrected.
|
|
|
- LOCUS: Short name for this sequence (Maximum of 32 characters).
- DEFINITION: Definition of sequence (Maximum of 80 characters).
- ACCESSION: accession number of the entry.
- VERSION: Version of the entry.
- DBSOURCE: Shows the source, the date of creation and last modification of the database entry.
- KEYWORDS: Keywords for the entry.
- AUTHORS: Authors for the work.
- TITLE: Title of the publication.
- JOURNAL: Journal reference for the entry.
- MEDLINE: Medline ID.
- COMMENT: Lines of comments.
- SOURCE ORGANISM: The organism from which the sequence was derived.
- ORGANISM: Full name of organism (Maximum of 80 characters).
- AUTHORS: Authors of this sequence (Maximum of 80 characters).
- ACCESSION: ID Number for this sequence (Maximum of 80 characters).
- FEATURES: Features of the sequence.
- ORIGIN: Beginning of sequence data.
- // End of sequence data.
|
LOCUS MMU22421 2235 bp DNA linear ROD 23-MAR-1995
DEFINITION Mus musculus obesity protein (ob) gene, complete cds.
ACCESSION U22421
VERSION U22421.1 GI:726296
KEYWORDS .
SOURCE Mus musculus (house mouse)
ORGANISM Mus musculus
Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi;
Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia;
Sciurognathi; Muridae; Murinae; Mus.
REFERENCE 1 (bases 1 to 2235)
AUTHORS Chehab,F.F. and Lim,M.E.
TITLE Genomic organization and sequence of the mouse obesity gene
JOURNAL Unpublished (1995)
REFERENCE 2 (bases 1 to 2235)
AUTHORS Chehab,F.F. and Lim,M.E.
TITLE Direct Submission
JOURNAL Submitted (09-MAR-1995) Farid F. Chehab, Laboratory Medicine,
University of California, San Francisco, 505 Parnassus Avenue, San
Francisco, CA 94143-0134, USA
FEATURES Location/Qualifiers
source 1..2235
/organism="Mus musculus"
/mol_type="genomic DNA"
/strain="C57BL/6J"
/db_xref="taxon:10090"
/chromosome="6"
gene join(1..144,1876..2235)
/gene="ob"
CDS join(1..144,1876..2235)
/gene="ob"
/codon_start=1
/product="obesity protein"
/protein_id="AAA64213.1"
/db_xref="GI:726297"
/translation="MCWRPLCRFLWLWSYLSYVQAVPIQKVQDDTKTLIKTIVTRIND
ISHTQSVSAKQRVTGLDFIPGLHPILSLSKMDQTLAVYQQVLTSLPSQNVLQIANDLE
NLRDLLHLLAFSKSCSLPQTSGLQKPESLDGVLEASLYSTEVVALSRLQGSLQDILQQ
LDVSPEC"
intron 145..1875
/gene="ob"
repeat_region 449..585
misc_feature 1876..1879
/gene="ob"
/note="slippage of acceptor site results in inclusion or
exclusion of glutamine at amino acid position 49"
ORIGIN
1 atgtgctgga gacccctgtg tcggttcctg tggctttggt cctatctgtc ttatgttcaa
61 gcagtgccta tccagaaagt ccaggatgac accaaaaccc tcatcaagac cattgtcacc
121 aggatcaatg acatttcaca cacggtagga gtcttatggg gggacaaaga tgtaggacta
181 gaaccagagt ctgagaaaca tgtcatgcac ctcctagaag ctgagagttt ataagcctcg
241 agtgtacatt atctctggtc atggctcttg tcactgctgc ctgctgaaat acagggctga
301 gtggttccat ttctaaaccc agcatctaga ctgctcagct gtactgccag tatcgcatga
361 ttctaatcct aagccacctt agggaattta acttctctct tatactccca ttaagaaacc
421 ataaggtgtc gggcgtggtg gcacatgccc tctaatccca gaactcggga ggcagaggca
481 ggtggatttc tgagttcaag gccagcctgg tctacaaaat gagttccagg acagccaggg
541 ctatacagag aaaccctgtc tcgaaaaacc aaaaaagaag ccataaggtt ctttgatatc
601 ataaggccat gctcattttc cctctgccac aggaaaccca gcccttggtg gctagctgag
661 catgtaaggt acacatcaga cctgggagaa cctgggttcc tccctgcttc cacagaccac
721 cctctcccct tccttagccc cctgtttctg cctctctcat tctctttcat ccatgaaact
781 acttccttga atttagtacc cagggcataa gaatccctaa aggtcatggt gtcccattga
841 cacgtggaca gcttcccagc agtgtctcta ctgggcagga ggagcagtag gtttctaatg
901 gtttttagct acagcttctg cccaccgctc acccactttt caaagtcaca cagaaaacaa
961 cctttccctc ctttacaacc agtccttgtg tagctgctga tagtggtcgg tgcccaccat
1021 gttcttcctc cgaggcccag cagcctacat cttcagccat ttcctcagat gtatctaagc
1081 tatgtgcata tcaccatatc tgcttcccat ctgcaagatc ttaggccagt tctccggtgg
1141 gttttaaacc ttgattttac catcttgatg agggagacat catatcatat caccaagttg
1201 ttctaaggct taaatggggt gtagtgaaag actttcttgt ggagccatct ggagactact
1261 atgtctcctg accagtgtgc gtgtctcaca gtgtggcctt ggcagctagg agaagtcaga
1321 tattcagaat caagggacag cttaatataa gagacttatg cggagaaagt tctcatcatc
1381 tctcgacaag agtcatcagg gctgcacatg gagaggccca actacccaaa tgtgggtgga
1441 aatgagagga agccagtggg gaaagccctt cctggtaacc agactcagca gagtgggggg
1501 ggggggcacg gctttgaccc taatgaggga gaaccacaga agagtatgac taggagggag
1561 agatctgata agggcaggag gctagagaga atataaggaa taaagagcta tggctggttc
1621 ttcacggata tcattggaga aaggaattac tcaagactaa tcagaagtga agggtggagt
1681 gactcggaat gatcagaaag tccgggagac cagctccgtg gcttccagtc agctgatgac
1741 aggaagtaag gacctggacc aggaaggtga gaaggaagga ggtagcccag gttcacagat
1801 gtaatgtaga gctctggagc ccgatgctcc ctgccacttg ctaaaacacc tcttgttctt
1861 cttcctcctc catatcagtc ggtatccgcc aagcagaggg tcactggctt ggacttcatt
1921 cctgggcttc accccattct gagtttgtcc aagatggacc agactctggc agtctatcaa
1981 caggtcctca ccagcctgcc ttcccaaaat gtgctgcaga tagccaatga cctggagaat
2041 ctccgagacc tcctccatct gctggccttc tccaagagct gctccctgcc tcagaccagt
2101 ggcctgcaga agccagagag cctggatggc gtcctggaag cctcactcta ctccacagag
2161 gtggtggctt tgagcaggct gcagggctct ctgcaggaca ttcttcaaca gttggatgtt
2221 agccctgaat gctga
//
|
|
|