Drosophila melanogaster genome annotation release 4.0 date 20041119 DATA CONTENTS Table of chromosome dna sizes Chromosome rel. 4 rel. 3 ------------------------------------ 2L 22407834 22217931 (~200 Kb added) 2R 20766785 20302755 (~400 Kb added at start/centromere) 3L 23771897 23352213 (~400 Kb added at end/centromere) 3R 27905053 27890790 (~ 10 KB added) 4 1281640 1237870 X 22224390 21780003 (~100 Kb added at start; ~200 Kb added at end) Feature counts in release 4, 3 compared annotation features (r40 nov04, r322 oct 04, r320 March 04) Table of D. mel. genome feature counts per release. Feature 400 322 320 ------------------------------------------------------------ BAC 0 949 949 << gone? no BAC pix in gbrowse (from match:clonelocator:scaffoldBACs) CDS 18715 18747 18746 DNA_motif 5 5 5 EST 0 0 304257 moved to match_ RNA_motif 0 1 0 aberration_junction 86 86 87 cDNA_clone 0 0 10204 moved to match_ chromosome_arm 6 6 0 chromosome_band 0 5715 0 << need new cyto-sequence mapping table enhancer 27 27 27 five_prime_UTR 14360 15769 13608 gene 13472 13472 13473 insertion_site 457 457 424 intron 16135 16153 16199 mRNA 19301 19302 18810 match_HDP 139 2448 0 match_RNAiHDP 110 40 0 match_assembly_path 434 0 0 ? same as scaffold, but 3 fewer match_blastx_aa_SP.hyp.dros 0 354 0 match_blastx_aa_SP.real.dros 0 22163 0 match_blastx_aa_SPTR.dmel 207911 0 0 SPTR.dmel ~ SPTR.dros match_blastx_aa_SPTR.dros 0 68846 0 SPTR.dmel ~ SPTR.dros match_blastx_aa_SPTR.insect 16610 7492 0 match_blastx_aa_SPTR.othinv 21451 12471 0 match_blastx_aa_SPTR.othvert 18036 11774 0 match_blastx_aa_SPTR.plant 11997 9609 0 match_blastx_aa_SPTR.primate 20850 16345 0 match_blastx_aa_SPTR.rodent 21644 16081 0 match_blastx_aa_SPTR.worm 13765 12679 0 match_blastx_aa_SPTR.yeast 5593 5211 0 match_blastx_aa_TR.real.dros 0 43823 0 match_fgenesh 0 14837 0 match_genie 11063 13794 0 match_genscan 17811 19189 0 match_repeat_runner_seg 9198 0 0 match_repeatmasker 11758 0 0 see repeat_region match_sim4_na_ARGs.dros 1062 0 0 match_sim4_na_ARGsCDS.dros 984 0 0 match_sim4_na_DGC_dros 5159 15270 0 match_sim4_na_EST.all_nr.dros 0 267828 0 na_EST -> na_dbEST.same,diff match_sim4_na_cDNA.dros 0 10319 0 match_sim4_na_dbEST.diff.dmel 82910 0 0 match_sim4_na_dbEST.same.dmel 159793 0 0 match_sim4_na_gadfly_dmel_r2 14249 14389 0 match_sim4_na_gb.dmel 26531 14977 0 match_sim4_na_gb.tpa.dmel 2214 0 0 match_sim4_na_pe.dros 0 3201 0 match_sim4_na_smallRNA.dros 98 0 0 match_sim4_na_transcript_dme.. 19001 0 0 match_sim4_na_transcript_dme.. 18799 0 0 match_sim4tandem_na_gb.dmel 23748 0 0 match_tblastx_na_agambiae 101190 0 0 match_tblastx_na_dbEST.insect 34107 16818 0 match_tblastx_na_dpse 263465 0 0 match_tblastx_na_unigene.rod.. 0 11707 0 mature_peptide 7 7 8 ncRNA 70 70 65 oligo 0 197726 193813 << no more oligos ? orthologous_region 0 12101 0 << can we copy from dmel3/dpse? point_mutation 485 485 476 polyA_site 107 107 101 processed_transcript 0 0 16748 protein 0 0 233812 protein_binding_site 90 90 85 pseudogene 40 40 39 rRNA 96 96 85 region 30 30 28 regulatory_region 137 137 136 repeat_region 1^ 4652 3390 see match_repeat rescue_fragment 136 136 135 scaffold 437 437 437 sequence_variant 232 232 225 signal_peptide 0 0 1 snRNA 28 28 28 snoRNA 28 28 28 syntenic_region 0 1230 0 << need new mapping of this to dmelr4 tRNA 288 288 288 tRNA_trnascan 295 297 -- three_prime_UTR 14683 16777 15493 transcription_start_site 36921 35737 16997 transposable_element 1571 1572 1567 transposable_element_inserti.. 4680 3257 4566 ------------------------------------------------------------ Table of D. mel. genome feature counts per release. Feature 400 322 320 << r4 located using r3 seq-cyto map ------------------------------------------------------------ cyto_insertion 16363 16363 21379 cytobreakpoint_inv 4565 4565 4565 cytobreakpoint_other 791 791 791 cytobreakpoint_ttp 6243 6243 6243 cytodeleted_segment 11073 11073 11073 cytoduplicated_segment 880 880 880 cytogene 5671 5671 6683 ------------------------------------------------------------ -- == data not available for this feature gene = protein coding gene, other features with gene-models (and transcripts) are pseudogene, rRNA, snRNA, snoRNA, tRNA, ncRNA ------- Data are from Postgres Chado database, release 4.0, v 3, 19 nov 2004 Copy at ftp://flybase.net/genomes/Drosophila_melanogaster/ dmel_r4.0_20041119/pgsql/chado_r4*.gz BULK FILE SET See ftp://flybase.net/genomes/Drosophila_melanogaster/dmel_r4.0_20041119/ (will become /current/ soon) blast/ - NCBI blast database set for selected fasta/ feature sets. dna/ - contains dna raw format files per chromosome-arm fasta/ - dna and protein data per chromosome and feature type; and -all- files which catenate each chromosome set. chromosome dna in fasta format gff/ - GFF v3 standard feature files per chromosome gnomap/ - Gnomap standard feature files per chromosome (drive genome map views) These two contain chromosome locations of above listed features