Index of /download/sequence/C_albicans_SC5314/Assembly20

Icon  Name                                  Last modified      Size  Description
[PARENTDIR] Parent Directory - [DIR] current/ 2023-06-29 09:53 - [DIR] archived_as_released/ 2023-06-29 09:53 - [DIR] archive/ 2023-10-16 10:53 -
The /download/sequence/C_albicans_SC5324/Assembly20 directory contains sequence from
the C. albicans chromosome sequencing project (Assembly 20), and derivatives thereof.
After Oct 6 2008 the files in this directory are NOT updated.

All files are gzip compressed. There are several freely available
software options for decompressing gzipped files using Windows.  The
software and other useful information is available on these web sites:
 
- WinZip (http://www.winzip.com/)
- Stuffit (http://www.stuffit.com/)
- Gzip (http://www.gzip.org/
    
and the gzip user's manual:
http://www.math.utah.edu/docs/info/gzip_toc.html

Additional sequence documentation is found on the CGD web site at:
http://www.candidagenome.org/help/SequenceHelp.shtml

------------------------------------------------

/Assembly20/

This directory contains sequence files for Assembly 20.

** PLEASE NOTE: Assembly 20 Sequence Advisory
** 
** posted October 19, 2006, updated October 25, 2006
** 
** The collaborative group who generated Assembly 20 has discovered that 
** the sequence traces that they had been using to fill some of the gaps 
** and determine overlaps between Assembly 19 contigs were derived from 
** strain WO-1, rather than from the reference strain, SC5314. 
** 
** Please see http://www.candidagenome.org/help/Assembly20_Advisory.shtml
** for the latest information and status updates.


/Assembly20/current/        

This directory contains the last version of the sequences
as of Oct 6 2008.
                             

sequence of Assembly 20 chromosomes:

	Ca20_chromosomes.fasta.gz

Contains DNA sequences for haploid Assembly 20, and the mitochondrial
chromosome sequence generated in the original SGTC sequencing project.
Haplotype information was not preserved during generation of Assembly
20.  The sequences called 'chromosomes' by the Assembly 20
collaborators may more precisely be described as 'reftigs' because
they are mosaics of haplotypes, rather than representative of a single
haploid genome in the sequenced strain.  Please note Assembly 20
issues described in detail at:
http://www.candidagenome.org/help/SequenceHelp.shtml


sequence with introns for all ORFs:
	orf_genomic_assembly_20.fasta.gz


sequence with no introns for all ORFs:
	orf_coding_assembly_20.fasta.gz                   


sequences with introns and untranslated region 1000 bp upstream and
downstream for all ORFs:
	orf_genomic_1000_assembly_20.fasta.gz


translation of all ORF regions:
	orf_trans_all_assembly_20.fasta.gz                


sequences from the systematic C. albicans sequence for the following
feature types: CEN, rRNA, tRNA, snRNA, snoRNA, ncRNA 
(other types will be added in future):

	other_features_genomic_assembly_20.fasta.gz                  
	other_features_no_introns_assembly_20.fasta.gz 


genomic sequence for the above features plus 1000 bp upstream and downstream sequence:

	other_features_genomic_1000_assembly_20.fasta.gz 
	

/Assembly20/current/EMBL_format/ 

This directory contains current gene and sequence data from the
C. albicans Assembly 20 genome in EMBL file format.  Files in this
directory are NOT updated and reflect the Assembly 20
information at CGD as of Oct 6 2008.


/Assembly20/archive

This directory contains archived versions of the Assembly 20
sequences.  Before Oct 6 2008 the sequences were checked 
for changes weekly and a new file was added whenever there had been changes.