Index of /download/sequence/C_albicans_SC5314/Assembly19
Name Last modified Size Description
Parent Directory -
current/ 2023-06-29 09:53 -
archived_as_released/ 2023-06-29 09:53 -
archive/ 2023-10-16 10:49 -
SC5314_traces/ 2023-06-29 09:53 -
The /download/sequence/ directory contains sequence from the
C. albicans genome sequencing project, and derivatives thereof.
Current files are generated weekly and reflect the most current
information at CGD. Most of the files in this directory are in FASTA
format; current Assembly 20 and Assembly 21 files in EMBL format may
be downloaded from the /Assembly20/current/EMBL_format/
and/Assembly21/current/EMBL_format/ subdirectories, respectively.
All files are gzip compressed. There are several freely available
software options for decompressing gzipped files using Windows. The
software and other useful information is available on these web sites:
- WinZip (http://www.winzip.com/)
- Stuffit (http://www.stuffit.com/)
- Gzip (http://www.gzip.org/
and the gzip user's manual:
http://www.math.utah.edu/docs/info/gzip_toc.html
Additional sequence documentation is found on the CGD web site at:
http://www.candidagenome.org/help/SequenceHelp.shtml
------------------------------------------------
/Assembly19/
This directory contains sequence files for Assembly 19
Note, that for the directories below, all files are available in both
haploid and diploid versions for Assembly 19, the haploid versions
being identified with '_haploid' in the file name. The haploid
versions of these files were created by omitting features mapping to
contigs in the assembly whose names begin with 'Contig19-20', as these
contigs contain the second copy of the alleles identified in the
diploid Assembly 19.
/Assembly19/current/
This directory contains the most current version of the sequences;
A19 files are not updated routinely:
sequence of Assembly 19 contigs, supercontigs and mitochondrial DNA:
Ca19-allContigs.fa.gz
Ca19-mtDNA.fa.gz
Ca19-supercontigs.fasta.gz
sequence with introns for all ORFs:
orf_genomic_assembly_19.fasta.gz
orf_genomic_haploid_assembly_19.fasta.gz
sequence with no introns for all ORFs:
orf_coding_assembly_19.fasta.gz
orf_coding_haploid_assembly_19.fasta.gz
sequences with introns and untranslated region 1000 bp upstream and
downstream for all ORFs:
orf_genomic_1000_assembly_19.fasta.gz
orf_genomic_1000_haploid_assembly_19.fasta.gz
translation of all ORF regions:
orf_trans_all_assembly_19.fasta.gz
orf_trans_all_haploid_assembly_19.fasta.gz
sequences from the systematic C. albicans sequence for the following
feature types: CEN, rRNA, tRNA, snRNA, snoRNA, ncRNA
(other types will be added in future):
other_features_genomic_assembly_19.fasta.gz
other_features_genomic_haploid_assembly_19.fasta.gz
other_features_no_introns_assembly_19.fasta.gz
other_features_no_introns_haploid_assembly_19.fasta.gz
genomic sequence for the above features plus 1000 bp upstream and
downstream sequence:
other_features_genomic_1000_assembly_19.fasta.gz
other_features_genomic_1000_haploid_assembly_19.fasta.gz
/Assembly19/archived_as_released/
This directory contains DNA sequences for diploid assembly 19 as
produced by the Candida Sequencing project at the Stanford Genome
Technology Center. These files are simply copies of the SGTC files,
and are here for archival purposes:
Ca19AnnotatedDec2004.DNA.seq
Ca-Assembly19.alleles.gz
Ca-Assembly19.orf.gz
/Assembly19/archive/
This directory contains archived versions of the Assembly 19
sequences.
/Assembly19/SC3514_traces/
This directory contains the original SC5314 sequence trace files and
quality scores generated by the Stanford Genome Technology Center
(SGTC). The Candida albicans server at the SGTC has been taken
offline as of October, 2006, and these sequence data were provided by
the SGTC to CGD.