Index of /download/sequence
Name Last modified Size Description
Parent Directory -
Assembly4/ 05-Mar-2010 15:35 -
Assembly5/ 05-Mar-2010 15:35 -
Assembly6/ 05-Mar-2010 15:35 -
Assembly19/ 05-Mar-2010 15:35 -
Assembly20/ 05-Mar-2010 15:35 -
Assembly21/ 05-Mar-2010 15:35 -
The /download/sequence/ directory contains sequence from the
C. albicans genome sequencing project, and derivatives thereof.
Current files are generated weekly and reflect the most current
information at CGD. Most of the files in this directory are in FASTA
format; current Assembly 20 and Assembly 21 files in EMBL format may
be downloaded from the /Assembly20/current/EMBL_format/ and
/Assembly21/current/EMBL_format/ subdirectories, respectively.
All files are gzip compressed. There are several freely available
software options for decompressing gzipped files using Windows. The
software and other useful information is available on these web sites:
- WinZip (http://www.winzip.com/)
- Stuffit (http://www.stuffit.com/)
- Gzip (http://www.gzip.org/
and the gzip user's manual:
http://www.math.utah.edu/docs/info/gzip_toc.html
Additional sequence documentation is found on the CGD web site at:
http://www.candidagenome.org/help/SequenceHelp.shtml
------------------------------------------------
/Assembly21/
This directory contains sequence files for Assembly 21
/Assembly21/current/
This directory contains the most current version of the sequences;
files are updated weekly:
/Assembly21/current/EMBL_format/
This directory contains current gene and sequence data from the
C. albicans Assembly 21 genome in EMBL file format. Files in this
directory are generated weekly and reflect the most current
information at CGD.
/Assembly21/archived_as_released/
This directory contains Candida albicans Assembly 21 (A21), as
released to CGD by the A21 collaborators and described in van Het Hoog
et al., 2007: http://genomebiology.com/content/pdf/gb-2007-8-4-r52.pdf
/Assembly21/archive/
This directory contains archived versions of the Assembly 21
sequences. The sequences are checked for changes weekly and a new
file is added whenever there has been a change. The date of the
update is included in the filename.
------------------------------------------------
/Assembly20/
This directory contains sequence files for Assembly 20.
After October 2008 the files in this directory are NOT updated.
** PLEASE NOTE: Assembly 20 Sequence Advisory
**
** posted October 19, 2006, updated October 25, 2006
**
** The collaborative group who generated Assembly 20 has discovered that
** the sequence traces that they had been using to fill some of the gaps
** and determine overlaps between Assembly 19 contigs were derived from
** strain WO-1, rather than from the reference strain, SC5314.
**
** Please see http://www.candidagenome.org/help/Assembly20_Advisory.shtml
** for the latest information and status updates.
/Assembly20/current/
This directory contains the last updated version of the sequences
as of Oct 6 2008
/Assembly20/current/EMBL_format/
This directory contains current gene and sequence data from the
C. albicans Assembly 20 genome in EMBL file format. Files in this
directory are NOT updated and reflect the Assembly 20
information at CGD as of Oct 6 2008.
/Assembly20/archive
This directory contains archived versions of the Assembly 20
sequences. Before Oct 6 2008 the sequences were checked for changes weekly and a new
file was added whenever there had been changes.
------------------------------------------------
/Assembly19/
This directory contains sequence files for Assembly 19
Note, that for the directories below, all files are available in both
haploid and diploid versions for Assembly 19, the haploid versions
being identified with '_haploid' in the file name. The haploid
versions of these files were created by omitting features mapping to
contigs in the assembly whose names begin with 'Contig19-20', as these
contigs contain the second copy of the alleles identified in the
diploid Assembly 19.
/Assembly19/current/
This directory contains the most current version of the sequences;
files are updated weekly:
/Assembly19/archived_as_released/
This directory contains DNA sequences for diploid assembly 19 as
produced by the Candida Sequencing project at the Stanford Genome
Technology Center. These files are simply copies of the SGTC files,
and are here for archival purposes:
/Assembly19/archive/
This directory contains archived versions of the Assembly 19
sequences. The sequences are checked for changes weekly and a new
file is added whenever there have been changes.
/Assembly19/SC3514_traces/
This directory contains the original SC5314 sequence trace files and
quality scores generated by the Stanford Genome Technology Center
(SGTC). The Candida albicans server at the SGTC has been taken
offline as of October, 2006, and these sequence data were provided by
the SGTC to CGD.
------------------------------------------------
/Assembly6/
/Assembly5/
/Assembly4/
These directories contain archived files from Assembly 6, 5, and 4 of the Candida
albicans (strain SC5314) genome from the Stanford Genome Technology
Center ( see Jones et al. (2004) The diploid genome sequence of
Candida albicans. Proc Natl Acad Sci U S A 101(19):7329-34. URL:
http://www.pnas.org/cgi/content/full/101/19/7329). The Candida
albicans server at the SGTC has been taken offline as of October,
2006, and these sequence data were provided by the SGTC to CGD for
archival purposes.