Download Data Files are described in detail in their associated README documents.
|
 |
|
| |
GO Annotations File |
The gene_association.cgd file contains the Gene Ontology (GO) curation from CGD. Please note that this file contains ALL of the CGD GO curation, whereas the gene_association file that is available on the GO consortium (GOC) web site, http://www.geneontology.org/, has been filtered according to GOC guidelines. The file contains curation for current ORFs only (i.e., ORFs that have been deleted from Assembly 21 have been omitted from this file).
Download the File|View the README
|
Chromosomal Feature File |
The chromosomal_feature file contains information such as names and aliases (synonyms), descriptions, and any S. cerevisiae orthologs for the chromosomal features (including protein-encoding and non-coding genes) in CGD. ORFs present in Assemblies 19, 20, and/or 21 are included in this file.
Download the File|View the README
|
Sequence Files |
Files containing chromosome, contig, ORF, protein, and intergenic sequences from Candida and Candida-related strains and species are available for download from this directory. For C. albicans SC5314, current sequence data and archived sequence data from previous assemblies of the genome sequence are available. Before downloading data from Assembly 20, please see the Assembly 20 Sequence Advisory.
Go to the Sequence Download
Directory|View the
README
|
Phenotype data |
The
phenotype_data.tab file contains the CGD
phenotype curation, in a tab-delimited table format.
Download the File
|View the README
|
Analyses of Assembly 21 |
ORFs were mapped from Assemblies 20 and 19 to Assembly 21, and the Assembly 21 files were processed at CGD to identify and classify changes that occurred between assemblies, and to identify other issues, as described in detail on the Sequence documentation page. Files containing all of these analyses (ORF lists, sequences, and/or alignments) are available.
Go to the Directory|View the README
|
Analyses of Assembly 20 |
The Assembly 20 files were processed at CGD to identify and classify changes that occurred between Assembly 19 and Assembly 20, and to identify other features in which users may be interested (e.g., introns/gaps/reading-frame adjustments), as described in detail on the Sequence documentation page. Files containing all of these analyses (ORF lists, sequences, and/or alignments) are available.
Go to the Directory|View the README
|
Supplementary Data for Assembly 19 |
The supplement to the C. albicans genomic sequencing paper, Jones et al., 2004. Access information about heterozygous polymorphism, raw contigs from the Phrap assembly, sequence omitted from diploid Assembly 19 (rDNA, mtDNA, etc.), and more.
Go to the Index Page
|
Assembly 19 Contig Diagrams |
These PDF files depict the the assembly of Contig19's from Contig6's by the Stanford Genome Technology Center (SGTC). These files were originally made available from the Candida web server at the SGTC, and copies are archived here at CGD.
Download the Files|View the README
|
Mappings to external resources |
These files contain mappings between CGD features and sequences from external resources, such as Uniprot/Swissprot,
RefSeq and Entrez Gene databases.
Download the Files|View the README
|
Mappings of Historic Contigs and ORFs |
These files summarize the BLAST-based mapping of ORFs and contigs from older assemblies onto the Assembly 19 contigs and the Assembly 21 and Assembly 20 chromosomes.
Go to the Directory|View the README
|
Assembly 6 Aliases |
The orf19_orf6_mapping file provides a mapping between the orf6 names assigned during Assembly 6 of the genome sequence and the orf19 names assigned during Assembly 19.
Download the File|View the README
|
Assembly 4 Aliases |
The orf4_orf19_mapping file provides a mapping between the identifiers from Assembly 4 of the genome sequence and the orf19 names assigned during Assembly 19.
Download the File|View the README
|
GFF files |
This directory contains files with information about the features in CGD in Generic Feature Format (GFF), as displayed in the GBrowse Genome Browser in CGD. The information contained in these files includes the CGD annotation in GFF format, the historic C. albicans genome assembly mapping files, gap regions in Assembly 21, introns in 5' UTRs from Mitrovich et al. (2007), and SNPs from Forche et al. (2004).
Go to the Directory|View the README
|
Orthologs and Best Hits |
The Orthologs directory contains
the mappings between C. albicans genes and the predicted
orthologs in S. cerevisiae, as well as the raw input
and output files from the InParanoid ortholog
computation. S. cerevisiae ortholog predictions are available for Assembly
21, Assembly 20, and Assembly 19. The Orthologs directory also
contains positional orthology mappings between C. albicans and
C. dubliniensis, provided by John Gamble and Matthew Berriman
at the Wellcome Trust Sanger Institute. The Best Hits
directory contains mappings between C. albicans and
S. cerevisiae at a level of similarity below
that required by the strict criteria used to determine orthology.
Go to the Orthologs
Directory|Go to the Best Hits
Directory
|
Pathway Files |
Download files with
information about metabolic pathways from CGD.
Go to the Pathways Directory|View the
README
|
Protein Domain Predictions |
Output of
IprScan domain predictions for all CGD proteins.
Go to the Domains Directory|
View the README
|
Codon Usage Table |
The C_albicans_codon_usage file contains a table of calculated codon usage frequencies.
View the File and README
|
Miscellaneous Annotation Files |
This directory contains
files that were constructed by CGD in response to a specific
request, but which may be useful to other members of the
research community. The C_albicans_codon_usage.tab file
contains a codon usage table (for Assembly 21). The CGD_GO_genespring_format.tab file
contains Gene Ontology curation in a format for use with GeneSpring software. The orf6_short_desc.txt and orf6_long_desc_plus_GO.txt files contain basic information about genes in CGD from Assemblies 6 and 19 (e.g., orf6 and orf19 names, description information, and GO terms).
Go to the Directory|View the
README
|
Community-contributed Data Files |
Files contributed by
members of the community. UAU1_nondisruptable.txt (from Aaron Mitchell) is a list of C. albicans genes in which no UAU1 insertions were obtained among at least 12 independent transformants, suggesting that these genes may be essential (but conclusive demonstration of essentiality requires additional experimentation).
Go to the Directory|View the README
|
GO Slim file |
The goslim_candida.obo file contains the subset of GO
that is used with the GO
Slim Term Mapper tool. The file is best viewed with OBO_Edit, a tool available at the Gene Ontology Consortium website.
View the obo file
|
Candida GO Slim Annotations File |
The GOslim_gene_association.cgd file contains GO Slim annotations for Candida genes, using the CGD GO Slim instead of the entire Gene Ontology. A GO Slim is a small subset of terms from the Gene Ontology, which is intended to provide a general overview without all the fine-grained detail contained in the GO itself. For the actual CGD GO curation, please use the gene_association.cgd file.
Download the File|View the README
|
Datasets archived at CGD |
Access the large-scale datasets from the CGD web site.
View the index|Browse the ftp
directory
|
| Batch
Download Tool |
Simultaneously retrieve
multiple types of data for a list of gene or feature names.
Go to the Batch
Download Tool page
|
Browse Downloads |
Browse the Download directories on the CGD web site.
Go to the top-level directory
|
|
|