Download Data

Files are described in detail in their associated README documents.


Help


  GO Annotations File
The gene_association.cgd file contains the Gene Ontology (GO) curation from CGD. Please note that this file contains ALL of the CGD GO curation, whereas the gene_association file that is available on the GO consortium (GOC) web site, http://www.geneontology.org/, has been filtered according to GOC guidelines. The file contains curation for current ORFs only (i.e., ORFs that have been deleted from C. albicans Assembly 21 have been omitted from this file).
Download the File|View the README

Chromosomal Feature File
The chromosomal_feature file contains information such as names and aliases (synonyms), descriptions, and any S. cerevisiae orthologs for the chromosomal features (including protein-encoding and non-coding genes) in CGD. The information for each curated species in CGD is contained in a separate chromosomal_feature file.
Go to the Chromosomal Feature File Directory|View the README

Sequence Files
Files containing chromosome, contig, ORF, protein, and intergenic sequences from Candida and Candida-related strains and species are available for download from this directory. For species for which older versions of the sequence and annotation are available (including C. albicans SC5314), both current sequence data and archived sequence data may be downloaded.
Go to the Sequence Download Directory|View the README

Phenotype data
The phenotype_data.tab file contains the CGD phenotype curation, in a tab-delimited table format. The phenotype information for each curated species in CGD is contained in a separate file.
Go to the Directory |View the README

Datasets archived at CGD
Access the large-scale datasets from the CGD web site.
View the index|Browse the Download Directory

Analyses of C. albicans Assembly 21
ORFs were mapped from Assemblies 20 and 19 to Assembly 21, and the Assembly 21 files were processed at CGD to identify and classify changes that occurred between assemblies, and to identify other issues, as described in detail on the Sequence documentation page. Files containing all of these analyses (ORF lists, sequences, and/or alignments) are available.
Go to the Directory|View the README

Analyses of C. albicans Assembly 20
Assembly 20 files were processed at CGD to identify and classify changes that occurred between Assembly 19 and Assembly 20, and to identify other features in which users may be interested (e.g., introns/gaps/reading-frame adjustments), as described in detail on the Sequence documentation page. Files containing all of these analyses (ORF lists, sequences, and/or alignments) are available.
Go to the Directory|View the README

Supplementary Data for C. albicans Assembly 19
The supplement to the C. albicans genomic sequencing paper, Jones et al., 2004. Access information about heterozygous polymorphism, raw contigs from the Phrap assembly, sequence omitted from diploid Assembly 19 (rDNA, mtDNA, etc.), and more.
Go to the Index Page

C. albicans Assembly 19 Contig Diagrams
These PDF files depict the the assembly of Contig19's from Contig6's by the Stanford Genome Technology Center (SGTC). These files were originally made available from the Candida web server at the SGTC, and copies are archived here at CGD.
Download the Files|View the README

Mappings to external resources
These files contain mappings between CGD features and sequences from external resources, such as Uniprot/Swissprot, RefSeq and Entrez Gene databases.
Download the Files|View the README

Mappings of Historic C. albicans Contigs and ORFs
These files summarize the BLAST-based mapping of ORFs and contigs from older assemblies onto the Assembly 19 contigs and the Assembly 21 and Assembly 20 chromosomes.
Go to the Directory|View the README

C. albicans Assembly 6 Aliases
The orf19_orf6_mapping file provides a mapping between the orf6 names assigned during Assembly 6 of the genome sequence and the orf19 names assigned during Assembly 19.
Download the File|View the README

C. albicans Assembly 4 Aliases
The orf4_orf19_mapping file provides a mapping between the identifiers from Assembly 4 of the genome sequence and the orf19 names assigned during Assembly 19.
Download the File|View the README

GFF files
This directory contains files with information about the features in CGD and features in historic versions of the genome assembly in Generic Feature Format (GFF), as displayed in the GBrowse Genome Browser in CGD.
Go to the Directory|View the README

Orthologs and Best Hits
The Orthologs directory contains the mappings between CGD genes and predicted orthologs among Candida-related species and in S. cerevisiae and S. pombe. The Orthologs directory also contains positional orthology mappings between C. albicans and C. dubliniensis, provided by John Gamble and Matthew Berriman at the Wellcome Trust Sanger Institute. The Best Hits directory contains mappings between C. albicans and S. cerevisiae at a level of similarity below that required by the strict criteria used to determine orthology.
Go to the Orthologs Directory|Go to the Best Hits Directory

Pathway Files
Download files with information about metabolic pathways from CGD.
Go to the Pathways Directory|View the README

Protein Domain Predictions
Output of IprScan domain predictions for all CGD proteins.
Go to the Domains Directory| View the README

Codon Usage Table
The C_albicans_codon_usage file contains a table of calculated codon usage frequencies.
View the File and README

Miscellaneous Annotation Files
This directory contains files that were constructed by CGD in response to a specific request, but which may be useful to other members of the research community. The C_albicans_codon_usage.tab file contains a codon usage table (for Assembly 21). The CGD_GO_genespring_format.tab file contains Gene Ontology curation in a format for use with GeneSpring software. The orf6_short_desc.txt and orf6_long_desc_plus_GO.txt files contain basic information about genes in CGD from Assemblies 6 and 19 (e.g., orf6 and orf19 names, description information, and GO terms).
Go to the Directory|View the README

Community-contributed Data Files
Files contributed by members of the community. UAU1_nondisruptable.txt (from Aaron Mitchell) is a list of C. albicans genes in which no UAU1 insertions were obtained among at least 12 independent transformants, suggesting that these genes may be essential (but conclusive demonstration of essentiality requires additional experimentation). The GSEA_Nantel_2012 directory contains files for running Gene Set Enrichment Analysis (GSEA) on Candida albicans data, provided to CGD by Andre Nantel.
Go to the Directory|View the README

GO Slim file
The goslim_candida.obo file contains the subset of GO that is used with the GO Slim Term Mapper tool. The file is best viewed with OBO_Edit, a tool available at the Gene Ontology Consortium website.
View the obo file

Candida GO Slim Annotations File
The GOslim_gene_association.cgd file contains GO Slim annotations for CGD genes, using the CGD GO Slim instead of the entire Gene Ontology. A GO Slim is a small subset of terms from the Gene Ontology, which is intended to provide a general overview without all the fine-grained detail contained in the GO itself. For the actual CGD GO curation, please use the gene_association.cgd file.
Download the File|View the README

Batch Download Tool
Simultaneously retrieve multiple types of data for a list of gene or feature names.
Go to the Batch Download Tool page

Browse Downloads
Browse the Download directories on the CGD web site.
Go to the top-level directory



Return to CGD Send a Message to the CGD Curators