The file, orf19_orf6_mapping.txt, provides a mapping from the names of the Open Reading Frames identified in C. albicans SC5314 Assembly 19, to the names of the ORFs in Assembly 6. This mapping was done by blasting the haploid set of orf19 predicted proteins (file available at http://www.candidagenome.org/ download/sequence/genomic_sequence/orf_protein/orf_trans_all_haploid.fasta, as of October 27, 2005) against orf6 predicted proteins (file from the Stanford Genome Technology Center, downloaded from http://www.candidagenome.org/ download/sequence/genomic_sequence/archived_assemblies/Ca-Assembly6.orf_trans). The best hit, or hits with >90% identity were retained. The pairs were subsequently screened, such that if an orf6 in an orf19-orf6 pairing had a more significant hit to a different orf19, then the less significant pairing was removed. In cases where multiple orf6 matches were observed for a single orf19, some subsequent manual curation was performed to remove pairs with less significant E values. An attempt was made to ensure that adjacent orf6's aligned with adjacent orf19's; however, this approach proved not to be helpful as a measure of validation due to apparent regions of misassembly in Assembly 6. Note, this is not necessarily a 1-to-1 mapping; some ORFs have multiple matches. The file of pairing contains the following columns: Column Description 1 The orf19 identifier 2 The Assembly 19 Contig from which the orf19 ORF derives 3 The orf6 identifier 4 The Assembly 6 Contig from which the orf6 ORF derives 5 E, the expectation or E-value 6 N, the number of scores considered jointly in computing E 7 Sprime, the normalized alignment score, expressed in units of bits 8 S, the raw alignment score 9 alignlen, the overall length of the alignment including any gaps 10 nident, the number of identical letter pairs 11 npos, the number of letter pairs contributing a positive score 12 nmism, the number of mismatched letter pairs 13 pcident, percent identity over the alignment length (as a fraction of alignlen) 14 pcpos, percent positive letter pairs over the alignment length (as a fraction of alignlen) 15 qgaps, number of gaps in the query sequence 16 qgaplen, total length of all gaps in the query sequence 17 sgaps, number of gaps in the subject sequence 18 sgaplen, total length of all gaps in the subject sequence 19 qframe, the reading frame in the query sequence (+0 for protein sequences in BLASTP and TBLASTN searches) 20 qstart, the starting coordinate of the alignment in the query sequence 21 qend, the ending coordinate of the alignment in the query sequence 22 sframe, the reading frame in the subject sequence (+0 for protein sequences in BLASTP and BLASTX searches) 23 sstart, the starting coordinate of the alignment in the subject sequence 24 send, the ending coordinate of the alignment in the subject sequence