Index of /download/Assembly20notes/Advisory
Name Last modified Size Description
Parent Directory -
suspect_WO1_regions_reduced.gff 2023-06-29 09:53 41K
suspect_WO1_regions.gff 2023-06-29 09:53 26K
ORFsWithinSuspectRegions_reduced.txt 2023-06-29 09:53 85K
ORFsWithinSuspectRegions.txt 2023-06-29 09:53 77K
This directory contains files with additional information related to the Assembly
20 Sequence Advisory. For the latest information and status updates, please see:
http://www.candidagenome.org/help/Assembly20_Advisory.shtml
------------------------------------
suspect_WO1_regions.gff
Lists the regions that were flagged by the BRI as potentially
derived from WO-1, and chromosomal coordinates of these regions.
This file is in Generic Feature Format
(GFF, http://www.sequenceontology.org/gff3.shtml).
------------------------------------
ORFsWithinSuspectRegions.txt
Lists the ORFs and non-ORF features (e.g., tRNA) that are affected by the suspect
regions (i.e., fully or partly contained within a suspect region). Includes
chromosomal coordinates of the ORF/feature and the suspect region that overlaps
it, with additional descriptive information about each ORF.
This file is in tab-delimited text format.
------------------------------------
suspect_WO1_regions_reduced.gff
Lists the regions and their chromosomal coordinates that are potentially derived from
WO-1. The BRI identified as "suspect" the gaps between contigs, which may have been
filled with sequence from WO-1, plus 1 kb regions flanking each gap, in which the BRI
may have made changes to the SC5314 sequence based on WO-1 sequence. CGD compared the
1kb flanking parts of each suspect region to Contig19 sequences, and reduced the size
of the suspect region where the sequence was clearly the same as the original sequence
from SC5314.
Specifically, this was accomplished as follows: Beginning from the side
of the suspect flanking region furthest from the gap (the side that abuts the non-suspect
region of the contig), a region of 100 bp was compared to the corresponding Assembly 19
contig by BLAST. If the sequence matched perfectly, the region was considered "no longer
suspect," and the adjoining 100 bp region of the suspect flanking region was compared
to the Assembly 19 contig. Iterations continued, and the suspect region was reduced in
100 bp increments, as long as the 100 bp section of the Assembly 20 flanking region and
the corresponding Assembly 19 contig showed 100% identity. If any sequence discrepancy
was encountered, the entire 100 bp section of the flanking region, and all of the flanking
region remaining between the section of the flanking region and the gap, remains classified
as "suspect." The section of the flanking region which aligns perfectly with the contig has
been removed from the suspect list. These are the regions that now appear with the label
"Suspect WO1" in the Genome Browser on the CGD web site.
------------------------------------
ORFsWithinSuspectRegions_reduced.txt
Lists the ORFs and non-ORF features (e.g., tRNA) that are affected by the suspect
regions (i.e., fully or partly contained within a suspect region) after the
regions have been reduced by CGD, as described above. Includes chromosomal coordinates of the
ORF/feature and the suspect region that overlaps it, with additional descriptive
information about each ORF.
This file is in tab-delimited text format.
------------------------------------