Contents
- Description
- Using Gene/Sequence Resources
- Finding Chromosomal Coordinates
- Accessing Gene/Sequence Resources
- Other Relevant Links
Gene/Sequence Resources (GSR) serves as a central point for accessing much of the information available at CGD for a 1) a named DNA sequence, 2) a specified chromosomal region or list of regions, or 3) a raw DNA or protein sequence. This information includes biological information, table/map displays, and sequence analysis and retrieval options. Once you have specified a sequence name or region(s), GSR will present only those options which are available for obtaining information about your entry.
CGD's Gene/Sequence Resources tool and Batch Download tool both allow you to retrieve sequences in batch for a list of regions. The difference between the batch options of these two tools is that GSR retrieves the entire nucleotide sequence between the coordinates specified in a list, while Batch Download retrieves only the sequences of the features (protein-coding and RNA genes, centromeres, etc.) that are annotated within the specified regions.
Whenever possible, selecting one of the available options for your entered sequence name or region takes you directly to the results for your entered sequence. In other cases, it takes you directly to the resource with the sequence already pre-pasted (e.g., BLAST, Restriction Analysis). When a list of sequence regions is entered, you will be presented with a link allowing you to download a file containing the sequences.
Note: Only ONE of the options listed below may be filled out at a time in order for the submission to be processed.
If at any time you decide you would like to change your selection, you may use the "Reset Form" button found at the bottom of the page to erase your entries.
All of the options retrieve the Watson strand (with ascending chromosomal coordinates) of the specified gene or region. To retrieve the Crick strand, click the "Use the reverse complement" checkbox for that option.
With this input option, you can enter a gene name (e.g., ACT1), ORF name (e.g., orf19.5007), or a CGDID (e.g., CAL0001571). After you've entered the sequence name, click the "Submit form" button. Note that you can also enter the first few characters of a named DNA sequence followed by the wildcard character (*). Submit the query. Queries that match multiple names will return a list of sequences from which you can then select a single sequence.
You may also retrieve information about flanking sequences upstream and/or downstream of the entered gene/sequence name. To do this, type the length of the flanking region you would like to retrieve in the boxes (upstream and/or downstream) below where you entered the sequence name. Note: negative numbers are not accepted in these boxes. If you would like to retrieve part of an ORF you should use the chromosomal coordinates in retrieval option 2.
Select "Submit form" to bring up the list of further options available for viewing and retrieving information about the named sequence you entered.
This entry option allows you to specify a region by choosing the chromosome number or contig name from the pull-down menu and entering coordinates, if desired. To use this option, simply select a chromosome or contig using the pull-down menu, and then enter the left and right chromosomal basepair coordinates in the boxes provided below. If no coordinates are entered, the first 100,000 bp of the chromosome or contig will be retrieved.
This option will retrieve the Watson strand, regardless of the order in which the coordinates are entered. If you would like to retrieve or manipulate the reverse complement of the sequence, check the "Use the Reverse Complement" box.
Select "Submit form" to bring up the list of further options available for viewing and retrieving information about the sequence you selected.
Note that the entire sequences of the chromosomes and contigs are also available for download from CGD's Sequence Download directory.
If you would like to retrieve DNA sequences from multiple regions, create a file with the tab- or space-separated columns:
For example:
Ca21chr3_C_albicans_SC5314 1356 20455 Ca21chr4_C_albicans_SC5314 11331 18001 Ca21chr6_C_albicans_SC5314 9856 100010 Contig19-10109 4600 24000 Contig19-10216 200310 220546Sequence coordinates (columns 2 and 3) are optional in this file; if no coordinates are present, the first 100,000 nucleotides of the chromosome or contig will be retrieved.
Use the "Browse" button to locate the file on your computer and click the Submit button to upload it. The resulting page presents a link allowing you to download a compressed file in FASTA format containing the sequences you requested.
This entry option allows you to enter a raw DNA or protein sequence for which you would like to retrieve information. First, use the pull-down menu to select the type of sequence you would like to enter, DNA or protein. Next, position the cursor in the entry box and either type or paste in a sequence.
Note that the sequence entered must be provided in RAW format, without comments (numbers are okay).
Select "Submit form" to bring up the list of further options available for viewing and retrieving information about the named sequence you entered.
After you submit either 1) a named sequence 2) a chromosomal region or list of regions, or 3) a raw DNA or protein sequence, a page is returned that lists all available information, displays, analyses, and sequence retrieval options available for the sequence. You will choose one of these options. Descriptions for each option are below. Please note that some of them may not be available for your selected entry, depending on the type of sequence you have entered.
If you decide you would like to change your original selection, hit the button [Change Selection or Coordinates]. This will bring back the Gene/Sequence Resources entry page with your information still filled in. You can then modify and re-submit the form.
Availability: This option is available when a gene or ORF name is entered on the Gene/Sequence Resources entry page.
Availability: This option is available when a gene or ORF name is entered on the Gene/Sequence Resources entry page.
Availability: This option is available when a gene, ORF, or chromosomal region is entered.
Availability: This option is available when a gene, ORF, or chromosomal region is entered.
Availability: This option is available when a gene name, ORF, chromosomal region, raw DNA sequence, or raw protein sequence is entered on the Gene/Sequence Resources page.
Availability: This option is available when a gene name, ORF, chromosomal region, or raw DNA sequence is entered on the Gene/Sequence Resources page.
Availability: This option is available when a gene name, ORF, chromosomal region, or raw DNA sequence is entered on the Gene/Sequence Resources page.
The Decorated FASTA format is generated by GBrowse and shows sequence features by using various color schemes. The following sequence features are highlighted with the associated decorations as described below:
Availability: This option is available when a gene or ORF name or chromosomal region is entered on the Gene/Sequence Resources page.
Availability: This option is available when a gene name or ORF is entered on the Gene/Sequence Resources entry page.
Availability: This option is available when a gene or ORF name is entered on the Gene/Sequence Resources entry page.
Go to Gene/Sequence Resources
Return to CGD | Send a Message to the CGD Curators |