Advanced Biomedical Computational Science (ABCS) | non-B DB

General

What is non-B DB?
- non-B DB is a database that stores candidate non-B DNA structures that are systematically identified from the genomic regions of several mammalian species. non-B DB also allows researchers to intersect non-B DNA information with positions of known variation in the genome to assess the possible disruption of these structures as a function of genotypic variation.
  - Hide
  - |
  - Link To

What is the goal of non-B DB?
- Several recent publications have provided significant evidence that non-B DNA structures may play a role in DNA instability and mutagenesis, leading to both DNA rearrangements and increased mutational rates, which are hallmark of cancer. The goal of non-B DB is to accelerate cancer research by helping cancer researchers find therapeutic treatments for several cancer types including but not limited to breast, ovary, glioblastoma, prostrate, etc.
  - Hide
  - |
  - Link To

What are non-B DNAs?
- DNA exists in many possible conformations that include the A-DNA, B-DNA, and Z-DNA forms; of these, B-DNA is the most common form found in cells. The DNAs that do not fall into a right-handed Watson-Crick double-helix are known as non-B DNAs and comprise cruciform, triplex, slipped (hairpin) structures, tetraplex (G-quadruplex), left-handed Z-DNA, and others.
  - Hide
  - |
  - Link To

Which non-B DNA motifs are included in non-B DB?
- Currently, non-B DB provides the most complete list of alternative DNA motif predictions available, including Z-DNA motifs, quadruplex forming motifs, inverted repeats, mirror repeats and direct repeats and their associated subsets of cruciforms, triplex and slipped structures, respectively. The database also contains motifs predicted to form static DNA bends, short tandem repeats and homo(purine.pyrimidine) tracts.
  - Hide
  - |
  - Link To

What are the advantages of non-B DB?
- Since non-B DB contains information on all the main forms of non-B DNA, it is the most comprehensive database of its kind. Other existing databases cover one type of non-B DNA. In addition to the wider scope of coverage, non-B DB was built upon the most updated genome assembly available as of today (e.g. hg19 (build 38) for human). In addition, non-B DB enables users to perform different types of query and visualize the results using GBrowse 2.0 which allows multiple data sources.
  - Hide
  - |
  - Link To

What are the search criteria used for locating potential non-B DNA motifs?

We have taken the approach of using rather broad and general identification methods based exclusively on sequence features; thus, although subsequent filtering of the sampled data is straightforward because of the flexibility provided by the database, our current criteria are expected to include a subset of both false positive and negative hits.

Input from the community regarding enhanced algorithms for the detection or scoring of identified motifs is most welcomed and may be incorporated into the system if appropriate. Please feel free to contact us.

DNA feature	Search Criteria	Subset of "DNA feature" forming non-B DNA	Search criteria for "Subset of DNA feature"
Inverted Repeat	10â€“100 nt with reverse complement within 100 nt spacer	Cruciform_Motif	if spacer=0-3 nt
Mirror Repeat	10â€“100 nt mirrored within 100 nt spacer	Triplex_Motif	90% Purine or Pyrimidine and 0â€“8 nt spacer
Direct Repeat	10â€“50 nt repeated within 5 nt spacer	Slipped_Motif	if spacer=0 nt
G-Quadruplex Forming Repeats	4 or more G-tracts (3-7 Gâ€™s) separated by 1â€“7 nt spacers; Preference for short spacers with Câ€™s and/or Tâ€™s	Whole set	As per the whole set
Z-DNA Repeat	G followed by Y (C or T) for at least 10 nt; One strand must be alternating Gs	Whole set	As per the whole set
A-Phased Repeats	3 or more A-tracts (3-5 As) 10 nt on center each; Spacers between equal sized A-tracts must contain some non As	Whole set	As per the whole set

Is there any example on how to use this non-B DB?
- Yes, please visit our HELP section where you will find some example.
  - Hide
  - |
  - Link To

What are the acronyms used in non-B DB?

Acronym	Description
A	Adenine
ABCC	Advanced Biomedical Computing Center
APR	A-Phased Repeat
C	Cytosine
CMP	composition
DAS	Distributed Annotation System
DR	Direct Repeat
G	Guanine
GFF	General Feature Format
GQFR	G-Quadruplex Forming Repeat
IR	Inverted Repeat
MB	Megabyte
MR	Mirror Repeat
nBMST	non-B DNA Motif Search Tool
R	Purine
rsrd	right strand right direction (in reference to DNA strand orientation)
STR	Short Tandem Repeat
T	Thymine
wswd	wrong strand wrong direction (in reference to DNA strand orientation)
Y	Pyrimidine
ZDM	Z-DNA Motif

How can I register to the non-B DB resource?
- Please visit the registration page .
  - Hide
  - |
  - Link To

Genomic Database Search Tools

What is the difference between Search by Feature and Search by Location?
- Search by Feature allows you to search the non-B DB by specifying one or more genomic features.
  Search by Location allows you to get location-specific annotations. Enter chromosome, start, and stop information and all the genomic features found within the region will be returned in a tabular format.
  - Hide
  - |
  - Link To

What is a GFF format?
- GFF stands for Generic Feature Format and is used for describing genes and other localized features associated with DNA, RNA and Protein sequences. Click here to learn more about GFF.
  - Hide
  - |
  - Link To

What is DAS which I keep seeing in the name of databases, e.g. nonbDAS?

DAS stands for Distributed Sequence Annotation System. The four DAS available on the non-B DB are as follows:

DAS	Description
abccDAS	This database includes features computed at the ABCC including PuPys, STRs, composition, physical DNA characteristics, gene based synteny blocks (GBSB), etc.
mappingDAS	This database includes features remapped to the reference genome using gmap for other data sources such as RefSeq, Ensembl, MGC, Unigene,miRBase etc.
ncbiDAS	This database includes features derived from the NCBI genomes directories including genes, SNPs, cytogenic markers assembly information, RepeatMasker elements, etc.
nonbDAS	This database ncludes alternative DNA structure predictions, including Z-DNA motifs, g-quadruplex forming motifs, inverted repeats, mirror repeats and direct repeats and their associated subsets of cruciforms, triplex and slipped structures, respectively.

Can I download the results?
- Yes, on your result page, click on "Tab-Delimited File" or "GFF File". This should open up a page which you can save and open in a text editor or Microsoft Excel program.
  - Hide
  - |
  - Link To

What are non-B feature attributes?
- A Phased Repeat
  - Composition (Equals, Not Equal To)
  - Sequence (Equals, Not Equal To)
  - Tracts (Equals, Not Equal To, Greater Than, Less Than)
  Direct Repeat
  - Spacer(Equals, Not Equal To, Greater Than, Less Than)
  - Repeat (Equals, Not Equal To, Greater Than, Less Than)
  - Composition (Equals, Not Equal To)
  - Sequence (Equals, Not Equal To)
  - Subset (Equals, Not Equal To)
  G Quadruplex Motif
  - Composition (Equals, Not Equal To)
  - Sequence (Equals, Not Equal To)
  - Islands (Equals, Not Equal To, Greater Than, Less Than)
  - Runs (Equals, Not Equal To, Greater Than, Less Than)
  - Max (Equals, Not Equal To, Greater Than, Less Than)
  Inverted Repeat
  - Spacer(Equals, Not Equal To, Greater Than, Less Than)
  - Repeat (Equals, Not Equal To, Greater Than, Less Than)
  - Perms (Equals, Not Equal To, Greater Than, Less Than)
  - Minloop (Not relevant, should be removed as an option)
  - Composition (Equals, Not Equal To)
  - Sequence (Equals, Not Equal To)
  - Subset (Equals, Not Equal To)
  Mirror Repeat
  - Spacer(Equals, Not Equal To, Greater Than, Less Than)
  - Repeat (Equals, Not Equal To, Greater Than, Less Than)
  - Perms (Equals, Not Equal To, Greater Than, Less Than)
  - Minloop (Not relevant, should be removed as an option)
  - Composition (Equals, Not Equal To)
  - Sequence (Equals, Not Equal To)
  - Subset (Equals, Not Equal To)
  Short Tandem Repeat
  - Composition (Equals, Not Equal To)
  - Sequence (Equals, Not Equal To)
  - Length (Equals, Not Equal To, Greater Than, Less Than)
  - Type(Equals, Not Equal To, Greater Than, Less Than)
  Z DNA Motif
  - Composition (Equals, Not Equal To)
  - Sequence (Equals, Not Equal To)
  - Subset (being removed from current version)
  - Length (Equals, Not Equal To, Greater Than, Less Than)
  - Score (being removed from current version)
  - Hide
  - |
  - Link To

Miscellaneous

What about questions that are not answered in this FAQs section?
- There are two ways to contact us:
  
  (1) At the top of this FAQ page, click on Submit a Question.
  
  (2) On the side menu, click on Contact us.
  - Hide
  - |
  - Link To

non-B Motif Search Tool (nBMST)

Is there a sample data to try out nBMST?
- Yes. On the nBMST page, there are two example sequences for demonstration purpose: a single FASTA sequence and multi FASTA sequences.
  - Hide
  - |
  - Link To

How can I download all the results in just one click?
- If you had selected more than one non-B motif, there will be a clickable image Download all files for this job. This comprehensive download includes all the GFF files, tab delimited files, and PNG images for all the non-B motifs you have selected.
  - Hide
  - |
  - Link To

Where can I find the actual sequences of non-B DNA motif found in my sequence?
- In the GFF output file, the last column contains the actual sequence information.
  
  For example,
  
  chr8 ABCC G-Quadruplex_Forming_Repeat 1603 1630 . + . ID=chr8_1603_1630_GQFR_rsrd;Length=28;BestStruct=G4-N7-G4-N2-G4-N2-G4(CGGTACT/GT/AC);NbrStructs=8;Sequence=GGGGCGGTACTGGGGGTGGGGACGGGGG;BestScore=16;
  - Hide
  - |
  - Link To

Can I specify how long my repeats can be, the length of the spacer(loops) and the inclusion of mismatches in this nBMST?
- In this release of nBMST v1.0, these parameters are not yet implemented. However, these features will be available in the next release.
  - Hide
  - |
  - Link To

Can I submit a batch of sequences on nBMST?
- Yes, you can submit a batch of sequences which must be separated by a ">".
  Example:
  >sequence 1
  actgggg
  >sequence 2
  aaaaaaaaagggggggggcccccccccggg
  >sequence 3
  ttgggggcccgggg
  - Hide
  - |
  - Link To

Why do I see two types of motifs in one selection? For instance, mirror repeats and triplex motifs. Why can't I select only one type of motif (i.e. only mirror repeats)?
- Our computational algorithms search for patterns matching a certain type of non-B DNA and its subset. Therefore, when you search for mirror repeats, you will also be given the subset of triplex forming motifs. The same applies for inverted repeats and the subset of cruciform motifs, and direct repeats and the subset of slipped motifs.
  - Hide
  - |
  - Link To

Why do I need to enter captcha information?
- This is to reduce and potentially prevent spambots from overwhelming our system. To avoid having to enter captcah info, please register and log in before submitting a job.
  - Hide
  - |
  - Link To

How will I know if there is no motifs found in the DNA sequence I submitted?
- The file will be empty and the result page will show explicitly.
  - Hide
  - |
  - Link To

Is there a size limit of a file I may upload?
- Yes, we allow up to 20 MB of data to be uploaded. If you need to analyze a larger file size, please feel free to contact us. We will be happy to assist you.
  - Hide
  - |
  - Link To

How long will my results be stored on the server?
- The results will be stored in our server for 3 days for non-registered users and 6 months for registered users.
  - Hide
  - |
  - Link To

How long will my job take?
- Turnaround time for results will vary depending on the size of the sequence(s), the number and types of the non-B DNA motifs selected, and the computing resources available at the ABCC at the time of submission.
  
  In cases where input sequences are very large and/or multiple motifs are selected, an email address is recommended to avoid waiting on-line. A notification email will be sent when the job is completed.
  
  It should be noted that the algorithms for mirror and inverted repeats are the most computationally intensive, and therefore take more time to complete than the rest of the motifs. Thus, in cases where quick results are desired for large sequences, it is recommended that separate jobs be submitted for these two motif types.
  - Hide
  - |
  - Link To

Motifs Visualization

What is Motifs Visualization?
- It allows users to view non-B DNA motifs in whole genome level in Circos Plot, individual or multiple chromosomal level in R plots.
  - Hide
  - |
  - Link To

non-B DB v2.0