The 5X (8.1 Gb) directory contains the original data used to generate 
the Regulatory Potential Track.  Each file is a two-column file: the first 
column is the offset (one-based) on the chromosome and the second column 
is the data value at that offset.

=======================================================================
Description:

This track displays regulatory potential (RP) score, computed from alignments
of human (hg17), chimpanzee (panTro1), mouse (mm5), rat (rn3), and dog
(canFam1).

RP scores compare frequencies of short alignment patterns between known
regulatory elements and neutral DNA. Results from a calibration study
investigating sensitivity and specificity RP scores on the hemoglobin beta
gene cluster suggest the use of a threshold ~0.0 for identifying new putative
regulatory elements.

The default viewing range for this track is from 0 to 0.01 (score values below
the 0 default indicate resemblance to alignment patterns typical of neutral
DNA, while score values above the 0.01 default indicate very marked
resemblance to alignment patterns typical of regulatory elements in the
training set). The range of RP scores from 0 to 0.01 contains the prediction
threshold suggested by calibration studies, and provides an effective
visualization of the score for most genomic loci. However, the user can
specify different viewing ranges if desired. Note: Absence of a score value at
a given location indicates lack of sufficient alignment -- scores are computed
for all regions of the human genome in which no region of more than 100 bases
lacks alignment in two or more non-human species.

This track may be configured in a variety of ways to highlight different
aspects of the displayed information. Click the "Graph configuration help"
link for an explanation of the configuration options.

==========================================================================
Methods:

The comparison employs log-ratios of transitions probabilities from two
variable order Markov models. Training the score entails selecting appropriate
alphabet (alignment column symbols) and maximal order (length of the longest
patterns = order + 1) for the Markov models, and estimating their transition
probabilities, based on alignment data from known regulatory elements and
ancestral repeats. The scores in this track are computed using a 6-symbol
alphabet and a maximal order of 2.

In the track, score values are displayed using a system of overlapping windows
of size 100 bp along sufficiently alignable portions of the human sequence.
Log-ratios are added over positions in a window, and the sum is normalized for
length. 

==========================================================================
Credits:

Work on RP scores is performed by members of the Comparative Genomics and
Bioinformatics Center at Penn State University. More information on this
research and the collection of known regulatory elements used in training the
score can be found at this site.

==========================================================================
References:

Blanchette, M., Kent, W.J., Riemer, C., Elnitski, L., Smit, A.F.A., 
Roskin, K.M., Baertsch, R., Rosenbloom, K., Clawson, H., Green, E.D., 
Haussler, D. and Miller, W. Aligning multiple genomic sequences with the 
threaded blockset aligner. Genome Res. 14(4), 708-15 (2004).

King, D.C., Taylor, J., Elnitski, L., Chiaromonte, F., Miller, W. and
Hardison, R.C.  Evaluation of regulatory potential and conservation scores 
for detecting cis-regulatory modules in aligned mammalian genome sequences. 
Genome Res. 15(8), 1051-60 (2005).

Kolbe, D., Taylor, J., Elnitski, L., Eswara, P., Li, J., Miller, W., 
Hardison, R.C. and Chiaromonte, F. Regulatory potential scores from 
genome-wide three-way alignments of human, mouse, and rat. 
Genome Res. 14(4), 700-707 (2004). 
      Name                    Last modified      Size  Description
Parent Directory - 5X/ 2005-10-27 14:20 -