The 2X (2.2 Gb) and 3X (1.9 Gb) directories contain the original 
data used to generate the Regulatory Potential Tracks.  Each file
is a two-column file: the first column is the offset (one-based)
on the chromosome and the second column is the data value at that
offset.

=======================================================================
Description:

The 3X Regulatory Potential Track displays the 3-way regulatory potential 
(RP) score, computed from alignments of human (hg16, Jul. 2003), 
mouse (mm3, Feb. 2003) and rat (rn3, Jun. 2003). 

The 2X track displays the 2-way regulatory potential (RP) score computed
from alignments of human (hg16, Jul. 2003) and mouse (mm4, Oct. 2003).

RP scores compare frequencies of short alignment patterns between 
regulatory elements and neutral DNA.

Score values at or below 0 indicate resemblance to alignment patterns
typical of neutral DNA. Score values at or above 0.1 indicate marked 
resemblance to alignment patterns typical of regulatory elements. 
Absence of a score value at a given location indicates lack of 3-way 
alignment.

Preliminary results from a calibration study investigating sensitivity
and specificity of the 3-way RP score on the hemoglobin beta gene 
cluster suggest the use of a threshold ~ 0.0002 for identifying new 
putative regulatory elements.

==========================================================================
Methods:

The comparison employs log-ratios of transition probabilities from two 
Markov models. Training the score entails selecting appropriate alphabet
(alignment column symbols) and order (length of the patterns = order + 1) 
for the Markov models and estimating their transition probabilities, based 
on alignment data from known regulatory elements and ancestral repeats.
The 3-way RP score uses a 10-symbol alphabet and order 2.

In the track, score values are displayed using a system of overlapping
windows of size 100 bp along aligned portions of the human sequence.
Log-ratios are added over positions in a window and the sum is normalized 
for length.  

==========================================================================
Credits:

Work on RP scores is performed by members of the Comparative Genomics 
and Bioinformatics Center at Penn State University (http://www.bx.psu.edu/). 
More information on this research and the collection of known regulatory 
elements used in training the score can be found at this site. 

Mouse and rat sequence data were provided by the Mouse and Rat Genome
Sequencing Consortia. The alignment data were created in collaboration with 
the UCSC Genome Bioinformatics group.

==========================================================================
References:

Blanchette M., Kent W.J., Riemer C., Elnitski L., Smit A.F.A., Roskin K.M.,
Baertsch R., Rosenbloom K., Clawson H., Green E.D., Haussler D. and
Miller W. (2004) Aligning Multiple Genomic Sequences with the Threaded
Blockset Aligner. Genome Res. 14(4):708-15.

Kolbe D., Taylor J., Elnitski L., Eswara P., Li J., Miller W., Hardison R.C.
and Chiaromonte F. (2004) Regulatory potential scores from genome-wide 3-way
alignments of human, mouse and rat. Genome Res. 14(4):700-707. 
      Name                    Last modified      Size  Description
Parent Directory - 3X/ 2004-02-06 15:30 - 2X/ 2004-02-06 15:30 -