This directory contains human/chimpanzee "reciprocal best" alignments
made using the Jul. 2003 human assembly (NCBI Build 34, UCSC hg16)
and the 13 Nov. 2003 Arachne 4x draft chimp assembly from
the Broad Institute, MIT/Harvard, with sequence provided by
the Broad Institute and Washington University, St. Louis.
The Chimpanzee Genome Sequencing project was sponsored by
NHGRI.
The alignments are in 'axt' format. Each alignment
contains three lines and is separated from the next
alignment by a space:
Line 1 - summarizes the alignment.
Line 2 - contains the human sequence with inserts.
Line 3 - contains the chimp sequence with inserts.
The summary line contains 9 blank separated fields with the
following meanings:
1 - Alignment number. The first alignment in a file
is numbered 0, the next 1, and so forth.
2 - Human chromosome.
3 - Start in human chromosome. The first base is
numbered 1.
4 - End in human chromosome. The end base is included.
5 - Chimp scaffold.
6 - Start in chimp scaffold.
7 - End in chimp scaffold.
8 - Chimp strand. If this is '-', the chimp start/end fields are
relative to the reverse-complemented chimp scaffold.
9 - Blastz score. The scoring matrix blastz uses is:
A C G T
A 91 -114 -31 -123
C -114 100 -125 -31
G -31 -125 100 -114
T -123 -31 -114 91
with a gap open penalty of 400 and a gap extension
penalty of 30. The minimum score for an alignment
to be kept was 3000 for the first pass, and then
2200 for the second pass, which just restricts
the search space to the regions between two alignments
found in the first pass.
The alignments were done with blastz, which is available
from Webb Miller's group at Pennsylvania State University (PSU).
Each chromosome was divided into 10010000 base chunks with 10000
bases of overlap. The .lav format blastz output, which does not
include the sequence, was converted to .axt with PSU's lavToAxt.
The axtBest subset covers 40% of the human genome.
Name Last modified Size Description
Parent Directory -
chr1.axt.gz 2003-12-19 13:37 87M
chr1_random.axt.gz 2003-12-19 13:39 259K
chr2.axt.gz 2003-12-19 13:39 96M
chr2_random.axt.gz 2003-12-19 13:40 48K
chr3.axt.gz 2003-12-19 13:40 80M
chr3_random.axt.gz 2003-12-19 13:40 4.2K
chr4.axt.gz 2003-12-19 13:40 76M
chr4_random.axt.gz 2003-12-19 13:40 41K
chr5.axt.gz 2003-12-19 13:41 72M
chr5_random.axt.gz 2003-12-19 13:41 356
chr6.axt.gz 2003-12-19 13:41 68M
chr6_random.axt.gz 2003-12-19 13:41 131K
chr7.axt.gz 2003-12-19 13:41 60M
chr7_random.axt.gz 2003-12-19 13:41 70K
chr8.axt.gz 2003-12-19 13:41 58M
chr8_random.axt.gz 2003-12-19 13:41 81K
chr9.axt.gz 2003-12-19 13:42 45M
chr9_random.axt.gz 2003-12-19 13:42 261K
chr10.axt.gz 2003-12-19 13:37 52M
chr10_random.axt.gz 2003-12-19 13:37 180K
chr11.axt.gz 2003-12-19 13:38 52M
chr12.axt.gz 2003-12-19 13:38 51M
chr13.axt.gz 2003-12-19 13:38 39M
chr13_random.axt.gz 2003-12-19 13:38 4.1K
chr14.axt.gz 2003-12-19 13:38 35M
chr15.axt.gz 2003-12-19 13:38 32M
chr15_random.axt.gz 2003-12-19 13:38 40K
chr16.axt.gz 2003-12-19 13:38 30M
chr17.axt.gz 2003-12-19 13:39 29M
chr17_random.axt.gz 2003-12-19 13:39 146K
chr18.axt.gz 2003-12-19 13:39 31M
chr18_random.axt.gz 2003-12-19 13:39 685
chr19.axt.gz 2003-12-19 13:39 19M
chr19_random.axt.gz 2003-12-19 13:39 5.6K
chr20.axt.gz 2003-12-19 13:39 24M
chr21.axt.gz 2003-12-19 13:40 13M
chr22.axt.gz 2003-12-19 13:40 12M
chrM.axt.gz 2003-12-19 13:42 3.8K
chrUn_random.axt.gz 2003-12-19 13:42 519K
chrX.axt.gz 2003-12-19 13:42 38M
chrX_random.axt.gz 2003-12-19 13:42 117K
chrY.axt.gz 2003-12-19 13:42 3.9M
md5sum.txt 2003-12-19 19:20 2.0K