This directory contain human/mouse alignments made using the Jun 2003 human assembly (also known as build 34) vs. the Oct 2003 mouse assembly (also known as mm4 or NCBI Mouse Build 32). The axtTight directory contains relatively stringent human/mouse alignments filtered so that only the best alignment for any given region of the human genome is used. The alignments are in 'axt' format. Each alignment contains three lines and is separated from the next alignment by a space: Line 1 - summarizes the alignment. Line 2 - contains the human sequence with inserts. Line 3 - contains the mouse sequence with inserts. The summary line contains 9 blank separated fields with the following meanings: 1 - Alignment number. The first alignment in a file is numbered 0, the next 1, and so forth. 2 - Human chromosome. 3 - Start in human chromosome. The first base is numbered 1. 4 - End in human chromosome. The end base is included. 5 - Mouse chromosome. 6 - Start in mouse. 7 - End in mouse. 8 - Mouse strand. If this is '-', the mouse start/end fields are relative to the reverse-complemented mouse chromosome. 9 - Blastz score. The scoring matrix blastz uses is: A C G T A 91 -114 -31 -123 C -114 100 -125 -31 G -31 -125 100 -114 T -123 -31 -114 91 with a gap open penalty of 400 and a gap extension penalty of 30. The minimum score for an alignment to be kept was 3000 for the first pass, and then 2200 for the second pass, which just restricts the search space to the regions between two alignments found in the first pass. The alignments were done with blastz, which is available from Webb Miller's group at Pennsylvania State University (PSU). Each chromosome was divided into 10010000 base chunks with 10000 bases of overlap. The .lav format blastz output, which does not include the sequence, was converted to .axt with PSU's lavToAxt. The axtTight alignments were processed with subsetAxt from Jim Kent using the matrix: A C G T A 100 -200 -100 -200 C -200 100 -200 -100 G -100 -200 100 -200 T -200 -100 -200 100 with a gap open penalty of 2000 and a gap extension penalty of 50. The minimum score was 3400. The axtTight subset covers 6% of the human genome while axtBest covers 40%.
Name Last modified Size Description
Parent Directory - chrY.axt.gz 2003-11-03 16:51 279K chrX_random.axt.gz 2003-11-03 16:51 94K chrX.axt.gz 2003-11-03 16:51 6.5M chrUn_random.axt.gz 2003-11-03 16:51 79K chrM.axt.gz 2003-11-03 16:51 8.3K chr22.axt.gz 2003-11-03 16:49 1.2M chr21.axt.gz 2003-11-03 16:49 973K chr20.axt.gz 2003-11-03 16:49 2.4M chr19.axt.gz 2003-11-03 16:48 1.9M chr18_random.axt.gz 2003-11-03 16:48 212 chr18.axt.gz 2003-11-03 16:48 2.5M chr17_random.axt.gz 2003-11-03 16:48 49K chr17.axt.gz 2003-11-03 16:48 4.1M chr16.axt.gz 2003-11-03 16:47 3.6M chr15_random.axt.gz 2003-11-03 16:47 36K chr15.axt.gz 2003-11-03 16:47 3.7M chr14.axt.gz 2003-11-03 16:47 3.8M chr13_random.axt.gz 2003-11-03 16:47 10K chr13.axt.gz 2003-11-03 16:47 3.2M chr12.axt.gz 2003-11-03 16:46 4.7M chr11.axt.gz 2003-11-03 16:46 5.5M chr10_random.axt.gz 2003-11-03 16:46 29K chr10.axt.gz 2003-11-03 16:46 5.0M chr9_random.axt.gz 2003-11-03 16:51 73K chr9.axt.gz 2003-11-03 16:51 4.6M chr8_random.axt.gz 2003-11-03 16:51 16K chr8.axt.gz 2003-11-03 16:51 4.7M chr7_random.axt.gz 2003-11-03 16:51 7.6K chr7.axt.gz 2003-11-03 16:51 5.7M chr6_random.axt.gz 2003-11-03 16:50 36K chr6.axt.gz 2003-11-03 16:50 6.1M chr5_random.axt.gz 2003-11-03 16:50 3.2K chr5.axt.gz 2003-11-03 16:50 7.0M chr4_random.axt.gz 2003-11-03 16:50 13K chr4.axt.gz 2003-11-03 16:50 6.0M chr3_random.axt.gz 2003-11-03 16:49 36K chr3.axt.gz 2003-11-03 16:49 7.9M chr2_random.axt.gz 2003-11-03 16:49 11K chr2.axt.gz 2003-11-03 16:48 9.9M chr1_random.axt.gz 2003-11-03 16:48 161K chr1.axt.gz 2003-11-03 16:45 9.6M