This file is from:
http://hgdownload.cse.ucsc.edu/goldenPath/panTro3/multiz12way/README.txt
This directory contains compressed multiple alignments of the
following assemblies to the chimp genome (panTro3, Oct. 2010):
Assemblies used in these alignments:
- Chimp Pan troglodytes Oct. 2010 panTro3
- Human Homo sapiens Feb. 2009 hg19/GRCh37
- Orangutan Pongo pygmaeus abelii July 2007 ponAbe2
- Rhesus Macaca mulatta Jan. 2006 rheMac2
- Marmoset Callithrix jacchus Mar. 2009 calJac3
- Mouse Mus musculus July 2007 mm9
- Rat Rattus norvegicus Nov. 2004 rn4
- Horse Equus caballus Sep. 2007 equCab2
- Dog Canis lupus familiaris May. 2005 canFam2
- Opossum Monodelphis domestica Oct. 2006 monDom5
- Chicken Gallus gallus May. 2006 galGal3
- Zebrafish Danio rerio Jul. 2010 danRer7
These alignments were prepared using the methods described in the
track description file:
http://genome.ucsc.edu/cgi-bin/hgTrackUi?db=panTro3&g=cons12way
based on the phylogenetic tree: 12way.nh.
Files in this directory:
- 12way.nh - phylogenetic tree used during the multiz multiple alignment
- panTro3.commonNames.12way.nh - same as 12way.nh with the UCSC database
names replaced by the common name for the species
The "alignments" directory contains compressed FASTA alignments
for the CDS regions of the chimp genome (panTro3, Oct. 2010)
aligned to the assemblies.
The multiz12way.maf.gz file contains all the alignments for all chromosomes
and contigs in the chimp genome.
Additional annotations to indicate gap context and genomic breaks for the
sequence in the underlying genome assemblies. Beware, the compressed
data size of this file is 11 Gb, uncompressed is more than 56 Gb.
The maf/upstream*.maf.gz files contain alignments in regions upstream of
annotated transcription starts for Ensembl genes with annotated 5' UTRs.
These files differ from the standard MAF format: they display
alignments that extend from start to end of the upstream region in
human, whether or not alignments actually exist. In situations where no
alignments exist or the alignments of one or more species are missing,
dot (".") is used as a placeholder. Multiple regions of an assembly's
sequence may align to a single region in chimp; therefore, only the
species name is displayed in the alignment data and no position information
is recorded. The alignment score is always zero in these files. These files
are updated weekly.
For a description of multiple alignment format (MAF), see
http://genome.ucsc.edu/goldenPath/help/maf.html.
PhastCons conservation scores for these alignments are available at:
http://hgdownload.cse.ucsc.edu/goldenPath/panTro3/phastCons12way
PhyloP conservation scores for these alignments are available at:
http://hgdownload.cse.ucsc.edu/goldenPath/panTro3/phyloP12way
---------------------------------------------------------------
To download a large file or multiple files from this directory, we recommend
that you use rsync or ftp rather than downloading the files via our website.
There is approximately 31 Gb of compressed data in this directory.
Via rsync:
rsync -av --progress \
rsync://hgdownload.cse.ucsc.edu/goldenPath/panTro3/multiz12way/ ./
Via FTP:
ftp hgdownload.cse.ucsc.edu
user name: anonymous
password: <your email address>
go to the directory goldenPath/panTro3/multiz12way
To download multiple files from the UNIX command line, use the "mget" command.
mget <filename1> <filename2> ...
- or -
mget -a (to download all the files in the directory)
Use the "prompt" command to toggle the interactive mode if you do not want
to be prompted for each file that you download.
---------------------------------------------------------------
All the files in this directory are freely usable for any
purpose. For data use restrictions regarding the individual
genome assemblies, see http://genome.ucsc.edu/goldenPath/credits.html.
Name Last modified Size Description
Parent Directory -
12way.nh 2011-03-18 11:36 306
alignments/ 2011-08-18 08:36 -
maf/ 2019-11-06 11:10 -
md5sum.txt 2011-05-09 10:06 204
multiz12way.maf.gz 2011-05-03 10:42 5.1G
panTro3.commonNames.12way.nh 2011-03-18 11:46 305