This file is from:
http://hgdownload.cse.ucsc.edu/goldenPath/rn6/multiz20way/README.txt
This directory contains compressed multiple alignments of the
following assemblies to the Rat genome (rn6, Jul. 2014):
Assemblies used in these alignments:
Rat - Rattus norvegicus Jul. 2014 (RGSC 6.0/rn6) reference
Rat Rattus norvegicus Jul. 2014 (RGSC 6.0/rn6)
Mouse Mus musculus Dec. 2011 (GRCm38/mm10)
Prairie vole Microtus ochrogaster Oct 2012 (MicOch1.0/micOch1)
Guinea pig Cavia porcellus Feb. 2008 (Broad/cavPor3)
Rabbit Oryctolagus cuniculus Apr. 2009 (Broad/oryCun2)
Human Homo sapiens Dec. 2013 (GRCh38/hg38)
Chimp Pan troglodytes May 2016 (Pan_tro 3.0/panTro5)
Rhesus Macaca mulatta Nov. 2015 (BCM Mmul_8.0.1/rheMac8)
Tarsier Tarsius syrichta Sep. 2013 (Tarsius_syrichta-2.0.1/tarSyr2)
Dog Canis lupus familiaris Sep. 2011 (Broad CanFam3.1/canFam3)
Panda Ailuropoda melanoleuca Dec. 2009 (BGI-Shenzhen 1.0/ailMel1)
Cat Felis catus Nov. 2014 (ICGSC Felis_catus_8.0/felCat8)
Cow Bos taurus Jun. 2014 (Bos_taurus_UMD_3.1.1/bosTau8)
Opossum Monodelphis domestica Oct. 2006 (Broad/monDom5)
Platypus Ornithorhynchus anatinus Feb. 2007 (ASM227v2/ornAna2)
Chicken Gallus gallus Dec 2015 (Gallus_gallus-5.0/galGal5)
Turkey Meleagris gallopavo Nov. 2014 (Turkey_5.0/melGal5)
X. tropicalis Xenopus tropicalis Sep. 2012 (JGI 7.0/xenTro7)
Zebrafish Danio rerio Sep. 2014 (GRCz10/danRer10)
Elephant shark Callorhinchus milii Dec. 2013 (Callorhinchus_milii-6.1.3/calMil1)
These alignments were prepared using the methods described in the
track description file:
http://genome.ucsc.edu/cgi-bin/hgTrackUi?db=rn6&g=cons20way
based on the phylogenetic tree: rn6.20way.nh.
Files in this directory:
- rn6.20way.nh - phylogenetic tree used during the multiz multiple alignment
- rn6.20way.commonNames.nh - same as rn6.20way.nh with the UCSC database
names replaced by the common name for the species
- rn6.20way.scientificName.nh - same as rn6.20way.nh with the UCSC database
names replaced by the scientific name for the species
- upstream*.ensGene.maf.gz - alignments of regions upstream of Ensembl genes
- rn6.20way.maf.gz - the multiple alignments on the Rat genome
- md5sum.txt - md5 check sums of these files to verify correct download files
The "alignments" directory contains compressed FASTA alignments
for the CDS regions for the gene track ensGene (v86/Oct. 2016 version)
of the rat genome (rn6, Jul. 2014) aligned to the assemblies.
The rn6.20way.maf.gz file contain all the alignments for the chromosomes
in the rat genome, including additional annotations to indicate gap
context and genomic breaks for the sequence in the underlying
genome assemblies. Note, the compressed data size of the
maf file is 7.2 Gb, uncompressed is more than 46 Gb.
The .upstream*.*.maf.gz files contain alignments in regions upstream of
annotated transcription starts for version v86/Oct. 2016 Ensembl genes.
with annotated 5' UTRs. These files differ from the standard
MAF format: they display
alignments that extend from start to end of the upstream region in
the rat whether or not alignments actually exist. In situations where no
alignments exist or the alignments of one or more species are missing,
dot (".") is used as a placeholder. Multiple regions of an assembly's
sequence may align to a single region in rat therefore, only the
species name is displayed in the alignment data and no position information
is recorded. The alignment score is always zero in these files.
For a description of multiple alignment format (MAF), see
http://genome.ucsc.edu/goldenPath/help/maf.html.
PhastCons conservation scores for these alignments are available at:
http://hgdownload.cse.ucsc.edu/goldenPath/rn6/phastCons20way
PhyloP conservation scores for these alignments are available at:
http://hgdownload.cse.ucsc.edu/goldenPath/rn6/phyloP20way
---------------------------------------------------------------
To download a large file or multiple files from this directory, we recommend
that you use rsync or ftp rather than downloading the files via our website.
There is approximately 7.9 Gb of compressed data in this directory.
Via rsync:
rsync -av --progress \
rsync://hgdownload.cse.ucsc.edu/goldenPath/rn6/multiz20way/ ./
Via FTP:
ftp hgdownload.cse.ucsc.edu
user name: anonymous
password: <your email address>
go to the directory goldenPath/rn6/multiz20way
To download multiple files from the UNIX command line, use the "mget" command.
mget <filename1> <filename2> ...
- or -
mget -a (to download all the files in the directory)
Use the "prompt" command to toggle the interactive mode if you do not want
to be prompted for each file that you download.
---------------------------------------------------------------
All the files in this directory are freely usable for any
purpose. For data use restrictions regarding the individual
genome assemblies, see http://genome.ucsc.edu/goldenPath/credits.html.
Name Last modified Size Description
Parent Directory -
alignments/ 2017-01-30 11:01 -
md5sum.txt 2017-01-24 15:16 406
rn6.20way.commonNames.nh 2017-01-24 13:25 674
rn6.20way.nh 2017-01-24 13:25 683
rn6.20way.scientificNames.nh 2017-01-24 13:25 862
upstream1000.ensGene.maf.gz 2017-01-24 14:53 52M
upstream2000.ensGene.maf.gz 2017-01-24 14:57 103M
upstream5000.ensGene.maf.gz 2017-01-24 15:02 220M
rn6.20way.maf.gz 2017-01-22 10:39 7.2G