This directory contains download files of the Saccharomyces
cerevisiae genome sequence and associated annotations. The data
is based on sequence dated June 2008 in the Saccharomyces Genome
Database (http://www.yeastgenome.org/) and was obtained from the site
http://downloads.yeastgenome.org/sequence/genomic_sequence/chromosomes/fasta/
The S288C strain was used in this sequencing project.
Files included in this directory:
sacCer2.2bit - contains the complete genome sequence in the 2bit file format.
The utility program, twoBitToFa (available from the kent src tree),
can be used to extract .fa file(s) from this file. A pre-compiled
version of the command line tool can be found at:
http://hgdownload.cse.ucsc.edu/admin/exe/linux.x86_64/
See also:
http://genome.ucsc.edu/admin/git.html
http://genome.ucsc.edu/admin/jk-install.html
chromAgp.tar.gz - contains the list of accession identifiers for
each chromosome, unpacking to one file per chromosome.
chromFa.tar.gz - The assembly sequence in one file per chromosome.
No masking has been applied to these sequences.
There are NO RepeatMasker .out files for this assembly.
chromTrf.tar.gz - Tandem Repeats Finder locations, filtered to keep repeats
with period less than or equal to 12, and translated into UCSC's BED
format (one file per chromosome).
est.fa.gz - S. cerevisiae ESTs in GenBank. This sequence data is updated once a
week via automatic GenBank updates.
md5sum.txt - checksums of files in this directory
mrna.fa.gz - S. cerevisiae mRNA from GenBank. This sequence data is updated
once a week via automatic GenBank updates.
sgdGene.upstream*.fa.gz - Saccharomyces Genome Database genes upstream
sequences, 1000, 2000 and 5000 bases
sacCer2.chrom.sizes - Two-column tab-separated text file containing assembly
sequence names and sizes.
------------------------------------------------------------------
If you plan to download a large file or multiple files from this
directory, we recommend that you use ftp rather than downloading the
files via our website. To do so, ftp to hgdownload.cse.ucsc.edu
[username: anonymous, password: your email address], then cd to the
directory goldenPath/sacCer2/bigZips. To download multiple files, use
the "mget" command:
mget <filename1> <filename2> ...
- or -
mget -a (to download all the files in the directory)
Alternate methods to ftp access.
Using an rsync command to download the entire directory:
rsync -avzP rsync://hgdownload.cse.ucsc.edu/goldenPath/sacCer2/bigZips/ .
For a single file, e.g. chromFa.tar.gz
rsync -avzP
rsync://hgdownload.cse.ucsc.edu/goldenPath/sacCer2/bigZips/chromFa.tar.gz .
Or with wget, all files:
wget --timestamping
'ftp://hgdownload.cse.ucsc.edu/goldenPath/sacCer2/bigZips/*'
With wget, a single file:
wget --timestamping
'ftp://hgdownload.cse.ucsc.edu/goldenPath/sacCer2/bigZips/chromFa.tar.gz'
-O chromFa.tar.gz
To unpack the *.tar.gz files:
tar xvzf <file>.tar.gz
To uncompress the fa.gz files:
gunzip <file>.fa.gz
All the tables in this directory are freely available for public use.
Name Last modified Size Description
Parent Directory -
xenoRefMrna.fa.gz.md5 2019-10-17 21:04 52
xenoRefMrna.fa.gz 2019-10-17 21:04 331M
upstream5000.fa.gz.md5 2019-10-17 21:04 53
upstream5000.fa.gz 2019-10-17 21:04 73K
upstream2000.fa.gz.md5 2019-10-17 21:04 53
upstream2000.fa.gz 2019-10-17 21:04 30K
upstream1000.fa.gz.md5 2019-10-17 21:04 53
upstream1000.fa.gz 2019-10-17 21:04 16K
sgdGene.upstream5000.fa.gz 2009-07-28 12:37 8.1M
sgdGene.upstream2000.fa.gz 2009-07-28 12:37 3.8M
sgdGene.upstream1000.fa.gz 2009-07-28 12:37 2.1M
sacCer2.fa.gz 2020-01-23 02:26 3.6M
sacCer2.chrom.sizes 2009-02-03 14:05 242
sacCer2.2bit 2009-02-03 14:05 2.9M
mrna.fa.gz.md5 2019-10-17 21:00 45
mrna.fa.gz 2019-10-17 21:00 111K
md5sum.txt 2012-01-09 13:19 434
genes/ 2020-02-05 13:47 -
est.fa.gz.md5 2019-10-17 21:04 44
est.fa.gz 2019-10-17 21:04 6.2M
chromTrf.tar.gz 2009-02-24 15:40 20K
chromFa.tar.gz 2009-02-24 15:40 3.6M
chromAgp.tar.gz 2009-02-24 15:40 711