Wednesday, May 22, 2013

Compiling Cufflinks

Reference from http://cufflinks.cbcb.umd.edu/tutorial.html
Cufflinks assembles transcripts, estimates their abundances, and tests for differential expression and regulation in RNA-Seq samples. It accepts aligned RNA-Seq reads and assembles the alignments into a parsimonious set of transcripts. Cufflinks then estimates the relative abundances of these transcripts based on how many reads support each one, taking into account biases in library preparation protocols. Preliminary Compilation
  1. Compiling and Installing Boost 1.53
  2. Compiling SamTools
  3. Compiling Eigen
Step 1: Download Cufflinks Download Cufflinks Step 2: Compile Cufflinks
# tar -zxvf cufflinks-2.1.1.tar.gz
# cd cufflinks-2.1.1
#  ./configure --prefix=/usr/local/cufflinks/ \
--with-boost=/usr/local/boost \ 
--with-eigen=/usr/local/include
# make
# make install

Step 3: Testing the installation Download the test data
# /usr/local/cufflinks/bin/cufflinks ./test_data.sam

You should see the following output:
[bam_header_read] EOF marker is absent. The input is probably truncated.
[bam_header_read] invalid BAM binary header (this is not a BAM file).
File ./test_data.sam doesn't appear to be a valid BAM file, trying SAM...
[13:23:15] Inspecting reads and determining fragment length distribution.
> Processed 1 loci.                            [*************************] 100%
Warning: Using default Gaussian distribution due to insufficient paired-end reads in open ranges.  
It is recommended that correct paramaters (--frag-len-mean and --frag-len-std-dev) be provided.
> Map Properties:
>       Total Map Mass: 102.50
>       Read Type: 75bp x 75bp
>       Fragment Length Distribution: Truncated Gaussian (default)
>                     Estimated Mean: 200
>                  Estimated Std Dev: 80
[13:23:15] Assembling transcripts and estimating abundances.
> Processed 1 loci.                            [*************************] 100%



Verify that the file transcripts.gtf is in the current directory and looks like this (your file will have GTF attributes, omitted here for clarity)
test_chromosome Cufflinks       exon    53      250     1000    +       . 
test_chromosome Cufflinks       exon    351     400     1000    +       . 
test_chromosome Cufflinks       exon    501     550     1000    +       .

No comments: