October 15, 2020

Bookmarks: Tutorials for bioinformatics & computational science tools

Links to free tools and tutorials. To be updated occasionally...

Reading in large csv spreadsheets (e.g. 1-2GB)

  • Optimize your data read in. Use the "data.table" R package fread() function instead of base read.csv(). It's faster and allows you to import only certain columns or rows. 
  • To read in a 2 GB spreadsheet, you need 16 GB of RAM. In my experience, 8 GB RAM results in out of memory errors.

Annotating omics data with Ensembl's BioMart tool (R, Perl)


RNA-seq workflow (Unix, R)


Comparing edgeR, DESeq2, and limma

DESeq2 differential expression analysis (R)


edgeR workflow

Machine learning 101 (Matlab, Python)


Comparing sequences

  • BLAST - compare your sequence to another given sequence, or to the human genome, or to other genomes or transcriptomes
    • Nucleotide BLAST compares sequences
    • blastx - give it a DNA/RNA sequence to be compared to proteins
    • tblastn - give it a protein sequence to be compared to DNA/RNA sequences
    • Protein BLAST - compare protein sequences
  • ClustalW - input sequences in a FASTA format and align them
    • Reduce the "Gap Extension Penalty" to zero if you're comparing DNA to mRNA (thus allowing for introns to interrupt the alignment, which the default algorithm avoids)
    • Use this when you want to see the full alignment, not just short windows of good alignment that BLAST provides

Enrichment Analysis - methods with no coding required

  • Ingenuity Pathway Analysis (QIAGEN) - click on "Resources" and search for webinars. The software is free, but the license is not. Cedars-Sinai has an institutional license that you can request through EIS.
  • Gene Ontology - free but less detailed than IPA

Data visualization


Statistical models



General links

  • Pak Yu's github: https://sfpacman.github.io/cookbook/index.html 

How to format final figures for publication

General figure guidelines File types and file sizes TIFF images with LZW compression to reduce the file size PDF files for vector images Not...