August 6, 2022

R code: Install code for my favorite R packages for bioinformatics

I am updating my R version and re-installing a bunch of things. Here is the code so I can copy/paste more easily in the future.

## For opening large csv files faster than read.csv()
## These are essential for my DNA methylation work when I have >800,000 rows of data

install.packages(c("vroom", "data.table"))


## For opening Excel spreadsheets
install.packages(c("gdata", "openxlsx"))


## For data cleaning, organizing, subsetting, and using the pipes (%>%) grammar. Installing tidyverse installs several powerful packages all at once (tidyr, dplyr, stringr, ggplot2, and more).

install.packages("tidyverse")

install.packages("lubridate")  ## for dates


## For plotting, data visualization

install.packages(c("ggplot2", "ggrepel", "ggpubr"))

install.packages("reshape2")  ## useful to re-format data for plotting

install.packages("gplots") ## for function heatmap.2

install.packages(c("plotly", "heatmaply", "htmlwidgets")) ## for interactive plots and saving to html

install.packages("ggExtra") ## for ggMarginal function

install.packages(c("viridis", "RColorBrewer")) ## for pretty color ranges

install.packages("scales") ## for prettier axis breaks



## For advanced data visualization with Dr Gu's ComplexHeatmap package
(warning: take a few minutes sometimes if the server is slow)

if (!require("BiocManager", quietly = TRUE))

    install.packages("BiocManager")

BiocManager::install("ComplexHeatmap")


## For combining plots into multipanel figures or creating/extracting multi-page pdf files

install.packages("cowplot")  ## my favorite for aligning plots, general intro

install.packages("patchwork") ## for combining plots, used by Seurat

install.packages("pdftools")


## For easy Manhattan and QQ plots (GWAS and DNA methylation data)

install.packages("qqman")


## For genomics annotation from Ensembl release 90 specifically

install.packages("devtools")

devtools::install_github("stephenturner/annotables")


## For genomics annotations from the Ensembl BioMart servers

if (!require("BiocManager", quietly = TRUE))

    install.packages("BiocManager")

BiocManager::install("biomaRt")


## For differential expression analysis (RNA-seq data)

if (!require("BiocManager", quietly = TRUE))

    install.packages("BiocManager")

BiocManager::install(c("DESeq2", "limma", "edgeR"))


## Advanced image processing with magick, and image-to-text with tesseract

install.packages("magick")

install.packages("tesseract")

install.packages("jpeg")


## Notification sounds for RStudio (Windows only) with beepr. I love using this when I'm running some code over lunch. I can hear when it's done!

install.packages("beepr")


## Venn diagrams and upset plots

install.packages("ggVennDiagram") ## really nice for Venn diagrams

if(!require(devtools)) install.packages("devtools")

devtools::install_github("krassowski/complex-upset") ## for better upset plots


## Web scrapping with rvest (tutorial1, tutorial2)

install.packages(c("rvest", "httr"))



## ----------- Single cell RNA-seq analysis ---------------------

## To handle data imports

install.packages("fftwtools") ## dependency

if (!require("BiocManager", quietly = TRUE))

    install.packages("BiocManager")

BiocManager::install("S4Vectors") ## dependency for rhdf5 and schard

BiocManager::install("rhdf5")  ## dependency for schard

devtools::install_github("cellgeni/schard") ## schard allows opening h5ad files into other formats

install.packages("anndata")  ## to load .h5ad files from scanpy


## Single cell RNA-seq analysis packages

BiocManager::install("multtest")  ## dependency for Seurat FindConservedMarkers() function

install.packages("qqconf") ## dependency for metap

install.packages("metap") ## dependency for Seurat FindConservedMarkers() function

install.packages("Seurat") # tool from Satija lab

BiocManager::install("scran") # tutorial for scran


## Quality control: doublet finder

if (!require("BiocManager", quietly = TRUE))

    install.packages("BiocManager")

BiocManager::install("scDblFinder")

BiocManager::install("BiocSingular")  ## calculates doublet score


## Cell recognition method

if (!require("BiocManager", quietly = TRUE))

    install.packages("BiocManager")

BiocManager::install("SingleR") 


## Phylogeny tree for clusters

install.packages("ape")  ##  dependency for function Seurat::PlotClusterTree


## Install Azimuth and Azimuth dependencies

if (!requireNamespace("remotes", quietly = TRUE))

     install.packages("remotes")

remotes::install_github("stuart-lab/signac", ref = "master")  # previously ref = "develop"


if (!require("BiocManager", quietly = TRUE))

     install.packages("BiocManager")

 BiocManager::install("BSgenome.Hsapiens.UCSC.hg38")

 BiocManager::install("TFBSTools")


devtools::install_github("satijalab/seurat", "seurat5")

devtools::install_github("satijalab/seurat-data", "seurat5")

remotes::install_github('satijalab/azimuth', ref = 'master')

# OLDER VERSION: devtools::install_github("satijalab/azimuth", "seurat5")  


## Cell proportion analysis

devtools::install_github("rpolicastro/scProportionTest") ## permutation testing, citation

BiocManager::install("speckle") ## propeller method, t-test for 2 groups, ANOVA test for 3+ groups, strictest method because it takes into account biological replicates



## ----------- DNA methylation packages ---------------------

## For methylation array analysis (missMethyl has a gomethyl function for gene ontology analysis that accounts for the number of probes per gene)

if (!require("BiocManager", quietly = TRUE))

    install.packages("BiocManager")

BiocManager::install("missMethyl")


## Differentially methylated region analysis (DNA methylation data) - three methods

if (!require("BiocManager", quietly = TRUE))

    install.packages("BiocManager")

BiocManager::install("ChAMP")

BiocManager::install("bumphunter")

BiocManager::install("DMRcate")


## ----------- Python and R play nice ----------------------

install.packages("renv")  ## to control the environment, see advice from scanpy



## ----------- Requires extra software ---------------------

## Per Seurat message: "For a (much!) faster implementation of the Wilcoxon Rank Sum Test, (default method for FindMarkers) please install the presto package". If using Windows, install Git for Windows first, linked from microsoft.com.

# install.packages("devtools")

# devtools::install_github("immunogenomics/presto")


## ----------- Notes for Linux ---------------------

## Linux installations of R packages may require additional Linux tools. Check red text error messages and install necessary dependencies from the Linux terminal, then try again in RStudio.

## You can also install R packages from the Linux terminal, example for Debian/Ubuntu and R package '"ffwtools":
##  sudo apt-get install r-cran-fftwtools

No comments:

Post a Comment

How to format final figures for publication

General figure guidelines File types and file sizes TIFF images with LZW compression to reduce the file size PDF files for vector images Not...