August 6, 2022

R code: Install code for my favorite R packages for bioinformatics

I am updating my R version and re-installing a bunch of things. Here is the code so I can copy/paste more easily in the future.

## For opening large csv files faster than read.csv()
## These are essential for my DNA methylation work when I have >800,000 rows of data

install.packages(c("vroom", "data.table"))


## For opening Excel spreadsheets
install.packages(c("gdata", "openxlsx"))


## For data cleaning, organizing, subsetting, and using the pipes (%>%) grammar. Installing tidyverse installs several powerful packages all at once (tidyr, dplyr, stringr, ggplot2, and more).

install.packages("tidyverse")

install.packages("lubridate")  ## for dates


## For plotting, data visualization

install.packages(c("ggplot2", "ggrepel", "ggpubr"))

install.packages("reshape2")  ## useful to re-format data for plotting

install.packages("gplots") ## for function heatmap.2

install.packages(c("plotly", "heatmaply", "htmlwidgets")) ## for interactive plots and saving to html

install.packages("ggExtra") ## for ggMarginal function

install.packages(c("viridis", "RColorBrewer")) ## for pretty color ranges

install.packages("scales") ## for prettier axis breaks



## For advanced data visualization with Dr Gu's ComplexHeatmap package
(warning: take a few minutes sometimes if the server is slow)

if (!require("BiocManager", quietly = TRUE))

    install.packages("BiocManager")

BiocManager::install("ComplexHeatmap")


## For combining plots into multipanel figures or creating/extracting multi-page pdf files

install.packages("cowplot")  ## my favorite for aligning plots, general intro

install.packages("patchwork") ## for combining plots, used by Seurat

install.packages("pdftools")


## For easy Manhattan and QQ plots (GWAS and DNA methylation data)

install.packages("qqman")


## For genomics annotation from Ensembl release 90 specifically

install.packages("devtools")

devtools::install_github("stephenturner/annotables")


## For genomics annotations from the Ensembl BioMart servers

if (!require("BiocManager", quietly = TRUE))

    install.packages("BiocManager")

BiocManager::install("biomaRt")


## For differential expression analysis (RNA-seq data)

if (!require("BiocManager", quietly = TRUE))

    install.packages("BiocManager")

BiocManager::install(c("DESeq2", "limma", "edgeR"))


## Advanced image processing with magick, and image-to-text with tesseract

install.packages("magick")

install.packages("tesseract")

install.packages("jpeg")


## Notification sounds for RStudio (Windows only) with beepr. I love using this when I'm running some code over lunch. I can hear when it's done!

install.packages("beepr")


## Venn diagrams and upset plots

install.packages("ggVennDiagram") ## really nice for Venn diagrams

if(!require(devtools)) install.packages("devtools")

devtools::install_github("krassowski/complex-upset") ## for better upset plots


## Web scrapping with rvest (tutorial1, tutorial2)

install.packages(c("rvest", "httr"))



## ----------- Single cell RNA-seq analysis ---------------------

## To handle data imports

install.packages("fftwtools") ## dependency

if (!require("BiocManager", quietly = TRUE))

    install.packages("BiocManager")

BiocManager::install("S4Vectors") ## dependency for rhdf5 and schard

BiocManager::install("rhdf5")  ## dependency for schard

devtools::install_github("cellgeni/schard") ## schard allows opening h5ad files into other formats

install.packages("anndata")  ## to load .h5ad files from scanpy


## Single cell RNA-seq analysis packages

BiocManager::install("multtest")  ## dependency for Seurat FindConservedMarkers() function

install.packages("qqconf") ## dependency for metap

install.packages("metap") ## dependency for Seurat FindConservedMarkers() function

install.packages("Seurat") # tool from Satija lab

BiocManager::install("scran") # tutorial for scran


## Quality control: doublet finder

if (!require("BiocManager", quietly = TRUE))

    install.packages("BiocManager")

BiocManager::install("scDblFinder")

BiocManager::install("BiocSingular")  ## calculates doublet score


## Cell recognition method

if (!require("BiocManager", quietly = TRUE))

    install.packages("BiocManager")

BiocManager::install("SingleR") 


## Phylogeny tree for clusters

install.packages("ape")  ##  dependency for function Seurat::PlotClusterTree


## Install Azimuth and Azimuth dependencies

if (!requireNamespace("remotes", quietly = TRUE))

     install.packages("remotes")

remotes::install_github("stuart-lab/signac", ref = "master")  # previously ref = "develop"


if (!require("BiocManager", quietly = TRUE))

     install.packages("BiocManager")

 BiocManager::install("BSgenome.Hsapiens.UCSC.hg38")

 BiocManager::install("TFBSTools")


devtools::install_github("satijalab/seurat", "seurat5")

devtools::install_github("satijalab/seurat-data", "seurat5")

remotes::install_github('satijalab/azimuth', ref = 'master')

# OLDER VERSION: devtools::install_github("satijalab/azimuth", "seurat5")  


## Cell proportion analysis

devtools::install_github("rpolicastro/scProportionTest") ## permutation testing, citation

BiocManager::install("speckle") ## propeller method, t-test for 2 groups, ANOVA test for 3+ groups, strictest method because it takes into account biological replicates



## ----------- DNA methylation packages ---------------------

## For methylation array analysis (missMethyl has a gomethyl function for gene ontology analysis that accounts for the number of probes per gene)

if (!require("BiocManager", quietly = TRUE))

    install.packages("BiocManager")

BiocManager::install("missMethyl")


## Differentially methylated region analysis (DNA methylation data) - three methods

if (!require("BiocManager", quietly = TRUE))

    install.packages("BiocManager")

BiocManager::install("ChAMP")

BiocManager::install("bumphunter")

BiocManager::install("DMRcate")


## ----------- Python and R play nice ----------------------

install.packages("renv")  ## to control the environment, see advice from scanpy



## ----------- Requires extra software ---------------------

## Per Seurat message: "For a (much!) faster implementation of the Wilcoxon Rank Sum Test, (default method for FindMarkers) please install the presto package". If using Windows, install Git for Windows first, linked from microsoft.com.

# install.packages("devtools")

# devtools::install_github("immunogenomics/presto")


## ----------- Notes for Linux ---------------------

## Linux installations of R packages may require additional Linux tools. Check red text error messages and install necessary dependencies from the Linux terminal, then try again in RStudio.

## You can also install R packages from the Linux terminal, example for Debian/Ubuntu and R package '"ffwtools":
##  sudo apt-get install r-cran-fftwtools

No comments:

Post a Comment

Setting up Ubuntu 24.04.x LTS desktop and server, RStudio server, and JypyterLab at home

Why?  Set up a Linux server computer on a home network if: You want to run code that takes a long time to complete. Let it run on the server...