June 12, 2023

Study questions for sequencing research articles

Study questions for a sequencing article:

What is the article citation? Include the full article title, a PubMed link to the article, and a link to the article's original publisher.

Who is the corresponding author and what is their institutional affiliation? The corresponding author is usually the last author on the author list. They will have an asterisk or another symbol by their name. This is the person who oversees the project and makes publication decisions.

What samples did the study use (tissue type and gestational age and source of sample)? Define any acronyms used. Some examples:

  • Placenta - but at what gestational age? This is usually given in weeks. If pre-delivery, were the samples from terminations or from continuing pregnancies? 
  • Cord blood - this is collected at delivery. Were the pregnancies "full term" or "pre-term" or otherwise described?
  • Plasma - when collected?
  • Serum - when collected?

Did they collect the samples themselves? If they are analyzing data from another paper, what is the citation for that other study? Check methods to see if they indicate an NCBI GEO accession ID (begins with "GSE").

What sequencing or array method(s) did they use? Some examples:
  • total RNA-sequencing: sequences all RNAs after a depletion of ribosomal RNAs.
  • mRNA-sequencing: only sequences the polyadenylated RNAs (with polyA tails) using oligo(dT) primers or probes during the library preparation.
  • miRNA-sequencing, microRNA-sequencing, or small RNA-sequencing: sequences small RNAs within a specific size range.
  • Bisulfite sequencing: a sequencing-based method to measure DNA methylation
  • DNA methylation array: an array-based method to measure DNA methylation, but which one? The two most common arrays are the 450k array (older) and the EPIC array (twice the size at 860k sites). Some papers also use custom arrays.
  • Vocabulary to help distinguish methods:
    • "Total RNA" = all RNA isolated from the sample before any depletion or enrichment step. The use of the term "total RNA" does not indicate the sequencing method.
    • Library preparation is the step to convert RNA into cDNA before RNA-sequencing. The cDNA is what is actually sequenced.
    • DNA is bisulfite converted for methylation arrays also, so the word "bisulfite" in methods does not by itself indicate that methylation was measured with bisulfite sequencing.

Did they use bulk sequencing, single cell sequencing, single nuclei sequencing, or multiple methods? Bulk means a whole sample was processed at once. Single cell and single nuclei sequencing require separating (or "dissociating") the tissue and capturing individual cells or nuclei.

How were the samples stored between collection and nucleotide isolation (or collection and tissue dissociation)?

  • How were samples frozen? Did they use some cryopreservative such as RNAlater, CryoStor CS10, CryoStor CS5, DMSO, glycerol, etc? Do they specify a percentage such as 10% DMSO?
  • What temperature was used to store samples long-term?

For bulk studies - What method was used for DNA or RNA isolation? Was it a commercial kit (which one) or a phenol/chloroform method or something else?

For single cell or single nuclei studies - Was there are red blood cell (RBC) lysis step? Was there a cell sorting step and if so, what population of cells did they capture and how did they identify it?

What study groups were compared in the analysis? Describe how the groups were defined.  

How many samples are in each group? Define any acronyms or uncommon words. Note that some articles list two numbers: the starting number of samples, then the number of samples they used for the final analysis after describing their criteria for inclusion/exclusion. What was the final number analyzed in sequencing or array analysis?

How did they define significance? What statistics and thresholds did they use?

How many significant genes, probes, and/or regions did they find?

  • RNA-seq studies will report differentially expressed genes (DEGs) and use words like "upregulated" and "downregulated", or "[group1]-biased" or "[group2]-biased" to indicate direction of higher expression.
  • DNA methylation studies will report differentially methylated probes (DMPs) and regions (DMRs).
  • The exact vocabulary might vary from paper to paper (e.g. "CpG sites" instead of "probes").

What genes did they highlight? Pick 1-3 genes and briefly discuss the main results. Keep in mind that saying a gene is "significant" doesn't give enough information. What about it is significant? In what direction?

  • Bad: "ZNF300 was significant."
  • Good: "ZNF300 gene expression was significantly higher in females, compared to males."
  • Better: "In first trimester human placenta, ZNF300 gene expression was 1.58-fold higher in females, compared to males, with FDR<0.05." 
  • (You don't need to repeat the sample type or species every time, but it's helpful to clarify at least once. The magnitude of the change is helpful to record, too, especially when other genes might be >10-fold different.)


What were their main conclusions? Write 1-5 bullet point notes or sentences. What are the highlights of the paper? What did they accomplish and find?


--------------------------------------------------------

Last updated May 2, 2024

No comments:

Post a Comment

Bookmarks: single cell RNA-seq tutorials and tools

These are my bookmarks for single cell transcriptomics resources and tutorials. scRNA-seq introductions How to make R obj...