Get DESeq2 results using reduced dispersion matrices

This function calls DESeq2::DESeq and DESeq2::results on a pre-existing DESeqDataSet object and returns a DESeqResults table for one or more pairwise comparisons. However, unlike a standard call to DESeq2::results using the contrast argument, this function subsets the dataset so that DESeq2 only estimates dispersion for the samples being compared, and not for all samples present.

getDESeqResults(
  dds,
  contrast.numer,
  contrast.denom,
  comparisons = NULL,
  sizeFactors = NULL,
  alpha = 0.1,
  lfcShrink = FALSE,
  args.DESeq = NULL,
  args.results = NULL,
  args.lfcShrink = NULL,
  ncores = getOption("mc.cores", 2L),
  quiet = FALSE
)

Arguments

dds: A DESeqDataSet object, produced using either getDESeqDataSet from this package or DESeqDataSet from DESeq2. If dds was not created using getDESeqDataSet, dds must be made with design = ~condition such that a unique condition level exists for each sample/treatment condition.
contrast.numer: A string naming the condition to use as the numerator in the DESeq2 comparison, typically the perturbative condition.
contrast.denom: A string naming the condition to use as the denominator in the DESeq2 comparison, typically the control condition.
comparisons: As an optional alternative to supplying a single contrast.numer and contrast.denom, users can supply a list of character vectors containing numerator-denominator pairs, e.g. list(c("B", "A"), c("C", "A"), c("C", "B")). comparisons can also be a dataframe in which each row is a comparison, the first column contains the numerators, and the second column contains the denominators.
sizeFactors: A vector containing DESeq2 sizeFactors to apply to each sample. Each sample's readcounts are divided by its respective DESeq2 sizeFactor. A warning will be generated if the DESeqDataSet already contains sizeFactors, and the previous sizeFactors will be over-written.
alpha: The significance threshold passed to DESeqResults, which is used for independent filtering of results (see DESeq2 documentation).
lfcShrink: Logical indicating if log2FoldChanges and their standard errors should be shrunk using lfcShrink. LFC shrinkage is very useful for making fold-change values meaningful, as low-expression/high variance genes are given low fold-changes. Set to FALSE by default.
args.DESeq: Additional arguments passed to DESeq, given as a list of argument-value pairs, e.g. list(fitType = "local", useT = TRUE). All arguments given here will be passed to DESeq except for object and parallel. If no arguments are given, all defaults will be used.
args.results: Additional arguments passed to DESeq2::results, given as a list of argument-value pairs, e.g. list(altHypothesis = "greater", lfcThreshold = 1.5). All arguments given here will be passed to results except for object, contrast, alpha, and parallel. If no arguments are given, all defaults will be used.
args.lfcShrink: Additional arguments passed to lfcShrink, given as a list of argument-value pairs. All arguments given here will be passed to lfcShrink except for dds, coef, contrast, and parallel. If no arguments are given, all defaults will be used.
ncores: The number of cores to use for parallel processing. Multicore processing is only used if more than one comparison is being made (i.e. argument comparisons is used), and the number of cores utilized will not be greater than the number of comparisons being performed.
quiet: If TRUE, all output messages from calls to DESeq and results will be suppressed, although passing option quiet in args.DESeq will supersede this option for the call to DESeq.

Value

For a single comparison, the output is the DESeqResults result table. If a comparisons is used to make multiple comparisons, the output is a named list of DESeqResults objects, with elements named following the pattern "X_vs_Y", where X is the name of the numerator condition, and Y is the name of the denominator condition.

Errors when `ncores > 1`

If this function returns an error, set ncores = 1. Whether or not this occurs can depend on whether users are using alternative BLAS libraries (e.g. OpenBLAS or Apple's Accelerate framework) and/or how DESeq2 was installed. This is because some DESeq2 functions (e.g. nbinomWaldTest) use C code that can be compiled to use parallelization, and this conflicts with our use of process forking (via the parallel package) when ncores > 1.

Author

Mike DeBerardine

Examples

#--------------------------------------------------#
# getDESeqDataSet
#--------------------------------------------------#
suppressPackageStartupMessages(require(DESeq2))
data("PROseq") # import included PROseq data
data("txs_dm6_chr4") # import included transcripts

# divide PROseq data into 6 toy datasets
ps_a_rep1 <- PROseq[seq(1, length(PROseq), 6)]
ps_b_rep1 <- PROseq[seq(2, length(PROseq), 6)]
ps_c_rep1 <- PROseq[seq(3, length(PROseq), 6)]

ps_a_rep2 <- PROseq[seq(4, length(PROseq), 6)]
ps_b_rep2 <- PROseq[seq(5, length(PROseq), 6)]
ps_c_rep2 <- PROseq[seq(6, length(PROseq), 6)]

ps_list <- list(A_rep1 = ps_a_rep1, A_rep2 = ps_a_rep2,
                B_rep1 = ps_b_rep1, B_rep2 = ps_b_rep2,
                C_rep1 = ps_c_rep1, C_rep2 = ps_c_rep2)

# make flawed dataset (ranges in txs_dm6_chr4 not disjoint)
#    this means there is double-counting
# also using discontinuous gene regions, as gene_ids are repeated
dds <- getDESeqDataSet(ps_list,
                       txs_dm6_chr4,
                       gene_names = txs_dm6_chr4$gene_id,
                       ncores = 1)

dds
#> class: DESeqDataSet 
#> dim: 111 6 
#> metadata(1): version
#> assays(1): counts
#> rownames(111): FBgn0267363 FBgn0266617 ... FBgn0039924 FBgn0027101
#> rowData names(2): tx_name gene_id
#> colnames(6): A_rep1 A_rep2 ... C_rep1 C_rep2
#> colData names(2): condition replicate

#--------------------------------------------------#
# getDESeqResults
#--------------------------------------------------#

res <- getDESeqResults(dds, "B", "A")
#> estimating size factors
#> estimating dispersions
#> gene-wise dispersion estimates
#> mean-dispersion relationship
#> final dispersion estimates
#> fitting model and testing

res
#> log2 fold change (MLE): condition B vs A 
#> Wald test p-value: condition B vs A 
#> DataFrame with 111 rows and 6 columns
#>                baseMean log2FoldChange     lfcSE        stat    pvalue
#>               <numeric>      <numeric> <numeric>   <numeric> <numeric>
#> FBgn0267363    0.254249     -1.4228145  4.997084 -0.28472898  0.775852
#> FBgn0266617   10.800541      0.1563201  1.033936  0.15118944  0.879826
#> FBgn0265633    2.057706      0.0175256  2.125748  0.00824443  0.993422
#> FBgn0264617   66.127353      0.2087670  0.467572  0.44649198  0.655242
#> FBgn0085432 4678.846158     -0.2333084  0.236723 -0.98557490  0.324342
#> ...                 ...            ...       ...         ...       ...
#> FBgn0039928     685.650       0.126857  0.258212    0.491292  0.623220
#> FBgn0052017     130.983       0.239103  0.418935    0.570741  0.568175
#> FBgn0002521     174.375      -0.110680  0.342843   -0.322829  0.746825
#> FBgn0039924     161.131       0.322486  0.456555    0.706346  0.479973
#> FBgn0027101     361.983       0.233889  0.294518    0.794142  0.427113
#>                  padj
#>             <numeric>
#> FBgn0267363  0.997413
#> FBgn0266617  0.997413
#> FBgn0265633  0.997413
#> FBgn0264617  0.997413
#> FBgn0085432  0.997413
#> ...               ...
#> FBgn0039928  0.997413
#> FBgn0052017  0.997413
#> FBgn0002521  0.997413
#> FBgn0039924  0.997413
#> FBgn0027101  0.997413

reslist <- getDESeqResults(dds, comparisons = list(c("B", "A"), c("C", "A")),
                           ncores = 1)
names(reslist)
#> [1] "B_vs_A" "C_vs_A"

reslist$B_vs_A
#> log2 fold change (MLE): condition B vs A 
#> Wald test p-value: condition B vs A 
#> DataFrame with 111 rows and 6 columns
#>                baseMean log2FoldChange     lfcSE        stat    pvalue
#>               <numeric>      <numeric> <numeric>   <numeric> <numeric>
#> FBgn0267363    0.254249     -1.4228145  4.997084 -0.28472898  0.775852
#> FBgn0266617   10.800541      0.1563201  1.033936  0.15118944  0.879826
#> FBgn0265633    2.057706      0.0175256  2.125748  0.00824443  0.993422
#> FBgn0264617   66.127353      0.2087670  0.467572  0.44649198  0.655242
#> FBgn0085432 4678.846158     -0.2333084  0.236723 -0.98557490  0.324342
#> ...                 ...            ...       ...         ...       ...
#> FBgn0039928     685.650       0.126857  0.258212    0.491292  0.623220
#> FBgn0052017     130.983       0.239103  0.418935    0.570741  0.568175
#> FBgn0002521     174.375      -0.110680  0.342843   -0.322829  0.746825
#> FBgn0039924     161.131       0.322486  0.456555    0.706346  0.479973
#> FBgn0027101     361.983       0.233889  0.294518    0.794142  0.427113
#>                  padj
#>             <numeric>
#> FBgn0267363  0.997413
#> FBgn0266617  0.997413
#> FBgn0265633  0.997413
#> FBgn0264617  0.997413
#> FBgn0085432  0.997413
#> ...               ...
#> FBgn0039928  0.997413
#> FBgn0052017  0.997413
#> FBgn0002521  0.997413
#> FBgn0039924  0.997413
#> FBgn0027101  0.997413

# or using a dataframe
reslist <- getDESeqResults(dds, comparisons = data.frame(num = c("B", "C"),
                                                         den = c("A", "A")),
                           ncores = 1)
reslist$B_vs_A
#> log2 fold change (MLE): condition B vs A 
#> Wald test p-value: condition B vs A 
#> DataFrame with 111 rows and 6 columns
#>                baseMean log2FoldChange     lfcSE        stat    pvalue
#>               <numeric>      <numeric> <numeric>   <numeric> <numeric>
#> FBgn0267363    0.254249     -1.4228145  4.997084 -0.28472898  0.775852
#> FBgn0266617   10.800541      0.1563201  1.033936  0.15118944  0.879826
#> FBgn0265633    2.057706      0.0175256  2.125748  0.00824443  0.993422
#> FBgn0264617   66.127353      0.2087670  0.467572  0.44649198  0.655242
#> FBgn0085432 4678.846158     -0.2333084  0.236723 -0.98557490  0.324342
#> ...                 ...            ...       ...         ...       ...
#> FBgn0039928     685.650       0.126857  0.258212    0.491292  0.623220
#> FBgn0052017     130.983       0.239103  0.418935    0.570741  0.568175
#> FBgn0002521     174.375      -0.110680  0.342843   -0.322829  0.746825
#> FBgn0039924     161.131       0.322486  0.456555    0.706346  0.479973
#> FBgn0027101     361.983       0.233889  0.294518    0.794142  0.427113
#>                  padj
#>             <numeric>
#> FBgn0267363  0.997413
#> FBgn0266617  0.997413
#> FBgn0265633  0.997413
#> FBgn0264617  0.997413
#> FBgn0085432  0.997413
#> ...               ...
#> FBgn0039928  0.997413
#> FBgn0052017  0.997413
#> FBgn0002521  0.997413
#> FBgn0039924  0.997413
#> FBgn0027101  0.997413