Differentially Expressed Heterogeneous Overdispersion Gene Test for Count Data

Introduction

DEHOGT is designed to handle overdispersion in count data using a generalized linear model (GLM) framework. The package supports quasi-Poisson and negative binomial models, making it useful for differential expression analysis of RNA-seq and other count-based data types.

Installation

if (!require("BiocManager", quietly = TRUE)) {
    install.packages("BiocManager")
}

BiocManager::install("DEHOGT")

Example Worklow

In this example, we simulate gene expression data and perform differential expression analysis using the quasi-Poisson model. We also show how to incorporate covariates and normalization factors.

## Simulate gene expression data (100 genes, 10 samples)
data <- matrix(rpois(1000, 10), nrow = 100, ncol = 10)

## Randomly assign treatment groups
treatment <- sample(0:1, 10, replace = TRUE)
## Load DEHOGT package
library(DEHOGT)

## Run the function with 2 CPU cores
result <- dehogt_func(data, treatment, num_cores = 2)

## Display results
head(result$pvals)
## [1] 0.8872323 0.4171896 0.5574052 0.5831349 0.2603625 0.1662555
# Example: Adding covariates and normalization factors
covariates <- matrix(rnorm(1000), nrow = 100, ncol = 10)
norm_factors <- rep(1, 10)

# Run with covariates and normalization factors
result_cov <- dehogt_func(data, treatment, covariates = covariates, norm_factors = norm_factors, num_cores = 2)

Session Info

sessionInfo()

R version 4.4.1 (2024-06-14) Platform: x86_64-pc-linux-gnu Running under: Ubuntu 24.04.1 LTS

Matrix products: default BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3 LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.26.so; LAPACK version 3.12.0

locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=C
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

time zone: Etc/UTC tzcode source: system (glibc)

attached base packages: [1] stats graphics grDevices utils datasets methods base

other attached packages: [1] DEHOGT_0.99.0 BiocStyle_2.33.1

loaded via a namespace (and not attached): [1] doParallel_1.0.17 cli_3.6.3 knitr_1.48
[4] rlang_1.1.4 xfun_0.47 jsonlite_1.8.8
[7] buildtools_1.0.0 htmltools_0.5.8.1 maketools_1.3.0
[10] sys_3.4.2 sass_0.4.9 rmarkdown_2.28
[13] evaluate_0.24.0 jquerylib_0.1.4 MASS_7.3-61
[16] fastmap_1.2.0 yaml_2.3.10 foreach_1.5.2
[19] lifecycle_1.0.4 BiocManager_1.30.25 compiler_4.4.1
[22] codetools_0.2-20 digest_0.6.37 R6_2.5.1
[25] parallel_4.4.1 bslib_0.8.0 tools_4.4.1
[28] iterators_1.0.14 cachem_1.1.0