Welcome to de_toolkit’s documentation!¶

Introduction¶

de_toolkit is a suite of Bioinformatics tools useful in differential expression analysis and other high-throughput sequencing count-based workflows. The tools are implemented either through direct implementation in python or as a convenience wrapper around R packages using rpy2.

The toolkit is both a python module and a command line interface that wraps primary module functions to facilitate easy integration into workflows. For instance, to perform DESeq2 normalization of a counts matrix contained in the file counts_matrix.tsv, you could run on the command line:

detk-norm deseq counts_matrix.tsv > norm_counts_matrix.tsv

The counts in the counts matrix file will be normalized using the DESeq2 method and output to the norm_counts_matrix.tsv file as the equivalent of:

library(DESeq2)

# TODO: add actual equivalent code

Module Documentation¶

The following functionality is (or will be) implemented by the package (items in italics are not yet implemented):

norm - Normalizing Count Matrices
- DESeq2
- trimmed mean
- reference norm
- library size
- FPKM
- user supplied
de - Differential Expression
- DESeq2
- Firth’s Logistic Regression
- t-test
outlier - Outlier Identification
- entropy
- Cook’s distance
transform - Count Transformation
- DESeq2 Variance Stabilizing Transform
- RUVSeq transformation
- trim
- shrink
filter - Filtering Count Matrices
- nonzero
- mean
stats - Count Matrix Statistics
- summary
- dist
- PCA

Installing¶

conda package¶

We suggest installing this package using anaconda on the bubhub channel:

conda install -c bubhub de_toolkit

Manual installation¶

If conda is not available, ensure the following packages are installed and available in your environment:

python packages (python>=3.5)
- docopt
- pandas
- numpy
R packages (R>=3.2)
- R>=3.2
- docopt

The following packages are only required to use the corresponding submodule functions:

R packages
- DESeq2 (bioconductor)
- RUVSeq (bioconductor)
- logistf (CRAN)

We suggest using anaconda to create an environment that contains the software necessary, e.g.:

conda create -n de_toolkit python=3.5

./install_conda_packages.sh

# if you want to use the R functions (Firth, DESeq2, etc.)
Rscript install_r_packages.sh

In development, when you want to run the toolkit, use the setup.py script:

python setup.py install

This should make the detk and its subtools available on the command line. Whenever you make changes to the code you will need to run this command again.

Welcome to de_toolkit’s documentation!¶

Introduction¶

Module Documentation¶

Installing¶

conda package¶

Manual installation¶

Indices and tables¶

Table Of Contents

Related Topics

This Page