Enrichr Algorithm
Purpose and Usage
Scenario 1: Enrichment analysis of significantly differentially expressed genes obtained from SDAS DEG analysis
SDAS geneSetEnrichment enrichr \ -i de_t-test.anno_rctd.SmoothMuscle-vs-Endo.sig_filtered.csv -o outdir \ --species human
Scenario 2: Enrichment analysis of significantly differentially expressed genes using only databases of interest
SDAS geneSetEnrichment enrichr \ -i de_t-test.anno_rctd.SmoothMuscle-vs-Endo.sig_filtered.csv -o outdir \ --gmt sdas_deg_enrichment/lib/GSEADB/KEGG_2021_Human.gmt
Input Parameter Description
-i / --input
Yes
Enrichr uses significance deg csv file.
-o / --output
Yes
The GSEApy output directory. Default: the current working directory
--species
No
human
Use biuld-in gmt database: human or mouse. Default: human. More database see here: https://amp.pharm.mssm.edu/modEnrichr.
--gmt
No
Enrichr library name(s) required. Separate each name by ",". Default use --species build-in database.
--cut_off
No
1
Adjust-Pval cutoff, used for generating plots. Default: 0.05.
--background
No
Choose background from one of the following. (1) A BioMart Dataset name, e.g. 'hsapiens_gene_ensembl' . (2) A total gene number, e.g. 20000. Only works for GMT file input. (3) A text file contains the background gene list (one gene per row). Gene identifier should be the same to your input (-i). (4) Default: None. It means all genes in the (-g) input as the background.
--top_term
No
10
Numbers of top terms shown in the plot. Default: 10
-v / --verbose
No
Increase output verbosity, print out progress of your job. Default False
Output Results Display
enrichment_{database}.UP.csv
Enrichment analysis results for upregulated genes
enrichment_{database}.DOWN.csv
Enrichment analysis results for downregulated genes
enrichment_{database}.pdf/png
Bar plot of top enriched pathways for up/down genes
Enrichment analysis results for up/downregulated genes:
enrichment_{database}.UP/DOWN.csv
are the result files for enrichment analysis of up- and downregulated genes, respectively. The files contain the columns Gene_set, Term, Overlap, P-value, Adjusted P-value, Odds Ratio, Combined Score, Genes, representing the database name, specific pathway name, number and proportion of overlapping genes in the input list, original p-value, adjusted p-value, odds ratio, combined score, and the specific overlapping gene names in the input list.
KEGG_2021_Human.gmt
ABC transporters
43/45
0.00026880161509715636
0.002529897553855589
5.888896293211162
48.41577841510133
ABCA3;ABCB4;...
KEGG_2021_Human.gmt
AGE-RAGE signaling pathway in diabetic complications
90/100
0.00011055669162020043
0.0013606977430178514
2.928493469422023
26.678523151510976
AKT1;PLCB1;...
KEGG_2021_Human.gmt
AMPK signaling pathway
107/120
6.61066069653723e-05
0.0009525689822440243
2.709298083129859
26.074940028325518
AKT1;CREB3;...
...
...
...
...
...
...
...
...
Enrichment analysis result bar plot:
enrichment_{database}.pdf/png
. Different colors represent the top enriched pathways for up- and downregulated genes.

Last updated