Enrichr Algorithm

Purpose and Usage

  • Scenario 1: Enrichment analysis of significantly differentially expressed genes obtained from SDAS DEG analysis

    SDAS geneSetEnrichment enrichr \
    -i de_t-test.anno_rctd.SmoothMuscle-vs-Endo.sig_filtered.csv -o outdir \
    --species human
  • Scenario 2: Enrichment analysis of significantly differentially expressed genes using only databases of interest

    SDAS geneSetEnrichment enrichr \
    -i de_t-test.anno_rctd.SmoothMuscle-vs-Endo.sig_filtered.csv -o outdir \
    --gmt sdas_deg_enrichment/lib/GSEADB/KEGG_2021_Human.gmt

Input Parameter Description

Enrichr Parameter
Required
Default Value
Description

-i / --input

Yes

Enrichr uses significance deg csv file.

-o / --output

Yes

The GSEApy output directory. Default: the current working directory

--species

No

human

Use biuld-in gmt database: human or mouse. Default: human. More database see here: https://amp.pharm.mssm.edu/modEnrichr.

--gmt

No

Enrichr library name(s) required. Separate each name by ",". Default use --species build-in database.

--cut_off

No

1

Adjust-Pval cutoff, used for generating plots. Default: 0.05.

--background

No

Choose background from one of the following. (1) A BioMart Dataset name, e.g. 'hsapiens_gene_ensembl' . (2) A total gene number, e.g. 20000. Only works for GMT file input. (3) A text file contains the background gene list (one gene per row). Gene identifier should be the same to your input (-i). (4) Default: None. It means all genes in the (-g) input as the background.

--top_term

No

10

Numbers of top terms shown in the plot. Default: 10

-v / --verbose

No

Increase output verbosity, print out progress of your job. Default False

Output Results Display

Enrichr Result File
Description

enrichment_{database}.UP.csv

Enrichment analysis results for upregulated genes

enrichment_{database}.DOWN.csv

Enrichment analysis results for downregulated genes

enrichment_{database}.pdf/png

Bar plot of top enriched pathways for up/down genes

  • Enrichment analysis results for up/downregulated genes: enrichment_{database}.UP/DOWN.csv are the result files for enrichment analysis of up- and downregulated genes, respectively. The files contain the columns Gene_set, Term, Overlap, P-value, Adjusted P-value, Odds Ratio, Combined Score, Genes, representing the database name, specific pathway name, number and proportion of overlapping genes in the input list, original p-value, adjusted p-value, odds ratio, combined score, and the specific overlapping gene names in the input list.

Gene_set
Term
Overlap
P-value
Adjusted P-value
Odds Ratio
Combined Score
Genes

KEGG_2021_Human.gmt

ABC transporters

43/45

0.00026880161509715636

0.002529897553855589

5.888896293211162

48.41577841510133

ABCA3;ABCB4;...

KEGG_2021_Human.gmt

AGE-RAGE signaling pathway in diabetic complications

90/100

0.00011055669162020043

0.0013606977430178514

2.928493469422023

26.678523151510976

AKT1;PLCB1;...

KEGG_2021_Human.gmt

AMPK signaling pathway

107/120

6.61066069653723e-05

0.0009525689822440243

2.709298083129859

26.074940028325518

AKT1;CREB3;...

...

...

...

...

...

...

...

...

  • Enrichment analysis result bar plot: enrichment_{database}.pdf/png. Different colors represent the top enriched pathways for up- and downregulated genes.

Last updated