Hotspot Algorithm

Purpose

Use the Hotspot algorithm to identify spatial gene co-expression gene sets.

Usage

SDAS coexpress hotspot -i st.h5ad -o outdir --bin_size 100 \
--layer raw_counts \
--selected_genes top5000  \
--moran_path ./moran.csv \
--n_cpus 8 \
--seed 42  \
--fdr_cutoff 0.05 \
--model bernoulli

Input Parameter Description

Parameter
Required
Default
Description

-i / --input

Yes

Stereo-seq h5ad, must contain raw expression matrix

-o / --output

Yes

Output directory

--bin_size

Yes

50

Bin size for resolution (20, 50, 100, 200, cellbin), consistent with input h5ad, required for plotting and calculation

--layer

No

Specify the layer of the raw expression matrix in h5ad (e.g. layers['raw_counts'])

--selected_genes

No

top5000

Selected gene mode : full(all the genes), topn(top n genes in Moran'I index)

--moran_path

No

Path to the precomputed Moran'I index csv file

--n_cpus

No

8

Number of parallel jobs for a speedup on multi-core machines

--seed

No

42

Random seed

--fdr_cutoff

No

0.05

FDR threshold for statistical testing of spatial highly variable genes and co-expression gene sets

--model

No

normal

Test statistc Null model used in Gene and Modules: danb: Depth-Adjusted Negative Binomial,bernoulli: Models probability of detection,normal: Depth-Adjusted Normal,none: Assumes data has been pre-standardized

Output Results Display

Result File
Description

<input_name>_hotspot.module.csv

The result csv of spatial highly variable genes (gene symbol+gene id) corresponding to the co-expression gene set (module)

<input_name>_hotspot.h5ad

h5ad file containing the results of co-expression gene sets (adata.obsm['module_score_hotspot'])

<input_name>_hotspot_module_score_hotspot.png/pdf

Spatial heatmap of gene set scoring for co-expression gene sets

<input_name>_hotspot.all_coex_heatmap.png/pdf

Similarity heatmap of co-expression gene sets

<input_name>_hotspot.moran.csv

If topn is used for calculation, outputs all gene Moran index and P values

  • Result csv of co-expression gene sets: <input_name>_hotspot.module.csv, separated by commas. Hotspot output shows the spatial highly variable genes identified and their corresponding co-expression gene sets (modules)

geneid
real_gene_name
FDR
Module

ENSG00000163209

SPRR3

0.0

Module-1

ENSG00000151632

AKR1C2

0.0

Module-1

ENSG00000170345

FOS

0.0

Module-1

ENSG00000164433

FABP5

0.0

Module-1

ENSG00000120129

DUSP1

0.0

Module-1

  • Spatial heatmap of gene set scoring for co-expression gene sets <input_name>_hotspot_module_score_hotspot.png/pdf: Visualizes the spatial distribution patterns of all co-expression gene sets (Modules). The color intensity in the figure indicates the expression level of the co-expression gene set.

  • Similarity heatmap of co-expression gene sets <input_name>_hotspot.all_coex_heatmap.png/pdf: Shows the similarity clustering relationship between different co-expression gene sets (Modules). The color in the figure indicates the similarity between different co-expression gene sets, with red indicating high similarity.

Result Interpretation

  • The co-expression gene sets start from Module1. Module-1/no Module means genes that do not meet the clustering requirements of co-expression gene sets.

Parameter Tuning Suggestions

  • If the number of genes in bin20/50 samples is less than 200, or for other special samples, and the identified spatial highly variable genes/co-expression gene sets are few, it is recommended to change the model parameter from normal to bernoulli, and set fdr_cutoff to 0.05.

Last updated