NeST Algorithm

Purpose

Use the NeST algorithm to identify spatial gene co-expression gene sets.

Usage

SDAS coexpress nest -i st.h5ad -o outdir --bin_size 100 \
--layer raw_counts \
--selected_genes top5000  \
--moran_path ./moran.csv \
--n_cpus 8 \
--seed 42  \
--hotspot_min_size 30  \
--hotspot_min_samples 4 \
--min_cells 100

Input Parameter Description

Parameter
Required
Default
Description

-i / --input

Yes

Stereo-seq h5ad, must contain raw expression matrix

-o / --output

Yes

Output directory

--bin_size

Yes

50

Bin size for resolution (20, 50, 100, 200, cellbin), consistent with input h5ad, required for plotting and calculation

--layer

No

Specify the layer of the raw expression matrix in h5ad (e.g. layers['raw_counts'])

--selected_genes

No

top5000

Selected gene mode : full(all the genes), topn(top n genes in Moran'I index)

--moran_path

No

Path to the precomputed Moran'I index csv file

--n_cpus

No

8

Number of parallel jobs for a speedup on multi-core machines

--seed

No

42

Random seed

--hotspot_min_size

No

30

single_hotspot: Minimum number of spots/cells to form a single-gene hotspot

--hotspot_min_samples

No

4

single_hotspot: Minimum number of neighboring spots/cells covered by DBSCAN

--min_cells

No

100/30

coexpress_hotspot: Minimum number of spots/cells to form a module in Module QC

default: 100 for cellbin/bin20/bin50; 30 for bin100/bin200

Output Results Display

Result File
Description

<input_name>_nest.module.csv

The result csv of spatial highly variable genes (gene symbol+gene id) corresponding to the co-expression gene set (module)

<input_name>_nest.h5ad

h5ad file containing the results of co-expression gene sets (adata.obsm['module_score_nest'])

<input_name>_nest_module_score_nest.png/pdf

Spatial heatmap of gene set scoring for co-expression gene sets

<input_name>_nest.all_coex_hotspots/_nest.all_coex_structure.png/pdf

Spatial location and hierarchical structure of co-expression gene sets

<input_name>_nest.separate_coex_hotspots.png/pdf

Spatial location and gene count of co-expression gene sets

<input_name>_nest.moran.csv

If topn is used for calculation, outputs all gene Moran index and P values

  • Result csv of co-expression gene sets: <input_name>_nest.module.csv, separated by commas. NeST output shows the spatial highly variable genes identified and their corresponding co-expression gene sets (modules)

Module
geneid
real_gene_name

Module0

ENSG00000130649

EPAS1

Module0

ENSG00000102882

CHCHD3

Module0

ENSG00000179144

MDGA2

  • Spatial heatmap of gene set scoring for co-expression gene sets <input_name>_nest_module_score_nest.png: Visualizes the spatial distribution patterns of all co-expression gene sets (Modules). The color intensity in the figure indicates the expression level of the co-expression gene set.

  • Spatial location and hierarchical structure of co-expression gene sets <input_name>_nest.all_coex_hotspots.png/pdf; <input_name>_nest.all_coex_structure.png/pdf: Shows the hierarchical relationship between different co-expression gene sets (Modules). The color in the figure indicates the spatial regions where different co-expression gene sets are located.

  • Spatial location and gene count of co-expression gene sets <input_name>_nest.separate_coex_hotspots.png/pdf: Visualizes the spatial regions where all co-expression gene sets (Modules) are located and the number of genes contained.

Result Interpretation

  • The co-expression gene sets start from Module0. No Module means genes that do not meet the clustering requirements of co-expression gene sets.

Parameter Tuning Suggestions

  • If the number of genes in bin20/50 samples is less than 200, or for other special samples, and the identified spatial co-expression gene sets are few, it is recommended to lower hotspot_min_size to 10.

  • If the identified spatial co-expression gene sets are few, it is recommended to lower min_cells to 10.

  • If the identified patterns are too fine and "NumPy Unable to allocate X GiB array" error occurs, it is recommended to increase hotspot_min_size and hotspot_min_samples.

Last updated