Squidpy Algorithm

Purpose

Use the Squidpy toolkit for comprehensive spatial omics analysis, including cell interaction patterns (neighborhood enrichment), co-occurrence analysis, and evaluation of spatial distribution patterns of cell types (Ripley's statistics).

Usage

SDAS spatialRelate squidpy -i st.h5ad -o outdir --label_key anno_cell2location \
--spatial_coords_scale 0.5 \
--bin_size 50.0 \
--n_neighs 9 \
--coord_type grid \
--n_perms_enrichment 1000 \
--cooccurrence_interval '100,1000,10' \
--n_simulations_ripley 100 \
--ripley_modes 'F,G,L' \
--n_cpus 24 \
--seed 42

Input Parameter Description

Parameter
Required
Default
Description

-i / --input

Yes

Path to the input AnnData h5ad file. The file must contain spatial coordinates (e.g., in adata.obsm['spatial']) and cell type or region cluster annotation (e.g., in adata.obs).

-o / --output

Yes

Output directory for all analysis results and plots. Will be created if it does not exist.

--label_key

Yes

Column name in adata.obs used for cell type or region cluster annotation.

--spatial_coords_scale

No

0.5

Global scaling factor applied to spatial coordinates. For example, set to 0.5 to convert pixel coordinates to micrometers for stereo-seq data.

--min_cells_per_type

No

20

Minimum number of cells (or spots) required for each cell type/cluster to be included in the analysis. Too small a value may lead to unstable results.

--n_cpus

No

8

Number of CPU cores to use for parallel computation. Increasing this value can speed up analysis for large datasets.

--bin_size

No

50.0

Diameter of spots in the original coordinate units. This value, multiplied by spatial_coords_scale, determines the point size in spatial scatter plots.

--n_neighs

No

9

Number of neighbors used to construct the spatial graph (sq.gr.spatial_neighbors). Affects neighborhood enrichment and co-occurrence analysis.

--coord_type

No

grid

Coordinate type for Squidpy's spatial_neighbors. Use 'grid' for regular arrays (faster), 'generic' for irregular spatial data.

--n_perms_enrichment

No

1000

Number of permutations for neighborhood enrichment analysis. Higher values yield more robust statistics but increase computation time.

--cooccurrence_interval

No

"100,1000,10"

Distance interval for co-occurrence analysis, in the format "start,end,num_points". For example, "50,250,10" means analyze 10 evenly spaced distances from 50 to 250 units.

--n_simulations_ripley

No

100

Number of simulations for Ripley's statistics (F, G, L functions). Affects the accuracy of confidence intervals.

--ripley_modes

No

"F,G,L"

Comma-separated list of Ripley statistics modes to calculate. Supported values: "F", "G", "L".

--seed

No

42

Random seed for reproducibility. Must be a non-negative integer.

Notes:

  • For sparse data or few cell types, consider increasing n_perms_enrichment and n_simulations_ripley for more stable results.

  • Adjust cooccurrence_interval and bin_size based on cell size and spatial distribution for optimal visualization and analysis.

Output Results Display

Result File
Description

<input_name>_squidpy_spatial_scatter.pdf/png

Spatial distribution plot colored by --label_key.

<input_name>_squidpy_nhood_enrichment_zscores.csv

Z-score matrix of neighborhood enrichment analysis.

<input_name>_squidpy_nhood_enrichment.pdf/png

Heatmap visualization of neighborhood enrichment Z-score matrix.

<input_name>_squidpy_co_occurrence_scores_full.csv

Main quantitative result. This file is in long table format, recording the raw co-occurrence score for each cell type pair at each distance bin. Fields include from_celltype, to_celltype, distance, distance_interval_index, cooccurrence_score. Suitable for custom analysis and visualization.

<input_name>_squidpy_co_occurrence_all_types.pdf

Trends of co-occurrence probability for all cell type pairs as distance changes (multi-page).

<input_name>_squidpy_co_occurrence_plots_png/

Individual PNG plots for each cell type's co-occurrence trend.

<input_name>_squidpy_co_occurrence_<celltype>.png

Co-occurrence trend plot for a single cell type.

<input_name>_squidpy_ripley_<mode>_function.pdf/png

Ripley's statistics result plots for each specified mode.

<input_name>_squidpy_processed.h5ad

Final AnnData object after Squidpy analysis.

  • Neighborhood enrichment Z-score matrix: <input_name>_squidpy_nhood_enrichment_zscores.csv shows the spatial proximity strength between cell types. Z-score indicates enrichment or depletion compared to random distribution.

B
Endo
Epi
Fib
Mye
NK
T

B

0.0

-0.5

-1.2

0.8

1.5

2.1

-0.3

Endo

-0.5

0.0

0.7

-0.2

-1.1

0.4

0.9

Epi

-1.2

0.7

0.0

1.8

-0.6

-0.8

0.3

Fib

0.8

-0.2

1.8

0.0

0.5

-0.1

1.2

Mye

1.5

-1.1

-0.6

0.5

0.0

0.7

-0.4

NK

2.1

0.4

-0.8

-0.1

0.7

0.0

1.6

T

-0.3

0.9

0.3

1.2

-0.4

1.6

0.0

  • Co-occurrence raw result table: <input_name>_squidpy_co_occurrence_scores_full.csv records the raw co-occurrence score for each cell type pair at each distance bin, suitable for custom analysis and visualization.

from_celltype
to_celltype
distance
distance_interval_index
cooccurrence_score

B

B

100.0

0

0.5327037

B

B

200.0

1

0.5388954

B

B

300.0

2

0.47285548

B

B

400.0

3

0.44997987

B

B

500.0

4

0.41206497

B

Endo

100.0

0

0.12482233

B

Endo

200.0

1

0.1646443

B

Endo

300.0

2

0.15779616

B

Endo

400.0

3

0.20279443

B

Epi

100.0

0

0.06660331

  • Neighborhood enrichment heatmap: <input_name>_squidpy_nhood_enrichment.png visualizes the Z-score matrix, color indicates spatial proximity strength.

  • Spatial scatter plot: <input_name>_squidpy_spatial_scatter.png shows spatial distribution of cell types colored by type.

  • Co-occurrence trend plot: <input_name>_squidpy_co_occurrence_B.png shows co-occurrence probability of B cells with other types as distance changes.

  • Ripley's F function analysis: <input_name>_squidpy_ripley_F_function.png shows spatial distribution pattern of cell types compared to random expectation.

Result Interpretation

  • Neighborhood enrichment Z-score: Positive means enrichment (cell types tend to be close), negative means depletion (tend to be apart), larger absolute value means more significant.

  • Co-occurrence analysis: Shows co-occurrence patterns as distance changes, helps understand spatial interactions.

  • Ripley's statistics: Evaluates spatial distribution pattern; L function above random means aggregation, below means dispersion.

Parameter Tuning Suggestions

  • If there are few cell types or data is sparse, adjust n_perms_enrichment and n_simulations_ripley for more stable results.

  • n_perms_enrichment: Number of permutations for neighborhood enrichment, can be reduced for large datasets to save time.

  • n_simulations_ripley: Number of simulations for Ripley's statistics, affects confidence interval accuracy.

  • cooccurrence_interval: Distance interval for co-occurrence analysis, adjust based on cell size and spatial distribution.

  • bin_size: Affects point size in scatter plot, adjust for data density and visualization needs.

Last updated