Squidpy Algorithm
Purpose
Use the Squidpy toolkit for comprehensive spatial omics analysis, including cell interaction patterns (neighborhood enrichment), co-occurrence analysis, and evaluation of spatial distribution patterns of cell types (Ripley's statistics).
Usage
SDAS spatialRelate squidpy -i st.h5ad -o outdir --label_key anno_cell2location \
--spatial_coords_scale 0.5 \
--bin_size 50.0 \
--n_neighs 9 \
--coord_type grid \
--n_perms_enrichment 1000 \
--cooccurrence_interval '100,1000,10' \
--n_simulations_ripley 100 \
--ripley_modes 'F,G,L' \
--n_cpus 24 \
--seed 42
Input Parameter Description
-i / --input
Yes
Path to the input AnnData h5ad file. The file must contain spatial coordinates (e.g., in adata.obsm['spatial']
) and cell type or region cluster annotation (e.g., in adata.obs
).
-o / --output
Yes
Output directory for all analysis results and plots. Will be created if it does not exist.
--label_key
Yes
Column name in adata.obs
used for cell type or region cluster annotation.
--spatial_coords_scale
No
0.5
Global scaling factor applied to spatial coordinates. For example, set to 0.5 to convert pixel coordinates to micrometers for stereo-seq data.
--min_cells_per_type
No
20
Minimum number of cells (or spots) required for each cell type/cluster to be included in the analysis. Too small a value may lead to unstable results.
--n_cpus
No
8
Number of CPU cores to use for parallel computation. Increasing this value can speed up analysis for large datasets.
--bin_size
No
50.0
Diameter of spots in the original coordinate units. This value, multiplied by spatial_coords_scale
, determines the point size in spatial scatter plots.
--n_neighs
No
9
Number of neighbors used to construct the spatial graph (sq.gr.spatial_neighbors
). Affects neighborhood enrichment and co-occurrence analysis.
--coord_type
No
grid
Coordinate type for Squidpy's spatial_neighbors. Use 'grid'
for regular arrays (faster), 'generic'
for irregular spatial data.
--n_perms_enrichment
No
1000
Number of permutations for neighborhood enrichment analysis. Higher values yield more robust statistics but increase computation time.
--cooccurrence_interval
No
"100,1000,10"
Distance interval for co-occurrence analysis, in the format "start,end,num_points"
. For example, "50,250,10"
means analyze 10 evenly spaced distances from 50 to 250 units.
--n_simulations_ripley
No
100
Number of simulations for Ripley's statistics (F, G, L functions). Affects the accuracy of confidence intervals.
--ripley_modes
No
"F,G,L"
Comma-separated list of Ripley statistics modes to calculate. Supported values: "F", "G", "L".
--seed
No
42
Random seed for reproducibility. Must be a non-negative integer.
Notes:
For sparse data or few cell types, consider increasing
n_perms_enrichment
andn_simulations_ripley
for more stable results.Adjust
cooccurrence_interval
andbin_size
based on cell size and spatial distribution for optimal visualization and analysis.
Output Results Display
<input_name>_squidpy_spatial_scatter.pdf/png
Spatial distribution plot colored by --label_key.
<input_name>_squidpy_nhood_enrichment_zscores.csv
Z-score matrix of neighborhood enrichment analysis.
<input_name>_squidpy_nhood_enrichment.pdf/png
Heatmap visualization of neighborhood enrichment Z-score matrix.
<input_name>_squidpy_co_occurrence_scores_full.csv
Main quantitative result. This file is in long table format, recording the raw co-occurrence score for each cell type pair at each distance bin. Fields include from_celltype, to_celltype, distance, distance_interval_index, cooccurrence_score. Suitable for custom analysis and visualization.
<input_name>_squidpy_co_occurrence_all_types.pdf
Trends of co-occurrence probability for all cell type pairs as distance changes (multi-page).
<input_name>_squidpy_co_occurrence_plots_png/
Individual PNG plots for each cell type's co-occurrence trend.
<input_name>_squidpy_co_occurrence_<celltype>.png
Co-occurrence trend plot for a single cell type.
<input_name>_squidpy_ripley_<mode>_function.pdf/png
Ripley's statistics result plots for each specified mode.
<input_name>_squidpy_processed.h5ad
Final AnnData object after Squidpy analysis.
Neighborhood enrichment Z-score matrix:
<input_name>_squidpy_nhood_enrichment_zscores.csv
shows the spatial proximity strength between cell types. Z-score indicates enrichment or depletion compared to random distribution.
B
0.0
-0.5
-1.2
0.8
1.5
2.1
-0.3
Endo
-0.5
0.0
0.7
-0.2
-1.1
0.4
0.9
Epi
-1.2
0.7
0.0
1.8
-0.6
-0.8
0.3
Fib
0.8
-0.2
1.8
0.0
0.5
-0.1
1.2
Mye
1.5
-1.1
-0.6
0.5
0.0
0.7
-0.4
NK
2.1
0.4
-0.8
-0.1
0.7
0.0
1.6
T
-0.3
0.9
0.3
1.2
-0.4
1.6
0.0
Co-occurrence raw result table:
<input_name>_squidpy_co_occurrence_scores_full.csv
records the raw co-occurrence score for each cell type pair at each distance bin, suitable for custom analysis and visualization.
B
B
100.0
0
0.5327037
B
B
200.0
1
0.5388954
B
B
300.0
2
0.47285548
B
B
400.0
3
0.44997987
B
B
500.0
4
0.41206497
B
Endo
100.0
0
0.12482233
B
Endo
200.0
1
0.1646443
B
Endo
300.0
2
0.15779616
B
Endo
400.0
3
0.20279443
B
Epi
100.0
0
0.06660331
Neighborhood enrichment heatmap:
<input_name>_squidpy_nhood_enrichment.png
visualizes the Z-score matrix, color indicates spatial proximity strength.

Spatial scatter plot:
<input_name>_squidpy_spatial_scatter.png
shows spatial distribution of cell types colored by type.

Co-occurrence trend plot:
<input_name>_squidpy_co_occurrence_B.png
shows co-occurrence probability of B cells with other types as distance changes.

Ripley's F function analysis:
<input_name>_squidpy_ripley_F_function.png
shows spatial distribution pattern of cell types compared to random expectation.

Result Interpretation
Neighborhood enrichment Z-score: Positive means enrichment (cell types tend to be close), negative means depletion (tend to be apart), larger absolute value means more significant.
Co-occurrence analysis: Shows co-occurrence patterns as distance changes, helps understand spatial interactions.
Ripley's statistics: Evaluates spatial distribution pattern; L function above random means aggregation, below means dispersion.
Parameter Tuning Suggestions
If there are few cell types or data is sparse, adjust
n_perms_enrichment
andn_simulations_ripley
for more stable results.n_perms_enrichment
: Number of permutations for neighborhood enrichment, can be reduced for large datasets to save time.n_simulations_ripley
: Number of simulations for Ripley's statistics, affects confidence interval accuracy.cooccurrence_interval
: Distance interval for co-occurrence analysis, adjust based on cell size and spatial distribution.bin_size
: Affects point size in scatter plot, adjust for data density and visualization needs.
Last updated