Squidpy Algorithm

Purpose

Use the Squidpy toolkit for comprehensive spatial omics analysis, including cell interaction patterns (neighborhood enrichment), co-occurrence analysis, and evaluation of spatial distribution patterns of cell types (Ripley's statistics).

Usage

SDAS spatialRelate squidpy -i st.h5ad -o outdir --label_key anno_cell2location \
--spatial_coords_scale 0.5 \
--bin_size 50.0 \
--n_neighs 9 \
--coord_type grid \
--n_perms_enrichment 1000 \
--cooccurrence_interval '100,1000,10' \
--n_simulations_ripley 100 \
--ripley_modes 'F,G,L' \
--n_cpus 24 \
--seed 42

Input Parameter Description

Parameter

Required

Default

Description

-i / --input

Yes

Path to the input AnnData h5ad file. The file must contain spatial coordinates (e.g., in adata.obsm['spatial']) and cell type or region cluster annotation (e.g., in adata.obs).

-o / --output

Yes

Output directory for all analysis results and plots. Will be created if it does not exist.

--label_key

Yes

Column name in adata.obs used for cell type or region cluster annotation.

--spatial_coords_scale

0.5

Global scaling factor applied to spatial coordinates. For example, set to 0.5 to convert pixel coordinates to micrometers for stereo-seq data.

--min_cells_per_type

Minimum number of cells (or spots) required for each cell type/cluster to be included in the analysis. Too small a value may lead to unstable results.

--n_cpus

Number of CPU cores to use for parallel computation. Increasing this value can speed up analysis for large datasets.

--bin_size

50.0

Diameter of spots in the original coordinate units. This value, multiplied by spatial_coords_scale, determines the point size in spatial scatter plots.

--n_neighs

Number of neighbors used to construct the spatial graph (sq.gr.spatial_neighbors). Affects neighborhood enrichment and co-occurrence analysis.

--coord_type

grid

Coordinate type for Squidpy's spatial_neighbors. Use 'grid' for regular arrays (faster), 'generic' for irregular spatial data.

--n_perms_enrichment

1000

Number of permutations for neighborhood enrichment analysis. Higher values yield more robust statistics but increase computation time.

--cooccurrence_interval

"100,1000,10"

Distance interval for co-occurrence analysis, in the format "start,end,num_points". For example, "50,250,10" means analyze 10 evenly spaced distances from 50 to 250 units.

--n_simulations_ripley

100

Number of simulations for Ripley's statistics (F, G, L functions). Affects the accuracy of confidence intervals.

--ripley_modes

"F,G,L"

Comma-separated list of Ripley statistics modes to calculate. Supported values: "F", "G", "L".

--seed

Random seed for reproducibility. Must be a non-negative integer.

Notes:

For sparse data or few cell types, consider increasing n_perms_enrichment and n_simulations_ripley for more stable results.
Adjust cooccurrence_interval and bin_size based on cell size and spatial distribution for optimal visualization and analysis.

Output Results Display

Result File

Description

<input_name>_squidpy_spatial_scatter.pdf/png

Spatial distribution plot colored by --label_key.

<input_name>_squidpy_nhood_enrichment_zscores.csv

Z-score matrix of neighborhood enrichment analysis.

<input_name>_squidpy_nhood_enrichment.pdf/png

Heatmap visualization of neighborhood enrichment Z-score matrix.

<input_name>_squidpy_co_occurrence_scores_full.csv

Main quantitative result. This file is in long table format, recording the raw co-occurrence score for each cell type pair at each distance bin. Fields include from_celltype, to_celltype, distance, distance_interval_index, cooccurrence_score. Suitable for custom analysis and visualization.

<input_name>_squidpy_co_occurrence_all_types.pdf

Trends of co-occurrence probability for all cell type pairs as distance changes (multi-page).

<input_name>_squidpy_co_occurrence_plots_png/

Individual PNG plots for each cell type's co-occurrence trend.

<input_name>_squidpy_co_occurrence_<celltype>.png

Co-occurrence trend plot for a single cell type.

<input_name>_squidpy_ripley_<mode>_function.pdf/png

Ripley's statistics result plots for each specified mode.

<input_name>_squidpy_processed.h5ad

Final AnnData object after Squidpy analysis.

Neighborhood enrichment Z-score matrix: <input_name>_squidpy_nhood_enrichment_zscores.csv shows the spatial proximity strength between cell types. Z-score indicates enrichment or depletion compared to random distribution.

Endo

Epi

Fib

Mye

0.0

-0.5

-1.2

0.8

1.5

2.1

-0.3

Endo

-0.5

0.0

0.7

-0.2

-1.1

0.4

0.9

Epi

-1.2

0.7

0.0

1.8

-0.6

-0.8

0.3

Fib

0.8

-0.2

1.8

0.0

0.5

-0.1

1.2

Mye

1.5

-1.1

-0.6

0.5

0.0

0.7

-0.4

2.1

0.4

-0.8

-0.1

0.7

0.0

1.6

-0.3

0.9

0.3

1.2

-0.4

1.6

0.0

Co-occurrence raw result table: <input_name>_squidpy_co_occurrence_scores_full.csv records the raw co-occurrence score for each cell type pair at each distance bin, suitable for custom analysis and visualization.

from_celltype

to_celltype

distance

distance_interval_index

cooccurrence_score

100.0

0.5327037

200.0

0.5388954

300.0

0.47285548

400.0

0.44997987

500.0

0.41206497

Endo

100.0

0.12482233

Endo

200.0

0.1646443

Endo

300.0

0.15779616

Endo

400.0

0.20279443

Epi

100.0

0.06660331

Neighborhood enrichment heatmap: <input_name>_squidpy_nhood_enrichment.png visualizes the Z-score matrix, color indicates spatial proximity strength.

Spatial scatter plot: <input_name>_squidpy_spatial_scatter.png shows spatial distribution of cell types colored by type.

Co-occurrence trend plot: <input_name>_squidpy_co_occurrence_B.png shows co-occurrence probability of B cells with other types as distance changes.

Ripley's F function analysis: <input_name>_squidpy_ripley_F_function.png shows spatial distribution pattern of cell types compared to random expectation.

Result Interpretation

Neighborhood enrichment Z-score: Positive means enrichment (cell types tend to be close), negative means depletion (tend to be apart), larger absolute value means more significant.
Co-occurrence analysis: Shows co-occurrence patterns as distance changes, helps understand spatial interactions.
Ripley's statistics: Evaluates spatial distribution pattern; L function above random means aggregation, below means dispersion.

Parameter Tuning Suggestions

If there are few cell types or data is sparse, adjust n_perms_enrichment and n_simulations_ripley for more stable results.
n_perms_enrichment: Number of permutations for neighborhood enrichment, can be reduced for large datasets to save time.
n_simulations_ripley: Number of simulations for Ripley's statistics, affects confidence interval accuracy.
cooccurrence_interval: Distance interval for co-occurrence analysis, adjust based on cell size and spatial distribution.
bin_size: Affects point size in scatter plot, adjust for data density and visualization needs.

PreviousInput File Example NextCRAWDAD Algorithm

Last updated 2 months ago