CRAWDAD Algorithm

Purpose

Use the CRAWDAD (Correlated Analysis of WORkflow-Derived Spatial Associations and Distances) R package to analyze spatial transcriptomics data for cell type interactions and spatial covariation trends, identifying significant cell proximity or avoidance.

Usage

SDAS spatialRelate crawdad -i st.h5ad -o outdir --label_key anno_cell2location \
--spatial_coords_scale 0.5 \
--crawdad_scales_definition '100,1000,10' \
--crawdad_perms 10 \
--crawdad_neighborhood_dist 50 \
--n_cpus 24 \
--seed 42

Input Parameter Description

Parameter
Required
Default
Description

-i / --input

Yes

Path to the input AnnData h5ad file. Must contain spatial coordinates and cell type annotation.

-o / --output

Yes

Output directory for all analysis results and plots. Will be created if it does not exist.

--label_key

Yes

Column name in adata.obs for cell type or region cluster annotation.

--spatial_coords_scale

No

0.5

Global scaling factor for spatial coordinates. For stereo-seq data, set to 0.5 to convert to micrometers.

--min_cells_per_type

No

20

Minimum number of cells (or spots) per type/cluster after filtering. Too small a value may lead to unstable results.

--crawdad_neighborhood_dist

No

50

Neighborhood distance (usually in micrometers) for defining cell neighborhoods in CRAWDAD. Should match cell size and spatial distribution; typically set to 1-3 times the cell diameter.

--crawdad_scales_definition

No

"100,1000,10"

Distance scale definition for CRAWDAD, in the format "start,end,num_points". For example, "50,500,10" means analyze 10 distances from 50 to 500 units. The start value should be greater than or equal to crawdad_neighborhood_dist.

--crawdad_perms

No

10

Number of permutations for calculating interaction p-values in CRAWDAD. Higher values improve p-value accuracy but increase computation time.

--n_cpus

No

8

Number of CPU cores for parallel computation. More cores reduce runtime for large datasets.

--seed

No

42

Random seed for reproducibility. Must be a non-negative integer.

Output Results Display

Result File
Description

<input_name>_crawdad_summary_coloc_dotplot.pdf/png

Summary dot plot showing significant co-localization or avoidance between cell type pairs.

<input_name>_crawdad_spatial_cluster_plot.pdf/png

Spatial cluster plot colored by cell type (label_key).

<input_name>_crawdad_trend_plots_all.pdf

Interaction trend plots (multi-page PDF), each page for a reference cell type.

<input_name>_crawdad_trend_plots_png/

Individual PNG images for each reference cell type.

<input_name>_crawdad_trend_plot_ref_<celltype>.png

Combined trend plot for a single reference cell type.

<input_name>_crawdad_melted_results.csv

Main quantitative result: detailed interaction analysis between cell types.

<input_name>_crawdad_zsig.csv

Bonferroni-corrected Z-score significance threshold (zsig).

<input_name>_crawdad_r_stdout.log

R script standard output log (INFO level, with timestamp).

<input_name>_crawdad_r_stderr.log

R script standard error log (WARNING/ERROR level, with timestamp and error details).

(<input_name>_processed_for_crawdad.h5ad)

(Intermediate file) Preprocessed AnnData file for R script.

  • Main quantitative result table: <input_name>_crawdad_melted_results.csv shows detailed interaction analysis data between cell types, including reference cell, neighbor cell, distance scale, Z-score, etc.

perm
neighbor
Z
scale
reference

1

B

1.05670532268942

100

B

2

B

-1.45780531937283

100

B

3

B

-2.7310679139005

100

B

4

B

-1.76600222336501

100

B

5

B

-0.122214856981382

100

B

6

B

-0.215950198738648

100

B

7

B

-1.34972501073846

100

B

8

B

-1.38061633431226

100

B

9

B

-1.56577575976306

100

B

10

B

-1.71982948316056

100

B

1

Endo

0.586389142807096

100

B

2

Endo

0.953148492138181

100

B

3

Endo

1.18876625358954

100

B

4

Endo

0.71926735855275

100

B

5

Endo

0.752573768964804

100

B

6

Endo

1.32419648335896

100

B

7

Endo

1.46020965216509

100

B

8

Endo

0.819291421388121

100

B

9

Endo

1.87180663929578

100

B

  • Summary dot plot: <input_name>_crawdad_summary_coloc_dotplot.png shows significant co-localization or avoidance between cell type pairs.

  • Spatial cluster plot: <input_name>_crawdad_spatial_cluster_plot.png shows spatial arrangement of cell types colored by type.

  • Interaction trend plot: <input_name>_crawdad_trend_plot_ref_B.png shows spatial distribution and interaction strength of B cells with other cell types as distance changes. In the trend plot, neighbor cell types that are determined to have significant interactions with the reference cell type based on zsig are colored, while other cell types are displayed in light gray. In the spatial distribution plot, cell types without significant interactions are not displayed.

Result Interpretation

  • Z-score: Positive means co-localization (cell types tend to appear in the same region), negative means avoidance (tend to appear in different regions), larger absolute value means more significant.

  • Distance scale: Different scale values represent different spatial analysis resolutions.

  • Significance threshold: zsig is used to determine statistical significance; Z-scores exceeding this threshold are considered significant.

Parameter Tuning Suggestions

  • If there are few cell types or data is sparse, adjust crawdad_perms for more stable results, or adjust crawdad_scales_definition to fit data characteristics.

  • crawdad_neighborhood_dist: Usually set to 1-3 times the cell diameter (e.g., 20-60 μm for a cell diameter of 20 μm). Should match cell size and spatial distribution.

  • crawdad_perms: Number of permutations, higher value increases accuracy but also computation time.

  • crawdad_scales_definition: Distance scale definition, adjust based on cell size and spatial distribution. The start value should be greater than or equal to crawdad_neighborhood_dist.

  • spatial_coords_scale: For stereo-seq data, set to 0.5 to convert coordinates to micrometers. This is recommended for correct interpretation of results.

Last updated