> For the complete documentation index, see [llms.txt](https://mysite.gitbook.io/sdas_manual_eng/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://mysite.gitbook.io/sdas_manual_eng/readme/04_manual/spatial_relate/03_squidpy.md).

# Squidpy Algorithm

## Purpose

Use the Squidpy toolkit for comprehensive spatial omics analysis, including cell interaction patterns (neighborhood enrichment), co-occurrence analysis, and evaluation of spatial distribution patterns of cell types (Ripley's statistics).

## Usage

```bash
SDAS spatialRelate squidpy -i st.h5ad -o outdir --label_key anno_cell2location \
--spatial_coords_scale 0.5 \
--bin_size 50.0 \
--n_neighs 9 \
--coord_type grid \
--n_perms_enrichment 1000 \
--cooccurrence_interval '100,1000,10' \
--n_simulations_ripley 100 \
--ripley_modes 'F,G,L' \
--n_cpus 24 \
--seed 42
```

## Input Parameter Description

| Parameter                | Required | Default       | Description                                                                                                                                                                          |
| ------------------------ | -------- | ------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
| **-i / --input**         | **Yes**  |               | Path to the input AnnData h5ad file. The file must contain spatial coordinates (e.g., in `adata.obsm['spatial']`) and cell type or region cluster annotation (e.g., in `adata.obs`). |
| **-o / --output**        | **Yes**  |               | Output directory for all analysis results and plots. Will be created if it does not exist.                                                                                           |
| **--label\_key**         | **Yes**  |               | Column name in `adata.obs` used for cell type or region cluster annotation.                                                                                                          |
| --spatial\_coords\_scale | No       | 0.5           | Global scaling factor applied to spatial coordinates. For example, set to 0.5 to convert pixel coordinates to micrometers for stereo-seq data.                                       |
| --min\_cells\_per\_type  | No       | 20            | Minimum number of cells (or spots) required for each cell type/cluster to be included in the analysis. Too small a value may lead to unstable results.                               |
| --n\_cpus                | No       | 8             | Number of CPU cores to use for parallel computation. Increasing this value can speed up analysis for large datasets.                                                                 |
| --bin\_size              | No       | 50.0          | Diameter of spots in the original coordinate units. This value, multiplied by `spatial_coords_scale`, determines the point size in spatial scatter plots.                            |
| --n\_neighs              | No       | 9             | Number of neighbors used to construct the spatial graph (`sq.gr.spatial_neighbors`). Affects neighborhood enrichment and co-occurrence analysis.                                     |
| --coord\_type            | No       | grid          | Coordinate type for Squidpy's spatial\_neighbors. Use `'grid'` for regular arrays (faster), `'generic'` for irregular spatial data.                                                  |
| --n\_perms\_enrichment   | No       | 1000          | Number of permutations for neighborhood enrichment analysis. Higher values yield more robust statistics but increase computation time.                                               |
| --cooccurrence\_interval | No       | "100,1000,10" | Distance interval for co-occurrence analysis, in the format `"start,end,num_points"`. For example, `"50,250,10"` means analyze 10 evenly spaced distances from 50 to 250 units.      |
| --n\_simulations\_ripley | No       | 100           | Number of simulations for Ripley's statistics (F, G, L functions). Affects the accuracy of confidence intervals.                                                                     |
| --ripley\_modes          | No       | "F,G,L"       | Comma-separated list of Ripley statistics modes to calculate. Supported values: "F", "G", "L".                                                                                       |
| --seed                   | No       | 42            | Random seed for reproducibility. Must be a non-negative integer.                                                                                                                     |

**Notes:**

* For sparse data or few cell types, consider increasing `n_perms_enrichment` and `n_simulations_ripley` for more stable results.
* Adjust `cooccurrence_interval` and `bin_size` based on cell size and spatial distribution for optimal visualization and analysis.

## Output Results Display

| Result File                                           | Description                                                                                                                                                                                                                                                                                             |
| ----------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `<input_name>_squidpy_spatial_scatter.pdf/png`        | Spatial distribution plot colored by --label\_key.                                                                                                                                                                                                                                                      |
| `<input_name>_squidpy_nhood_enrichment_zscores.csv`   | Z-score matrix of neighborhood enrichment analysis.                                                                                                                                                                                                                                                     |
| `<input_name>_squidpy_nhood_enrichment.pdf/png`       | Heatmap visualization of neighborhood enrichment Z-score matrix.                                                                                                                                                                                                                                        |
| `<input_name>_squidpy_co_occurrence_scores_full.csv`  | Main quantitative result. This file is in long table format, recording the raw co-occurrence score for each cell type pair at each distance bin. Fields include from\_celltype, to\_celltype, distance, distance\_interval\_index, cooccurrence\_score. Suitable for custom analysis and visualization. |
| `<input_name>_squidpy_co_occurrence_all_types.pdf`    | Trends of co-occurrence probability for all cell type pairs as distance changes (multi-page).                                                                                                                                                                                                           |
| `<input_name>_squidpy_co_occurrence_plots_png/`       | Individual PNG plots for each cell type's co-occurrence trend.                                                                                                                                                                                                                                          |
| `<input_name>_squidpy_co_occurrence_<celltype>.png`   | Co-occurrence trend plot for a single cell type.                                                                                                                                                                                                                                                        |
| `<input_name>_squidpy_ripley_<mode>_function.pdf/png` | Ripley's statistics result plots for each specified mode.                                                                                                                                                                                                                                               |
| `<input_name>_squidpy_processed.h5ad`                 | Final AnnData object after Squidpy analysis.                                                                                                                                                                                                                                                            |

* **Neighborhood enrichment Z-score matrix**: `<input_name>_squidpy_nhood_enrichment_zscores.csv` shows the spatial proximity strength between cell types. Z-score indicates enrichment or depletion compared to random distribution.

|      | B    | Endo | Epi  | Fib  | Mye  | NK   | T    |
| ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- |
| B    | 0.0  | -0.5 | -1.2 | 0.8  | 1.5  | 2.1  | -0.3 |
| Endo | -0.5 | 0.0  | 0.7  | -0.2 | -1.1 | 0.4  | 0.9  |
| Epi  | -1.2 | 0.7  | 0.0  | 1.8  | -0.6 | -0.8 | 0.3  |
| Fib  | 0.8  | -0.2 | 1.8  | 0.0  | 0.5  | -0.1 | 1.2  |
| Mye  | 1.5  | -1.1 | -0.6 | 0.5  | 0.0  | 0.7  | -0.4 |
| NK   | 2.1  | 0.4  | -0.8 | -0.1 | 0.7  | 0.0  | 1.6  |
| T    | -0.3 | 0.9  | 0.3  | 1.2  | -0.4 | 1.6  | 0.0  |

* **Co-occurrence raw result table**: `<input_name>_squidpy_co_occurrence_scores_full.csv` records the raw co-occurrence score for each cell type pair at each distance bin, suitable for custom analysis and visualization.

| from\_celltype | to\_celltype | distance | distance\_interval\_index | cooccurrence\_score |
| -------------- | ------------ | -------- | ------------------------- | ------------------- |
| B              | B            | 100.0    | 0                         | 0.5327037           |
| B              | B            | 200.0    | 1                         | 0.5388954           |
| B              | B            | 300.0    | 2                         | 0.47285548          |
| B              | B            | 400.0    | 3                         | 0.44997987          |
| B              | B            | 500.0    | 4                         | 0.41206497          |
| B              | Endo         | 100.0    | 0                         | 0.12482233          |
| B              | Endo         | 200.0    | 1                         | 0.1646443           |
| B              | Endo         | 300.0    | 2                         | 0.15779616          |
| B              | Endo         | 400.0    | 3                         | 0.20279443          |
| B              | Epi          | 100.0    | 0                         | 0.06660331          |

* **Neighborhood enrichment heatmap**: `<input_name>_squidpy_nhood_enrichment.png` visualizes the Z-score matrix, color indicates spatial proximity strength.

<figure><img src="/files/NHzbrkUkCutakdkKNHHJ" alt="" width="375"><figcaption></figcaption></figure>

* **Spatial scatter plot**: `<input_name>_squidpy_spatial_scatter.png` shows spatial distribution of cell types colored by type.

<figure><img src="/files/2N2DUCdsxYoCjc8Aj9FS" alt="" width="375"><figcaption></figcaption></figure>

* **Co-occurrence trend plot**: `<input_name>_squidpy_co_occurrence_B.png` shows co-occurrence probability of B cells with other types as distance changes.

<figure><img src="/files/uRO9hbYzHmcO8KiRJznF" alt="" width="375"><figcaption></figcaption></figure>

* **Ripley's F function analysis**: `<input_name>_squidpy_ripley_F_function.png` shows spatial distribution pattern of cell types compared to random expectation.

<figure><img src="/files/0iIkWiETpLBsA6W8ClYw" alt="" width="375"><figcaption></figcaption></figure>

## Result Interpretation

* **Neighborhood enrichment Z-score**: Positive means enrichment (cell types tend to be close), negative means depletion (tend to be apart), larger absolute value means more significant.
* **Co-occurrence analysis**: Shows co-occurrence patterns as distance changes, helps understand spatial interactions.
* **Ripley's statistics**: Evaluates spatial distribution pattern; L function above random means aggregation, below means dispersion.

## Parameter Tuning Suggestions

* If there are few cell types or data is sparse, adjust `n_perms_enrichment` and `n_simulations_ripley` for more stable results.
* `n_perms_enrichment`: Number of permutations for neighborhood enrichment, can be reduced for large datasets to save time.
* `n_simulations_ripley`: Number of simulations for Ripley's statistics, affects confidence interval accuracy.
* `cooccurrence_interval`: Distance interval for co-occurrence analysis, adjust based on cell size and spatial distribution.
* `bin_size`: Affects point size in scatter plot, adjust for data density and visualization needs.


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://mysite.gitbook.io/sdas_manual_eng/readme/04_manual/spatial_relate/03_squidpy.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
