InferCNV Algorithm

Purpose

The inferCNV tool is used to infer copy number variations (CNVs) from spatial transcriptomics data, helping to reveal genomic variation characteristics in different tissue regions. This tool can also perform CNV inference on scRNA-Seq data.

Usage

  • Use the h5ad after cell annotation, prepare the rds file:

SDAS dataProcess h5ad2rds -i st.h5ad -o outdir 
  • Run inferCNV

SDAS infercnv -i st.rds --h5ad st.h5ad --bin_size 50 --slice_key batch \
-o outdir --label_key anno_cell2location --species human \
--ref_group_names B,T --min_counts_per_cell 200

Input Parameter Description

Parameter
Required
Default Value
Description

--run_mode

No

stRNA

Choose spatial transcriptomics (stRNA) or single-cell (scRNA) mode

-i / --input

Yes

Input Stereo-seq or scRNA data in Seurat rds format, must contain the raw expression matrix

-o / --output

Yes

Output directory containing all results

--h5ad

Yes

h5ad format sample.h5ad, used for spatial heatmap (not required for scRNA mode)

--bin_size

Yes

Bin size, controls spatial heatmap spot size (e.g., 20, 50, 100), not required for scRNA mode

--label_key

Yes

The annotations field, containing cell type or grouping information in rds metadata

--ref_group_names

No

Reference group names, specify normal cell/sample groups, by default uses all cells (not recommended)

--gene_order_file

No

Data file containing the positions of each gene along each chromosome in the genome, tab-delimited

--cluster_heatmap

No

False

Whether to cluster CNV heatmap, True or False

--species

No

human

Gene location for the specified species (options: human or mouse, default: human). This parameter is invalid when --gene_order_file is set

--slice_key

No

sampleID

Column name in h5ad.obs indicating slice number for multiple slices

--gene_symbol_key

No

real_gene_name

The key of gene symbol in meta.data, If set to '_index', treat rownames in rds file as gene symbol

--assay

No

Assay name in rds used for CNV calculation, if not set, use default assay

--cutoff

No

0.02

Cut-off for the min average read counts per gene among reference cells/bins

--min_counts_per_cell

No

100

Mimimun counts allowed per cell/bins

Output Results

Output File
Description

<input_name>_run.final.infercnv_obj.rds

rds object containing the CNV matrix for all genes and spots

<input_name>_CNV_score.csv

CNV score for each spot

<input_name>_CNV_ref.png/pdf

CNV expression heatmap for reference cells (not output if ref_group_names is None)

<input_name>_CNV_obs.png/pdf

CNV expression heatmap for observed cells

<input_name>_CNV_score.png/pdf

Spatial heatmap of CNV score (one per slice for multiple slices, not output in scRNA mode)

  • CNV Expression Heatmap for Reference Cells: <input_name>_CNV_ref.png/pdf X-axis: spot, Y-axis: gene, color: CNV intensity

  • CNV Expression Heatmap for Observed Cells: <input_name>_CNV_obs.png/pdf X-axis: spot, Y-axis: gene, color: CNV intensity

    • --cluster_heatmap False: Spots are not clustered

  • --cluster_heatmap True: Spots are clustered

  • Spatial Heatmap of CNV Score: <input_name>_CNV_score.png/pdf Color indicates CNV intensity

  • CNV Score Text File: <input_name>_CNV_score.csv, higher values indicate stronger CNV intensity

spot_id
CNV_score

429496737600_D03663C6

0.0018

429496737700_D03663C6

0.0015

429496737800_D03663C6

0.0031

...

...

Last updated