hdWGCNA Algorithm
Purpose
Use the hdWGCNA algorithm to identify spatial gene co-expression gene sets.
Usage
SDAS coexpress hdwgcna -i st.h5ad -o outdir --bin_size 100 \
--input_layer raw_counts \
--selected_genes top5000 \
--moran_path ./moran.csv \
--n_cpus 8 \
--seed 42 \
--knn_neighbors 50 \
--max_shared_cells 15 \
--soft_power 8
Input Parameter Description
-i / --input
Yes
Stereo-seq h5ad, must contain raw expression matrix
-o / --output
Yes
Output directory
--bin_size
Yes
50
Bin size for resolution (20, 50, 100, 200, cellbin), consistent with input h5ad
--layer
No
Specify the layer of the raw expression matrix in h5ad (e.g. layers['raw_counts'])
--selected_genes
No
top5000
Selected gene mode : full
(all the genes), topn
(top n genes in Moran'I index)
--moran_path
No
Path to the precomputed Moran'I index csv file
--n_cpus
No
8
Number of parallel jobs for a speedup on multi-core machines
--seed
No
42
Random seed
--knn_neighbors
No
50
construct metacells: Number of neighboring cells covered by KNN algorithm
--max_shared_cells
No
15
construct metacells:maximum number of shared cells between two metacells
--soft_power
No
Used in network construction, by default automatically selects the lowest soft_power with a scale-free topology model fit of 0.8
Output Results Display
<input_name>_hdwgcna.module.csv
The result csv of spatial highly variable genes (gene symbol+gene id) corresponding to the co-expression gene set (module)
<input_name>_hdwgcna.module_score.csv
The result csv of gene set scoring for co-expression gene sets
<input_name>_hdwgcna.coexpress.rds
rds file containing the results of co-expression gene sets
<input_name>_hdwgcna.module_score.png/pdf
Spatial heatmap of gene set scoring for co-expression gene sets
<input_name>_hdwgcna.all_coex_dendrogram.png/pdf
Dendrogram of similarity between co-expression gene sets
<input_name>_hdwgcna.softpowers.png/pdf
Bar chart of soft_power values for network construction
<input_name>_hdwgcna.moran.csv
If topn is used for calculation, outputs all gene Moran index and P values
Result csv of co-expression gene sets:
<input_name>_hdwgcna.module.csv
, separated by commas. hdWGCNA output shows the spatial highly variable genes identified and their corresponding co-expression gene sets (modules). kME indicates the correlation strength between a gene's expression pattern and the module eigengene (Module Eigengene, ME) of the module it belongs to. The closer the kME value is to 1 or -1, the more likely the gene is a hub gene.
A2M
ENSG00000175899
Module1
green
0.47946868988301
-0.107096403482606
-0.178114022165641
0.0676792398874597
0.095966109797419
-0.0907050325056857
-0.0529390531160642
-0.150612945887371
0.0878907827651177
0.0249952108382643
A2M-AS1
ENSG00000237094
Module1
green
0.54370397007705
-0.150011910577089
-0.254597937099371
0.0926882061841318
0.140032173496191
-0.115227951266487
-0.101675353602963
-0.222107282189061
0.0803636102659976
0.0426306888623326
A2ML1
ENSG00000166535
Module2
yellow
0.0404144692736028
0.479908573141937
0.194701680726881
-0.327610748128114
0.0430624759042059
0.429681007497005
-0.342984504779987
0.145625804577339
-0.386999928188458
0.08281144751312791
A2MP1
ENSG00000256069
grey
grey
-0.046660656715667
0.20294339804614
0.284819067476003
-0.0506850476403686
-0.205976941174478
0.244779685854094
0.000250607520833238
0.170101997387916
-0.0177549796818324
0.0639042087827032
Result csv of gene set scoring for co-expression gene sets:
<input_name>_hdwgcna.module_score.csv
, separated by commas. hdWGCNA output shows the high and low expression scores of each co-expression gene set (module).
2200_16100
-3.23688863476392
-4.34756288337066
-2.3278151796256
-8.21694142422341
-14.8112682710791
-9.12253218247156
-10.174563894144
-3.09447240000024
0.481660736850741
3.91787079378259
2200_17200
5.77873502485046
0.783016254503074
1.06582091429724
-6.03050203635639
-3.71256039305597
-0.825856084852031
-3.67468239887104
-2.09159016878048
-2.639251117267012
5.41583186417414
2300_16700
7.90521666109811
2.93759207152763
-0.391450035802177
-3.02639637030598
1.63013439679168
1.66371621513915
-1.51360146647437
-0.8975499248414
-4.66703690157902
1.40723191567521
Spatial heatmap of gene set scoring for co-expression gene sets
<input_name>_hdwgcna.module_score.png/pdf
: Visualizes the spatial distribution patterns of all co-expression gene sets (Modules). The color intensity in the figure indicates the expression level of the co-expression gene set.

Bar chart of soft_power values for network construction
<input_name>_hdwgcna.softpowers.png/pdf
: Analyzes the effect of different soft_power parameters on network construction. By default, the lowest soft_power with a scale-free topology model fit of 0.8 is automatically selected.

Dendrogram of similarity between co-expression gene sets
<input_name>_hdwgcna.all_coex_dendrogram.png/pdf
: Shows the hierarchical clustering dendrogram of similarity between different co-expression gene sets (Modules).

Result Interpretation
The co-expression gene sets start from Module1, and grey represents genes that do not meet the clustering requirements of co-expression gene sets.
Parameter Tuning Suggestions
If the number of genes in bin20/50 samples is less than 200, or for other special samples, and the identified spatial co-expression gene sets are few, you can try lowering the threshold according to the
soft_power
test chart.You can customize the parameters
knn_neighbors
andmax_shared_cells
to obtain more interpretable results.
Last updated