Survival Analysis Module
Purpose
This module is based on IOBR, survival, survminer and other R packages to perform univariate survival analysis on immune infiltration/gene set scoring results and survival information, and output standardized survival curve plots.
Input File Examples
input
Feature scoring/immune infiltration result file: Each row represents a sample name, each column represents various immune cells/gene set scoring and other features, tab-separated
Sample1
0.123
0.456
Sample2
0.234
0.567
Sample3
0.345
0.678
clinical
Survival information file: Each row represents a sample name, each column represents survival time, survival status and other features, tab-separated
Sample1
1000
1
800
0
Sample2
800
0
600
0
Sample3
1200
1
1000
1
Running Method
SDAS bulkValidate survivalKM --input tme_combine.txt --clinical survival.txt --signature Macrophages_M2_CIBERSORT --project_name survival --time OS.time --status OS.status --time_type day --output survival_output
Input Parameter Description
--input
Yes
Immune infiltration/scoring result file path
--clinical
Yes
Survival information file path
--signature
Yes
Feature column name used for survival analysis
--output
Yes
Output directory path
--project_name
No
test
Project name (used for output file naming etc.)
--time
No
OS.time
Survival time column name
--status
No
OS.status
Survival status column name (0=survival/no recurrence, 1=death/recurrence)
--time_type
No
day
Time unit, default day
Output Results Display
survival.png/pdf
Survival curve plot
Survival curve plot:
survival.png/pdf
Shows survival analysis curves under specified feature grouping, displaying survival differences between high and low groups.

Differences in grouping methods:
Best cutoff grouping: Find the cutoff value that maximizes survival differences between two groups through statistical methods. Usually can most significantly distinguish high-risk and low-risk groups (as shown in the left figure).
Mean grouping: Use the mean score of all samples as the boundary point. This method works well when data distribution is symmetric, but may not be sensitive enough in skewed distributions (as shown in the middle figure).
Tertile grouping: Divide samples into three equal parts (low, medium, high) according to scores. The difference between high and low groups may not be as significant as the previous two methods (as shown in the right figure), but can avoid the influence of extreme values and observe the impact of the middle group.
Meaning of statistical indicators:
P-value: Indicates the significance of survival differences between groups. The smaller the P-value, the less likely the group differences are due to random factors.
Hazard Ratio (HR): Indicates the death risk multiple of the high group relative to the low group. HR>1 indicates high risk
95% CI (Confidence Interval): Indicates the 95% confidence interval of HR, reflecting the precision of the estimate. If the interval does not contain 1 (e.g., 1.41-2.64 in the left figure), it means HR is significantly not equal to 1. The narrower the interval, the more precise the estimate.
Cutoff value: The boundary point used in the best cutoff grouping method.
Survival curves: Show the survival probability changes over time for the high group (orange) and low group (blue). The more obvious the curve separation, the greater the group differences.
Risk table (Number at risk): Shows the number of people at risk in each group (i.e., the number of people who have not experienced the endpoint event up to that time point) below the time points, which helps assess the changes in sample size of each group over time.
Last updated