Skip to content

SCSE-Biomedical-Computing-Group/spin

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SPIN Reproducibility

Build image

docker build \
  --build-arg APP_UID=$(id -u) \
  --build-arg APP_GID=$(id -g) \
  -t spin:latest .
  • APP_UID / APP_GID ensure the app user inside the image matches your host UID/GID so bind-mounted files stay writable.
  • Re-run this command whenever the Dockerfile or Python dependencies change.

Run experiments

docker run --rm \
  --gpus all \
  -v "$(pwd)/spin:/home/app/spin" \
  -v "$(pwd)/spin-experiments:/home/app/spin-experiments" \
  -v "/data1/data_repo/spatial_transcriptomics:/home/app/spin-experiments/data" \
  -w /home/app \
  --env CUDA_VISIBLE_DEVICES=3 \
  --env EXPERIMENT_SCRIPT=./inference_biccn.sh \
  spin:latest
  • --gpus all exposes host GPUs; adjust or drop it for CPU-only runs.
  • The two -v mounts keep spin and spin-experiments editable on the host while letting the container read/write logs in real time.
  • -w /home/app sets the working directory so relative paths inside scripts resolve correctly.
  • Override EXPERIMENT_SCRIPT per run (e.g., another spin-experiments/*.sh) without rebuilding the image.
  • CUDA_VISIBLE_DEVICES=3 restricts the run to GPU 3; change as needed.

Logs land in logs on the host, so you can tail them while the container runs.

Training and Inference Scripts

Pre-requisites

  • Download the Transcriptformer checkpoints (“exemplar” and “sapiens”) into spin-experiments/models/transcriptformer before launching any script.
  • Only the Human Lymph Node dataset is fetched automatically. Other datasets must be prepared manually (see manuscript for sources and preprocessing), and due to size constraints they are not included in this repository.

Training Scripts

Script Dataset / Context Description
train_aging.sh Mouse aging dataset Trains the SPIN model on the mouse aging spatial transcriptomics dataset. Handles dataset-specific preprocessing, model configuration, and training loop.
train_biccn.sh BICCN mouse brain dataset Trains SPIN on the BICCN mouse brain (e.g. VISp) dataset, including graph construction and dataset-specific hyperparameters.
train_human_breast_cancer.sh Human breast cancer dataset Trains SPIN on the human breast cancer spatial transcriptomics dataset with the appropriate cell-type annotations and tissue layout.
train_human_crc.sh Human colorectal cancer (CRC) dataset Trains SPIN on the human CRC dataset, including dataset-specific preprocessing, train/val split, and training configuration.
train_human_lymph_node.sh Human lymph node dataset Trains SPIN on the human lymph node spatial transcriptomics dataset (e.g. lymph node / lymphoid tissue).
train_human_tonsil_atlas.sh Human tonsil atlas dataset Trains SPIN on the Human Tonsil Atlas dataset, using atlas-specific metadata and preprocessing options.
train_nsclc.sh Human NSCLC (lung cancer) dataset Trains SPIN on the human NSCLC spatial transcriptomics dataset, including dataset-specific graph construction and training hyperparameters.

Inference Scripts

Script Dataset / Context Description
inference_biccn.sh BICCN mouse brain dataset Runs inference with a trained SPIN model on the BICCN mouse brain dataset to produce cell–cell interaction scores and derived outputs.
inference_human_breast_cancer.sh Human breast cancer dataset Runs inference on the human breast cancer dataset using a pre-trained model, generating per-cell or population-level interaction readouts.
inference_human_crc.sh Human colorectal cancer (CRC) dataset Performs inference on the human CRC dataset, loading the trained checkpoint and exporting interaction statistics and summaries.
inference_human_lymph_node.sh Human lymph node dataset Runs inference on the human lymph node dataset with the corresponding trained model.
inference_human_tonsil_atlas.sh Human tonsil atlas dataset Runs inference on the Human Tonsil Atlas dataset to compute SPIN interaction scores and downstream metrics.
inference_mouse_aging.sh Mouse aging dataset Runs inference on the mouse aging dataset using the trained SPIN model.
inference_nsclc.sh Human NSCLC (lung cancer) dataset Performs inference on the human NSCLC dataset, generating single-cell and/or population-level interaction outputs.

Examples of parameters that can be updated (e.g. paths, hyperparameters, and runtime options) can be found directly in each of the scripts themselves.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages