docker build \
--build-arg APP_UID=$(id -u) \
--build-arg APP_GID=$(id -g) \
-t spin:latest .APP_UID/APP_GIDensure theappuser inside the image matches your host UID/GID so bind-mounted files stay writable.- Re-run this command whenever the Dockerfile or Python dependencies change.
docker run --rm \
--gpus all \
-v "$(pwd)/spin:/home/app/spin" \
-v "$(pwd)/spin-experiments:/home/app/spin-experiments" \
-v "/data1/data_repo/spatial_transcriptomics:/home/app/spin-experiments/data" \
-w /home/app \
--env CUDA_VISIBLE_DEVICES=3 \
--env EXPERIMENT_SCRIPT=./inference_biccn.sh \
spin:latest--gpus allexposes host GPUs; adjust or drop it for CPU-only runs.- The two
-vmounts keep spin and spin-experiments editable on the host while letting the container read/write logs in real time. -w /home/appsets the working directory so relative paths inside scripts resolve correctly.- Override
EXPERIMENT_SCRIPTper run (e.g., anotherspin-experiments/*.sh) without rebuilding the image. CUDA_VISIBLE_DEVICES=3restricts the run to GPU 3; change as needed.
Logs land in logs on the host, so you can tail them while the container runs.
- Download the Transcriptformer checkpoints (“exemplar” and “sapiens”) into
spin-experiments/models/transcriptformerbefore launching any script. - Only the Human Lymph Node dataset is fetched automatically. Other datasets must be prepared manually (see manuscript for sources and preprocessing), and due to size constraints they are not included in this repository.
| Script | Dataset / Context | Description |
|---|---|---|
train_aging.sh |
Mouse aging dataset | Trains the SPIN model on the mouse aging spatial transcriptomics dataset. Handles dataset-specific preprocessing, model configuration, and training loop. |
train_biccn.sh |
BICCN mouse brain dataset | Trains SPIN on the BICCN mouse brain (e.g. VISp) dataset, including graph construction and dataset-specific hyperparameters. |
train_human_breast_cancer.sh |
Human breast cancer dataset | Trains SPIN on the human breast cancer spatial transcriptomics dataset with the appropriate cell-type annotations and tissue layout. |
train_human_crc.sh |
Human colorectal cancer (CRC) dataset | Trains SPIN on the human CRC dataset, including dataset-specific preprocessing, train/val split, and training configuration. |
train_human_lymph_node.sh |
Human lymph node dataset | Trains SPIN on the human lymph node spatial transcriptomics dataset (e.g. lymph node / lymphoid tissue). |
train_human_tonsil_atlas.sh |
Human tonsil atlas dataset | Trains SPIN on the Human Tonsil Atlas dataset, using atlas-specific metadata and preprocessing options. |
train_nsclc.sh |
Human NSCLC (lung cancer) dataset | Trains SPIN on the human NSCLC spatial transcriptomics dataset, including dataset-specific graph construction and training hyperparameters. |
| Script | Dataset / Context | Description |
|---|---|---|
inference_biccn.sh |
BICCN mouse brain dataset | Runs inference with a trained SPIN model on the BICCN mouse brain dataset to produce cell–cell interaction scores and derived outputs. |
inference_human_breast_cancer.sh |
Human breast cancer dataset | Runs inference on the human breast cancer dataset using a pre-trained model, generating per-cell or population-level interaction readouts. |
inference_human_crc.sh |
Human colorectal cancer (CRC) dataset | Performs inference on the human CRC dataset, loading the trained checkpoint and exporting interaction statistics and summaries. |
inference_human_lymph_node.sh |
Human lymph node dataset | Runs inference on the human lymph node dataset with the corresponding trained model. |
inference_human_tonsil_atlas.sh |
Human tonsil atlas dataset | Runs inference on the Human Tonsil Atlas dataset to compute SPIN interaction scores and downstream metrics. |
inference_mouse_aging.sh |
Mouse aging dataset | Runs inference on the mouse aging dataset using the trained SPIN model. |
inference_nsclc.sh |
Human NSCLC (lung cancer) dataset | Performs inference on the human NSCLC dataset, generating single-cell and/or population-level interaction outputs. |
Examples of parameters that can be updated (e.g. paths, hyperparameters, and runtime options) can be found directly in each of the scripts themselves.