Latent Diffusion Model – LoDoInd (DM4CT)

This repository contains the pretrained latent-space diffusion model used in the
DM4CT: Benchmarking Diffusion Models for CT Reconstruction (ICLR 2026) benchmark.

Paper: DM4CT: Benchmarking Diffusion Models for Computed Tomography Reconstruction
Project Page: https://dm4ct.github.io/DM4CT/
Codebase: https://github.com/DM4CT/DM4CT

🔬 Model Overview

This model learns a prior over CT reconstruction images in a compressed latent space using a denoising diffusion probabilistic model (DDPM).

Unlike pixel diffusion models, diffusion is performed in the latent space of a pretrained autoencoder (VQ-VAE).

Architecture:
- VQ-VAE (image encoder/decoder)
- 2D UNet operating in latent space
Input resolution (image space): 512 × 512
Channels: 1 (grayscale CT slice)
Training objective: ε-prediction (standard DDPM formulation)
Noise schedule: Linear beta schedule
Training dataset: Industry CT dataset (LoDoInd)
Intensity normalization: Rescaled to (-1, 1)

This model is intended to be combined with data-consistency correction for CT reconstruction tasks.

📊 Dataset: LoDoInd

The model was trained on the industrial CT dataset LoDoInd.

Reconstructed slices were rescaled to the range (-1, 1).
The model learns an unconditional latent prior over CT slices; no specific geometry information is embedded in the weights.

🧠 Training Details

Optimizer: AdamW
Learning rate: 1e-4
Hardware: NVIDIA A100 GPU
Training scripts: Available in the DM4CT GitHub repository.

🚀 Usage

You can load and use this model with the diffusers library:

from diffusers import LDMPipeline
import torch

pipeline = LDMPipeline.from_pretrained(
    "jiayangshi/lodoind_latent_diffusion"
)
pipeline.to("cuda")

# Generate a sample (unconditional prior)
image = pipeline().images[0]
image.save("generated_ct_slice.png")

Note: For actual CT reconstruction, this prior is typically used with data-consistency guidance as described in the paper.

Citation

@inproceedings{
shi2026dmct,
title={{DM}4{CT}: Benchmarking Diffusion Models for Computed Tomography Reconstruction},
author={Shi, Jiayang and Pelt, Dani{\in}l M and Batenburg, K Joost},
booktitle={The Fourteenth International Conference on Learning Representations},
year={2026},
url={https://openreview.net/forum?id=YE5scJekg5}
}

Downloads last month: 39

Paper for jiayangshi/lodoind_latent_diffusion

DM4CT: Benchmarking Diffusion Models for Computed Tomography Reconstruction

Paper • 2602.18589 • Published 18 days ago • 1