How to train on 3D dataset #11

supgy · 2025-08-11T10:20:27Z

supgy
Aug 11, 2025

Hello author, thank you very much for your contribution. I want to train on 3D data, but there is not enough memory(H100, 80G),could you give me some advice?Here is my training code, hope your reply.

import torch
from landmarker.models import OriginalSpatialConfigurationNet
from landmarker.losses import GaussianHeatmapL2Loss
from torch.utils.data import DataLoader
from landmarker.heatmap import GaussianHeatmapGenerator
from landmarker.data import LandmarkDataset

import json
import numpy as np
import os
from os.path import join
from monai.transforms import (Compose, RandAffined, RandGaussianNoised,
RandScaleIntensityd, RandAdjustContrastd, RandHistogramShiftd,
ScaleIntensityd, RandSpatialCropd)

def extract_landmarks(json_path):
with open(json_path, 'r') as f:
json_file = json.load(f)
landmarks = []
points_lst = json_file['markups'][0]['controlPoints'] # a list with 25 dicts
for point in points_lst:
position = point['position']
position = position[::-1]
landmarks.append(position)
return landmarks

'''
Option 1: Using LandmarkDataset directly train
'''
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

img_path = ''
lab_path = ''
model_save_path = ''

image_paths = []
landmarks_array = []
for file in os.listdir(img_path):
name = file.split('.')[0]
image_paths.append(join(img_path, file))
label = extract_landmarks(join(lab_path, name+'.json'))
landmarks_array.append(label)
landmarks_array = np.array(landmarks_array)

names = [
'A',
'B',
'C',
'D',
'E',
'F',
'G',
'H',
'I',
'J',
'K',
'L',
'M',
'N',
'O',
'P',
'Q',
'R',
'S',
'T',
'U',
'V',
'W',
'X',
'Y'
]

#transforms = Compose([RandSpatialCropd])

train_dataset = LandmarkDataset(
imgs=image_paths, # List of paths to your images
landmarks=landmarks_array, # NumPy array of shape (N, C, D)
# N = number of samples
# C = number of landmark classes
# D = spatial dimensions (2 or 3)
spatial_dims=3, # 2 for 2D images, 3 for 3D
transform=None, # MONAI transforms for preprocessing
dim_img=(512, 192, 192), # Target image dimensions 1300, 420, 300
class_names=names # List of landmark class names
)

'''
Option 2: Using Built-in Datasets
Load the ISBI2015 cephalometric dataset
'''

data_dir = "path/to/data"

train_ds, test1_ds, test2_ds = get_cepha_landmark_datasets(data_dir)

heatmap_generator = GaussianHeatmapGenerator(
nb_landmarks=25, # Number of landmarks
sigmas=3, # Standard deviation for Gaussian distribution
learnable=True, # Enable adaptive heatmap parameters
heatmap_size=(512, 192, 192) # Output heatmap dimensions 1300, 420, 300
).to(device)

Initialize model

model = OriginalSpatialConfigurationNet(
in_channels=1, # Number of input channels
out_channels=25, # Number of landmarks
spatial_dim=3
)

Set up optimizer

optimizer = torch.optim.SGD([
{'params': model.parameters(), "weight_decay": 1e-3},
{'params': heatmap_generator.sigmas},
{'params': heatmap_generator.rotation}
], lr=1e-6, momentum=0.99, nesterov=True)

Define loss function

criterion = GaussianHeatmapL2Loss(alpha=5)

Create data loader

train_loader = DataLoader(
train_dataset,
batch_size=1,
shuffle=True,
num_workers=4
)

Training loop

model = model.to(device)

for epoch in range(100):
print(epoch)
model.train()
for batch in train_loader:
images = batch["image"].to(device)
landmarks = batch["landmark"].to(device)

    optimizer.zero_grad()
    outputs = model(images)
    heatmaps = heatmap_generator(landmarks)
    loss = criterion(outputs, heatmap_generator.sigmas, heatmaps)

    loss.backward()
    optimizer.step()
if epoch % 100 == 0:
    save_path = join(model_save_path, f"checkpoint_epoch{epoch+1}.pth")
    torch.save({
        "epoch": epoch+1,
        "model_state_dict": model.state_dict(),
        "optimizer_state_dict": optimizer.state_dict(),
        "heatmap_generator_state_dict": heatmap_generator.state_dict()
    }, save_path)

print('training finished')

jejon · 2025-08-29T07:19:15Z

jejon
Aug 29, 2025
Maintainer

Thanks for your question! A few points:

When working with large datasets or 3D images, ensure that store_imgs=False is set in LandmarkDataset, HeatmapDataset, or MaskDataset. This way, images are only loaded when a batch is fetched, which saves a lot of GPU/CPU memory.
Resizing your 3D images (which can be done through dim_img=... in LandmarkDataset, HeatmapDataset, or MaskDataset.
Option 2 (the ISBI2015 dataset) is not really relevant in your case, since you’re training on your own custom 3D dataset.
To support you (and others) further, I’ve prepared an example notebook for 3D data: 3D-example-MML-maskdataset.ipynb.

I’ll move this issue into the Discussions section so others can also find and benefit from the solution.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

How to train on 3D dataset #11

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

How to train on 3D dataset #11

Uh oh!

supgy Aug 11, 2025

data_dir = "path/to/data"

train_ds, test1_ds, test2_ds = get_cepha_landmark_datasets(data_dir)

Initialize model

Set up optimizer

Define loss function

Create data loader

Training loop

Replies: 1 comment

Uh oh!

jejon Aug 29, 2025 Maintainer

supgy
Aug 11, 2025

jejon
Aug 29, 2025
Maintainer