Skip to content

[MAISI] Pretrained weights did not generate MRI volume unconditionally #2028

@Masaaki-75

Description

@Masaaki-75

Describe the bug
Unconditional generation did not produce MR volumes, despite setting modality=9 (which refers to MR T1w)

To Reproduce
Steps to reproduce the behavior:

  1. Select maisi3d-ddpm or maisi3d-rflow as the model.
  2. Specify modality to 8 (mri) or 9 (mri_t1).
  3. Run diff_model_infer.py

Expected behavior
MRI volumes.

Screenshots

Image

Additional context
I print out the args just before the model inference, which confirms that modality variable has been switched to MRI. But the generated volumes still look like CT volumes:

Namespace(
spatial_dims=3, 
image_channels=1, 
latent_channels=4, 
include_body_region=False, 
mask_generation_latent_shape=[4, 64, 64, 64], 

autoencoder_def={'_target_': 'monai.apps.generation.maisi.networks.autoencoderkl_maisi.AutoencoderKlMaisi', 'spatial_dims': '@spatial_dims', 'in_channels': '@image_channels', 'out_channels': '@image_channels', 'latent_channels': '@latent_channels', 'num_channels': [64, 128, 256], 'num_res_blocks': [2, 2, 2], 'norm_num_groups': 32, 'norm_eps': 1e-06, 'attention_levels': [False, False, False], 'with_encoder_nonlocal_attn': False, 'with_decoder_nonlocal_attn': False, 'use_checkpointing': False, 'use_convtranspose': False, 'norm_float16': True, 'num_splits': 4, 'dim_split': 1}, 

diffusion_unet_def={'_target_': 'monai.apps.generation.maisi.networks.diffusion_model_unet_maisi.DiffusionModelUNetMaisi', 'spatial_dims': '@spatial_dims', 'in_channels': '@latent_channels', 'out_channels': '@latent_channels', 'num_channels': [64, 128, 256, 512], 'attention_levels': [False, False, True, True], 'num_head_channels': [0, 0, 32, 32], 'num_res_blocks': 2, 'use_flash_attention': True, 'include_top_region_index_input': '@include_body_region', 'include_bottom_region_index_input': '@include_body_region', 'include_spacing_input': True, 'num_class_embeds': 128, 'resblock_updown': True, 'include_fc': True}, 

controlnet_def={'_target_': 'monai.apps.generation.maisi.networks.controlnet_maisi.ControlNetMaisi', 'spatial_dims': '@spatial_dims', 'in_channels': '@latent_channels', 'num_channels': [64, 128, 256, 512], 'attention_levels': [False, False, True, True], 'num_head_channels': [0, 0, 32, 32], 'num_res_blocks': 2, 'use_flash_attention': True, 'conditioning_embedding_in_channels': 8, 'conditioning_embedding_num_channels': [8, 32, 64], 'num_class_embeds': 128, 'resblock_updown': True, 'include_fc': True}, 

mask_generation_autoencoder_def={'_target_': 'monai.apps.generation.maisi.networks.autoencoderkl_maisi.AutoencoderKlMaisi', 'spatial_dims': '@spatial_dims', 'in_channels': 8, 'out_channels': 125, 'latent_channels': '@latent_channels', 'num_channels': [32, 64, 128], 'num_res_blocks': [1, 2, 2], 'norm_num_groups': 32, 'norm_eps': 1e-06, 'attention_levels': [False, False, False], 'with_encoder_nonlocal_attn': False, 'with_decoder_nonlocal_attn': False, 'use_flash_attention': False, 'use_checkpointing': True, 'use_convtranspose': True, 'norm_float16': True, 'num_splits': 8, 'dim_split': 1}, 

mask_generation_diffusion_def={'_target_': 'monai.networks.nets.diffusion_model_unet.DiffusionModelUNet', 'spatial_dims': '@spatial_dims', 'in_channels': '@latent_channels', 'out_channels': '@latent_channels', 'channels': [64, 128, 256, 512], 'attention_levels': [False, False, True, True], 'num_head_channels': [0, 0, 32, 32], 'num_res_blocks': 2, 'use_flash_attention': True, 'with_conditioning': True, 'upcast_attention': True, 'cross_attention_dim': 10}, 

mask_generation_scale_factor=1.0055984258651733, noise_scheduler={'_target_': 'monai.networks.schedulers.rectified_flow.RFlowScheduler', 'num_train_timesteps': 1000, 'use_discrete_timesteps': False, 'use_timestep_transform': True, 'sample_method': 'uniform', 'scale': 1.4}, 

mask_generation_noise_scheduler={'_target_': 'monai.networks.schedulers.ddpm.DDPMScheduler', 'num_train_timesteps': 1000, 'beta_start': 0.0015, 'beta_end': 0.0195, 'schedule': 'scaled_linear_beta', 'clip_sample': False},
output_dir='output', 
trained_autoencoder_path='models/autoencoder_epoch273.pt', 
trained_diffusion_path='models/diff_unet_3d_rflow.pt', 
trained_controlnet_path='models/controlnet_3d_rflow.pt', trained_mask_generation_autoencoder_path='models/mask_generation_autoencoder.pt', 
trained_mask_generation_diffusion_path='models/mask_generation_diffusion_unet.pt', 
all_mask_files_base_dir='datasets/all_masks_flexible_size_and_spacing_4000', 
all_mask_files_json='./configs/candidate_masks_flexible_size_and_spacing_4000.json', 
all_anatomy_size_conditions_json='./configs/all_anatomy_size_condtions.json', 
label_dict_json='./configs/label_dict.json', 
label_dict_remap_json='./configs/label_dict_124_to_132.json', 
num_output_samples=1, 
body_region=['brain'], 
anatomy_list=['brain'], 
controllable_anatomy_size=[], 
num_inference_steps=1000, 
mask_generation_num_inference_steps=1000, 
output_size=[256, 256, 128], 
image_output_ext='.nii.gz', 
label_output_ext='.nii.gz', 
spacing=[0.9375, 0.9375, 1.2109375], 
autoencoder_sliding_window_infer_size=[48, 48, 48], 
autoencoder_sliding_window_infer_overlap=0.6666, 
controlnet='$@controlnet_def', 
diffusion_unet='$@diffusion_unet_def', 
autoencoder='$@autoencoder_def', 
mask_generation_autoencoder='$@mask_generation_autoencoder_def', mask_generation_diffusion='$@mask_generation_diffusion_def', 
modality=9, 
random_seed=1995, 
diffusion_unet_inference={'top_region_index': [1, 0, 0, 0], 'bottom_region_index': [1, 0, 0, 0], 'modality': 9, 'spacing': [0.9375, 0.9375, 1.2109375], 'dim': [256, 256, 128], 'num_inference_steps': 1000}
)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions