You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Setup:
ComfyUI running in a miniconda environment on Linux (Gentoo)
Both system and miniconda env are running python 3.12. (That was a whole other thing that I finally figured out since it originally installed 3.13 in the env)
AMD RX 7600XT GPU
I am trying to run the default workflow. I installed the checkpoint it prompted me to, and I can get past that first node. When I run just a straight up 'python ./main.py' in the env, I get an error "RuntimeError: HIP error: invalid device function"
This is the report that's generated:
ComfyUI Error Report
Error Details
Node ID: 7
Node Type: CLIPTextEncode
Exception Type: RuntimeError
Exception Message: HIP error: invalid device function
HIP kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing AMD_SERIALIZE_KERNEL=3
Compile with TORCH_USE_HIP_DSA to enable device-side assertions.
Stack Trace
File "/home/me/dev/ComfyUI/execution.py", line 345, in execute
output_data, output_ui, has_subgraph = get_output_data(obj, input_data_all, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/me/dev/ComfyUI/execution.py", line 220, in get_output_data
return_values = _map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/me/dev/ComfyUI/execution.py", line 192, in _map_node_over_list
process_inputs(input_dict, i)
File "/home/me/dev/ComfyUI/execution.py", line 181, in process_inputs
results.append(getattr(obj, func)(**inputs))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/me/dev/ComfyUI/nodes.py", line 69, in encode
return (clip.encode_from_tokens_scheduled(tokens), )
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/me/dev/ComfyUI/comfy/sd.py", line 154, in encode_from_tokens_scheduled
pooled_dict = self.encode_from_tokens(tokens, return_pooled=return_pooled, return_dict=True)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/me/dev/ComfyUI/comfy/sd.py", line 216, in encode_from_tokens
o = self.cond_stage_model.encode_token_weights(tokens)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/me/dev/ComfyUI/comfy/sd1_clip.py", line 677, in encode_token_weights
out = getattr(self, self.clip).encode_token_weights(token_weight_pairs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/me/dev/ComfyUI/comfy/sd1_clip.py", line 45, in encode_token_weights
o = self.encode(to_encode)
^^^^^^^^^^^^^^^^^^^^^^
File "/home/me/dev/ComfyUI/comfy/sd1_clip.py", line 288, in encode
return self(tokens)
^^^^^^^^^^^^
File "/home/me/miniconda3/envs/comfyenv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/me/miniconda3/envs/comfyenv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1750, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/me/dev/ComfyUI/comfy/sd1_clip.py", line 250, in forward
embeds, attention_mask, num_tokens = self.process_tokens(tokens, device)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/me/dev/ComfyUI/comfy/sd1_clip.py", line 204, in process_tokens
tokens_embed = self.transformer.get_input_embeddings()(tokens_embed, out_dtype=torch.float32)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/me/miniconda3/envs/comfyenv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/me/miniconda3/envs/comfyenv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1750, in _call_impl
return forward_call(*args, *kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/me/dev/ComfyUI/comfy/ops.py", line 225, in forward
return self.forward_comfy_cast_weights(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/me/dev/ComfyUI/comfy/ops.py", line 221, in forward_comfy_cast_weights
return torch.nn.functional.embedding(input, weight, self.padding_idx, self.max_norm, self.norm_type, self.scale_grad_by_freq, self.sparse).to(dtype=output_dtype)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/me/miniconda3/envs/comfyenv/lib/python3.12/site-packages/torch/nn/functional.py", line 2551, in embedding
return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
System Information
ComfyUI Version: 0.3.28
Arguments: ./main.py
OS: posix
Python Version: 3.12.9 | packaged by Anaconda, Inc. | (main, Feb 6 2025, 18:56:27) [GCC 11.2.0]
Embedded Python: false
PyTorch Version: 2.6.0+rocm6.2.4
Devices
Name: cuda:0 AMD Radeon RX 7600 XT : native
Type: cuda
VRAM Total: 17163091968
VRAM Free: 16698211328
Torch VRAM Total: 268435456
Torch VRAM Free: 13270016
Logs
2025-04-16T21:13:14.054083 - Checkpoint files will always be loaded safely.
2025-04-16T21:13:14.730303 - Total VRAM 16368 MB, total RAM 64215 MB
2025-04-16T21:13:14.730363 - pytorch version: 2.6.0+rocm6.2.4
2025-04-16T21:13:14.730567 - AMD arch: gfx1102
2025-04-16T21:13:14.730617 - Set vram state to: NORMAL_VRAM
2025-04-16T21:13:14.730672 - Device: cuda:0 AMD Radeon RX 7600 XT : native
2025-04-16T21:13:15.705738 - Using sub quadratic optimization for attention, if you have memory or speed issues try using: --use-split-cross-attention
2025-04-16T21:13:16.891090 - Python version: 3.12.9 | packaged by Anaconda, Inc. | (main, Feb 6 2025, 18:56:27) [GCC 11.2.0]
2025-04-16T21:13:16.891150 - ComfyUI version: 0.3.28
2025-04-16T21:13:16.894309 - ComfyUI frontend version: 1.15.13
2025-04-16T21:13:16.894840 - [Prompt Server] web root: /home/me/miniconda3/envs/comfyenv/lib/python3.12/site-packages/comfyui_frontend_package/static
2025-04-16T21:13:17.118634 -
Import times for custom nodes:
2025-04-16T21:13:17.118711 - 0.0 seconds: /home/me/dev/ComfyUI/custom_nodes/websocket_image_save.py
2025-04-16T21:13:17.118744 -
2025-04-16T21:13:17.123422 - Starting server
2025-04-16T21:13:17.123692 - To see the GUI go to: http://127.0.0.1:8188
2025-04-16T21:13:20.894344 - got prompt
2025-04-16T21:13:21.138372 - model weight dtype torch.float16, manual cast: None
2025-04-16T21:13:21.139020 - model_type EPS
2025-04-16T21:13:21.443082 - Using split attention in VAE
2025-04-16T21:13:21.444244 - Using split attention in VAE
2025-04-16T21:13:21.509067 - VAE load device: cuda:0, offload device: cpu, dtype: torch.float32
2025-04-16T21:13:21.580080 - CLIP/text encoder model load device: cuda:0, offload device: cpu, current: cpu, dtype: torch.float16
2025-04-16T21:13:21.708025 - Requested to load SD1ClipModel
2025-04-16T21:13:22.081970 - loaded completely 15094.8 235.84423828125 True
2025-04-16T21:13:22.194236 - /home/me/miniconda3/envs/comfyenv/lib/python3.12/site-packages/torch/nn/functional.py:2551: UserWarning: Ignoring invalid value for boolean flag AMD_SERIALIZE_KERNEL: 3valid values are 0 or 1. (Triggered internally at /pytorch/c10/util/env.cpp:86.)
return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse)
2025-04-16T21:13:22.194412 - !!! Exception during processing !!! HIP error: invalid device function
HIP kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing AMD_SERIALIZE_KERNEL=3
Compile with `TORCH_USE_HIP_DSA` to enable device-side assertions.
2025-04-16T21:13:22.196605 - Traceback (most recent call last):
File "/home/me/dev/ComfyUI/execution.py", line 345, in execute
output_data, output_ui, has_subgraph = get_output_data(obj, input_data_all, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/me/dev/ComfyUI/execution.py", line 220, in get_output_data
return_values = _map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/me/dev/ComfyUI/execution.py", line 192, in _map_node_over_list
process_inputs(input_dict, i)
File "/home/me/dev/ComfyUI/execution.py", line 181, in process_inputs
results.append(getattr(obj, func)(**inputs))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/me/dev/ComfyUI/nodes.py", line 69, in encode
return (clip.encode_from_tokens_scheduled(tokens), )
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/me/dev/ComfyUI/comfy/sd.py", line 154, in encode_from_tokens_scheduled
pooled_dict = self.encode_from_tokens(tokens, return_pooled=return_pooled, return_dict=True)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/me/dev/ComfyUI/comfy/sd.py", line 216, in encode_from_tokens
o = self.cond_stage_model.encode_token_weights(tokens)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/me/dev/ComfyUI/comfy/sd1_clip.py", line 677, in encode_token_weights
out = getattr(self, self.clip).encode_token_weights(token_weight_pairs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/me/dev/ComfyUI/comfy/sd1_clip.py", line 45, in encode_token_weights
o = self.encode(to_encode)
^^^^^^^^^^^^^^^^^^^^^^
File "/home/me/dev/ComfyUI/comfy/sd1_clip.py", line 288, in encode
return self(tokens)
^^^^^^^^^^^^
File "/home/me/miniconda3/envs/comfyenv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/me/miniconda3/envs/comfyenv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1750, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/me/dev/ComfyUI/comfy/sd1_clip.py", line 250, in forward
embeds, attention_mask, num_tokens = self.process_tokens(tokens, device)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/me/dev/ComfyUI/comfy/sd1_clip.py", line 204, in process_tokens
tokens_embed = self.transformer.get_input_embeddings()(tokens_embed, out_dtype=torch.float32)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/me/miniconda3/envs/comfyenv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/me/miniconda3/envs/comfyenv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1750, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/me/dev/ComfyUI/comfy/ops.py", line 225, in forward
return self.forward_comfy_cast_weights(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/me/dev/ComfyUI/comfy/ops.py", line 221, in forward_comfy_cast_weights
return torch.nn.functional.embedding(input, weight, self.padding_idx, self.max_norm, self.norm_type, self.scale_grad_by_freq, self.sparse).to(dtype=output_dtype)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/me/miniconda3/envs/comfyenv/lib/python3.12/site-packages/torch/nn/functional.py", line 2551, in embedding
return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: HIP error: invalid device function
HIP kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing AMD_SERIALIZE_KERNEL=3
Compile with `TORCH_USE_HIP_DSA` to enable device-side assertions.
2025-04-16T21:13:22.197586 - Prompt executed in 1.30 seconds
Attached Workflow
Please make sure that workflow does not contain any sensitive information such as API keys or passwords.
{"id":"cff22c39-df37-453f-875d-46730704dc7a","revision":0,"last_node_id":9,"last_link_id":9,"nodes":[{"id":7,"type":"CLIPTextEncode","pos":[413,389],"size":[425.27801513671875,180.6060791015625],"flags":{},"order":3,"mode":0,"inputs":[{"localized_name":"clip","name":"clip","type":"CLIP","link":5}],"outputs":[{"localized_name":"CONDITIONING","name":"CONDITIONING","type":"CONDITIONING","slot_index":0,"links":[6]}],"properties":{"Node name for S&R":"CLIPTextEncode"},"widgets_values":["text, watermark"]},{"id":6,"type":"CLIPTextEncode","pos":[415,186],"size":[422.84503173828125,164.31304931640625],"flags":{},"order":2,"mode":0,"inputs":[{"localized_name":"clip","name":"clip","type":"CLIP","link":3}],"outputs":[{"localized_name":"CONDITIONING","name":"CONDITIONING","type":"CONDITIONING","slot_index":0,"links":[4]}],"properties":{"Node name for S&R":"CLIPTextEncode"},"widgets_values":["beautiful scenery nature glass bottle landscape, , purple galaxy bottle,"]},{"id":5,"type":"EmptyLatentImage","pos":[473,609],"size":[315,106],"flags":{},"order":0,"mode":0,"inputs":[],"outputs":[{"localized_name":"LATENT","name":"LATENT","type":"LATENT","slot_index":0,"links":[2]}],"properties":{"Node name for S&R":"EmptyLatentImage"},"widgets_values":[512,512,1]},{"id":3,"type":"KSampler","pos":[863,186],"size":[315,262],"flags":{},"order":4,"mode":0,"inputs":[{"localized_name":"model","name":"model","type":"MODEL","link":1},{"localized_name":"positive","name":"positive","type":"CONDITIONING","link":4},{"localized_name":"negative","name":"negative","type":"CONDITIONING","link":6},{"localized_name":"latent_image","name":"latent_image","type":"LATENT","link":2}],"outputs":[{"localized_name":"LATENT","name":"LATENT","type":"LATENT","slot_index":0,"links":[7]}],"properties":{"Node name for S&R":"KSampler"},"widgets_values":[1033950172478245,"randomize",20,8,"euler","normal",1]},{"id":8,"type":"VAEDecode","pos":[1209,188],"size":[210,46],"flags":{},"order":5,"mode":0,"inputs":[{"localized_name":"samples","name":"samples","type":"LATENT","link":7},{"localized_name":"vae","name":"vae","type":"VAE","link":8}],"outputs":[{"localized_name":"IMAGE","name":"IMAGE","type":"IMAGE","slot_index":0,"links":[9]}],"properties":{"Node name for S&R":"VAEDecode"},"widgets_values":[]},{"id":9,"type":"SaveImage","pos":[1451,189],"size":[210,58],"flags":{},"order":6,"mode":0,"inputs":[{"localized_name":"images","name":"images","type":"IMAGE","link":9}],"outputs":[],"properties":{},"widgets_values":["ComfyUI"]},{"id":4,"type":"CheckpointLoaderSimple","pos":[26,474],"size":[315,98],"flags":{},"order":1,"mode":0,"inputs":[],"outputs":[{"localized_name":"MODEL","name":"MODEL","type":"MODEL","slot_index":0,"links":[1]},{"localized_name":"CLIP","name":"CLIP","type":"CLIP","slot_index":1,"links":[3,5]},{"localized_name":"VAE","name":"VAE","type":"VAE","slot_index":2,"links":[8]}],"properties":{"Node name for S&R":"CheckpointLoaderSimple"},"widgets_values":["v1-5-pruned-emaonly-fp16.safetensors"]}],"links":[[1,4,0,3,0,"MODEL"],[2,5,0,3,3,"LATENT"],[3,4,1,6,0,"CLIP"],[4,6,0,3,1,"CONDITIONING"],[5,4,1,7,0,"CLIP"],[6,7,0,3,2,"CONDITIONING"],[7,3,0,8,0,"LATENT"],[8,4,2,8,1,"VAE"],[9,8,0,9,0,"IMAGE"]],"groups":[],"config":{},"extra":{"ds":{"scale":1,"offset":[90.54545801336127,-47.90910339355469]}},"version":0.4}
When I try running test-rocm.py in the same env, it shows clean:
$ python ./test-rocm.py
Checking ROCM support...
GOOD: ROCM devices found: 2
Checking PyTorch...
GOOD: PyTorch is working fine.
Checking user groups...
GOOD: The user chubbypiper is in RENDER and VIDEO groups.
GOOD: PyTorch ROCM support found.
Testing PyTorch ROCM support...
Everything fine! You can run PyTorch code inside of:
---> AMD Ryzen 7 2700X Eight-Core Processor
---> gfx1102
$
I had read elsewhere that running export HIP_VISIBLE_DEVICES=0 should correct this, however it had no effect on the results.
I had also seen that setting HSA_OVERRIDE_GFX_VERSION could be beneficial, so I ran HSA_OVERRIDE_GFX_VERSION=11.0.0 python ./main.py, but while it does get rid of the HIP error I have above, I now am getting a segmentation fault (core dumped)
To see the GUI go to: http://127.0.0.1:8188
got prompt
model weight dtype torch.float16, manual cast: None
model_type EPS
Using split attention in VAE
Using split attention in VAE
VAE load device: cuda:0, offload device: cpu, dtype: torch.float32
CLIP/text encoder model load device: cuda:0, offload device: cpu, current: cpu, dtype: torch.float16
Requested to load SD1ClipModel
Segmentation fault (core dumped)
$
In addition, test-rocm.py also crashes with a seg fault when I add HSA_OVERRIDE_GFX_VERSION to it.
Anyone have any suggestions on where I could look next?
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Setup:
ComfyUI running in a miniconda environment on Linux (Gentoo)
Both system and miniconda env are running python 3.12. (That was a whole other thing that I finally figured out since it originally installed 3.13 in the env)
AMD RX 7600XT GPU
I am trying to run the default workflow. I installed the checkpoint it prompted me to, and I can get past that first node. When I run just a straight up 'python ./main.py' in the env, I get an error "RuntimeError: HIP error: invalid device function"
This is the report that's generated:
ComfyUI Error Report
Error Details
HIP kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing AMD_SERIALIZE_KERNEL=3
Compile with
TORCH_USE_HIP_DSA
to enable device-side assertions.Stack Trace
System Information
Devices
Logs
Attached Workflow
Please make sure that workflow does not contain any sensitive information such as API keys or passwords.
When I try running test-rocm.py in the same env, it shows clean:
I had read elsewhere that running
export HIP_VISIBLE_DEVICES=0
should correct this, however it had no effect on the results.I had also seen that setting HSA_OVERRIDE_GFX_VERSION could be beneficial, so I ran
HSA_OVERRIDE_GFX_VERSION=11.0.0 python ./main.py
, but while it does get rid of the HIP error I have above, I now am getting asegmentation fault (core dumped)
In addition, test-rocm.py also crashes with a seg fault when I add HSA_OVERRIDE_GFX_VERSION to it.
Anyone have any suggestions on where I could look next?
Beta Was this translation helpful? Give feedback.
All reactions