-
Notifications
You must be signed in to change notification settings - Fork 6
Problem #5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
That doesn't look fun. The fatality is: Are you using an older or workstation GPU? |
I use 2060 Super 8GB
|
hmm. did you disable SYSMEM fallback at some point? otherwise make sure your nvidia driver is up to date. It should happily fall into shared memory and just slow down if you exceed 8GB VRAM |
oh actually, i think bfloat16 was only supported from Ampere Nvidia onwards. That's worth checking into. It was changed from float16 to bfloat16 for (phat) Mac to work better, but not helpful it it breaks it for 1000 and 2000 grn Nvidia! it was changed in two places in this commit: 80a3efd line 111: clarity-refiners-ui/app/app.py Line 111 in c0ba342
and 126: clarity-refiners-ui/app/app.py Line 126 in c0ba342
change: |
These work and my problem solved.
|
Hello every image i try, i get these error:
`❌ Error during processing: File "C:\pinokio\api\clarity-refiners-ui.git\app\env\lib\site-packages\refiners\fluxion\layers\chain.py", line 858, in forward
super().forward(*inputs)
File "C:\pinokio\api\clarity-refiners-ui.git\app\env\lib\site-packages\refiners\fluxion\layers\chain.py", line 249, in forward
result = self._call_layer(layer, name, *intermediate_args)
File "C:\pinokio\api\clarity-refiners-ui.git\app\env\lib\site-packages\refiners\fluxion\layers\chain.py", line 249, in forward
result = self._call_layer(layer, name, *intermediate_args)
File "C:\pinokio\api\clarity-refiners-ui.git\app\env\lib\site-packages\refiners\fluxion\layers\chain.py", line 249, in forward
result = self._call_layer(layer, name, *intermediate_args)
File "C:\pinokio\api\clarity-refiners-ui.git\app\env\lib\site-packages\refiners\fluxion\layers\chain.py", line 922, in forward
return super().forward(*inputs) + inputs[0]
File "C:\pinokio\api\clarity-refiners-ui.git\app\env\lib\site-packages\refiners\fluxion\layers\chain.py", line 249, in forward
result = self._call_layer(layer, name, *intermediate_args)
File "C:\pinokio\api\clarity-refiners-ui.git\app\env\lib\site-packages\refiners\fluxion\layers\chain.py", line 249, in forward
result = self._call_layer(layer, name, *intermediate_args)
File "C:\pinokio\api\clarity-refiners-ui.git\app\env\lib\site-packages\refiners\fluxion\layers\chain.py", line 249, in forward
result = self._call_layer(layer, name, *intermediate_args)
File "C:\pinokio\api\clarity-refiners-ui.git\app\env\lib\site-packages\refiners\fluxion\layers\chain.py", line 922, in forward
return super().forward(*inputs) + inputs[0]
File "C:\pinokio\api\clarity-refiners-ui.git\app\env\lib\site-packages\refiners\fluxion\layers\chain.py", line 249, in forward
result = self._call_layer(layer, name, *intermediate_args)
File "C:\pinokio\api\clarity-refiners-ui.git\app\env\lib\site-packages\refiners\fluxion\layers\chain.py", line 249, in forward
result = self._call_layer(layer, name, *intermediate_args)
OutOfMemoryError:
File "C:\pinokio\api\clarity-refiners-ui.git\app\env\lib\site-packages\refiners\fluxion\layers\attentions.py", line 129, in forward
return self._process_attention(
File "C:\pinokio\api\clarity-refiners-ui.git\app\env\lib\site-packages\refiners\fluxion\layers\attentions.py", line 29, in scaled_dot_product_attention
return _scaled_dot_product_attention(
CUDA out of memory. Tried to allocate 7.75 GiB. GPU
(CHAIN) SelfAttention(embedding_dim=320, num_heads=8, inner_dim=320, use_bias=False)
├── (PAR)
│ └── Identity() (x3)
├── (DISTR)
│ └── Linear(in_features=320, out_features=320, device=cuda:0, dtype=bfloat16) (x3)
├── >>> ScaledDotProductAttention(num_heads=8) | SD1UNet.Controlnet.DownBlocks.Chain_2.CLIPLCrossAttention.Chain_2.CrossAttentionBlock.Residual_1.SelfAttention.ScaledDotProductAttention
└── Linear(in_features=320, out_features=320, device=cuda:0, dtype=bfloat16)
0: Tensor(shape=(2, 16128, 320), dtype=bfloat16, device=cuda:0, min=-6.97, max=7.16, mean=-0.05, std=1.51, norm=4841.88, grad=False)
1: Tensor(shape=(2, 16128, 320), dtype=bfloat16, device=cuda:0, min=-6.69, max=7.00, mean=-0.01, std=1.58, norm=5069.41, grad=False)
2: Tensor(shape=(2, 16128, 320), dtype=bfloat16, device=cuda:0, min=-2.86, max=2.66, mean=0.01, std=0.46, norm=1474.05, grad=False)
(RES) Residual()
├── LayerNorm(normalized_shape=(320,), device=cuda:0, dtype=bfloat16)
└── >>> (CHAIN) SelfAttention(embedding_dim=320, num_heads=8, inner_dim=320, use_bias=False) | SD1UNet.Controlnet.DownBlocks.Chain_2.CLIPLCrossAttention.Chain_2.CrossAttentionBlock.Residual_1.SelfAttention
├── (PAR)
│ └── Identity() (x3)
├── (DISTR)
│ └── Linear(in_features=320, out_features=320, device=cuda:0, dtype=bfloat16) (x3)
├── ScaledDotProductAttention(num_heads=8)
└── Linear(in_features=320, out_features=320, device=cuda:0, dtype=bfloat16)
0: Tensor(shape=(2, 16128, 320), dtype=bfloat16, device=cuda:0, min=-2.91, max=2.64, mean=0.01, std=0.54, norm=1737.50, grad=False)
(CHAIN) CrossAttentionBlock(embedding_dim=320, context_embedding_dim=768, context_key=clip_text_embedding, num_heads=8, use_bias=False)
├── >>> (RES) Residual() | SD1UNet.Controlnet.DownBlocks.Chain_2.CLIPLCrossAttention.Chain_2.CrossAttentionBlock.Residual_1 #1
│ ├── LayerNorm(normalized_shape=(320,), device=cuda:0, dtype=bfloat16)
│ └── (CHAIN) SelfAttention(embedding_dim=320, num_heads=8, inner_dim=320, use_bias=False)
│ ├── (PAR) ...
│ ├── (DISTR) ...
│ ├── ScaledDotProductAttention(num_heads=8)
│ └── Linear(in_features=320, out_features=320, device=cuda:0, dtype=bfloat16)
├── (RES) Residual() #2
│ ├── LayerNorm(normalized_shape=(320,), device=cuda:0, dtype=bfloat16)
│ ├── (PAR)
0: Tensor(shape=(2, 16128, 320), dtype=bfloat16, device=cuda:0, min=-1.66, max=1.97, mean=0.01, std=0.24, norm=780.96, grad=False)
(CHAIN)
└── >>> (CHAIN) CrossAttentionBlock(embedding_dim=320, context_embedding_dim=768, context_key=clip_text_embedding, num_heads=8, use_bias=False) | SD1UNet.Controlnet.DownBlocks.Chain_2.CLIPLCrossAttention.Chain_2.CrossAttentionBlock
├── (RES) Residual() #1
│ ├── LayerNorm(normalized_shape=(320,), device=cuda:0, dtype=bfloat16)
│ └── (CHAIN) SelfAttention(embedding_dim=320, num_heads=8, inner_dim=320, use_bias=False) ...
├── (RES) Residual() #2
│ ├── LayerNorm(normalized_shape=(320,), device=cuda:0, dtype=bfloat16)
│ ├── (PAR) ...
│ └── (CHAIN) Attention(embedding_dim=320, num_heads=8, key_embedding_dim=768, value_embedding_dim=768, inner_dim=320, use_bias=False) ...
└── (RES) Residual() #3
├── LayerNorm(normalized_shape=(320,), device=cuda:0, dtype=bfloat16)
0: Tensor(shape=(2, 16128, 320), dtype=bfloat16, device=cuda:0, min=-1.66, max=1.97, mean=0.01, std=0.24, norm=780.96, grad=False)
(RES) CLIPLCrossAttention(channels=320)
├── (CHAIN) #1
│ ├── GroupNorm(num_groups=32, eps=1e-06, channels=320, device=cuda:0, dtype=bfloat16)
│ ├── Conv2d(in_channels=320, out_channels=320, kernel_size=(1, 1), device=cuda:0, dtype=bfloat16)
│ ├── (CHAIN) StatefulFlatten(start_dim=2)
│ │ ├── SetContext(context=flatten, key=sizes)
│ │ └── Flatten(start_dim=2)
│ └── Transpose(dim0=1, dim1=2)
├── >>> (CHAIN) | SD1UNet.Controlnet.DownBlocks.Chain_2.CLIPLCrossAttention.Chain_2 #2
│ └── (CHAIN) CrossAttentionBlock(embedding_dim=320, context_embedding_dim=768, context_key=clip_text_embedding, num_heads=8, use_bias=False)
│ ├── (RES) Residual() #1 ...
0: Tensor(shape=(2, 16128, 320), dtype=bfloat16, device=cuda:0, min=-1.66, max=1.97, mean=0.01, std=0.24, norm=780.96, grad=False)
(CHAIN)
├── (SUM) ResidualBlock(in_channels=320, out_channels=320)
│ ├── (CHAIN)
│ │ ├── GroupNorm(num_groups=32, channels=320, device=cuda:0, dtype=bfloat16) #1
│ │ ├── SiLU() #1
│ │ ├── (SUM) RangeAdapter2d(channels=320, embedding_dim=1280) ...
│ │ ├── GroupNorm(num_groups=32, channels=320, device=cuda:0, dtype=bfloat16) #2
│ │ ├── SiLU() #2
│ │ └── Conv2d(in_channels=320, out_channels=320, kernel_size=(3, 3), padding=(1, 1), device=cuda:0, dtype=bfloat16)
│ └── Identity()
├── >>> (RES) CLIPLCrossAttention(channels=320) | SD1UNet.Controlnet.DownBlocks.Chain_2.CLIPLCrossAttention
0: Tensor(shape=(2, 320, 144, 112), dtype=bfloat16, device=cuda:0, min=-8.19, max=6.47, mean=-0.09, std=0.69, norm=2236.34, grad=False)
(CHAIN) DownBlocks(in_channels=4)
├── (CHAIN) #1
│ ├── Conv2d(in_channels=4, out_channels=320, kernel_size=(3, 3), padding=(1, 1), device=cuda:0, dtype=bfloat16)
│ ├── (RES) Residual()
│ │ ├── UseContext(context=controlnet, key=condition_tile)
│ │ └── (CHAIN) ConditionEncoder() ...
│ └── (PASS)
│ ├── Conv2d(in_channels=320, out_channels=320, kernel_size=(1, 1), device=cuda:0, dtype=bfloat16)
│ └── Lambda(_store_residual(x: torch.Tensor))
├── >>> (CHAIN) (x2) | SD1UNet.Controlnet.DownBlocks.Chain_2 #2
│ ├── (SUM) ResidualBlock(in_channels=320, out_channels=320)
0: Tensor(shape=(2, 320, 144, 112), dtype=bfloat16, device=cuda:0, min=-6.19, max=6.31, mean=-0.04, std=0.67, norm=2141.49, grad=False)
(PASS) Controlnet(name=tile, scale=0.6)
├── (PASS) TimestepEncoder()
│ ├── UseContext(context=diffusion, key=timestep)
│ ├── (CHAIN) RangeEncoder(sinusoidal_embedding_dim=320, embedding_dim=1280)
│ │ ├── Lambda(compute_sinusoidal_embedding(x: jaxtyping.Int[Tensor, '*batch 1']) -> jaxtyping.Float[Tensor, '*batch 1 embedding_dim'])
│ │ ├── Converter(set_device=False)
│ │ ├── Linear(in_features=320, out_features=1280, device=cuda:0, dtype=bfloat16) #1
│ │ ├── SiLU()
│ │ └── Linear(in_features=1280, out_features=1280, device=cuda:0, dtype=bfloat16) #2
│ └── SetContext(context=range_adapter, key=timestep_embedding_tile)
├── Slicing(dim=1, end=4)
0: Tensor(shape=(2, 4, 144, 112), dtype=bfloat16, device=cuda:0, min=-3.31, max=4.03, mean=0.48, std=1.11, norm=433.98, grad=False)
(CHAIN) SD1UNet(in_channels=4)
├── >>> (PASS) Controlnet(name=tile, scale=0.6) | SD1UNet.Controlnet
│ ├── (PASS) TimestepEncoder()
│ │ ├── UseContext(context=diffusion, key=timestep)
│ │ ├── (CHAIN) RangeEncoder(sinusoidal_embedding_dim=320, embedding_dim=1280) ...
│ │ └── SetContext(context=range_adapter, key=timestep_embedding_tile)
│ ├── Slicing(dim=1, end=4)
│ ├── (CHAIN) DownBlocks(in_channels=4)
│ │ ├── (CHAIN) #1 ...
│ │ ├── (CHAIN) (x2) #2 ...
│ │ ├── (CHAIN) #3 ...
0: Tensor(shape=(2, 4, 144, 112), dtype=bfloat16, device=cuda:0, min=-3.31, max=4.03, mean=0.48, std=1.11, norm=433.98, grad=False)`
The text was updated successfully, but these errors were encountered: