-
-
Notifications
You must be signed in to change notification settings - Fork 112
Description
I've been using the 1.0B model with the EXL2 backend to great effect, but am having issues with the 0.6B model.
Using Backend.EXL2
with the 0.6B model throws an error: RuntimeError: torch.cat(): expected a non-empty list of Tensors
Using Backend.EXL2ASYNC
with the 0.6B model does work, but VRAM usage spills over into shared memory significantly for some reason, making it impractical.
My ModelConfig settings:
MODEL_CONFIG = outetts.ModelConfig( model_path=r"C:\Users\me\.cache\huggingface\hub\models--OuteAI--OuteTTS-1.0-0.6B\snapshots\12345", interface_version=outetts.InterfaceVersion.V3, backend=outetts.Backend.EXL2, # or EXL2ASYNC device="cuda", dtype=torch.bfloat16 )
Running Windows 11, RTX 3080Ti, latest version of the oute lib, latest version of the exllama2 lib.