Skip to content

Issues with 0.6B model using either EXL2 or EXL2ASYNC #91

@zeropointnine

Description

@zeropointnine

I've been using the 1.0B model with the EXL2 backend to great effect, but am having issues with the 0.6B model.

Using Backend.EXL2 with the 0.6B model throws an error: RuntimeError: torch.cat(): expected a non-empty list of Tensors

Using Backend.EXL2ASYNC with the 0.6B model does work, but VRAM usage spills over into shared memory significantly for some reason, making it impractical.

My ModelConfig settings:
MODEL_CONFIG = outetts.ModelConfig( model_path=r"C:\Users\me\.cache\huggingface\hub\models--OuteAI--OuteTTS-1.0-0.6B\snapshots\12345", interface_version=outetts.InterfaceVersion.V3, backend=outetts.Backend.EXL2, # or EXL2ASYNC device="cuda", dtype=torch.bfloat16 )

Running Windows 11, RTX 3080Ti, latest version of the oute lib, latest version of the exllama2 lib.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions