Skip to content

gpuOffload config parameter does not work #88

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
skupr opened this issue May 20, 2025 · 6 comments
Open

gpuOffload config parameter does not work #88

skupr opened this issue May 20, 2025 · 6 comments
Labels
bug Something isn't working

Comments

@skupr
Copy link

skupr commented May 20, 2025

LM Studio version: 0.3.16 (Build 3)
Python package version: lmstudio==1.3.0

I've tried to write a custom model loading script using lmstudio-python API.
And one important thing is to specify custom gpuOffload.

I've tried to do it as it's documented here: https://lmstudio.ai/docs/python/llm-prediction/parameters
But it just doesn't have any effect.
Moreover, pylint checker gives an error:

Argument of type "dict[str, int | float | dict[str, float]]" cannot be assigned to parameter "config" of type "LlmLoadModelConfig | LlmLoadModelConfigDict | None" in function "llm"
  Type "dict[str, int | float | dict[str, float]]" is not assignable to type "LlmLoadModelConfig | LlmLoadModelConfigDict | None"
    "dict[str, int | float | dict[str, float]]" is not assignable to "LlmLoadModelConfig"
    "dict[str, int | float | dict[str, float]]" is not assignable to "LlmLoadModelConfigDict"
    "dict[str, int | float | dict[str, float]]" is not assignable to "None"Pylance[reportArgumentType](https://github.com/microsoft/pylance-release/blob/main/docs/diagnostics/reportArgumentType.md)

I've tried to do some research, and found out that more correct way according to Python type hints is following:

    model = lms.llm(
        model_id,
        config={
            "contextLength": context_size,
            "gpu": {
                "ratio": gpu_offload,
                "mainGpu": 1,
                "splitStrategy": "favorMainGpu",
                "disabledGpus": [],
            },
        },
    )

But this this produces bunch of runtime errors.

Field with key load.gpuSplitConfig does not satisfy the schema:[
  {
    "expected": "'evenly' | 'priorityOrder' | 'custom'",
    "received": "undefined",
    "code": "invalid_type",
    "path": [
      "strategy"
    ],
    "message": "Required"
  },
  {
    "code": "invalid_type",                                                                                                                                                                                                              "expected": "array",
    "received": "undefined",
    "path": [
      "priority"                                                                                                                                                                                                                         ],
    "message": "Required"                                                                                                                                                                                                              },
  {                                                                                                                                                                                                                                      "code": "invalid_type",
    "expected": "array",                                                                                                                                                                                                                 "received": "undefined",
    "path": [
      "customRatio"
    ],
    "message": "Required"
  }
]
@skupr
Copy link
Author

skupr commented May 20, 2025

I also tried another syntax, similar to what's documented in lmsutudio-js:

    model = lms.llm(
        model_id,
        config={
            "contextLength": context_size,
            "gpu": {
                "ratio": gpu_offload,
            },
        },
    )

This produces another error:

Traceback (most recent call last):
  File "/home/stask/playground/lms-benchmark/benchmark.py", line 87, in benchmark_llm_batch
    result = benchmark_llm(model_id, ctx_size, gpu_layers, test_prompt)
  File "/home/stask/playground/lms-benchmark/benchmark.py", line 34, in benchmark_llm
    model = lms.llm(
  File "/usr/lib/python3.10/contextlib.py", line 79, in inner
    return func(*args, **kwds)
  File "/home/stask/playground/lms-benchmark/venv/lib/python3.10/site-packages/lmstudio/sync_api.py", line 1660, in llm
    return get_default_client().llm.model(model_key, ttl=ttl, config=config)
  File "/usr/lib/python3.10/contextlib.py", line 79, in inner
    return func(*args, **kwds)
  File "/home/stask/playground/lms-benchmark/venv/lib/python3.10/site-packages/lmstudio/sync_api.py", line 762, in model
    return self._get_or_load(model_key, ttl, config, on_load_progress)
  File "/home/stask/playground/lms-benchmark/venv/lib/python3.10/site-packages/lmstudio/sync_api.py", line 829, in _get_or_load
    endpoint = GetOrLoadEndpoint(
  File "/home/stask/playground/lms-benchmark/venv/lib/python3.10/site-packages/lmstudio/json_api.py", line 993, in __init__
    kv_config = load_config_to_kv_config_stack(config, config_type)
  File "/home/stask/playground/lms-benchmark/venv/lib/python3.10/site-packages/lmstudio/_kv_config.py", line 330, in load_config_to_kv_config_stack
    return _client_config_to_kv_config_stack(dict_config, TO_SERVER_LOAD_LLM)
  File "/home/stask/playground/lms-benchmark/venv/lib/python3.10/site-packages/lmstudio/_kv_config.py", line 313, in _client_config_to_kv_config_stack
    fields = _to_kv_config_stack_base(config, keymap)
  File "/home/stask/playground/lms-benchmark/venv/lib/python3.10/site-packages/lmstudio/_kv_config.py", line 302, in _to_kv_config_stack_base
    kv_field = config_field.to_kv_field(server_key, config)
  File "/home/stask/playground/lms-benchmark/venv/lib/python3.10/site-packages/lmstudio/_kv_config.py", line 94, in to_kv_field
    value[key] = containing_value[key]
KeyError: 'mainGpu'

@skupr
Copy link
Author

skupr commented May 20, 2025

Please fix gpuOffload parameter or provide documentation how to really use it.

@ncoghlan ncoghlan added the bug Something isn't working label May 22, 2025
@ncoghlan
Copy link
Collaborator

ncoghlan commented May 22, 2025

This is a genuine bug in the SDK. The translation from the named "favorMainGpu" split strategy to the server's "priorityOrder" split configuration is not currently working correctly (without a clear client side workaround).

@skupr
Copy link
Author

skupr commented May 22, 2025

Thank you for the response!

You've mentioned

(without a clear client side workaround)

Can you provide more details about workaround?

@ncoghlan
Copy link
Collaborator

That was poor wording on my part - while the dynamic nature of Python means monkeypatching the SDK to fix the config translation is theoretically possible, actually doing so is sufficiently complicated that it doesn't really count as a viable workaround (and once I've worked out what is wrong in sufficient detail to describe what such a patch would look like, I'll have spun a new release with the translation fixed).

@skupr
Copy link
Author

skupr commented May 23, 2025

That was poor wording on my part - while the dynamic nature of Python means monkeypatching the SDK to fix the config translation is theoretically possible, actually doing so is sufficiently complicated that it doesn't really count as a viable workaround (and once I've worked out what is wrong in sufficient detail to describe what such a patch would look like, I'll have spun a new release with the translation fixed).

Thank you for the information!

BTW, meanwhile I've found a workaround - I'm just invoking LMS CLI executable for the loading, and it works as expected.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants