Skip to content

请问您当时训练模型的时候datasets版本号是多少? #8

@pandayummy

Description

@pandayummy

最近尝试您的数据集的时候,一直报错。

pip show datasets

[INFO|2025-03-25 02:48:08] llamafactory.data.loader:143 >> Loading dataset BUAADreamer/llava-med-zh-instruct-60k...
Loading dataset shards: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████| 20/20 [00:00<00:00, 47.21it/s]
Running tokenizer on dataset (num_proc=16):   0%|                                                                                 | 0/56649 [00:00<?, ? examples/s]
multiprocess.pool.RemoteTraceback:
"""
Traceback (most recent call last):
  File "/home/cheng/.local/lib/python3.10/site-packages/multiprocess/pool.py", line 125, in worker
    result = (True, func(*args, **kwds))
  File "/home/cheng/.local/lib/python3.10/site-packages/datasets/utils/py_utils.py", line 678, in _write_generator_to_queue
    for i, result in enumerate(func(**kwargs)):
  File "/home/cheng/.local/lib/python3.10/site-packages/datasets/arrow_dataset.py", line 3476, in _map_single
    batch = apply_function_on_filtered_inputs(
  File "/home/cheng/.local/lib/python3.10/site-packages/datasets/arrow_dataset.py", line 3338, in apply_function_on_filtered_inputs
    processed_inputs = function(*fn_args, *additional_args, **fn_kwargs)
  File "/home/cheng/LLaMA-Factory/src/llamafactory/data/processor/supervised.py", line 99, in preprocess_dataset
    input_ids, labels = self._encode_data_example(
  File "/home/cheng/LLaMA-Factory/src/llamafactory/data/processor/supervised.py", line 43, in _encode_data_example
    messages = self.template.mm_plugin.process_messages(prompt + response, images, videos, audios, self.processor)
  File "/home/cheng/LLaMA-Factory/src/llamafactory/data/mm_plugin.py", line 484, in process_messages
    image_seqlen = (height // processor.patch_size) * (
TypeError: unsupported operand type(s) for //: 'int' and 'NoneType'
"""

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions