set_transform seem process all the sample on the fly, not batch_size #6050
Replies: 1 comment 1 reply
-
i logging the process, and in the line: train_result = trainer.train(resume_from_checkpoint=checkpoint) , call the prepare_dataset_transform, but input param batch size = 1 , seem not batch; and seem to process all the sample. |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
two question with set_transform, ask for help!
Basic info:
datasets 2.13.1
python 3.10
pytorch torch 2.0.1
`def prepare_dataset_transform(batch):
# process audio
sample = batch[audio_column_name]
array_input = [audio["array"] for audio in batch[audio_column_name]]
inputs = feature_extractor(
array_input, sampling_rate=sample[0]["sampling_rate"], return_attention_mask=forward_attention_mask
)
Beta Was this translation helpful? Give feedback.
All reactions