Unable to initialize harness with CSV file for text classification task #1195

StellaRK99 · 2025-04-29T22:07:23Z

StellaRK99
Apr 29, 2025

Hi there, I'm trying to use langtest with openAI to do some bias testing. Right now I'm just doing a POC and trying to get things working.

I'm using the imdb.csv file.
I'm using custom for my hub
I'm using text classification for the task.

Here is how I am initializing the harness, as per the documentation (which seems outdated?)

harness = Harness(task="text-classification",
model={"model": chat_model, "hub": "custom"},
data= {"data_source": 'abs/path/to/the/csv'})

chat_model is an object of a class I made that calls using the openAI client to make a chat completion request. It has a function in it called predict(), which does this exactly and returns the prediction as a string.

When I try to print(harness.data()), I get [] back, empty- so the harness.generate() call fails:

Traceback (most recent call last):
File "/Users/redacted.name//Desktop/git/redacted/tests/bias_fairness_tests/test_langtest.py", line 116, in
print("ABOUT TO GENERATE!!!!", harness.generate())
^^^^^^^^^^^^^^^^^^
File "/Users/redacted.name/.pyenv/versions/3.11.12/lib/python3.11/site-packages/langtest/langtest.py", line 315, in generate
self._testcases = self.__single_dataset_generate(self.data)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/redacted.name//.pyenv/versions/3.11.12/lib/python3.11/site-packages/langtest/langtest.py", line 1602, in __single_dataset_generate
testcases = TestFactory.transform(self.task, dataset, tests, m_data=m_data)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/redacted.name//.pyenv/versions/3.11.12/lib/python3.11/site-packages/langtest/transform/base.py", line 106, in transform
else all_categories[each](data, sub_test_types).transform()
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/redacted.name//.pyenv/versions/3.11.12/lib/python3.11/site-packages/langtest/transform/accuracy.py", line 67, in transform
if self._data_handler[0].expected_results is None:
~~~~~~~~~~~~~~~~~~^^^
IndexError: list index out of range

Another thing I noticed is that when I initialize the harness in the way it's shown in the documentation, there are warning because it expects you to pass in a DatasetConfig object for the data argument, and a ModelConfig object for the model argument. This is not explained in the documentation. But even my attempt to do this did not work:

data_config = [
DatasetConfig(data_source='/Users/sredacted.name/Desktop/imdb.csv')
]
model_config = [
ModelConfig(model= get_config('MODEL_NAME'), type='chat', hub='custom')
]

Any help would be greatly appreciated

chakravarthik27 · 2025-06-03T14:29:52Z

chakravarthik27
Jun 3, 2025
Maintainer

Hi @StellaRK99 ,

can you check this notebook https://github.com/JohnSnowLabs/langtest/blob/main/demo/tutorials/misc/Custom_Hub_Notebook.ipynb

I hope it helps with your task.

thank you
@chakravarthik27

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Unable to initialize harness with CSV file for text classification task #1195

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Unable to initialize harness with CSV file for text classification task #1195

Uh oh!

StellaRK99 Apr 29, 2025

Replies: 1 comment

Uh oh!

chakravarthik27 Jun 3, 2025 Maintainer

StellaRK99
Apr 29, 2025

chakravarthik27
Jun 3, 2025
Maintainer