Unable to initialize harness with CSV file for text classification task #1195
Unanswered
StellaRK99
asked this question in
Q&A
Replies: 1 comment
-
Hi @StellaRK99 , can you check this notebook https://github.com/JohnSnowLabs/langtest/blob/main/demo/tutorials/misc/Custom_Hub_Notebook.ipynb I hope it helps with your task. thank you |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Hi there, I'm trying to use langtest with openAI to do some bias testing. Right now I'm just doing a POC and trying to get things working.
I'm using the imdb.csv file.
I'm using
custom
for my hubI'm using text classification for the task.
Here is how I am initializing the harness, as per the documentation (which seems outdated?)
harness = Harness(task="text-classification",
model={"model": chat_model, "hub": "custom"},
data= {"data_source": 'abs/path/to/the/csv'})
chat_model
is an object of a class I made that calls using the openAI client to make a chat completion request. It has a function in it calledpredict()
, which does this exactly and returns the prediction as a string.When I try to
print(harness.data())
, I get[]
back, empty- so theharness.generate()
call fails:Traceback (most recent call last):
File "/Users/redacted.name//Desktop/git/redacted/tests/bias_fairness_tests/test_langtest.py", line 116, in
print("ABOUT TO GENERATE!!!!", harness.generate())
^^^^^^^^^^^^^^^^^^
File "/Users/redacted.name/.pyenv/versions/3.11.12/lib/python3.11/site-packages/langtest/langtest.py", line 315, in generate
self._testcases = self.__single_dataset_generate(self.data)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/redacted.name//.pyenv/versions/3.11.12/lib/python3.11/site-packages/langtest/langtest.py", line 1602, in __single_dataset_generate
testcases = TestFactory.transform(self.task, dataset, tests, m_data=m_data)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/redacted.name//.pyenv/versions/3.11.12/lib/python3.11/site-packages/langtest/transform/base.py", line 106, in transform
else all_categories[each](data, sub_test_types).transform()
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/redacted.name//.pyenv/versions/3.11.12/lib/python3.11/site-packages/langtest/transform/accuracy.py", line 67, in transform
if self._data_handler[0].expected_results is None:
~~~~~~~~~~~~~~~~~~^^^
IndexError: list index out of range
Another thing I noticed is that when I initialize the harness in the way it's shown in the documentation, there are warning because it expects you to pass in a DatasetConfig object for the data argument, and a ModelConfig object for the model argument. This is not explained in the documentation. But even my attempt to do this did not work:
data_config = [
DatasetConfig(data_source='/Users/sredacted.name/Desktop/imdb.csv')
]
model_config = [
ModelConfig(model= get_config('MODEL_NAME'), type='chat', hub='custom')
]
Any help would be greatly appreciated
Beta Was this translation helpful? Give feedback.
All reactions