Skip to content

Commit cbfdc33

Browse files
Merge pull request #1184 from JohnSnowLabs/release/2.6.0
Release/2.6.0
2 parents 60bbe4d + 05e51d6 commit cbfdc33

File tree

18 files changed

+5970
-118
lines changed

18 files changed

+5970
-118
lines changed

demo/tutorials/llm_notebooks/Med_Halt_Tests.ipynb

Lines changed: 1940 additions & 0 deletions
Large diffs are not rendered by default.

demo/tutorials/llm_notebooks/dataset-notebooks/JSL_Medical_LLM.ipynb

Lines changed: 1455 additions & 0 deletions
Large diffs are not rendered by default.

demo/tutorials/misc/Dataset_Debiasing.ipynb

Lines changed: 683 additions & 0 deletions
Large diffs are not rendered by default.

demo/tutorials/misc/Evaluation_with_Structured_Outputs.ipynb

Lines changed: 674 additions & 0 deletions
Large diffs are not rendered by default.

docs/pages/tutorials/LLM_testing_Notebooks/llm_testing_notebooks.md

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -42,4 +42,6 @@ The following table gives an overview of the different tutorial notebooks to tes
4242
| [**Question Answering Benchmarking**](question_answering_benchmarking): This notebook provides a demo on benchmarking Language Models (LLMs) for Question-Answering tasks. | Hugging Face Inference API | Question-Answering | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/JohnSnowLabs/langtest/blob/main/demo/tutorials/benchmarks/Question-Answering.ipynb) |
4343
| **Fewshot Model Evaluation**: This notebook provides a demo on Optimize and evaluate your models using few-shot prompt techniques | OpenAI | Question-Answering | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/JohnSnowLabs/langtest/blob/main/demo/tutorials/llm_notebooks/Fewshot_QA_Notebook.ipynb) |
4444
| **Evaluating NER in LLMs**:In this tutorial, we assess the support for Named Entity Recognition (NER) tasks specifically for Large Language Models (LLMs) | OpenAI | Question-Answering | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/JohnSnowLabs/langtest/blob/main/demo/tutorials/llm_notebooks/NER%20Casual%20LLM.ipynb) |
45-
| **Swapping Drug Names Test**:In this notebook, we discussed implementing tests that facilitate the swapping of generic drug names with brand names and vice versa. This feature ensures accurate evaluations in medical and pharmaceutical contexts. | OpenAI | Question-Answering | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/JohnSnowLabs/langtest/blob/main/demo/tutorials/llm_notebooks/Swapping_Drug_Names_Test.ipynb) |
45+
| **Swapping Drug Names Test**:In this notebook, we discussed implementing tests that facilitate the swapping of generic drug names with brand names and vice versa. This feature ensures accurate evaluations in medical and pharmaceutical contexts. | OpenAI | Question-Answering | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/JohnSnowLabs/langtest/blob/main/demo/tutorials/llm_notebooks/Swapping_Drug_Names_Test.ipynb) |
46+
| **Evaluation with Structured Outputs**:In this notebook, we discussed implementing evalution with structured output APIs for OpenAI, Ollama, and Azure-OpenAI, offering greater flexibility and precision when processing model responses. | OpenAI | Question-Answering | https://colab.research.google.com/github/JohnSnowLabs/langtest/blob/main/demo/tutorials/misc/Evaluation_with_Structured_Outputs.ipynb |
47+
| **Med Halt Tests**:In this notebook, we discussed about insights into your LLMs’ robustness and reliability under diverse conditions with our upgraded Med Halt tests. | OpenAI | Question-Answering | https://colab.research.google.com/github/JohnSnowLabs/langtest/blob/main/demo/tutorials/llm_notebooks/Med_Halt_Tests.ipynb |

docs/pages/tutorials/miscellaneous_notebooks/miscellaneous_notebooks.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -46,4 +46,5 @@ The following table gives an overview of the different tutorial notebooks. In th
4646
| **Misuse_Test_with_Prometheus_evaluation**: In this Notebook, we discussed about new safety testing features to identify and mitigate potential misuse and safety issues in your models | OpenAI |Question-Answering | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/JohnSnowLabs/langtest/blob/main/demo/tutorials/misc/Misuse_Test_with_Prometheus_evaluation.ipynb) |
4747
| **Visual_QA**: In this Notebook, we discussed about the visual question answering tests to evaluate how models handle both visual and textual inputs, offering a deeper understanding of their versatility. | OpenAI | Visual-Question-Answering (visualqa) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/JohnSnowLabs/langtest/blob/main/demo/tutorials/misc/Misuse_Test_with_Prometheus_evaluation.ipynb) |
4848
| **Add_New_Lines_and_Tabs_Tests**: In this Notebook, we discussed about new tests like inserting new lines and tab characters into text inputs, challenging your models to handle structural changes without compromising accuracy. | Hugging Face/John Snow Labs/Spacy |Text-Classification/Question-Answering/Summarization | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/JohnSnowLabs/langtest/blob/main/demo/tutorials/misc/Add_New_Lines_and_Tabs_Tests.ipynb) |
49-
| **Safety_Tests_With_PromptGuard**: In this Notebook, we discussed about evaluating prompts before they are sent to large language models (LLMs), ensuring harmful or unethical outputs are avoided with PromptGuard. | Hugging Face/John Snow Labs/Spacy | Text-Classification/Question-Answering/Summarization | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/JohnSnowLabs/langtest/blob/main/demo/tutorials/misc/Safety_Tests_With_PromptGuard.ipynb) |
49+
| **Safety_Tests_With_PromptGuard**: In this Notebook, we discussed about evaluating prompts before they are sent to large language models (LLMs), ensuring harmful or unethical outputs are avoided with PromptGuard. | Hugging Face/John Snow Labs/Spacy | Text-Classification/Question-Answering/Summarization | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/JohnSnowLabs/langtest/blob/main/demo/tutorials/misc/Safety_Tests_With_PromptGuard.ipynb) |
50+
| **De-biasing Data Augmentation**: In this Notebook, We’ve integrated de-biasing techniques into our data augmentation process, ensuring more equitable and representative model assessments. | Hugging Face/John Snow Labs/Spacy | Text-Classification/Question-Answering | https://colab.research.google.com/github/JohnSnowLabs/langtest/blob/main/demo/tutorials/misc/Dataset_Debiasing.ipynb |

langtest/augmentation/__init__.py

Lines changed: 8 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,11 @@
11
from .base import BaseAugmentaion, AugmentRobustness, TemplaticAugment
22
from .augmenter import DataAugmenter
3+
from .debias import DebiasTextProcessing
34

4-
__all__ = ["BaseAugmentaion", "AugmentRobustness", "TemplaticAugment", "DataAugmenter"]
5+
__all__ = [
6+
"DebiasTextProcessing",
7+
"BaseAugmentaion",
8+
"AugmentRobustness",
9+
"TemplaticAugment",
10+
"DataAugmenter",
11+
]

langtest/augmentation/base.py

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -361,7 +361,8 @@ def __init__(
361361
raise ImportError(Errors.E097())
362362

363363
except Exception as e_msg:
364-
raise Errors.E095(e=e_msg)
364+
error_message = str(e_msg)
365+
raise Exception(Errors.E095(e=error_message))
365366

366367
if show_templates:
367368
[print(template) for template in self.__templates]
@@ -610,6 +611,7 @@ def __generate_templates(
610611
from langtest.augmentation.utils import (
611612
generate_templates_azoi, # azoi means Azure OpenAI
612613
generate_templates_openai,
614+
generate_templates_ollama,
613615
)
614616

615617
params = model_config.copy() if model_config else {}
@@ -620,5 +622,8 @@ def __generate_templates(
620622
elif model_config and model_config.get("provider") == "azure":
621623
return generate_templates_azoi(template, num_extra_templates, params)
622624

625+
elif model_config and model_config.get("provider") == "ollama":
626+
return generate_templates_ollama(template, num_extra_templates, params)
627+
623628
else:
624629
return generate_templates_openai(template, num_extra_templates)

0 commit comments

Comments
 (0)