|
| 1 | +{ |
| 2 | + "cells": [ |
| 3 | + { |
| 4 | + "cell_type": "markdown", |
| 5 | + "metadata": {}, |
| 6 | + "source": [ |
| 7 | + "<td>\n", |
| 8 | + " <a target=\"_blank\" href=\"https://labelbox.com\" ><img src=\"https://labelbox.com/blog/content/images/2021/02/logo-v4.svg\" width=256/></a>\n", |
| 9 | + "</td>\n" |
| 10 | + ] |
| 11 | + }, |
| 12 | + { |
| 13 | + "cell_type": "markdown", |
| 14 | + "metadata": {}, |
| 15 | + "source": [ |
| 16 | + "<td>\n", |
| 17 | + "<a href=\"https://colab.research.google.com/github/Labelbox/labelbox-notebooks/blob/main/project_configuration/prompt_response_projects.ipynb\" target=\"_blank\"><img\n", |
| 18 | + "src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"></a>\n", |
| 19 | + "</td>\n", |
| 20 | + "\n", |
| 21 | + "<td>\n", |
| 22 | + "<a href=\"https://github.com/Labelbox/labelbox-notebooks/tree/main/project_configuration/prompt_response_projects.ipynb\" target=\"_blank\"><img\n", |
| 23 | + "src=\"https://img.shields.io/badge/GitHub-100000?logo=github&logoColor=white\" alt=\"GitHub\"></a>\n", |
| 24 | + "</td>" |
| 25 | + ] |
| 26 | + }, |
| 27 | + { |
| 28 | + "cell_type": "markdown", |
| 29 | + "metadata": {}, |
| 30 | + "source": [ |
| 31 | + "# Prompt and response creation projects\n", |
| 32 | + "\n", |
| 33 | + "This notebook will provide an example workflow of setting up a prompt and response type project with the Labelbox-Python SDK." |
| 34 | + ] |
| 35 | + }, |
| 36 | + { |
| 37 | + "cell_type": "markdown", |
| 38 | + "metadata": {}, |
| 39 | + "source": [ |
| 40 | + "## Set up" |
| 41 | + ] |
| 42 | + }, |
| 43 | + { |
| 44 | + "cell_type": "code", |
| 45 | + "execution_count": null, |
| 46 | + "metadata": {}, |
| 47 | + "outputs": [], |
| 48 | + "source": [ |
| 49 | + "%pip install -q --upgrade \"labelbox[data]\"" |
| 50 | + ] |
| 51 | + }, |
| 52 | + { |
| 53 | + "cell_type": "code", |
| 54 | + "execution_count": null, |
| 55 | + "metadata": {}, |
| 56 | + "outputs": [], |
| 57 | + "source": [ |
| 58 | + "import labelbox as lb" |
| 59 | + ] |
| 60 | + }, |
| 61 | + { |
| 62 | + "cell_type": "markdown", |
| 63 | + "metadata": {}, |
| 64 | + "source": [ |
| 65 | + "## API key and client\n", |
| 66 | + "Please provide a valid API key below to connect to the Labelbox client properly. For more information, please review the [Create API key guide](https://docs.labelbox.com/reference/create-api-key)." |
| 67 | + ] |
| 68 | + }, |
| 69 | + { |
| 70 | + "cell_type": "code", |
| 71 | + "execution_count": null, |
| 72 | + "metadata": {}, |
| 73 | + "outputs": [], |
| 74 | + "source": [ |
| 75 | + "API_KEY = None\n", |
| 76 | + "client = lb.Client(api_key=API_KEY)" |
| 77 | + ] |
| 78 | + }, |
| 79 | + { |
| 80 | + "cell_type": "markdown", |
| 81 | + "metadata": {}, |
| 82 | + "source": [ |
| 83 | + "## Example: Create prompt response projects and ontologies\n", |
| 84 | + "\n", |
| 85 | + "The steps to creating prompt response projects and ontologies through the Labelbox-Python SDK are similar to creating a regular project. However, they vary slightly, and we will showcase the different methods in this example workflow." |
| 86 | + ] |
| 87 | + }, |
| 88 | + { |
| 89 | + "cell_type": "markdown", |
| 90 | + "metadata": {}, |
| 91 | + "source": [ |
| 92 | + "### Create a prompt and response ontology\n", |
| 93 | + "\n", |
| 94 | + "You can create ontologies for prompt and response projects in the same way as other project ontologies using two methods: `client.create_ontology` and `client.create_ontology_from_feature_schemas`. For response creation projects the only difference between other projects is the media_type for the project needs to be set to `lb.MediaType.Text`. For prompt and prompt response creation projects you need to include their respective media type: `lb.MediaType.LLMPromptCreation` and `lb.MediaType.LLMPromptResponseCreation`. Additional you also need to provide an `ontology_kind` parameter, which needs to be set to `lb.OntologyKind.ResponseCreation` this is only applicable for prompt and prompt response creation projects." |
| 95 | + ] |
| 96 | + }, |
| 97 | + { |
| 98 | + "cell_type": "markdown", |
| 99 | + "metadata": {}, |
| 100 | + "source": [ |
| 101 | + "#### Option A: `client.create_ontology`\n", |
| 102 | + "\n", |
| 103 | + "Typically, you create ontologies and generate the associated features simultaneously. Below is an example of creating an ontology for your prompt and response projects using supported classifications; for information on supported annotation types, visit our [prompt and response generation](https://docs.labelbox.com/docs/prompt-and-response-generation-editor) guide.\n", |
| 104 | + "\n", |
| 105 | + "Depending if you were creating a prompt, response, or prompt and response creation projects you would not need certain classifications inside your ontologies. For information on supported annotation types, visit our [prompt and response generation](doc:prompt-and-response-generation-editor#supported-prompt-formats) guide. In this notebook, we will be creating a prompt and response creation ontology. " |
| 106 | + ] |
| 107 | + }, |
| 108 | + { |
| 109 | + "cell_type": "markdown", |
| 110 | + "metadata": {}, |
| 111 | + "source": [ |
| 112 | + "##### Prompt and response creation ontology" |
| 113 | + ] |
| 114 | + }, |
| 115 | + { |
| 116 | + "cell_type": "code", |
| 117 | + "execution_count": null, |
| 118 | + "metadata": {}, |
| 119 | + "outputs": [], |
| 120 | + "source": [ |
| 121 | + "ontology_builder = lb.OntologyBuilder(\n", |
| 122 | + " tools=[],\n", |
| 123 | + " classifications=[\n", |
| 124 | + " lb.PromptResponseClassification(\n", |
| 125 | + " class_type=lb.PromptResponseClassification.Type.PROMPT,\n", |
| 126 | + " name=\"prompt text\",\n", |
| 127 | + " character_min = 1, # Minimum character count of prompt field (optional)\n", |
| 128 | + " character_max = 20, # Maximum character count of prompt field (optional)\n", |
| 129 | + " ),\n", |
| 130 | + " lb.PromptResponseClassification(\n", |
| 131 | + " class_type=lb.PromptResponseClassification.Type.RESPONSE_CHECKLIST,\n", |
| 132 | + " name=\"response checklist feature\",\n", |
| 133 | + " options=[\n", |
| 134 | + " lb.Option(value=\"option 1\", label=\"option 1\"),\n", |
| 135 | + " lb.Option(value=\"option 2\", label=\"option 2\"),\n", |
| 136 | + " ],\n", |
| 137 | + " ),\n", |
| 138 | + " lb.PromptResponseClassification(\n", |
| 139 | + " class_type=lb.PromptResponseClassification.Type.RESPONSE_RADIO,\n", |
| 140 | + " name=\"response radio feature\",\n", |
| 141 | + " options=[\n", |
| 142 | + " lb.Option(value=\"first_radio_answer\"),\n", |
| 143 | + " lb.Option(value=\"second_radio_answer\"),\n", |
| 144 | + " ],\n", |
| 145 | + " ),\n", |
| 146 | + " lb.PromptResponseClassification(\n", |
| 147 | + " class_type=lb.PromptResponseClassification.Type.RESPONSE_TEXT,\n", |
| 148 | + " name=\"response text\",\n", |
| 149 | + " character_min = 1, # Minimum character count of response text field (optional)\n", |
| 150 | + " character_max = 20, # Maximum character count of response text field (optional)\n", |
| 151 | + " )\n", |
| 152 | + " ],\n", |
| 153 | + ")\n", |
| 154 | + "\n", |
| 155 | + "# Create ontology\n", |
| 156 | + "ontology = client.create_ontology(\n", |
| 157 | + " \"Prompt and response ontology\",\n", |
| 158 | + " ontology_builder.asdict(),\n", |
| 159 | + " media_type=lb.MediaType.LLMPromptResponseCreation,\n", |
| 160 | + ")" |
| 161 | + ] |
| 162 | + }, |
| 163 | + { |
| 164 | + "cell_type": "markdown", |
| 165 | + "metadata": {}, |
| 166 | + "source": [ |
| 167 | + "### Option B: `client.create_ontology_from_feature_schemas`\n", |
| 168 | + "Ontologies can also be created with feature schema IDs. This makes your ontologies with existing features compared to generating new features. You can get these features by going to the _Schema_ tab inside Labelbox. (uncomment the below code block for this option)" |
| 169 | + ] |
| 170 | + }, |
| 171 | + { |
| 172 | + "cell_type": "code", |
| 173 | + "execution_count": null, |
| 174 | + "metadata": {}, |
| 175 | + "outputs": [], |
| 176 | + "source": [ |
| 177 | + "# ontology = client.create_ontology_from_feature_schemas(\n", |
| 178 | + "# \"LMC ontology\",\n", |
| 179 | + "# feature_schema_ids=[\"<list of feature schema ids\"],\n", |
| 180 | + "# media_type=lb.MediaType.Conversational,\n", |
| 181 | + "# ontology_kind=lb.OntologyKind.ModelEvaluation,\n", |
| 182 | + "# )" |
| 183 | + ] |
| 184 | + }, |
| 185 | + { |
| 186 | + "cell_type": "markdown", |
| 187 | + "metadata": {}, |
| 188 | + "source": [ |
| 189 | + "### Create response creation projects\n", |
| 190 | + "\n", |
| 191 | + "You can create response creation projects using `client.create_response_creation_project`, which uses the same parameters as `client.create_project` but provides better validation to ensure the project is set up correctly. Additionally, you need to import text data rows to be used as prompts." |
| 192 | + ] |
| 193 | + }, |
| 194 | + { |
| 195 | + "cell_type": "code", |
| 196 | + "execution_count": null, |
| 197 | + "metadata": {}, |
| 198 | + "outputs": [], |
| 199 | + "source": [ |
| 200 | + "response_creation_project = client.create_response_creation_project(\n", |
| 201 | + " name=\"Demo response creation\",\n", |
| 202 | + " description=\"<project_description>\", # optional\n", |
| 203 | + ")" |
| 204 | + ] |
| 205 | + }, |
| 206 | + { |
| 207 | + "cell_type": "markdown", |
| 208 | + "metadata": {}, |
| 209 | + "source": [ |
| 210 | + "## Create prompt response and prompt creation projects\n", |
| 211 | + "\n", |
| 212 | + "When creating a prompt response or prompt creation project using client.create_prompt_response_generation_project, you do not need to create data rows because they are generated automatically. This method takes the same parameters as the traditional client.create_project but with a few specific additional parameters.\n", |
| 213 | + "\n", |
| 214 | + "Parameters\n", |
| 215 | + "The `client.create_prompt_response_generation_project` method requires the following parameters:\n", |
| 216 | + "\n", |
| 217 | + "- `create_prompt_response_generation_project` parameters:\n", |
| 218 | + "\n", |
| 219 | + " - `name` (required): The name of your new project.\n", |
| 220 | + "\n", |
| 221 | + " - `description`: An optional description of your project.\n", |
| 222 | + "\n", |
| 223 | + " - `media_type` (required): The type of assets this project accepts. Can be either lb.MediaType.LLMPromptCreation or MediaType.LLMPromptResponseCreation, depending on the project type you are setting up.\n", |
| 224 | + "\n", |
| 225 | + " - `dataset_name`: The name of the dataset where the generated data rows will be located. Include this parameter only if you want to create a new dataset.\n", |
| 226 | + "\n", |
| 227 | + " - `dataset_id`: An optional dataset ID of an existing Labelbox dataset. Include this parameter if you want to append it to an existing dataset.\n", |
| 228 | + "\n", |
| 229 | + " - `data_row_count`: The number of data row assets that will be generated and used with your project." |
| 230 | + ] |
| 231 | + }, |
| 232 | + { |
| 233 | + "cell_type": "code", |
| 234 | + "execution_count": null, |
| 235 | + "metadata": {}, |
| 236 | + "outputs": [], |
| 237 | + "source": [ |
| 238 | + "project = client.create_prompt_response_generation_project(\n", |
| 239 | + " name=\"Demo prompt response project\",\n", |
| 240 | + " media_type=lb.MediaType.LLMPromptResponseCreation,\n", |
| 241 | + " dataset_name=\"Demo prompt response dataset\",\n", |
| 242 | + " data_row_count=100,\n", |
| 243 | + ")\n", |
| 244 | + "\n", |
| 245 | + "# Setup project with ontology created above\n", |
| 246 | + "project.connect_ontology(ontology)" |
| 247 | + ] |
| 248 | + }, |
| 249 | + { |
| 250 | + "cell_type": "markdown", |
| 251 | + "metadata": {}, |
| 252 | + "source": [ |
| 253 | + "## Exporting prompt response, prompt or response create project\n", |
| 254 | + "Exporting from a prompt and response type project works the same as exporting from other projects. In this example, your export will be empty unless you create labels within the Labelbox platform. See prompt and response export for a sample export." |
| 255 | + ] |
| 256 | + }, |
| 257 | + { |
| 258 | + "cell_type": "code", |
| 259 | + "execution_count": null, |
| 260 | + "metadata": {}, |
| 261 | + "outputs": [], |
| 262 | + "source": [ |
| 263 | + "# The return type of this method is an `ExportTask`, which is a wrapper of a`Task`\n", |
| 264 | + "# Most of `Task` features are also present in `ExportTask`.\n", |
| 265 | + "\n", |
| 266 | + "export_params = {\n", |
| 267 | + " \"attachments\": True,\n", |
| 268 | + " \"metadata_fields\": True,\n", |
| 269 | + " \"data_row_details\": True,\n", |
| 270 | + " \"project_details\": True,\n", |
| 271 | + " \"label_details\": True,\n", |
| 272 | + " \"performance_details\": True,\n", |
| 273 | + " \"interpolated_frames\": True,\n", |
| 274 | + "}\n", |
| 275 | + "\n", |
| 276 | + "# Note: Filters follow AND logic, so typically using one filter is sufficient.\n", |
| 277 | + "filters = {\n", |
| 278 | + " \"last_activity_at\": [\"2000-01-01 00:00:00\", \"2050-01-01 00:00:00\"],\n", |
| 279 | + " \"label_created_at\": [\"2000-01-01 00:00:00\", \"2050-01-01 00:00:00\"],\n", |
| 280 | + " \"workflow_status\": \"InReview\",\n", |
| 281 | + " \"batch_ids\": [\"batch_id_1\", \"batch_id_2\"],\n", |
| 282 | + " \"data_row_ids\": [\"data_row_id_1\", \"data_row_id_2\"],\n", |
| 283 | + " \"global_keys\": [\"global_key_1\", \"global_key_2\"],\n", |
| 284 | + "}\n", |
| 285 | + "\n", |
| 286 | + "export_task = project.export(params=export_params, filters=filters)\n", |
| 287 | + "export_task.wait_till_done()\n", |
| 288 | + "\n", |
| 289 | + "# Return a JSON output string from the export task results/errors one by one:\n", |
| 290 | + "def json_stream_handler(output: lb.BufferedJsonConverterOutput):\n", |
| 291 | + " print(output.json)\n", |
| 292 | + "\n", |
| 293 | + "\n", |
| 294 | + "if export_task.has_errors():\n", |
| 295 | + " export_task.get_buffered_stream(stream_type=lb.StreamType.ERRORS).start(\n", |
| 296 | + " stream_handler=lambda error: print(error)\n", |
| 297 | + " )\n", |
| 298 | + "\n", |
| 299 | + "if export_task.has_result():\n", |
| 300 | + " export_json = export_task.get_buffered_stream(\n", |
| 301 | + " stream_type=lb.StreamType.RESULT\n", |
| 302 | + " ).start(stream_handler=json_stream_handler)\n", |
| 303 | + "\n", |
| 304 | + "print(\"file size: \", export_task.get_total_file_size(stream_type=lb.StreamType.RESULT))\n", |
| 305 | + "print(\"line count: \", export_task.get_total_lines(stream_type=lb.StreamType.RESULT))" |
| 306 | + ] |
| 307 | + }, |
| 308 | + { |
| 309 | + "cell_type": "markdown", |
| 310 | + "metadata": {}, |
| 311 | + "source": [ |
| 312 | + "## Clean up\n", |
| 313 | + "\n", |
| 314 | + "This section serves as an optional clean-up step to delete the Labelbox assets created within this guide. You will need to uncomment the delete methods shown." |
| 315 | + ] |
| 316 | + }, |
| 317 | + { |
| 318 | + "cell_type": "code", |
| 319 | + "execution_count": null, |
| 320 | + "metadata": {}, |
| 321 | + "outputs": [], |
| 322 | + "source": [ |
| 323 | + "# project.delete()\n", |
| 324 | + "# response_creation_project.delete()\n", |
| 325 | + "# client.delete_unused_ontology(ontology.uid)\n", |
| 326 | + "# dataset.delete()" |
| 327 | + ] |
| 328 | + } |
| 329 | + ], |
| 330 | + "metadata": { |
| 331 | + "language_info": { |
| 332 | + "name": "python" |
| 333 | + } |
| 334 | + }, |
| 335 | + "nbformat": 4, |
| 336 | + "nbformat_minor": 2 |
| 337 | +} |
0 commit comments