From c711c51d25d3eb4c5510d90c9f3b1cea471000d9 Mon Sep 17 00:00:00 2001 From: Gabe <33893811+Gabefire@users.noreply.github.com> Date: Mon, 29 Jul 2024 19:24:06 -0500 Subject: [PATCH 1/3] Rename conversational_LLM_data_generation.ipynb to LLM_data_generation.ipynb --- ...onal_LLM_data_generation.ipynb => LLM_data_generation.ipynb} | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) rename annotation_import/{conversational_LLM_data_generation.ipynb => LLM_data_generation.ipynb} (99%) diff --git a/annotation_import/conversational_LLM_data_generation.ipynb b/annotation_import/LLM_data_generation.ipynb similarity index 99% rename from annotation_import/conversational_LLM_data_generation.ipynb rename to annotation_import/LLM_data_generation.ipynb index d0225cf..2b8a6af 100644 --- a/annotation_import/conversational_LLM_data_generation.ipynb +++ b/annotation_import/LLM_data_generation.ipynb @@ -266,4 +266,4 @@ "execution_count": null } ] -} \ No newline at end of file +} From b691c98322db5738471713a6b6d2d3e62f1e3f16 Mon Sep 17 00:00:00 2001 From: "github-actions[bot]" Date: Tue, 30 Jul 2024 00:25:07 +0000 Subject: [PATCH 2/3] :art: Cleaned --- annotation_import/LLM_data_generation.ipynb | 6 +- .../prompt_response_projects.ipynb | 248 +++++------------- 2 files changed, 65 insertions(+), 189 deletions(-) diff --git a/annotation_import/LLM_data_generation.ipynb b/annotation_import/LLM_data_generation.ipynb index 2b8a6af..2ee85cd 100644 --- a/annotation_import/LLM_data_generation.ipynb +++ b/annotation_import/LLM_data_generation.ipynb @@ -16,12 +16,12 @@ "metadata": {}, "source": [ "\n", - "\n", "\n", "\n", "\n", - "\n", "" ], @@ -266,4 +266,4 @@ "execution_count": null } ] -} +} \ No newline at end of file diff --git a/project_configuration/prompt_response_projects.ipynb b/project_configuration/prompt_response_projects.ipynb index 3deab9a..47c1b2e 100644 --- a/project_configuration/prompt_response_projects.ipynb +++ b/project_configuration/prompt_response_projects.ipynb @@ -1,16 +1,18 @@ { + "nbformat": 4, + "nbformat_minor": 2, + "metadata": {}, "cells": [ { - "cell_type": "markdown", "metadata": {}, "source": [ - "\n", - " \n", + "", + " ", "\n" - ] + ], + "cell_type": "markdown" }, { - "cell_type": "markdown", "metadata": {}, "source": [ "\n", @@ -22,80 +24,73 @@ "\n", "" - ] + ], + "cell_type": "markdown" }, { - "cell_type": "markdown", "metadata": {}, "source": [ "# Prompt and response creation projects\n", "\n", "This notebook will provide an example workflow of setting up a prompt and response type project with the Labelbox-Python SDK." - ] + ], + "cell_type": "markdown" }, { - "cell_type": "markdown", "metadata": {}, "source": [ "## Set up" - ] + ], + "cell_type": "markdown" }, { - "cell_type": "code", - "execution_count": null, "metadata": {}, + "source": "%pip install -q --upgrade \"labelbox[data]\"", + "cell_type": "code", "outputs": [], - "source": [ - "%pip install -q --upgrade \"labelbox[data]\"" - ] + "execution_count": null }, { - "cell_type": "code", - "execution_count": null, "metadata": {}, + "source": "import labelbox as lb", + "cell_type": "code", "outputs": [], - "source": [ - "import labelbox as lb" - ] + "execution_count": null }, { - "cell_type": "markdown", "metadata": {}, "source": [ "## API key and client\n", "Replace the value of `API_KEY` with a valid [API key]([ref:create-api-key](https://docs.labelbox.com/reference/create-api-key)) to connect to the Labelbox client." - ] + ], + "cell_type": "markdown" }, { - "cell_type": "code", - "execution_count": null, "metadata": {}, + "source": "API_KEY = None\nclient = lb.Client(api_key=API_KEY)", + "cell_type": "code", "outputs": [], - "source": [ - "API_KEY = None\n", - "client = lb.Client(api_key=API_KEY)" - ] + "execution_count": null }, { - "cell_type": "markdown", "metadata": {}, "source": [ "## Example: Create prompt and response projects and ontologies\n", "\n", "The steps of creating prompt and response projects and corresponding ontologies using the Labelbox-Python SDK are similar to creating a regular project, and we will describe the slight differences for each scenario." - ] + ], + "cell_type": "markdown" }, { - "cell_type": "markdown", "metadata": {}, "source": [ "### Create a prompt and response ontology\n", "\n", "You can create ontologies for prompt and response projects in the same way as other project ontologies using two methods: `client.create_ontology` and `client.create_ontology_from_feature_schemas`. You also need to include the respective `media_type`: `lb.MediaType.LLMPromptCreation` and `lb.MediaType.LLMPromptResponseCreation`. Additionally, you need to provide an `ontology_kind` parameter set to `lb.OntologyKind.ResponseCreation` that is only applicable for prompt and prompt response creation projects." - ] + ], + "cell_type": "markdown" }, { - "cell_type": "markdown", "metadata": {}, "source": [ "#### Option A: `client.create_ontology`\n", @@ -103,109 +98,55 @@ "Typically, you create ontologies and generate the associated features simultaneously. Below is an example of creating an ontology for your prompt and response projects using supported classifications; for information on supported annotation types, see [prompt and response generation](https://docs.labelbox.com/docs/prompt-and-response-generation-editor).\n", "\n", "Depending if you were creating a prompt, response, or prompt and response creation projects, you don't need certain classifications inside your ontologies. For information on supported annotation types, see [prompt and response generation](doc:prompt-and-response-generation-editor#supported-prompt-formats). In this notebook, we will create a prompt and response creation ontology. " - ] + ], + "cell_type": "markdown" }, { - "cell_type": "markdown", "metadata": {}, "source": [ "##### Prompt and response creation ontology" - ] + ], + "cell_type": "markdown" }, { - "cell_type": "code", - "execution_count": null, "metadata": {}, + "source": "ontology_builder = lb.OntologyBuilder(\n tools=[],\n classifications=[\n lb.PromptResponseClassification(\n class_type=lb.PromptResponseClassification.Type.PROMPT,\n name=\"prompt text\",\n character_min=1, # Minimum character count of prompt field (optional)\n character_max=\n 20, # Maximum character count of prompt field (optional)\n ),\n lb.PromptResponseClassification(\n class_type=lb.PromptResponseClassification.Type.RESPONSE_CHECKLIST,\n name=\"response checklist feature\",\n options=[\n lb.Option(value=\"option 1\", label=\"option 1\"),\n lb.Option(value=\"option 2\", label=\"option 2\"),\n ],\n ),\n lb.PromptResponseClassification(\n class_type=lb.PromptResponseClassification.Type.RESPONSE_RADIO,\n name=\"response radio feature\",\n options=[\n lb.Option(value=\"first_radio_answer\"),\n lb.Option(value=\"second_radio_answer\"),\n ],\n ),\n lb.PromptResponseClassification(\n class_type=lb.PromptResponseClassification.Type.RESPONSE_TEXT,\n name=\"response text\",\n character_min=\n 1, # Minimum character count of response text field (optional)\n character_max=\n 20, # Maximum character count of response text field (optional)\n ),\n ],\n)\n\n# Create ontology\nontology = client.create_ontology(\n \"Prompt and response ontology\",\n ontology_builder.asdict(),\n media_type=lb.MediaType.LLMPromptResponseCreation,\n)", + "cell_type": "code", "outputs": [], - "source": [ - "ontology_builder = lb.OntologyBuilder(\n", - " tools=[],\n", - " classifications=[\n", - " lb.PromptResponseClassification(\n", - " class_type=lb.PromptResponseClassification.Type.PROMPT,\n", - " name=\"prompt text\",\n", - " character_min = 1, # Minimum character count of prompt field (optional)\n", - " character_max = 20, # Maximum character count of prompt field (optional)\n", - " ),\n", - " lb.PromptResponseClassification(\n", - " class_type=lb.PromptResponseClassification.Type.RESPONSE_CHECKLIST,\n", - " name=\"response checklist feature\",\n", - " options=[\n", - " lb.Option(value=\"option 1\", label=\"option 1\"),\n", - " lb.Option(value=\"option 2\", label=\"option 2\"),\n", - " ],\n", - " ),\n", - " lb.PromptResponseClassification(\n", - " class_type=lb.PromptResponseClassification.Type.RESPONSE_RADIO,\n", - " name=\"response radio feature\",\n", - " options=[\n", - " lb.Option(value=\"first_radio_answer\"),\n", - " lb.Option(value=\"second_radio_answer\"),\n", - " ],\n", - " ),\n", - " lb.PromptResponseClassification(\n", - " class_type=lb.PromptResponseClassification.Type.RESPONSE_TEXT,\n", - " name=\"response text\",\n", - " character_min = 1, # Minimum character count of response text field (optional)\n", - " character_max = 20, # Maximum character count of response text field (optional)\n", - " )\n", - " ],\n", - ")\n", - "\n", - "# Create ontology\n", - "ontology = client.create_ontology(\n", - " \"Prompt and response ontology\",\n", - " ontology_builder.asdict(),\n", - " media_type=lb.MediaType.LLMPromptResponseCreation,\n", - ")" - ] + "execution_count": null }, { - "cell_type": "markdown", "metadata": {}, "source": [ "### Option B: `client.create_ontology_from_feature_schemas`\n", "You can also create ontologies using feature schema IDs, which make your ontologies with existing features instead of generating new features. You can get these features by going to the _Schema_ tab inside Labelbox." - ] + ], + "cell_type": "markdown" }, { - "cell_type": "code", - "execution_count": null, "metadata": {}, + "source": "# Uncomment the follwing code block for this option\n# ontology = client.create_ontology_from_feature_schemas(\n# \"LMC ontology\",\n# feature_schema_ids=[\"\", # optional\n)", + "cell_type": "code", "outputs": [], - "source": [ - "response_creation_project = client.create_response_creation_project(\n", - " name=\"Demo response creation\",\n", - " description=\"\", # optional\n", - ")" - ] + "execution_count": null }, { - "cell_type": "markdown", "metadata": {}, "source": [ "## Create prompt response and prompt creation projects\n", @@ -228,111 +169,46 @@ " - `dataset_id`: An optional dataset ID of an existing Labelbox dataset. Include this parameter if you want to append it to an existing dataset.\n", "\n", " - `data_row_count`: The number of data row assets that will be generated and used with your project." - ] + ], + "cell_type": "markdown" }, { - "cell_type": "code", - "execution_count": null, "metadata": {}, + "source": "project = client.create_prompt_response_generation_project(\n name=\"Demo prompt response project\",\n media_type=lb.MediaType.LLMPromptResponseCreation,\n dataset_name=\"Demo prompt response dataset\",\n data_row_count=100,\n)\n\n# Setup project with ontology created above\nproject.connect_ontology(ontology)", + "cell_type": "code", "outputs": [], - "source": [ - "project = client.create_prompt_response_generation_project(\n", - " name=\"Demo prompt response project\",\n", - " media_type=lb.MediaType.LLMPromptResponseCreation,\n", - " dataset_name=\"Demo prompt response dataset\",\n", - " data_row_count=100,\n", - ")\n", - "\n", - "# Setup project with ontology created above\n", - "project.connect_ontology(ontology)" - ] + "execution_count": null }, { - "cell_type": "markdown", "metadata": {}, "source": [ "## Exporting prompt response, prompt or response create project\n", "Exporting from a prompt and response type project works the same as exporting from other projects. In this example, your export will be empty unless you create labels within the Labelbox platform. See prompt and response export for a sample export." - ] + ], + "cell_type": "markdown" }, { - "cell_type": "code", - "execution_count": null, "metadata": {}, + "source": "# The return type of this method is an `ExportTask`, which is a wrapper of a`Task`\n# Most of `Task` features are also present in `ExportTask`.\n\nexport_params = {\n \"attachments\": True,\n \"metadata_fields\": True,\n \"data_row_details\": True,\n \"project_details\": True,\n \"label_details\": True,\n \"performance_details\": True,\n \"interpolated_frames\": True,\n}\n\n# Note: Filters follow AND logic, so typically using one filter is sufficient.\nfilters = {\n \"last_activity_at\": [\"2000-01-01 00:00:00\", \"2050-01-01 00:00:00\"],\n \"label_created_at\": [\"2000-01-01 00:00:00\", \"2050-01-01 00:00:00\"],\n \"workflow_status\": \"InReview\",\n \"batch_ids\": [\"batch_id_1\", \"batch_id_2\"],\n \"data_row_ids\": [\"data_row_id_1\", \"data_row_id_2\"],\n \"global_keys\": [\"global_key_1\", \"global_key_2\"],\n}\n\nexport_task = project.export(params=export_params, filters=filters)\nexport_task.wait_till_done()\n\n\n# Return a JSON output string from the export task results/errors one by one:\ndef json_stream_handler(output: lb.BufferedJsonConverterOutput):\n print(output.json)\n\n\nif export_task.has_errors():\n export_task.get_buffered_stream(stream_type=lb.StreamType.ERRORS).start(\n stream_handler=lambda error: print(error))\n\nif export_task.has_result():\n export_json = export_task.get_buffered_stream(\n stream_type=lb.StreamType.RESULT).start(\n stream_handler=json_stream_handler)\n\nprint(\n \"file size: \",\n export_task.get_total_file_size(stream_type=lb.StreamType.RESULT),\n)\nprint(\n \"line count: \",\n export_task.get_total_lines(stream_type=lb.StreamType.RESULT),\n)", + "cell_type": "code", "outputs": [], - "source": [ - "# The return type of this method is an `ExportTask`, which is a wrapper of a`Task`\n", - "# Most of `Task` features are also present in `ExportTask`.\n", - "\n", - "export_params = {\n", - " \"attachments\": True,\n", - " \"metadata_fields\": True,\n", - " \"data_row_details\": True,\n", - " \"project_details\": True,\n", - " \"label_details\": True,\n", - " \"performance_details\": True,\n", - " \"interpolated_frames\": True,\n", - "}\n", - "\n", - "# Note: Filters follow AND logic, so typically using one filter is sufficient.\n", - "filters = {\n", - " \"last_activity_at\": [\"2000-01-01 00:00:00\", \"2050-01-01 00:00:00\"],\n", - " \"label_created_at\": [\"2000-01-01 00:00:00\", \"2050-01-01 00:00:00\"],\n", - " \"workflow_status\": \"InReview\",\n", - " \"batch_ids\": [\"batch_id_1\", \"batch_id_2\"],\n", - " \"data_row_ids\": [\"data_row_id_1\", \"data_row_id_2\"],\n", - " \"global_keys\": [\"global_key_1\", \"global_key_2\"],\n", - "}\n", - "\n", - "export_task = project.export(params=export_params, filters=filters)\n", - "export_task.wait_till_done()\n", - "\n", - "# Return a JSON output string from the export task results/errors one by one:\n", - "def json_stream_handler(output: lb.BufferedJsonConverterOutput):\n", - " print(output.json)\n", - "\n", - "\n", - "if export_task.has_errors():\n", - " export_task.get_buffered_stream(stream_type=lb.StreamType.ERRORS).start(\n", - " stream_handler=lambda error: print(error)\n", - " )\n", - "\n", - "if export_task.has_result():\n", - " export_json = export_task.get_buffered_stream(\n", - " stream_type=lb.StreamType.RESULT\n", - " ).start(stream_handler=json_stream_handler)\n", - "\n", - "print(\"file size: \", export_task.get_total_file_size(stream_type=lb.StreamType.RESULT))\n", - "print(\"line count: \", export_task.get_total_lines(stream_type=lb.StreamType.RESULT))" - ] + "execution_count": null }, { - "cell_type": "markdown", "metadata": {}, "source": [ "## Clean up\n", "\n", "This section serves as an optional clean-up step to delete the Labelbox assets created within this guide. You will need to uncomment the delete methods shown." - ] + ], + "cell_type": "markdown" }, { - "cell_type": "code", - "execution_count": null, "metadata": {}, + "source": "# project.delete()\n# response_creation_project.delete()\n# client.delete_unused_ontology(ontology.uid)\n# dataset.delete()", + "cell_type": "code", "outputs": [], - "source": [ - "# project.delete()\n", - "# response_creation_project.delete()\n", - "# client.delete_unused_ontology(ontology.uid)\n", - "# dataset.delete()" - ] + "execution_count": null } - ], - "metadata": { - "language_info": { - "name": "python" - } - }, - "nbformat": 4, - "nbformat_minor": 2 -} + ] +} \ No newline at end of file From 0e6507e121144924ff4e8c738eb4c9ce4b6fbda3 Mon Sep 17 00:00:00 2001 From: "github-actions[bot]" Date: Tue, 30 Jul 2024 00:25:34 +0000 Subject: [PATCH 3/3] :memo: README updated --- README.md | 15 ++++++++++----- 1 file changed, 10 insertions(+), 5 deletions(-) diff --git a/README.md b/README.md index 9031bec..0b1f0f7 100644 --- a/README.md +++ b/README.md @@ -126,6 +126,11 @@ Welcome to Labelbox Notebooks! These documents are directly linked from our Labe Open In Colab Open In Github + + Prompt response projects + Open In Colab + Open In Github + @@ -155,11 +160,6 @@ Welcome to Labelbox Notebooks! These documents are directly linked from our Labe Open In Colab Open In Github - - Conversational LLM data generation - Open In Colab - Open In Github - Image Open In Colab @@ -180,6 +180,11 @@ Welcome to Labelbox Notebooks! These documents are directly linked from our Labe Open In Colab Open In Github + + LLM data generation + Open In Colab + Open In Github + Text Open In Colab