Update semantic search examples (#10322) (#10323)

opensearch-trigger-bot[bot] · web-flow · commit c6c5b17310e5 · 2025-07-16T21:02:29.000Z
diff --git a/_tutorials/vector-search/index.md b/_tutorials/vector-search/index.md
@@ -16,6 +16,22 @@ vector_search_101:
   - heading: "Getting started with semantic and hybrid search"
     description: "Build your first AI search application"
     link: "/tutorials/vector-search/neural-search-tutorial/"
+ai_search_types:
+  - heading: "Semantic search"
+    description: "Understands the meaning and intent behind a query to deliver more relevant results"
+    link: "/vector-search/ai-search/semantic-search/"
+  - heading: "Hybrid search"
+    description: "Improves relevance by combining keyword-based and semantic search techniques"
+    link: "/vector-search/ai-search/hybrid-search/"
+  - heading: "Multimodal search"
+    description: "Enables searching across different types of data, such as text and images"
+    link: "/vector-search/ai-search/multimodal-search/"
+  - heading: "Neural sparse search"
+    description: "Uses sparse vector representations and deep learning models for efficient retrieval"
+    link: "/vector-search/ai-search/neural-sparse-search/"
+  - heading: "Conversational search with RAG"
+    description: "Combines natural dialogue with retrieval-augmented generation to provide contextual answers"
+    link: "/vector-search/ai-search/conversational-search/"
 other:
   - heading: "Vector operations"
     description: "Learn how to generate embeddings and optimize vector storage"
@@ -36,6 +52,10 @@ Explore the following tutorials to learn about implementing vector search applic
 
 {% include cards.html cards=page.vector_search_101 %}
 
+## AI search types
+
+{% include cards.html cards=page.ai_search_types %}
+
 ## Vector search applications
 
 {% include cards.html cards=page.other %}
diff --git a/_tutorials/vector-search/neural-search-tutorial.md b/_tutorials/vector-search/neural-search-tutorial.md
@@ -51,7 +51,6 @@ PUT _cluster/settings
 {
   "persistent": {
     "plugins.ml_commons.only_run_on_ml_node": "false",
-    "plugins.ml_commons.model_access_control_enabled": "true",
     "plugins.ml_commons.native_memory_threshold": "99"
   }
 }
@@ -106,10 +105,10 @@ For information about choosing a model, see [Further reading](#further-reading).
 
 ### Step 2: Register and deploy the model 
 
-To register the model, provide the model group ID in the register request:
+To register and deploy the model, provide the model group ID in the register request:
 
 ```json
-POST /_plugins/_ml/models/_register
+POST /_plugins/_ml/models/_register?deploy=true
 {
   "name": "huggingface/sentence-transformers/msmarco-distilbert-base-tas-b",
   "version": "1.0.3",
@@ -205,32 +204,7 @@ The response contains the model information. You can see that the `model_state`
 
 #### Advanced: Registering a custom model
 
-To register a custom model, you must provide a model configuration in the register request. For example, the following is a register request containing the full format for the model used in this tutorial:
-
-```json
-POST /_plugins/_ml/models/_register
-{
-	"name": "sentence-transformers/msmarco-distilbert-base-tas-b",
-	"version": "1.0.1",
-	"description": "This is a port of the DistilBert TAS-B Model to sentence-transformers model: It maps sentences & paragraphs to a 768 dimensional dense vector space and is optimized for the task of semantic search.",
-	"model_task_type": "TEXT_EMBEDDING",
-	"model_format": "ONNX",
-	"model_content_size_in_bytes": 266291330,
-	"model_content_hash_value": "a3c916f24239fbe32c43be6b24043123d49cd2c41b312fc2b29f2fc65e3c424c",
-	"model_config": {
-		"model_type": "distilbert",
-		"embedding_dimension": 768,
-		"framework_type": "huggingface_transformers",
-		"pooling_mode": "CLS",
-		"normalize_result": false,
-		"all_config": "{\"_name_or_path\":\"old_models/msmarco-distilbert-base-tas-b/0_Transformer\",\"activation\":\"gelu\",\"architectures\":[\"DistilBertModel\"],\"attention_dropout\":0.1,\"dim\":768,\"dropout\":0.1,\"hidden_dim\":3072,\"initializer_range\":0.02,\"max_position_embeddings\":512,\"model_type\":\"distilbert\",\"n_heads\":12,\"n_layers\":6,\"pad_token_id\":0,\"qa_dropout\":0.1,\"seq_classif_dropout\":0.2,\"sinusoidal_pos_embds\":false,\"tie_weights_\":true,\"transformers_version\":\"4.7.0\",\"vocab_size\":30522}"
-	},
-	"created_time": 1676074079195,
-	"url": "https://artifacts.opensearch.org/models/ml-models/huggingface/sentence-transformers/msmarco-distilbert-base-tas-b/1.0.1/onnx/sentence-transformers_msmarco-distilbert-base-tas-b-1.0.1-onnx.zip"
-}
-```
-
-For more information, see [Using ML models within OpenSearch]({{site.url}}{{site.baseurl}}/ml-commons-plugin/using-ml-models/).
+To register a custom model, you must provide a model configuration in the register request. For more information, see [Using ML models within OpenSearch]({{site.url}}{{site.baseurl}}/ml-commons-plugin/using-ml-models/).
 
 <details markdown="block">
   <summary>
diff --git a/_vector-search/getting-started/auto-generated-embeddings.md b/_vector-search/getting-started/auto-generated-embeddings.md
@@ -23,7 +23,6 @@ PUT _cluster/settings
 {
   "persistent": {
     "plugins.ml_commons.only_run_on_ml_node": "false",
-    "plugins.ml_commons.model_access_control_enabled": "true",
     "plugins.ml_commons.native_memory_threshold": "99"
   }
 }
@@ -261,16 +260,16 @@ You can use automated workflows to create and deploy externally hosted models an
 
 ### Step 1: Register and deploy the model
 
-To register and deploy a model, select the built-in workflow template for the model provider. For more information, see [Supported workflow templates]({{site.url}}{{site.baseurl}}/automating-configurations/workflow-templates/#supported-workflow-templates). Alternatively, to configure a custom model, use [Step 1 of the manual setup](#step-1-register-and-deploy-the-model).
+To register and deploy a model, select the built-in workflow template for the model provider. For more information, see [Supported workflow templates]({{site.url}}{{site.baseurl}}/automating-configurations/workflow-templates/#supported-workflow-templates). Alternatively, to configure a custom model, use [Step 1 of the manual setup](#step-1-register-and-deploy-the-model). Note the model ID; you'll use it in the next step.
 
 ### Step 2: Configure a workflow
 
-Create and provision a semantic search workflow. You must provide the model ID for the configured model. Review your selected workflow template [defaults](https://github.com/opensearch-project/flow-framework/blob/2.13/src/main/resources/defaults/semantic-search-defaults.json) to determine whether you need to update any of the parameters. For example, if the model dimensionality is different from the default (`1024`), specify the dimensionality of your model in the `output_dimension` parameter. Change the workflow template default text field from `passage_text` to `text` in order to match the manual example:
+Create and provision a semantic search workflow. You must provide the model ID for the model deployed in the previous step. Review your selected workflow template [defaults](https://github.com/opensearch-project/flow-framework/blob/2.13/src/main/resources/defaults/semantic-search-defaults.json) to determine whether you need to update any of the parameters. For example, if the model dimensionality is different from the default (`1024`), specify the dimensionality of your model in the `output_dimension` parameter. Change the workflow template default text field from `passage_text` to `text` in order to match the manual example:
 
 ```json
 POST /_plugins/_flow_framework/workflow?use_case=semantic_search&provision=true
 {
-    "create_ingest_pipeline.model_id" : "mBGzipQB2gmRjlv_dOoB",
+    "create_ingest_pipeline.model_id" : "aVeif4oB5Vm0Tdw8zYO2",
     "text_embedding.field_map.output.dimension": "768",
     "text_embedding.field_map.input": "text"
 }

Original file line number	Diff line number	Diff line change
`@@ -23,7 +23,6 @@ PUT _cluster/settings`
`23`	`23`	`{`
`24`	`24`	`"persistent": {`
`25`	`25`	`"plugins.ml_commons.only_run_on_ml_node": "false",`
`26`		`- "plugins.ml_commons.model_access_control_enabled": "true",`
`27`	`26`	`"plugins.ml_commons.native_memory_threshold": "99"`
`28`	`27`	`}`
`29`	`28`	`}`
`@@ -261,16 +260,16 @@ You can use automated workflows to create and deploy externally hosted models an`
`261`	`260`
`262`	`261`	`### Step 1: Register and deploy the model`
`263`	`262`
`264`		`-To register and deploy a model, select the built-in workflow template for the model provider. For more information, see [Supported workflow templates]({{site.url}}{{site.baseurl}}/automating-configurations/workflow-templates/#supported-workflow-templates). Alternatively, to configure a custom model, use [Step 1 of the manual setup](#step-1-register-and-deploy-the-model).`
	`263`	`+To register and deploy a model, select the built-in workflow template for the model provider. For more information, see [Supported workflow templates]({{site.url}}{{site.baseurl}}/automating-configurations/workflow-templates/#supported-workflow-templates). Alternatively, to configure a custom model, use [Step 1 of the manual setup](#step-1-register-and-deploy-the-model). Note the model ID; you'll use it in the next step.`
`265`	`264`
`266`	`265`	`### Step 2: Configure a workflow`
`267`	`266`
`268`		-Create and provision a semantic search workflow. You must provide the model ID for the configured model. Review your selected workflow template [defaults](https://github.com/opensearch-project/flow-framework/blob/2.13/src/main/resources/defaults/semantic-search-defaults.json) to determine whether you need to update any of the parameters. For example, if the model dimensionality is different from the default (`1024`), specify the dimensionality of your model in the `output_dimension` parameter. Change the workflow template default text field from `passage_text` to `text` in order to match the manual example:
	`267`	+Create and provision a semantic search workflow. You must provide the model ID for the model deployed in the previous step. Review your selected workflow template [defaults](https://github.com/opensearch-project/flow-framework/blob/2.13/src/main/resources/defaults/semantic-search-defaults.json) to determine whether you need to update any of the parameters. For example, if the model dimensionality is different from the default (`1024`), specify the dimensionality of your model in the `output_dimension` parameter. Change the workflow template default text field from `passage_text` to `text` in order to match the manual example:
`269`	`268`
`270`	`269`	```json
`271`	`270`	`POST /_plugins/_flow_framework/workflow?use_case=semantic_search&provision=true`
`272`	`271`	`{`
`273`		`- "create_ingest_pipeline.model_id" : "mBGzipQB2gmRjlv_dOoB",`
	`272`	`+ "create_ingest_pipeline.model_id" : "aVeif4oB5Vm0Tdw8zYO2",`
`274`	`273`	`"text_embedding.field_map.output.dimension": "768",`
`275`	`274`	`"text_embedding.field_map.input": "text"`
`276`	`275`	`}`