BerriAI
diff --git a/‎docs/my-website/docs/providers/vertex.md
Lines changed: 96 additions & 4 deletions b/‎docs/my-website/docs/providers/vertex.md
Lines changed: 96 additions & 4 deletions
diff --git a/‎docs/my-website/docs/providers/xai.md
Lines changed: 78 additions & 0 deletions b/‎docs/my-website/docs/providers/xai.md
Lines changed: 78 additions & 0 deletions
diff --git a/‎docs/my-website/docs/tutorials/tag_management.md
Lines changed: 145 additions & 0 deletions b/‎docs/my-website/docs/tutorials/tag_management.md
Lines changed: 145 additions & 0 deletions
diff --git a/‎docs/my-website/img/tag_create.png
250 KB b/‎docs/my-website/img/tag_create.png
250 KB
diff --git a/‎docs/my-website/img/tag_invalid.png
237 KB b/‎docs/my-website/img/tag_invalid.png
237 KB
diff --git a/‎docs/my-website/img/tag_valid.png
319 KB b/‎docs/my-website/img/tag_valid.png
319 KB
@@ -347,7 +347,7 @@ Return a `list[Recipe]`
 completion(model="vertex_ai/gemini-1.5-flash-preview-0514", messages=messages, response_format={ "type": "json_object" })
 ```
 
-### **Grounding**
+### **Grounding - Web Search**
 
 Add Google Search Result grounding to vertex ai calls. 
 
@@ -358,7 +358,7 @@ See the grounding metadata with `response_obj._hidden_params["vertex_ai_groundin
 <Tabs>
 <TabItem value="sdk" label="SDK">
 
-```python 
+```python showLineNumbers
 from litellm import completion 
 
 ## SETUP ENVIRONMENT
@@ -377,14 +377,36 @@ print(resp)
 </TabItem>
 <TabItem value="proxy" label="PROXY">
 
-```bash
+<Tabs>
+<TabItem value="openai" label="OpenAI Python SDK">
+
+```python showLineNumbers
+from openai import OpenAI
+
+client = OpenAI(
+    api_key="sk-1234", # pass litellm proxy key, if you're using virtual keys
+    base_url="http://0.0.0.0:4000/v1/" # point to litellm proxy
+)
+
+response = client.chat.completions.create(
+    model="gemini-pro",
+    messages=[{"role": "user", "content": "Who won the world cup?"}],
+    tools=[{"googleSearchRetrieval": {}}],
+)
+
+print(response)
+```
+</TabItem>
+<TabItem value="curl" label="cURL">
+
+```bash showLineNumbers
 curl http://localhost:4000/v1/chat/completions \
   -H "Content-Type: application/json" \
   -H "Authorization: Bearer sk-1234" \
   -d '{
     "model": "gemini-pro",
     "messages": [
-      {"role": "user", "content": "Hello, Claude!"}
+      {"role": "user", "content": "Who won the world cup?"}
     ],
    "tools": [
         {
@@ -394,12 +416,82 @@ curl http://localhost:4000/v1/chat/completions \
   }'
 
 ```
+</TabItem>
+</Tabs>
 
 </TabItem>
 </Tabs>
 
 You can also use the `enterpriseWebSearch` tool for an [enterprise compliant search](https://cloud.google.com/vertex-ai/generative-ai/docs/grounding/web-grounding-enterprise).
 
+<Tabs>
+<TabItem value="sdk" label="SDK">
+
+```python showLineNumbers
+from litellm import completion 
+
+## SETUP ENVIRONMENT
+# !gcloud auth application-default login - run this to add vertex credentials to your env
+
+tools = [{"enterpriseWebSearch": {}}] # 👈 ADD GOOGLE ENTERPRISE SEARCH
+
+resp = litellm.completion(
+                    model="vertex_ai/gemini-1.0-pro-001",
+                    messages=[{"role": "user", "content": "Who won the world cup?"}],
+                    tools=tools,
+                )
+
+print(resp)
+```
+</TabItem>
+<TabItem value="proxy" label="PROXY">
+
+<Tabs>
+<TabItem value="openai" label="OpenAI Python SDK">
+
+```python showLineNumbers
+from openai import OpenAI
+
+client = OpenAI(
+    api_key="sk-1234", # pass litellm proxy key, if you're using virtual keys
+    base_url="http://0.0.0.0:4000/v1/" # point to litellm proxy
+)
+
+response = client.chat.completions.create(
+    model="gemini-pro",
+    messages=[{"role": "user", "content": "Who won the world cup?"}],
+    tools=[{"enterpriseWebSearch": {}}],
+)
+
+print(response)
+```
+</TabItem>
+<TabItem value="curl" label="cURL">
+
+```bash showLineNumbers
+curl http://localhost:4000/v1/chat/completions \
+  -H "Content-Type: application/json" \
+  -H "Authorization: Bearer sk-1234" \
+  -d '{
+    "model": "gemini-pro",
+    "messages": [
+      {"role": "user", "content": "Who won the world cup?"}
+    ],
+   "tools": [
+        {
+            "enterpriseWebSearch": {} 
+        }
+    ]
+  }'
+
+```
+</TabItem>
+</Tabs>
+
+</TabItem>
+</Tabs>
+
+
 #### **Moving from Vertex AI SDK to LiteLLM (GROUNDING)**
 
 
 
@@ -176,3 +176,81 @@ Here's how to call a XAI model with the LiteLLM Proxy Server
   </Tabs>
 
 
+## Reasoning Usage
+
+LiteLLM supports reasoning usage for xAI models.
+
+<Tabs>
+
+<TabItem value="python" label="LiteLLM Python SDK">
+
+```python showLineNumbers title="reasoning with xai/grok-3-mini-beta"
+import litellm
+response = litellm.completion(
+    model="xai/grok-3-mini-beta",
+    messages=[{"role": "user", "content": "What is 101*3?"}],
+    reasoning_effort="low",
+)
+
+print("Reasoning Content:")
+print(response.choices[0].message.reasoning_content)
+
+print("\nFinal Response:")
+print(completion.choices[0].message.content)
+
+print("\nNumber of completion tokens (input):")
+print(completion.usage.completion_tokens)
+
+print("\nNumber of reasoning tokens (input):")
+print(completion.usage.completion_tokens_details.reasoning_tokens)
+```
+</TabItem>
+
+<TabItem value="curl" label="LiteLLM Proxy - OpenAI SDK Usage">
+
+```python showLineNumbers title="reasoning with xai/grok-3-mini-beta"
+import openai
+client = openai.OpenAI(
+    api_key="sk-1234",             # pass litellm proxy key, if you're using virtual keys
+    base_url="http://0.0.0.0:4000" # litellm-proxy-base url
+)
+
+response = client.chat.completions.create(
+    model="xai/grok-3-mini-beta",
+    messages=[{"role": "user", "content": "What is 101*3?"}],
+    reasoning_effort="low",
+)
+
+print("Reasoning Content:")
+print(response.choices[0].message.reasoning_content)
+
+print("\nFinal Response:")
+print(completion.choices[0].message.content)
+
+print("\nNumber of completion tokens (input):")
+print(completion.usage.completion_tokens)
+
+print("\nNumber of reasoning tokens (input):")
+print(completion.usage.completion_tokens_details.reasoning_tokens)
+```
+
+</TabItem>
+</Tabs>
+
+**Example Response:**
+
+```shell
+Reasoning Content:
+Let me calculate 101 multiplied by 3:
+101 * 3 = 303.
+I can double-check that: 100 * 3 is 300, and 1 * 3 is 3, so 300 + 3 = 303. Yes, that's correct.
+
+Final Response:
+The result of 101 multiplied by 3 is 303.
+
+Number of completion tokens (input):
+14
+
+Number of reasoning tokens (input):
+310
+```
@@ -0,0 +1,145 @@
+import Image from '@theme/IdealImage';
+import Tabs from '@theme/Tabs';
+import TabItem from '@theme/TabItem';
+
+# [Beta] Routing based on request metadata
+
+Create routing rules based on request metadata.
+
+## Setup
+
+Add the following to your litellm proxy config yaml file.
+
+```yaml showLineNumbers title="litellm proxy config.yaml"
+router_settings:
+  enable_tag_filtering: True # 👈 Key Change
+```
+
+## 1. Create a tag
+
+On the LiteLLM UI, navigate to Experimental > Tag Management > Create Tag.
+
+Create a tag called `private-data` and only select the allowed models for requests with this tag. Once created, you will see the tag in the Tag Management page.
+
+<Image img={require('../../img/tag_create.png')}  style={{ width: '800px', height: 'auto' }} />
+
+
+## 2. Test Tag Routing
+
+Now we will test the tag based routing rules.
+
+### 2.1 Invalid model
+
+This request will fail since we send `tags=private-data` but the model `gpt-4o` is not in the allowed models for the `private-data` tag.
+
+<Image img={require('../../img/tag_invalid.png')}  style={{ width: '800px', height: 'auto' }} />
+
+<br />
+
+Here is an example sending the same request using the OpenAI Python SDK.
+<Tabs>
+<TabItem value="python" label="OpenAI Python SDK">
+
+```python showLineNumbers
+from openai import OpenAI
+
+client = OpenAI(
+    api_key="sk-1234",
+    base_url="http://0.0.0.0:4000/v1/"
+)
+
+response = client.chat.completions.create(
+    model="gpt-4o",
+    messages=[
+        {"role": "user", "content": "Hello, how are you?"}
+    ],
+    extra_body={
+        "tags": "private-data"
+    }
+)
+```
+
+</TabItem>
+<TabItem value="curl" label="cURL">
+
+```bash
+curl -L -X POST 'http://0.0.0.0:4000/v1/chat/completions' \
+-H 'Content-Type: application/json' \
+-H 'Authorization: Bearer sk-1234' \
+-d '{
+  "model": "gpt-4o",
+  "messages": [
+    {
+      "role": "user",
+      "content": "Hello, how are you?"
+    }
+  ],
+  "tags": "private-data"
+}'
+```
+
+</TabItem>
+</Tabs>
+
+<br />
+
+### 2.2 Valid model
+
+This request will succeed since we send `tags=private-data` and the model `us.anthropic.claude-3-7-sonnet-20250219-v1:0` is in the allowed models for the `private-data` tag.
+
+<Image img={require('../../img/tag_valid.png')}  style={{ width: '800px', height: 'auto' }} />
+
+Here is an example sending the same request using the OpenAI Python SDK.
+
+<Tabs>
+<TabItem value="python" label="OpenAI Python SDK">
+
+```python showLineNumbers
+from openai import OpenAI
+
+client = OpenAI(
+    api_key="sk-1234",
+    base_url="http://0.0.0.0:4000/v1/"
+)
+
+response = client.chat.completions.create(
+    model="us.anthropic.claude-3-7-sonnet-20250219-v1:0",
+    messages=[
+        {"role": "user", "content": "Hello, how are you?"}
+    ],
+    extra_body={
+        "tags": "private-data"
+    }
+)
+```
+
+</TabItem>
+<TabItem value="curl" label="cURL">
+
+```bash
+curl -L -X POST 'http://0.0.0.0:4000/v1/chat/completions' \
+-H 'Content-Type: application/json' \
+-H 'Authorization: Bearer sk-1234' \
+-d '{
+  "model": "us.anthropic.claude-3-7-sonnet-20250219-v1:0",
+  "messages": [
+    {
+      "role": "user",
+      "content": "Hello, how are you?"
+    }
+  ],
+  "tags": "private-data"
+}'
+```
+
+</TabItem>
+</Tabs>
+
+
+
+## Additional Tag Features
+- [Sending tags in request headers](https://docs.litellm.ai/docs/proxy/tag_routing#calling-via-request-header)
+- [Tag based routing](https://docs.litellm.ai/docs/proxy/tag_routing)
+- [Track spend per tag](cost_tracking#-custom-tags)
+- [Setup Budgets per Virtual Key, Team](users)
+