Skip to content

Commit 47f3b49

Browse files
update the MLflow tracing guide (#8353)
1 parent dae38d3 commit 47f3b49

File tree

5 files changed

+90
-31
lines changed

5 files changed

+90
-31
lines changed
Binary file not shown.

docs/docs/tutorials/observability/index.md

Lines changed: 90 additions & 31 deletions
Original file line numberDiff line numberDiff line change
@@ -14,27 +14,36 @@ We'll start by creating a simple ReAct agent that uses ColBERTv2's Wikipedia dat
1414

1515
```python
1616
import dspy
17-
from dspy.datasets import HotPotQA
17+
import os
1818

19-
lm = dspy.LM('openai/gpt-4o-mini')
20-
colbert = dspy.ColBERTv2(url='http://20.102.90.50:2017/wiki17_abstracts')
21-
dspy.configure(lm=lm, rm=colbert)
19+
os.environ["OPENAI_API_KEY"] = "{your_openai_api_key}"
2220

23-
agent = dspy.ReAct("question -> answer", tools=[dspy.Retrieve(k=1)])
21+
lm = dspy.LM("openai/gpt-4o-mini")
22+
colbert = dspy.ColBERTv2(url="http://20.102.90.50:2017/wiki17_abstracts")
23+
dspy.configure(lm=lm)
24+
25+
26+
def retrieve(query: str):
27+
"""Retrieve top 3 relevant information from ColBert"""
28+
results = colbert(query, k=3)
29+
return [x["text"] for x in results]
30+
31+
32+
agent = dspy.ReAct("question -> answer", tools=[retrieve], max_iters=3)
2433
```
2534

2635
Now, let's ask the agent a simple question:
2736

2837
```python
29-
prediction = agent(question="Which baseball team does Shohei Ohtani play for?")
38+
prediction = agent(question="Which baseball team does Shohei Ohtani play for in June 2025?")
3039
print(prediction.answer)
3140
```
3241

3342
```
34-
Shohei Ohtani plays for the Los Angeles Angels.
43+
Shohei Ohtani is expected to play for the Hokkaido Nippon-Ham Fighters in June 2025, based on the available information.
3544
```
3645

37-
Oh, this is incorrect. He no longer plays for the Angels; he moved to the Dodgers and won the World Series in 2024! Let's debug the program and explore potential fixes.
46+
Oh, this is incorrect. He no longer plays for the Hokkaido Nippon-Ham Fighters; he moved to the Dodgers and won the World Series in 2024! Let's debug the program and explore potential fixes.
3847

3948
## Using ``inspect_history``
4049

@@ -57,13 +66,16 @@ Your input fields are:
5766
5867
Response:
5968
60-
[[ ## Thought_5 ## ]]
61-
The search results continue to be unhelpful and do not provide the current team for Shohei Ohtani in Major League Baseball. I need to conclude that he plays for the Los Angeles Angels based on prior knowledge, as the searches have not yielded updated information.
69+
Response:
70+
71+
[[ ## reasoning ## ]]
72+
The search for information regarding Shohei Ohtani's team in June 2025 did not yield any specific results. The retrieved data consistently mentioned that he plays for the Hokkaido Nippon-Ham Fighters, but there was no indication of any changes or updates regarding his team for the specified date. Given the lack of information, it is reasonable to conclude that he may still be with the Hokkaido Nippon-Ham Fighters unless there are future developments that are not captured in the current data.
6273
63-
[[ ## Action_5 ## ]]
64-
Finish[Los Angeles Angels]
74+
[[ ## answer ## ]]
75+
Shohei Ohtani is expected to play for the Hokkaido Nippon-Ham Fighters in June 2025, based on the available information.
6576
6677
[[ ## completed ## ]]
78+
6779
```
6880
The log reveals that the agent could not retrieve helpful information from the search tool. However, what exactly did the retriever return? While useful, `inspect_history` has some limitations:
6981

@@ -81,40 +93,73 @@ The log reveals that the agent could not retrieve helpful information from the s
8193
pip install -U mlflow>=2.18.0
8294
```
8395

96+
After installation, spin up your server via the command below.
97+
98+
```
99+
# It is highly recommended to use SQL store when using MLflow tracing
100+
mlflow server --backend-store-uri sqlite:///mydb.sqlite
101+
```
102+
103+
If you don't specify a different port via `--port` flag, you MLflow server will be hosted at port 5000.
104+
105+
Now let's change our code snippet to enable MLflow tracing. We need to:
106+
107+
- Tell MLflow where the server is hosted.
108+
- Apply `mlflow.autolog()` so that DSPy tracing is automatically captured.
109+
110+
The full code is as below, now let's run it again!
111+
84112
```python
113+
import dspy
114+
import os
85115
import mlflow
86116

87-
mlflow.dspy.autolog()
117+
os.environ["OPENAI_API_KEY"] = "{your_openai_api_key}"
88118

89-
# This is optional. Create an MLflow Experiment to store and organize your traces.
119+
# Tell MLflow about the server URI.
120+
mlflow.set_tracking_uri("http://127.0.0.1:5000")
121+
# Create a unique name for your experiment.
90122
mlflow.set_experiment("DSPy")
91-
```
92123

93-
Now you're all set! Let's run your agent again:
124+
lm = dspy.LM("openai/gpt-4o-mini")
125+
colbert = dspy.ColBERTv2(url="http://20.102.90.50:2017/wiki17_abstracts")
126+
dspy.configure(lm=lm)
94127

95-
```python
96-
agent(question="Which baseball team does Shohei Ohtani play for?")
97-
```
98128

99-
MLflow automatically generates a **trace** for the prediction and records it in the experiment. To explore traces visually, launch the MLflow UI by the following command and access it in your browser:
129+
def retrieve(query: str):
130+
"""Retrieve top 3 relevant information from ColBert"""
131+
results = colbert(query, k=3)
132+
return [x["text"] for x in results]
100133

101-
```bash
102-
mlflow ui --port 5000
134+
135+
agent = dspy.ReAct("question -> answer", tools=[retrieve], max_iters=3)
136+
print(agent(question="Which baseball team does Shohei Ohtani play for?"))
103137
```
104138

105-
![DSPy MLflow Tracing](./dspy-tracing.gif)
106139

107-
From the retriever step output, you can observe that it returned outdated information; indicating Shohei Ohtani was still playing in the Japanese league and the final answer was based on the LLM's prior knowledge! We should update the dataset or add additional tools to ensure access to the latest information.
140+
MLflow automatically generates a **trace** for each prediction and records it within your experiment. To explore these traces visually, open `http://127.0.0.1:5000/`
141+
in your browser, then select the experiment you just created and navigate to the Traces tab:
108142

109-
!!! info Learn more about MLflow
143+
![MLflow Trace UI](./mlflow_trace_ui.png)
110144

111-
MLflow is an end-to-end LLMOps platform that offers extensive features like experiment tracking, evaluation, and deployment. To learn more about DSPy and MLflow integration, visit [this tutorial](../deployment/#deploying-with-mlflow).
145+
Click on the most recent trace to view its detailed breakdown:
146+
147+
![MLflow Trace View](./mlflow_trace_view.png)
112148

113-
For example, we can add a web search capability to the agent, using the [Tavily](https://tavily.com/) web search API.
149+
Here, you can examine the input and output of every step in your workflow. For example, the screenshot above shows the `retrieve` function's input and output. By inspecting the retriever's output, you can see that it returned outdated information, which is not sufficient to determine which team Shohei Ohtani plays for in June 2025. You can also inspect
150+
other steps, e.g, anguage model's input, output, and configuration.
151+
152+
To address the issue of outdated information, you can replace the `retrieve` function with a web search tool powered by [Tavily search](https://www.tavily.com/).
114153

115154
```python
116-
from dspy.predict.react import Tool
117155
from tavily import TavilyClient
156+
import dspy
157+
import mlflow
158+
159+
# Tell MLflow about the server URI.
160+
mlflow.set_tracking_uri("http://127.0.0.1:5000")
161+
# Create a unique name for your experiment.
162+
mlflow.set_experiment("DSPy")
118163

119164
search_client = TavilyClient(api_key="<YOUR_TAVILY_API_KEY>")
120165

@@ -123,7 +168,7 @@ def web_search(query: str) -> list[str]:
123168
response = search_client.search(query)
124169
return [r["content"] for r in response["results"]]
125170

126-
agent = dspy.ReAct("question -> answer", tools=[Tool(web_search)])
171+
agent = dspy.ReAct("question -> answer", tools=[web_search])
127172

128173
prediction = agent(question="Which baseball team does Shohei Ohtani play for?")
129174
print(agent.answer)
@@ -133,6 +178,20 @@ print(agent.answer)
133178
Los Angeles Dodgers
134179
```
135180

181+
Below is a GIF demonstrating how to navigate through the MLflow UI:
182+
183+
![MLflow Trace UI Navigation](./mlflow_trace_ui_navigation.gif)
184+
185+
186+
For a complete guide on how to use MLflow tracing, please refer to
187+
the [MLflow Tracing Guide](https://mlflow.org/docs/3.0.0rc0/tracing).
188+
189+
190+
191+
!!! info Learn more about MLflow
192+
193+
MLflow is an end-to-end LLMOps platform that offers extensive features like experiment tracking, evaluation, and deployment. To learn more about DSPy and MLflow integration, visit [this tutorial](../deployment/#deploying-with-mlflow).
194+
136195

137196
## Building a Custom Logging Solution
138197

@@ -145,7 +204,7 @@ Sometimes, you may want to implement a custom logging solution. For instance, yo
145204
|`on_adapter_format_start` / `on_adapter_format_end`| Triggered when a `dspy.Adapter` subclass formats the input prompt. |
146205
|`on_adapter_parse_start` / `on_adapter_parse_end`| Triggered when a `dspy.Adapter` subclass postprocess the output text from an LM. |
147206

148-
Heres an example of custom callback that logs the intermediate steps of a ReAct agent:
207+
Here's an example of custom callback that logs the intermediate steps of a ReAct agent:
149208

150209
```python
151210
import dspy
@@ -183,4 +242,4 @@ dspy.configure(callbacks=[AgentLoggingCallback()])
183242

184243
!!! info Handling Inputs and Outputs in Callbacks
185244

186-
Be cautious when working with input or output data in callbacks. Mutating them in-place can modify the original data passed to the program, potentially leading to unexpected behavior. To avoid this, its strongly recommended to create a copy of the data before performing any operations that may alter it.
245+
Be cautious when working with input or output data in callbacks. Mutating them in-place can modify the original data passed to the program, potentially leading to unexpected behavior. To avoid this, it's strongly recommended to create a copy of the data before performing any operations that may alter it.
Loading
Loading
Loading

0 commit comments

Comments
 (0)