A Model Context Protocol (MCP) server for querying Weights & Biases data. This server allows a MCP Client to:
- query W&B Models runs and sweeps
- query W&B Weave traces, evaluations and datasets
- query wandbot, the W&B support agent, for general W&B feature questions
- write text and charts to W&B Reports
Please first install uv
with either:
curl -LsSf https://astral.sh/uv/install.sh | sh
or
brew install uv
Enable the server for a specific project. Run the following in the root of your project dir:
uvx --from git+https://github.com/wandb/wandb-mcp-server -- add_to_client --config_path .cursor/mcp.json && uvx wandb login
Enable the server for all Cursor projects, doesn't matter where this is run:
uvx --from git+https://github.com/wandb/wandb-mcp-server -- add_to_client --config_path ~/.cursor/mcp.json && uvx wandb login
uvx --from git+https://github.com/wandb/wandb-mcp-server -- add_to_client --config_path ~/.codeium/windsurf/mcp_config.json && uvx wandb login
claude mcp add wandb -- uvx --from git+https://github.com/wandb/wandb-mcp-server wandb_mcp_server && uvx wandb login
Passing an environment variable to Claude Code, e.g. api key:
claude mcp add wandb -e WANDB_API_KEY=your-api-key -- uvx --from git+https://github.com/wandb/wandb-mcp-server wandb_mcp_server
First ensure uv
is installed, you might have to use homebrew
to install depite uv
being available in your terminal. Then run the below:
uvx --from git+https://github.com/wandb/wandb-mcp-server -- add_to_client --config_path "~/Library/Application Support/Claude/claude_desktop_config.json" && uvx wandb login
- Ensure you have
uv
installed, see above installation instructions for uv. - Get your W&B api key here
- Add the following to your MCP client config manually.
{
"mcpServers": {
"wandb": {
"command": "uvx",
"args": [
"--from",
"git+https://github.com/wandb/wandb-mcp-server",
"wandb_mcp_server"
],
"env": {
"WANDB_API_KEY": "<insert your wandb key>",
}
}
}
}
These help utilities above are inspired by the OpenMCP Server Registry add-to-client pattern.
query_wandb_tool
Execute queries against wandb experiment tracking data including Runs & Sweeps.
-
query_weave_traces_tool
Queries Weave evaluations and traces with powerful filtering, sorting, and pagination options. Returns either complete trace data or just metadata to avoid overwhelming the LLM context window. -
count_weave_traces_tool
Efficiently counts Weave traces matching given filters without returning the trace data. Returns both total trace count and root traces count to understand project scope before querying.
query_wandb_support_bot
Connect your client to wandbot, our RAG-powered support agent for general help on how to use Weigths & Biases products and features.
create_wandb_report_tool
Creates a new W&B Report with markdown text and HTML-rendered visualizations. Provides a permanent, shareable document for saving analysis findings and generated charts.
query_wandb_entity_projects
List the available W&B entities and projects that can be accessed to give the LLM more context on how to write the correct queries for the above tools.
LLMs are not mind readers, ensure you specify the W&B Entity and W&B Project to the LLM. Example query for Claude Desktop:
how many openai.chat traces in the wandb-applied-ai-team/mcp-tests weave project? plot the most recent 5 traces over time and save to a report
Questions such as "what is my best evaluation?" are probably overly broad and you'll get to an answer faster by refining your question to be more specific such as: "what eval had the highest f1 score?"
When asking broad, general questions such as "what are my best performing runs/evaluations?" its always a good idea to ask the LLM to check that it retrieved all the available runs. The MCP tools are designed to fetch the correct amount of data, but sometimes there can be a tendency from the LLMs to only retrieve the latest runs or the last N runs.
The add_to_client
function accepts a number of flags to enable writing optional environment variables to the server's config file. Below is an example setting other env variables that don't have dedicated flags.
# Write the server config file with additional env vars
uvx --from git+https://github.com/wandb/wandb-mcp-server -- add_to_client \
--config_path ~/.codeium/windsurf/mcp_config.json \
--write_env_vars MCP_LOGS_WANDB_ENTITY=my_wandb_entity
# Then login to W&B
uvx wandb login
Arguments passed to --write_env_vars
must be space separated and the key and value of each env variable must be separated only by a =
.
Run the server from source by running the below in the root dir:
wandb login && uv run src/wandb_mcp_server/server.py
The full list of environment variables used to control the server's settings can be found in the .env.example
file.
Ensure the machine running the MCP server is authenticated to Weights & Biases, either by setting the WANDB_API_KEY
or running the below to add the key to the .netrc file:
uvx wandb login
If you encounter an error like this when starting the MCP server:
Error: spawn uv ENOENT
This indicates that the uv
package manager cannot be found. Fix this with these steps:
-
Install
uv
using the official installation script:curl -LsSf https://astral.sh/uv/install.sh | sh
or if using a Mac:
brew install uv
-
If the error persists after installation, create a symlink to make
uv
available system-wide:sudo ln -s ~/.local/bin/uv /usr/local/bin/uv
-
Restart your application or IDE after making these changes.
This ensures that the uv
executable is accessible from standard system paths that are typically included in the PATH for all processes.
The tests include a mix of unit tests and integration tests that test the tool calling reliability of a LLM. For now the integration tets only use claude-sonnet-3.7.
Set the appropriate api key in the .env
file, e.g.
ANTHROPIC_API_KEY=<my_key>
Run a single test using pytest with 10 workers
uv run pytest -s -n 10 tests/test_query_wandb_gql.py
Turn on debug logging for a single sample in 1 test file
pytest -s -n 1 "tests/test_query_weave_traces.py::test_query_weave_trace[longest_eval_most_expensive_child]" -v --log-cli-level=DEBUG