The W&B Model Context Protocol (MCP) server exposes your W&B data as a set of tools that any MCP client can call. Connect your IDE or AI agent to the server and ask it to analyze experiments, debug Weave traces, generate reports, inspect artifacts and registry, and search W&B documentation, all without writing custom GraphQL or SDK code. Supported clients include Cursor, Visual Studio Code, Claude Code, OpenAI Codex, Gemini CLI, Mistral LeChat, Claude Desktop, and the OpenAI Responses API.Documentation Index
Fetch the complete documentation index at: https://wb-21fd5541-anish-docs-mcp-server-rework.mintlify.app/llms.txt
Use this file to discover all available pages before exploring further.
Choose a setup
The hosted MCP server is the default for every deployment. The only thing that differs between deployment types is the URL your client connects to. A local install is an escape hatch when you need more flexibility than the hosted server offers.Hosted server (recommended)
A W&B-managed MCP server that your client connects to over HTTP with your W&B API key. No installation, no local process to maintain.
- Multi-tenant Cloud:
https://mcp.withwandb.com/mcp. - Dedicated Cloud or Self-Managed:
https://<your-instance>/mcp, once W&B enables it on your deployment.
Local install (escape hatch)
Run the MCP server on your own machine over STDIO or HTTP. Use when you need air-gapped operation, pinning to a specific release, custom server behavior, active server development, or support for a client that only speaks STDIO.Run the MCP server locally
What you can do
Use the MCP server to analyze experiments, debug traces, create reports, manage registry and artifacts, and answer questions from the W&B docs. The following are representative prompts:- “Show me the top 5 runs by
eval/accuracyinyour-team/your-project.” - “How did the latency of my hiring agent’s
predicttraces evolve over the last month?” - “Generate a W&B report comparing decisions made by the hiring agent last week.”
- “What versions of the
production-modelartifact exist, and what changed betweenv2andv3?” - “How do I create a leaderboard in Weave?”
Available tools
The server registers a set of tools grouped by purpose. Each row lists the tool name, when the agent should pick it, and a concrete prompt a user can paste. Availability ofsearch_wandb_docs_tool depends on WANDB_MCP_PROXY_DOCS (enabled by default).
- Discovery
- Experiments and runs
- Weave traces
- Reports
- Artifacts and registry
- Docs
Use these tools first when you don’t know the exact entity, project, or schema to query.
| Tool | Use when | Example prompt |
|---|---|---|
list_entities_tool | You have not specified which entity to use, or need to see what teams and accounts the API key can reach. | ”What W&B teams do I have access to?” |
query_wandb_entity_projects | You know the entity but not the project name, or earlier queries failed with project not found. | ”List all projects under your-team.” |
probe_project_tool | Starting work on an unfamiliar run-based project and you need to discover available metrics, config keys, and tags. | ”Probe your-team/your-project and tell me what metrics are logged.” |
infer_trace_schema_tool | Starting work on an unfamiliar Weave traces project and you need field names, types, and sample values before querying. | ”What fields are on the Weave traces in your-team/your-project?” |
Schema-first trace queries
For Weave trace queries, callinfer_trace_schema_tool first to discover available fields, then call query_weave_traces_tool with a precise column list and detail_level:
detail_level | When to use |
|---|---|
schema | Structural fields only. Fastest. Use for browsing and counting. |
summary | Truncated inputs and outputs. The default. |
full | Everything untruncated. Use to drill into a small number of specific traces. |
full only for the traces that matter.
Recommended workflows
Most real questions need more than one tool. Ask your agent to follow one of these chains.Explore an unfamiliar project
Use when you land on a new entity or project and don’t know what is logged.list_entities_toolto find the entity.query_wandb_entity_projectsto find the project.probe_project_toolfor run-based projects, orinfer_trace_schema_toolfor Weave trace projects.- A targeted
query_wandb_toolorquery_weave_traces_toolcall using the discovered keys.
Triage failing LLM calls
Use when the agent needs to find bad traces and understand the sessions that produced them.query_weave_traces_toolwith a filter on error or exception fields, anddetail_level="summary".resolve_trace_roots_toolon the resultingtrace_idlist to map each failure to its root session.query_weave_traces_toolwithdetail_level="full"on a small number of specific roots to drill in.create_wandb_report_toolto document the findings.
Diagnose a bad training run
Use when a run looks wrong and the user wants a health check.get_run_history_toolto pull the loss and validation curves.diagnose_run_toolfor automated convergence, overfitting, and NaN checks.compare_runs_toolagainst a known-good baseline run.create_wandb_report_toolwith line-plot panels to share the diagnosis.
Summarize evals and compare model versions
Use when the user asks which model version performed best on an evaluation.summarize_evaluation_toolfor per-scorer pass rates and error counts.list_artifact_versions_toolon the relevant model collection.compare_artifact_versions_toolbetween the candidate and current production version.log_analysis_to_wandbandcreate_wandb_report_toolto publish the comparison.
Prerequisites
Before you configure any client:- A W&B API key. Create one at wandb.ai/authorize.
- Set the key as the
WANDB_API_KEYenvironment variable, or pass it to your client as a bearer token. - For Dedicated Cloud, Self-Managed, and local installs against a non-default instance, set the
WANDB_BASE_URLenvironment variable to your instance URL.
Use the hosted server
W&B runs a managed MCP server for every deployment type. You don’t need to install anything. Configure your client to connect over HTTP with a W&B API key in theAuthorization header.
Connection URL
The URL depends on where your W&B deployment lives.| Deployment | Server URL |
|---|---|
| Multi-tenant Cloud | https://mcp.withwandb.com/mcp |
| Dedicated Cloud | https://<your-instance>/mcp |
| Self-Managed | https://<your-instance>/mcp |
https://mcp.withwandb.com/mcp with https://<your-instance>/mcp and keep everything else the same.
- Cursor
- VS Code
- Claude Code
- Codex
- Gemini CLI
- Mistral LeChat
- OpenAI Responses API
Install the server in Cursor automatically with the one-click installation link, then replace the placeholder with your W&B API key in the
Authorization field.To configure Cursor manually:- On macOS, open Cursor > Settings > Cursor Settings. On Windows or Linux, open Preferences > Settings > Cursor Settings.
- Select Tools and MCP.
-
In Installed MCP Servers, select Add Custom MCP. Cursor opens the
mcp.jsonconfiguration file. -
Add the following to the
mcpServersobject: - Restart Cursor.
-
Verify the connection by asking
List my W&B entities.The agent should calllist_entities_tooland return your username and any teams.
Run the MCP server locally
A local MCP server is an escape hatch, not the default path for any particular deployment. Use it when the hosted server doesn’t fit your setup. Common reasons to run locally:- Air-gapped or offline environments where your client can’t reach a hosted W&B endpoint.
- Pinned version. The hosted server follows the main branch. A local install can pin to a specific release tag.
- Custom server behavior such as changing tool descriptions, adding tools, or setting a non-default response token budget.
- Active development on the server itself.
- STDIO-only clients or clients that require a local process.
WANDB_BASE_URL to your instance URL.
Local prerequisites
To run the server locally, make sure you have the following:- Python 3.11 or higher.
uvis recommended. For installation instructions, see the uv documentation. You can usepipinstead.- A W&B API key, set as
WANDB_API_KEY. WANDB_BASE_URLset to your instance URL if you use Dedicated Cloud or Self-Managed.
Install the server
Choose an install method:- uvx (no install)
- uv
- pip
- Install from GitHub
Configure your client
Select your MCP client:- Cursor
- VS Code
- Claude Code
- Codex
- Claude Desktop
Add the following to your Omit
mcp.json configuration:WANDB_BASE_URL to use the default W&B API endpoint.Run the server with HTTP transport
For web-based clients and for testing, run the server with HTTP transport:Environment variables
The following environment variables control authentication, instance routing, and server behavior for local installs. Set them in your client’senv block or export them in your shell.
| Variable | Description |
|---|---|
WANDB_API_KEY | W&B API key for authentication. Required. |
WANDB_BASE_URL | Custom W&B instance URL for Dedicated Cloud or Self-Managed. Defaults to https://api.wandb.ai. |
WANDB_MCP_PROXY_DOCS | Enable the search_wandb_docs_tool documentation search proxy. Default: true. |
WANDBOT_BASE_URL | Custom endpoint for the docs search proxy. |
MAX_RESPONSE_TOKENS | Token budget for tool-response truncation. Default: 30000. |
MCP_SERVER_LOG_LEVEL | Logging verbosity. One of DEBUG, INFO, WARNING, ERROR. |
Usage tips
The following practices help you get better results from the MCP server. Start with the general tips, then read the subsection that matches your workload for more specific advice.For everyone
Follow these practices regardless of your workload:- Specify the entity and project. MCP tools need an explicit entity and project name. Include both in every question, for example “in
your-team/your-project”. - Ask focused questions. Prefer “Which eval had the highest F1 score?” to “What is my best evaluation?”. Specific metrics and time ranges produce better tool calls.
- Verify full retrieval. For broad questions such as “What are my best performing runs?”, ask the agent to confirm it retrieved all available runs rather than only the most recent ones.
- Combine with W&B Skills. W&B Skills teach coding agents how to structure W&B workflows. Skills provide patterns and MCP provides data access, and the two work well together.
For trace-heavy workflows
Follow these practices when working with Weave traces:- Start with the schema. Call
infer_trace_schema_toolbeforequery_weave_traces_toolso the agent knows which fields and filter values are valid. - Pick the right
detail_level. Useschemato browse,summary(the default) for analysis, andfullonly when drilling into a small number of specific traces. - Chain
resolve_trace_roots_tool. After a child-trace query, pass the resultingtrace_idlist toresolve_trace_roots_toolto map each trace to its root session in one batched call. - Prefer
summarize_evaluation_toolfor evals. It aggregates theEvaluation.evaluateandpredict_and_scorehierarchy automatically. Only fall back toquery_weave_traces_toolfor raw trace data.
For run-heavy workflows
Follow these practices when working with W&B Models runs:- Probe before you query. Call
probe_project_toolon an unfamiliar run-based project to discover metric keys, config keys, and tags before constructing GraphQL. - Use
get_run_history_toolfor time series. GraphQL doesn’t sample, so for loss curves and other time-series dataget_run_history_toolis both faster and cheaper. - Let
compare_runs_tooldo the diff. It returns config and metric deltas with aligned history in a single call, avoiding manual comparison. - Run a health check first. When a training run looks wrong, call
diagnose_run_toolbefore digging into history manually.
For Dedicated Cloud and Self-Managed
Keep these considerations in mind for non-multi-tenant deployments:- Prefer the hosted server on your instance at
https://<your-instance>/mcp. It exposes the same tools as the Multi-tenant server with no client-sideWANDB_BASE_URLneeded. Only fall back to a local install if the hosted server isn’t yet enabled. - When you do run locally against your instance, set
WANDB_BASE_URLto your instance URL in the client’senvblock. Without it, the server targetsapi.wandb.aiand the server returns no data. - Rate limits on Dedicated Cloud are separate from Multi-tenant. See Dedicated Cloud rate limits for defaults and how to request changes.
For local installs
Keep these considerations in mind when running the server on your own machine:- Prefer STDIO transport for desktop clients (Cursor, VS Code, Claude Code, Claude Desktop). Only switch to HTTP transport when a client explicitly requires it (for example, the OpenAI Responses API).
- When tool calls fail silently, set
MCP_SERVER_LOG_LEVEL=DEBUGin the client’senvblock and recheck the client’s MCP logs. - If you install from GitHub (
uvx --from git+https://github.com/wandb/wandb-mcp-server wandb_mcp_server),uvxpins to the default branch. Pin an explicit tag by appending@v0.3.2to the Git URL when you need a stable version.
Troubleshooting
| Symptom | Cause and fix |
|---|---|
401 Unauthorized or Invalid API key | Your W&B API key is missing, malformed, or not authorized for the target entity. Regenerate a key at wandb.ai/authorize and confirm it is passed as a bearer token or set in WANDB_API_KEY. |
| Empty results for queries you expect to succeed | The entity or project name is incorrect, or the API key does not have access. Confirm both with the agent and retry. |
404 Not Found or connection refused on https://<your-instance>/mcp | The hosted MCP server is not yet enabled on your Dedicated Cloud or Self-Managed instance, or the client is pointed at the wrong URL. Contact W&B support to request enablement, then confirm the URL in Connection URL. |
429 Too Many Requests on Dedicated Cloud | You have hit your instance’s rate limits. See Dedicated Cloud rate limits for how to request higher limits. |
Local server cannot find uvx in Claude Desktop | Use the full path to uvx in the command field of claude_desktop_config.json. |
Related
Explore these resources for observability, reusable agent patterns, and advanced server configuration:- Trace MCP clients and servers with Weave for end-to-end observability of MCP interactions.
- W&B Skills for reusable agent instructions that pair with the MCP server.
- wandb/wandb-mcp-server for the open-source server, command-line reference, and release notes.
- Dedicated Cloud and Self-Managed hosting options.