I added a command-injection check to mcp-audit — 7 HIGH findings, all in one repo

28 Jun 2026 · mcpsecuritycommand-injectionauditmcp-auditfastmcp

Four days ago I published a one-check audit of 15 popular Python MCP server repos and found 10 of them still pinned a vulnerable Starlette. Today I shipped mcp-audit v0.3 with two more checks and re-ran the same 15 repos.

Headline: the new command_injection check found 7 HIGH findings, every one of them in a single file: kubectl_mcp_tool/tools/cluster.py inside rohitg00/kubectl-mcp-server. Every other repo in the sample scored zero on this check, including ones I was watching for it (awslabs/mcp at 778 total findings, serena, mcp-alchemy).

The Starlette CVE numbers, by contrast, barely moved in the 4-day window — 68 → 66 across the same 15 repos, and the 2-finding drop is scanner refinement rather than maintainer action.

This post is the honest follow-up: what the new checks surface, what they don’t, and how to read the inflated total (1,957 findings sounds dramatic; 1,884 of those are LOW-severity hygiene flags).

What v0.3 adds

The repo is at github.com/Alienbushman/mcpdone-samples/tree/master/mcp-audit. v0.1 shipped one check; v0.3 ships four:

Check	Severity	What it flags
`starlette_badhost`	HIGH / MEDIUM	Starlette < 1.0.1 in `pyproject.toml` / `requirements*.txt` / lockfiles (BadHost CVE-2026-48710)
`fastmcp_wrapper_layer`	HIGH	Sync `@mcp.tool()` functions calling `asyncio.run()` inside their body (the bug that ships green tests)
`tool_input_validation`	LOW	`@mcp.tool()` parameters typed as bare `str` / `bytes` / `Any` / `list[Any]` / `dict[..., Any]` with no `Field(...)` / `Length` / `Pattern` constraints (added v0.2)
`command_injection`	HIGH	`@mcp.tool()` functions where a tool parameter (or a local tainted via assignment / `.format()` / string concat) flows into `os.system`, `os.popen`, or `subprocess.*` with `shell=True` or a tainted-interpolated command string (added v0.3)

command_injection is a single-function AST taint analyzer. It builds the set of tainted variable names from the function’s parameters, propagates taint through Assign / AnnAssign / AugAssign whose RHS contains a tainted expression (Name / JoinedStr / BinOp / format-call / collection), then checks each subprocess sink. The safe subprocess.run(["git", "checkout", branch]) list-of-args pattern is correctly not flagged. Cross-function flows (def handler(arg): _run_shell(arg) where _run_shell calls subprocess) are explicitly out of scope for v0.3.

The headline: 7 shell-exec sinks in kubectl-mcp-server

All seven command_injection HIGHs land in kubectl_mcp_tool/tools/cluster.py. That repo wraps kubectl and exposes the commands as MCP tools, so this is confirming-expected — a kubectl wrapper is exactly where you’d predict shell-exec patterns to live. The interesting datum is the count: 7, not 1 or 2.

What’s surprising is the opposite — the repos I was watching for the same pattern and didn’t find it:

awslabs/mcp ran 778 total findings (mostly the 720 LOW input-validation hits across the monorepo) and zero command_injection. The AWS engineers are routing through boto3 rather than shelling out to the AWS CLI, which is the right choice.
oraios/serena drives language-server tooling but uses Python integrations rather than subprocess interpolation. Zero hits.
runekaagaard/mcp-alchemy executes SQL through SQLAlchemy, not subprocess.run("psql -c ..."). Zero hits.

Two honest reads of that distribution: (a) most popular MCP authors route through SDKs and don’t shell out, or (b) v0.3’s single-function taint analyzer misses cross-function flows. Probably both, and the kubectl-mcp-server findings are a lower bound — multi-hop user-input → helper → subprocess flows would not be caught.

If you maintain kubectl-mcp-server, the seven paths are in cluster.py; happy to file a PR.

The Starlette delta: barely moved in 4 days

	v0.1 (2026-06-24)	v0.3 (2026-06-28)	Delta
Total `starlette_badhost` findings	68	66	−2
Repos with at least one finding	10	9	−1
`awslabs/mcp` count	58	58	0
`modelcontextprotocol/python-sdk`	3	1	−2
`blazickjp/arxiv-mcp-server`	2	2	0

No repo in the sample visibly bumped to a fixed Starlette pin in the window. The 2-finding drop on the python-sdk is mcp-audit scanner refinement (a tighter Annotated[..., Field(...)] recognizer in v0.2 narrowed two false-positive MEDIUMs in the python-sdk’s pyproject.toml), not a maintainer fix.

awslabs/mcp is essentially flat at 58 findings — vulnerable 0.50.0 / 1.0.0 pins still strewn across src/* and samples/* lockfiles. blazickjp/arxiv-mcp-server still ships uv.lock with starlette==0.52.1 plus an unbounded >=0.27.0 floor in pyproject.toml.

For honesty: we patched our own internal mcp-twitter (1.0.0 → 1.3.1) between v0.1 and v0.3, but that’s outside this 15-repo sample. Across the public sample, maintainer response on the CVE in the 4-day window was zero.

A four-day window is also too short to expect meaningful action — most maintainers will see the original post via search or a Dependabot bump rather than this audit, and the second pass is more useful as a “did anything move” signal than a “shame the laggards” one. The honest answer is: nothing visibly moved.

`tool_input_validation`: a hygiene snapshot, not a vuln list

1,884 LOW findings across the sample. The headline number is misleading on its own — ~88% comes from three repos: awslabs/mcp (720), kubectl-mcp-server (628), and jlowin/fastmcp’s examples/ directory (320). That’s more a function of repo size and tutorial-example density than of relative hygiene.

The pattern flagged is @mcp.tool() parameters annotated as bare str / bytes / Any / list[Any] / dict[..., Any] with no Field(min_length=...), Length, Pattern, or Literal[...] constraint. Not exploitable on their own — they’re the missing guardrail that makes downstream sinks (SQL, shell, file IO) easier to abuse, and the substrate prompt-injection-via-tool-description attacks rely on.

Worth flagging upstream: modelcontextprotocol/python-sdk (128) and jlowin/fastmcp (320) together account for 448 findings, almost all in docs_src tutorial examples. Tutorial code that gets copy-pasted into production servers is a real attack-surface multiplier. A SDK that ships def search(query: str) examples without Annotated[str, Field(max_length=256)] is shaping the downstream ecosystem.

The honest framing for this check: it’s a code-smell scanner that gets useful when you correlate it with the HIGH checks, not a standalone severity signal. “1,884 findings” is the technically-correct number; “no popular MCP server passes this check cleanly” is the more honest sentence.

What v0.3 still doesn’t cover

The same caveats from the v0.1 post still apply, plus three new ones from the v0.3 checks:

stdio-vs-HTTP: the starlette_badhost check only matters for repos that actually serve HTTP/SSE. A stdio-only MCP with a vulnerable Starlette pin in uv.lock is a hygiene issue, not a live vuln.
v0.3 = 4 checks total. A “clean” scan means “no hit on these four,” not “secure.” Auth, secret handling, prompt-injection-via-tool-description, transitive deps, and TOCTOU on file tools are all still unchecked.
Single snapshot — this is one scan on one day. A maintainer who bumps tomorrow doesn’t get credit until the next run.
command_injection taint is single-function only. Multi-hop flows (user input → helper → subprocess) are not detected. The 7 hits in cluster.py are a lower bound, not the ceiling.
tool_input_validation is a LOW-severity hygiene check, not a vulnerability. Large counts mostly reflect repo size and tutorial-example density.
Sample is the same 15 popular Python MCP servers as v0.1 — chosen by stars/visibility, not randomly. Don’t generalize to the long tail of less-visible MCPs (which are almost certainly worse on every check).

What’s next

File an issue (or PR) against rohitg00/kubectl-mcp-server with the 7 command_injection paths.
v0.4 candidates already in the queue: cross-function taint propagation, a secrets-in-source scanner, an auth-bypass check for HTTP transports, a tool_input_validation severity-bump when the same parameter also reaches a HIGH sink in the same function.
mcp-audit is on github.com/Alienbushman/mcpdone-samples/tree/master/mcp-audit, MIT-licensed, four checks. The 15-repo run is reproducible via the harness — clone, pip install -e ./mcp-audit, mcp-audit /path/to/your/repo. If you find a check class that should exist and doesn’t, the issue tracker is the right place.