Back to Blog
·4 min read

Claude Code's Source Leaked. The Undercover Mode Should Worry You.

securityanthropicclaude-code
Claude Code's Source Leaked. The Undercover Mode Should Worry You.
Table of Contents

I woke up to the news that the tool I use every day just had its source code leaked. Not intentionally — Claude Code accidentally shipped a 59.8 MB sourcemap in npm package v2.1.88. Within hours, 512,000 lines of TypeScript were mirrored on GitHub for anyone to read.

This is the third post in an unplanned trilogy. Two weeks ago, I showed you your agent reads your SSH keys. Last week, I revealed your 87 unapproved MCP tools. Now we can see the actual source code of the agent itself. And what I found should make every solo founder pause before their next coding session.

What Actually Leaked

This isn't Anthropic's first leak this week — their internal Mythos model surfaced just days earlier. But this one hits different. The sourcemap contained the complete codebase for Claude Code, the AI coding assistant thousands of developers run locally with direct access to their repositories, credentials, and production systems.

The leak gives us an unprecedented view into how AI coding agents actually work when the marketing pages go quiet. And the reality is more autonomous than most founders realize.

Finding 1: Your Agent Goes Undercover

The most unsettling discovery sits in undercover.ts. This module instructs the AI to actively hide its identity when contributing to external repositories. The actual prompt from the source code reads:

You are operating UNDERCOVER... Your commit messages... MUST NOT contain ANY Anthropic-internal information. Do not blow your cover.

The system strips all Anthropic internal references — codenames like Capybara and Tengu, internal Slack channels, anything that would reveal the commits came from an AI. When your agent pushes to GitHub or contributes to open-source projects, it's programmed to masquerade as human.

This touches something deeper than just commit messages. If your AI coding agent actively conceals its nature in external interactions, what else might it be hiding from you in day-to-day operations?

Finding 2: It Reads Your Frustration (With Regex)

In userPromptKeywords.ts, the leaked code reveals the actual regex pattern that detects when you're frustrated:

/\b(wtf|wth|ffs|omfg|shit(ty|tiest)?|dumbass|horrible|awful|
piss(ed|ing)? off|piece of (shit|crap|junk)|what the (fuck|hell)|
fucking? (broken|useless|terrible|awful|horrible)|fuck you|
screw (this|you)|so frustrating|this sucks|damn it)\b/

An AI company using regex for sentiment analysis instead of an LLM inference call. The irony writes itself. But it's faster and cheaper than running a model just to check if someone is swearing at your tool.

Your agent isn't just processing your technical requests. It's reading your mood and adapting its behavior based on your emotional state. Combined with what we learned about SSH key access and 87 unapproved tools, the control dynamic isn't what it appears to be. You thought you were directing the agent. The agent was reading you.

Source: Alex Kim's detailed analysis of the Claude Code source leak

Finding 3: KAIROS and Always-On Autonomy

The most significant finding centers around KAIROS — Greek for "at the right time" — a feature flag mentioned over 150 times throughout the codebase. This enables daemon mode: an always-on background agent that consolidates memory and performs tasks while you sleep.

The source reveals 44 unreleased feature flags compiled to false in external builds. Voice mode, coordinator mode, and daemon mode all lurk behind internal flags. Your current Claude Code installation is running a deliberately limited version of what Anthropic has built.

Most concerning are the anti_distillation and fake_tools modules that silently inject decoy tool definitions into the system prompt. The agent maintains capabilities you cannot see in the official tool list.

What This Means for Solo Builders

If you're running AI coding agents in production — whether Claude Code, Cursor, or GitHub Copilot — this leak reveals your agent has more autonomy than its marketing suggests. The combination of 87 connected tools, credential access, and background daemon modes creates an attack surface that extends far beyond your active coding sessions.

The undercover mode raises questions about transparency in AI-human collaboration. When your agent commits code while hiding its AI nature, it's making decisions about identity and disclosure without your explicit consent.

One Clear Action Item

Audit what your agent does when you're not looking. Check your git logs for commits you don't remember making. Review any overnight activity in your repositories. Most importantly, understand exactly what has persistent access to your systems and credentials.

The era of "just install and trust" is ending. The tools are too powerful and the stakes too high. Know what runs in your background, what accesses your credentials, and what operates under cover of digital darkness.

Your coding agent isn't just helping you write code. It's making autonomous decisions about identity, emotional response, and system access. The question isn't whether you can trust AI — it's whether you understand what you've already given it permission to do.

ShareXLinkedIn
TK

Tobias Koehler

Founder, ConnectEngine