Claude docs changes for April 30th, 2026 [diff]

Executive Summary

New cache pre-warming feature documented: use max_tokens: 0 to load system prompts into the cache before user traffic arrives, eliminating cold-start latency
New allowMachLookup field in the Python Agent SDK allows sandboxed macOS processes to access XPC/Mach services by name
Skills clarification: the directory name (not a name frontmatter field) determines a skill's slash command
New Windows troubleshooting section for file-lock errors during install, plus PATH guidance for fish and Nushell users

Added new allowMachLookup field to SandboxNetworkConfig: macOS-only setting that allows sandboxed processes to access named XPC/Mach services, with support for a trailing wildcard. [lines 3064-3077] [Source]

Clarified that the directory name (not the name field in YAML frontmatter) becomes the slash command for a skill. The name: field has been removed from the example SKILL.md. [line 41] [Source]

Added new troubleshooting entry and resolution section for the Windows error The process cannot access the file ... because it is being used by another process, which occurs when antivirus or a prior installer run locks files in the downloads folder. [lines 27, 424-432]
Added guidance for fish and Nushell users to add ~/.local/bin to their PATH using their shell's own configuration syntax. [line 106] [Source]

Added a note pointing users to the Code Review integration for GitHub PRs, which posts inline PR comments without requiring a CLI step. [line 78] [Source]

Added constraint that each batched request must have max_tokens of at least 1; max_tokens: 0 (cache pre-warming) is not supported inside batches because a cache entry written during batch processing would likely expire before the follow-up request runs. [line 47] [Source]

Added note that extended thinking cannot be combined with max_tokens: 0 (cache pre-warming) because budget_tokens must be less than max_tokens. [line 87] [Source]

Added a major new "Pre-warming the cache" section documenting how to use max_tokens: 0 to load system prompts or tool definitions into the cache before user traffic arrives, eliminating cold-start TTFT penalties. Includes code examples, typical usage patterns, and a discussion of limitations (incompatible with streaming, extended thinking, structured outputs, and forced tool choice). [lines 501-641] [Source]
Added guidance on replacing the old max_tokens: 1 warm-up workaround with the new max_tokens: 0 approach, which produces no output tokens and bills nothing for output. [line 633] [Source]