key developments
claude mythos and the cybersecurity disclosure model. anthropic’s claude mythos is the first model since gpt-2 to be withheld from public access, but for concrete rather than precautionary reasons: it reportedly discovers zero-day exploits across essentially all major software, including operating systems and browsers. anthropic created “project glasswing” to give mythos exclusively to cybersecurity firms for patching before broader access. zvi maulshagen’s analysis takes anthropic at face value, noting that the cooperation of major tech and cybersecurity firms would make a bluff quickly obvious. nathan lambert at interconnects pushes back on the resulting anti-open-weight narrative, arguing the 6-18 month capability lag between closed and open models is the system working as intended, and that conflating general capability gaps with specific exploit risks leads to overbroad policy recommendations. the real question is whether governments will try to co-opt the offensive capabilities rather than use the defensive window. (zvi, interconnects)
anthropic reportedly at $30b annualized revenue. per saastr (citing 20vc), anthropic hit $30 billion in annualized revenue, up from $9 billion at start of year. if accurate, that is 3.3x growth in four months and would represent the fastest revenue scaling in enterprise software history. the claim that anthropic’s training costs are a quarter of openai’s is harder to verify and partly explained by narrower product scope (no video, no image generation). still, the combination of revenue acceleration and cost efficiency, if the numbers hold, represents a significant shift in competitive dynamics. treat with appropriate skepticism given the source is a podcast recap, not audited figures. (saastr)
the atom report: chinese open models overtook u.s. counterparts in mid-2025. a new comprehensive analysis of the open model ecosystem (~1,500 mainline models) documents that chinese models (qwen, deepseek) overtook u.s. open models in hugging face downloads, derivatives, and inference market share during summer 2025 and have since widened the gap. this matters for policy, supply chain, and strategic planning: the open-weight ecosystem that most developers build on is increasingly chinese-origin. (arxiv)
ragen-2 identifies “template collapse” in agentic rl, invisible to entropy monitoring. this paper finds that rl-trained multi-turn agents can appear to maintain reasoning diversity (stable entropy) while actually relying on fixed, input-agnostic templates. they call this “template collapse” and show that mutual information between inputs and reasoning traces is a far better predictor of final performance than entropy. the proposed fix (snr-aware filtering) selects high-signal prompts per iteration. this is practically important for anyone training agents with rl; entropy monitoring alone is insufficient. (arxiv)
foundry: 99% cold-start reduction for llm serving via cuda graph materialization. serving large moe and dense models suffers from cuda graph capture taking minutes during cold start. foundry persists both graph topology and execution context offline, then reconstructs executable graphs in seconds. qwen3-235b-a22b initialization drops from 10 minutes to 3.9 seconds. for autoscaling and parallelism reconfiguration scenarios, this effectively removes cuda graphs as a cold-start bottleneck. (arxiv)
the depth ceiling: latent planning in llms hits hard limits. controlled experiments on graph path-finding show that llms have strict limits on multi-step latent planning (reasoning within a single forward pass without chain-of-thought). tiny transformers reach 3 steps, fine-tuned gpt-4o reaches 5, gpt-5.4 reaches 7. interestingly, strategies discovered during training at depth 5 generalize to depth 8 at test time, revealing a gap between strategy discovery and strategy execution. this matters for cot monitoring: if latent reasoning has reliable ceilings, monitoring externalized reasoning remains viable as a safety strategy. (arxiv)
notable
-
forkkv achieves 3x throughput for multi-lora agent serving by decoupling kv cache into shared and agent-specific components using copy-on-write semantics, directly addressing the memory bottleneck when multiple lora agents share context. (arxiv)
-
eat (entropy after ) reduces reasoning token usage 12-22% without accuracy loss by monitoring entropy after appending a stop-thinking token; works even as a black-box method using small proxy models on claude 3.7. (arxiv)
-
lrkv attention: low-rank key-value sharing across heads achieves lowest test loss vs mha/mqa/gqa/mla at 45-53% kv cache size across 128m-6.3b parameter models, reaching equivalent quality 18-25% faster. (arxiv)
-
gemma 4 on llama.cpp is now stable after fixing all known issues; q5 quants of 31b working well, kv cache quantization with q5k/q4v shows no major degradation. note: do not use cuda 13.2, confirmed broken. (reddit)
-
backend-agnostic tensor parallelism merged into llama.cpp, meaning multi-gpu acceleration no longer requires cuda. experimental but significant for heterogeneous hardware setups. (reddit)
-
tracesafe-bench finds guardrail efficacy for agents depends on structural reasoning (json parsing), not safety alignment; performance correlates with structured-to-text benchmarks (ρ=0.79) but near-zero correlation with jailbreak robustness. architecture matters more than scale. (arxiv)
-
langchain launches “deep agents deploy” as an open-source, model-agnostic alternative to claude managed agents, bundling orchestration, sandboxes, mcp/a2a endpoints into a single deploy command. (langchain)
-
master key hypothesis: capability transfer across model scales without retraining. transferring cot reasoning from qwen1.5-14b to 7b yields +12.1% on math; transferring math reasoning from qwen3-4b-base to 14b-base surpasses the post-trained 14b model. (arxiv)
-
improving sae robustness via masked regularization reduces feature absorption and improves ood performance in sparse autoencoders used for mechanistic interpretability, addressing a known failure mode. (arxiv)
-
sol-rl uses fp4 quantized rollouts with bf16 training for diffusion model rl alignment, achieving 4.64x training speedup while maintaining bf16 training integrity. (arxiv)
-
longwriter-zero achieves sota long-form writing via rl alone (no sft data), surpassing deepseek r1 and qwen3-235b on writingbench and arena-write. trained from qwen2.5-32b base. (arxiv)
papers
“the depth ceiling: on the limits of large language models in discovering latent planning” reveals hard limits on multi-step latent reasoning in llms, with implications for chain-of-thought monitoring as a safety strategy. (arxiv)
“ragen-2: reasoning collapse in agentic rl” introduces mutual information as a superior diagnostic for reasoning quality over entropy, identifying template collapse as a previously invisible failure mode. (arxiv)
“the illusion of stochasticity in llms” demonstrates that llms fundamentally fail to map internal probability estimates to stochastic outputs when sampling from specified distributions, a critical gap for agentic systems. (arxiv)
“the illusion of superposition? a principled analysis of latent thinking in language models” finds that only models trained from scratch exhibit superposition in continuous chain-of-thought; pretrained models collapse or bypass it due to natural language training bias. (arxiv)
“how to sketch a learning algorithm” presents a data deletion scheme for deep learning with vanishing error and only poly(1/ε) overhead, based on locally sketching arithmetic circuits via higher-order derivatives in random complex directions. (arxiv)
“computational bottlenecks for denoising diffusions” provides evidence that some tractable-to-sample distributions have intractable diffusion drifts, showing that superpolynomially-near-optimal drifts can yield samples far from the target. (arxiv)
“distributed interpretability and control for large language models” presents a practical multi-gpu implementation of logit lens and steering vectors with 7x memory reduction and 41x throughput increase, tested on models up to 70b. (arxiv)