key developments

interconnects on “lossy self-improvement” challenges the rsi narrative. nathan lambert’s latest piece examines recursive self-improvement (rsi), the idea that ai models will accelerate their own development in a runaway loop. his core argument is that self-improvement is inherently lossy; models improving their own training pipelines face compounding errors, distribution shifts, and diminishing returns that dampen the exponential feedback loop rsi proponents assume. this matters because the rsi thesis underpins much of the “fast takeoff” safety discourse and significant investment narratives. lambert acknowledges the real acceleration happening (superhuman coding assistants making research easier, consolidation into 2-3 leading labs) but frames it as rapid linear progress, not exponential recursion. this is the most grounded technical analysis of rsi i’ve seen from someone embedded in the industry. https://www.interconnects.ai/p/lossy-self-improvement

sebastian raschka publishes a visual gallery of 45 llm architectures with attention variant deep dive. raschka released an interactive architecture gallery covering 45 distinct llm designs with visual model cards, plus a companion article walking through every major attention variant used in modern open-weight models (mha, gqa, mqa, sliding window, etc). this is a genuine reference resource rather than a tutorial; it synthesizes years of architectural evolution into one navigable artifact. useful for anyone who needs to quickly compare design choices across the llama, mistral, deepseek, and qwen families. the gallery will be maintained as new architectures emerge. https://magazine.sebastianraschka.com/p/visual-attention-variants https://sebastianraschka.com/llm-architecture-gallery/

starlette 1.0 released; willison explores it with claude skills. starlette, the asgi framework underpinning fastapi, shipped its 1.0 after years of development. willison flags this as significant because starlette has enormous invisible usage (every fastapi app runs on it) but low brand recognition. the 1.0 brings breaking changes around startup/shutdown via a new lifespan context manager pattern. willison notes that starlette’s single-file app style makes it exceptionally llm-friendly, and he used claude’s agent skills to experiment with the new api. for anyone building python web services or llm-generated backends, this is worth knowing about. https://simonwillison.net/2026/Mar/22/starlette/#atom-everything

minimax confirms m2.7 will be open weights, release in approximately 2 weeks. minimax’s m2.7 model, which has been generating buzz for strong benchmark performance, will be released as open weights. this adds another strong contender to the open model ecosystem alongside qwen and llama. details on architecture and parameter count are still sparse, but the localllama community is treating this as significant given minimax’s recent api performance. https://www.reddit.com/r/LocalLLaMA/comments/1s0mo33/m27_open_weights_coming_in_2_weeks/

notable