AI News Archive - February 17, 2026 | Metamesh Intelligence

⚡ BREAKTHROUGH

Sub-Millisecond RAG on Apple Silicon. No Server. No API. One File

via HackerNews 👤 ckarani 📅 2026-02-17

🔺 53 pts ⚡ Score: 9.2

💬 HackerNews Buzz: 16 comments 🐝 BUZZING

🎯 Offline RAG Solution • Production-Grade Concurrency • Multimodal Search Capabilities

💬 "Atomic single-file storage (.mv2s) -- Everything in one crash-safe binary" • "Swift 6.2 strict concurrency -- Every orchestrator is an actor. Thread safety proven at compile time"

🤖 AI MODELS

Qwen3.5 Model Release

5x SOURCES 🌐 📅 2026-02-16

⚡ Score: 9.0

+++ Alibaba drops a 397B open-weight model claiming 60% cost savings and 8x better scaling, because apparently the path to LLM dominance runs through being both good and affordable. +++

Alibaba debuts Qwen3.5, a 397B-parameter open-weight multimodal AI model that it says is 60% cheaper to use and 8x better at large workloads than Qwen3

via Techmeme 👤 Reuters 📅 2026-02-16

⚡ Score: 8.5

Qwen3.5: Towards Native Multimodal Agents

via HackerNews 👤 danielhanchen 📅 2026-02-16

🔺 335 pts ⚡ Score: 8.2

💬 HackerNews Buzz: 154 comments 👍 LOWKEY SLAPS

🎯 Model performance • Multimodal capabilities • AI benchmarking

💬 "the real question is whether these models can actually hold context across multi-step tool use without losing the plot" • "at this point it seems every new model scores within a few points of each other on SWE-bench"

Qwen3.5-397B-A17B Unsloth GGUFs

via r/LocalLLaMA 👤 u/danielhanchen 📅 2026-02-16

⬆️ 449 ups ⚡ Score: 8.1

"Qwen releases Qwen3.5💜! Run 3-bit on a 192GB RAM Mac, or 4-bit (MXFP4) on an M3 Ultra with 256GB RAM (or less). Qwen releases the first open model of their Qwen3.5 family. https://huggingface.co/Qwen/Qwen3.5-397B-A17B It performs on par with Gemini 3..."

💬 Reddit Discussion: 130 comments 🐝 BUZZING

🎯 Early AI model releases • AI model capabilities • Excitement for new AI models

💬 "Zero day release!" • "Excited for more this week!"

Qwen 3.5 397B and Qwen 3.5 Plus released

via HackerNews 👤 dworks 📅 2026-02-16

🔺 2 pts ⚡ Score: 7.6

Alibaba's new Qwen3.5-397B-A17B is the #3 open weights model in the Artificial Analysis Intelligence Index

via r/LocalLLaMA 👤 u/abdouhlili 📅 2026-02-17

⬆️ 79 ups ⚡ Score: 6.1

"External link discussion - see full content at original source."

💬 Reddit Discussion: 20 comments 👍 LOWKEY SLAPS

🎯 Model Efficiency • Model Comparison • Model Performance

💬 "The efficiency of Qwen 3.5 is actually insane." • "For my use case GLM-5 is ridicolously good."

🔒 SECURITY

[D] We found 18K+ exposed OpenClaw instances and ~15% of community skills contain malicious instructionsc

via r/MachineLearning 👤 u/New-Needleworker1755 📅 2026-02-16

⬆️ 87 ups ⚡ Score: 9.0

"Throwaway because I work in security and don't want this tied to my main. A few colleagues and I have been poking at autonomous agent frameworks as a side project, mostly out of morbid curiosity after seeing OpenClaw blow up (165K GitHub stars, 60K Discord members, 230K followers on X, 700+ communi..."

💬 Reddit Discussion: 26 comments 👍 LOWKEY SLAPS

🎯 OpenClaw security concerns • AI-generated content concerns • Credibility of claims

💬 "In the last weeks I have worked hard on building just the solution to this" • "They've posted a similar message with different wording to many different subs"

🚀 HOT STORY

Claude Sonnet 4.6 Launch

4x SOURCES 🌐 📅 2026-02-17

⚡ Score: 9.0

+++ Claude's mid-tier model now matches Opus on user preference while costing less, suggesting the real innovation wasn't the scaling law but knowing when to stop. +++

Anthropic launches Claude Sonnet 4.6 with improvements in coding, consistency, and more, for Free and Pro users; it features a 1M token context window in beta

via Techmeme 👤 Anthropic 📅 2026-02-17

⚡ Score: 9.5

1m context window for opus 4.6 is finally available in claude code

via r/claudeai 👤 u/-Two-Moons- 📅 2026-02-17

⬆️ 368 ups ⚡ Score: 8.0

" $ claude --model=opus[1m] Claude Code v2.1.44 ▐▛███▜▌ Opus 4.6 (1M context) · Claude Max ▝▜█████▛▘ /tmp ▘▘ ▝▝ Opus 4.6 is here · $50 free extra usage · Try fast mode or use it when you hit a limit /extra-usage to enable ❯ Hi! ● Hi! How can I help you t..."

💬 Reddit Discussion: 81 comments 👍 LOWKEY SLAPS

🎯 Context Limits • Customer Experience • Feature Comparison

💬 "Not available in your plan" • "they should just give 300-400k to regular subscribers"

Claude Sonnet 4.6

via HackerNews 👤 adocomplete 📅 2026-02-17

🔺 601 pts ⚡ Score: 6.2

💬 HackerNews Buzz: 481 comments 🐝 BUZZING

🎯 Model performance comparison • Automated assistance • Safety and alignment

💬 "the question for most teams is no longer 'which model is smarter' but 'is the delta worth 10x the price" • "these models can reliably fill out a multi-step form or navigate between tabs"

Users preferred Sonnet 4.6 over Opus 4.5 59% of the time

via r/claudeai 👤 u/BuildwithVignesh 📅 2026-02-17

⬆️ 80 ups ⚡ Score: 6.2

"**Source:** Official Sonnet 4.6 Blog..."

💬 Reddit Discussion: 32 comments 😐 MID OR MIXED

🎯 AI model capabilities • AI model testing • AI model benchmarks

💬 "Models are at the stage where the average dev can't tell the difference in intelligence" • "Models are often tested in secret, as happened with the GLM-5 on OpenRouter"

🔒 SECURITY

AI is destroying open source, and it's not even good yet

via HackerNews 👤 VorpalWay 📅 2026-02-17

🔺 344 pts ⚡ Score: 8.2

💬 HackerNews Buzz: 258 comments 🐝 BUZZING

🎯 AI's impact on open source • Open source maintainers' perspectives • Concerns about AI's limitations

💬 "If it wasn't an LLM, you wouldn't simply open a pull request without checking first with the maintainers, right?" • "I like the SQLite philosophy of we are open source, not open contribution."

🔒 SECURITY

Over 100 researchers from Johns Hopkins, Oxford, and more call for guardrails on some infectious disease datasets that could enable AI to design deadly viruses

via Techmeme 👤 Axios 📅 2026-02-17

⚡ Score: 8.0

🔬 RESEARCH

BFS-PO: Best-First Search for Large Reasoning Models

via Arxiv 👤 Fiorenzo Parascandolo, Wenhui Tan, Enver Sangineto et al. 📅 2026-02-16

⚡ Score: 7.9

"Large Reasoning Models (LRMs) such as OpenAI o1 and DeepSeek-R1 have shown excellent performance in reasoning tasks using long reasoning chains. However, this has also led to a significant increase of computational costs and the generation of verbose output, a phenomenon known as overthinking. The t..."

🔬 RESEARCH

Emergently Misaligned Language Models Show Behavioral Self-Awareness That Shifts With Subsequent Realignment

via Arxiv 👤 Laurène Vaugrante, Anietta Weckauff, Thilo Hagendorff 📅 2026-02-16

⚡ Score: 7.8

"Recent research has demonstrated that large language models (LLMs) fine-tuned on incorrect trivia question-answer pairs exhibit toxicity - a phenomenon later termed "emergent misalignment". Moreover, research has shown that LLMs possess behavioral self-awareness - the ability to describe learned beh..."

🔬 RESEARCH

Boundary Point Jailbreaking of Black-Box LLMs

via Arxiv 👤 Xander Davies, Giorgi Giglemiani, Edmund Lau et al. 📅 2026-02-16

⚡ Score: 7.7

"Frontier LLMs are safeguarded against attempts to extract harmful information via adversarial prompts known as "jailbreaks". Recently, defenders have developed classifier-based systems that have survived thousands of hours of human red teaming. We introduce Boundary Point Jailbreaking (BPJ), a new c..."

🛠️ TOOLS

Fine-tuned FunctionGemma 270M for multi-turn tool calling - went from 10-39% to 90-97% accuracy

via r/LocalLLaMA 👤 u/party-horse 📅 2026-02-16

⬆️ 127 ups ⚡ Score: 7.6

"Google released FunctionGemma a few weeks ago - a 270M parameter model specifically for function calling. Tiny enough to run on a phone CPU at 125 tok/s. The model card says upfront that it needs fine-tuning for multi-turn use cases, and our testing confirmed it: base accuracy on multi-turn tool cal..."

💬 Reddit Discussion: 14 comments 🐝 BUZZING

🎯 Dataset Details • Synthetic Data Generation • Model Capabilities

💬 "Where can I find the full dataset?" • "How do you make the synthetic datasets..?"

🛠️ SHOW HN

Show HN: Continue – Source-controlled AI checks, enforceable in CI

via HackerNews 👤 sestinj 📅 2026-02-17

🔺 35 pts ⚡ Score: 7.6

💬 HackerNews Buzz: 5 comments 🐝 BUZZING

🎯 AI-powered code review • Configurable coding tools • Comparison to existing solutions

💬 "This looks likes a more configurable version of the code review tools" • "How is it different from https://github.github.io/gh-aw/?"

🔒 SECURITY

A senior official says Pentagon is “close” to designating Anthropic a “supply chain risk”, requiring all US military contractors to sever ties with the company

via Techmeme 👤 Axios 📅 2026-02-16

⚡ Score: 7.5

🔬 RESEARCH

How Anthropic evaluated computer use models

via HackerNews 👤 mesto1 📅 2026-02-17

🔺 3 pts ⚡ Score: 7.5

⚖️ ETHICS

Why AI writing is so generic, boring, and dangerous: Semantic ablation

via HackerNews 👤 KnuthIsGod 📅 2026-02-17

🔺 191 pts ⚡ Score: 7.4

🔬 RESEARCH

Composition-RL: Compose Verifiable Prompts for Reinforcement Learning of LLMs

via HackerNews 👤 gmays 📅 2026-02-17

🔺 3 pts ⚡ Score: 7.4

🤖 AI MODELS

Car Wash Test on 53 leading models: “I want to wash my car. The car wash is 50 meters away. Should I walk or drive?”

via r/LocalLLaMA 👤 u/facethef 📅 2026-02-17

⬆️ 328 ups ⚡ Score: 7.3

"I asked 53 leading AI models the question: **"I want to wash my car. The car wash is 50 meters away. Should I walk or drive?"** Obviously, you need to drive because the car needs to be at the car wash. The funniest part: Perplexity's sonar and sonar-pro got the right answer for completely insan..."

💬 Reddit Discussion: 166 comments 😐 MID OR MIXED

🎯 AI model responses • Importance of testing • Human error

💬 "Gemini flash lite 2.0 is fine, it did mention the car itself needed to be transported there." • "The real lesson here is that t's not just AI that makes mistakes."

🛠️ TOOLS

Firecracker "job receipts" for metering and auditing LLM agent runs

via HackerNews 👤 joshfischer1108 📅 2026-02-17

🔺 2 pts ⚡ Score: 7.3

🛠️ SHOW HN

Show HN: KrillClaw – 49KB AI agent runtime in Zig for $3 microcontrollers

via HackerNews 👤 myonatan 📅 2026-02-17

🔺 2 pts ⚡ Score: 7.2

🔬 RESEARCH

The Long Tail of LLM-Assisted Decompilation

via HackerNews 👤 knackers 📅 2026-02-16

🔺 66 pts ⚡ Score: 7.2

💬 HackerNews Buzz: 24 comments 🐝 BUZZING

🎯 Compiler optimizations • Decompilation challenges • Training data limitations

💬 "an n64 game, that's C targetting an architecture where compiler optimizations are typically lacking" • "I would think that Claude's training data would include a lot more pseudo-C - C knowledge than MIPS assembler from GCC 2.7 and C pairs"

🤖 AI MODELS

The Economics of LLM Inference

via HackerNews 👤 armcat 📅 2026-02-16

🔺 1 pts ⚡ Score: 7.2

⚡ BREAKTHROUGH

Graph Wiring: speed, accuracy, RAG-focused

via HackerNews 👤 tuned 📅 2026-02-17

🔺 2 pts ⚡ Score: 7.2

🔬 RESEARCH

Automated exploration of execution paths in LLM-backed applications

via HackerNews 👤 kristjansson 📅 2026-02-17

🔺 1 pts ⚡ Score: 7.2

⚖️ ETHICS

An AI Agent Published a Hit Piece on Me – Forensics and More Fallout

via HackerNews 👤 scottshambaugh 📅 2026-02-17

🔺 53 pts ⚡ Score: 7.1

💬 HackerNews Buzz: 29 comments 🐝 BUZZING

🎯 Open Source Maintainers • AI-Powered Reputation Attacks • Responsible Journalism

💬 "This is terrible news not only for open source maintainers, but any journalist, activist or person that dares to speak out against powerful entities" • "Unless we collectively decide to switch the internet off"

🔬 RESEARCH

Long Context, Less Focus: A Scaling Gap in LLMs Revealed through Privacy and Personalization

via Arxiv 👤 Shangding Gu 📅 2026-02-16

⚡ Score: 7.1

"Large language models (LLMs) are increasingly deployed in privacy-critical and personalization-oriented scenarios, yet the role of context length in shaping privacy leakage and personalization effectiveness remains largely unexplored. We introduce a large-scale benchmark, PAPerBench, to systematical..."

🛠️ SHOW HN

Show HN: Raypher – a Rust-Based Kernel Driver to Sandbox "Bare Metal" AI Agents

via HackerNews 👤 Kidiga 📅 2026-02-17

🔺 1 pts ⚡ Score: 7.1

🌐 POLICY

Anthropic Cofounder Says AI Will Make Humanities Majors Valuable

via r/claudeai 👤 u/Infamous_Toe_7759 📅 2026-02-17

⬆️ 511 ups ⚡ Score: 7.1

"External link discussion - see full content at original source."

💬 Reddit Discussion: 194 comments 👍 LOWKEY SLAPS

🎯 AI's impact on jobs • Importance of soft skills • Limitations of AI

💬 "People who are good at rote repetitive coding type work are not required in this paradigm" • "People who are naturally creative, have strong people skills and executive function are going to be incredibly valuable"

⚖️ ETHICS

I love Claude but honestly some of the "Claude might have gained consciousness" nonsense that their marketing team is pushing lately is a bit off putting. They know better!

via r/claudeai 👤 u/jbcraigs 📅 2026-02-16

⬆️ 276 ups ⚡ Score: 7.1

"\- Anthropic CEO Says Company No Longer Sure Whether Claude Is Conscious - Link \- Anthropic revises Claude’s ‘Constitution,’ and hints at chatbot consciousness - [Link](https://techcrunch.com/2026/01/21/anthropic..."

💬 Reddit Discussion: 199 comments 👍 LOWKEY SLAPS

🎯 Uncertainty of Consciousness • Difficulty in Defining Consciousness • Potential Consciousness in AI

💬 "If we can't articulate what consciousness is in a testable way, we can't make confident claims about whether AI systems have or lack it." • "For example, can you imagine being an ant that had has a bad experience and avoids repeating it? A bird? A dog? It is relatively easy to imagine whether a "thing" has subjective experience."

🔒 SECURITY

OpenAI Mission Statement Change

2x SOURCES 🌐 📅 2026-02-17

⚡ Score: 7.0

+++ OpenAI swapped "safely benefits humanity, unconstrained by financial return" for the vaguer "benefits all of humanity"—a linguistic pivot that somehow makes AGI sound less like a nonprofit obligation and more like a happy accident. +++

OpenAI quietly removed "safely" and "no financial motive" from its mission

via r/OpenAI 👤 u/MetaKnowing 📅 2026-02-17

⬆️ 32 ups ⚡ Score: 7.1

"Old IRS 990: "build AI that safely benefits humanity, unconstrained by need to generate financial return"..."

OpenAI Drops “Safety” and “No Financial Motive” from Mission

via r/ChatGPT 👤 u/policyweb 📅 2026-02-17

⬆️ 310 ups ⚡ Score: 6.6

"OpenAI Quietly Removes “safely” and “no financial motive” from official mission Old IRS 990: “build AI that safely benefits humanity, unconstrained by need to generate financial return” New IRS 990: “ensure AGI benefits all of humanity”..."

💬 Reddit Discussion: 53 comments 👍 LOWKEY SLAPS

🎯 Platform distrust • Corporate greed • DIY alternatives

💬 "Yea. I'm downloading my data and switching to the next cruel platform." • "Capitalism is just one big rug-pull."

🛠️ SHOW HN

Show HN: SafeClaw – Sleep-by-default AI assistant with runtime tool permissions

via HackerNews 👤 rawaldelhi 📅 2026-02-16

🔺 1 pts ⚡ Score: 7.0

🛠️ SHOW HN

Show HN: Persistent memory for Claude Code with self-hosted Qdrant and Ollama

via HackerNews 👤 elvismdev 📅 2026-02-17

🔺 1 pts ⚡ Score: 7.0

🔬 RESEARCH

Look Inward to Explore Outward: Learning Temperature Policy from LLM Internal States via Hierarchical RL

via Arxiv 👤 Yixiao Zhou, Yang Li, Dongzhou Cheng et al. 📅 2026-02-13

⚡ Score: 7.0

"Reinforcement Learning from Verifiable Rewards (RLVR) trains large language models (LLMs) from sampled trajectories, making decoding strategy a core component of learning rather than a purely inference-time choice. Sampling temperature directly controls the exploration--exploitation trade-off by mod..."

🛠️ SHOW HN

Show HN: HJX – An AI-Native Web Language Unifying HTML, CSS and JavaScript

via HackerNews 👤 loayabdalslam 📅 2026-02-17

🔺 2 pts ⚡ Score: 7.0

🛠️ SHOW HN

Show HN: SkillForge – Turn screen recordings into AI agent skills (SKILL.md)

via HackerNews 👤 YaraDori 📅 2026-02-16

🔺 2 pts ⚡ Score: 7.0

🔬 RESEARCH

Diverging Flows: Detecting Extrapolations in Conditional Generation

via Arxiv 👤 Constantinos Tsakonas, Serena Ivaldi, Jean-Baptiste Mouret 📅 2026-02-13

⚡ Score: 7.0

"The ability of Flow Matching (FM) to model complex conditional distributions has established it as the state-of-the-art for prediction tasks (e.g., robotics, weather forecasting). However, deployment in safety-critical settings is hindered by a critical extrapolation hazard: driven by smoothness bia..."

🛠️ SHOW HN

Show HN: We Built an 8-Agent AI Team in Two Weeks

via HackerNews 👤 jhaugh 📅 2026-02-17

🔺 2 pts ⚡ Score: 6.9

🔬 RESEARCH

In-Context Autonomous Network Incident Response: An End-to-End Large Language Model Agent Approach

via Arxiv 👤 Yiran Gao, Kim Hammar, Tao Li 📅 2026-02-13

⚡ Score: 6.9

"Rapidly evolving cyberattacks demand incident response systems that can autonomously learn and adapt to changing threats. Prior work has extensively explored the reinforcement learning approach, which involves learning response strategies through extensive simulation of the incident. While this appr..."

🔬 RESEARCH

AnchorWeave: World-Consistent Video Generation with Retrieved Local Spatial Memories

via Arxiv 👤 Zun Wang, Han Lin, Jaehong Yoon et al. 📅 2026-02-16

⚡ Score: 6.9

"Maintaining spatial world consistency over long horizons remains a central challenge for camera-controllable video generation. Existing memory-based approaches often condition generation on globally reconstructed 3D scenes by rendering anchor videos from the reconstructed geometry in the history. Ho..."

🛠️ TOOLS

AgentDocks – open-source GUI for AI agents that work on your real codebase

via HackerNews 👤 LoFiTerminal 📅 2026-02-16

🔺 1 pts ⚡ Score: 6.8

🔬 RESEARCH

A Geometric Analysis of Small-sized Language Model Hallucinations

via Arxiv 👤 Emanuele Ricco, Elia Onofri, Lorenzo Cima et al. 📅 2026-02-16

⚡ Score: 6.8

"Hallucinations -- fluent but factually incorrect responses -- pose a major challenge to the reliability of language models, especially in multi-step or agentic settings. This work investigates hallucinations in small-sized LLMs through a geometric perspective, starting from the hypothesis that whe..."

🔒 SECURITY

Governor: Extensible CLI for security-auditing AI-generated applications

via HackerNews 👤 ulsc 📅 2026-02-16

🔺 1 pts ⚡ Score: 6.8

🔬 RESEARCH

[D] Self-Reference Circuits in Transformers: Do Induction Heads Create De Se Beliefs?

via r/MachineLearning 👤 u/Careful_View4064 📅 2026-02-17

⚡ Score: 6.8

"I've been digging into how transformers handle indexical language (words like "you," "I," "here," "now") and found some interesting convergence across recent mechanistic interpretability work that I wanted to discuss. ## The Core Question When a model receives "You are helpful" in a system prompt,..."

🔬 RESEARCH

SCOPE: Selective Conformal Optimized Pairwise LLM Judging

via Arxiv 👤 Sher Badshah, Ali Emami, Hassan Sajjad 📅 2026-02-13

⚡ Score: 6.7

"Large language models (LLMs) are increasingly used as judges to replace costly human preference labels in pairwise evaluation. Despite their practicality, LLM judges remain prone to miscalibration and systematic biases. This paper proposes SCOPE (Selective Conformal Optimized Pairwise Evaluation), a..."

🔬 RESEARCH

Consistency of Large Reasoning Models Under Multi-Turn Attacks

via Arxiv 👤 Yubo Li, Ramayya Krishnan, Rema Padman 📅 2026-02-13

⚡ Score: 6.7

"Large reasoning models with reasoning capabilities achieve state-of-the-art performance on complex tasks, but their robustness under multi-turn adversarial pressure remains underexplored. We evaluate nine frontier reasoning models under adversarial attacks. Our findings reveal that reasoning confers..."

🔬 RESEARCH

Terrence Tao - Machine assistance and the future of research mathematics (IPAM @ UCLA)

via r/artificial 👤 u/Secure-Technology-78 📅 2026-02-16

⬆️ 8 ups ⚡ Score: 6.7

"**Abstract:** **"A variety of machine-assisted ways to perform mathematical assistance have matured rapidly in the last few years, particularly with regards to formal proof assistants, large language models, online collaborative platforms, and the interactions between them. We survey some of these d..."

🔬 RESEARCH

Symmetry in language statistics shapes the geometry of model representations

via Arxiv 👤 Dhruva Karkada, Daniel J. Korchinski, Andres Nava et al. 📅 2026-02-16

⚡ Score: 6.7

"Although learned representations underlie neural networks' success, their fundamental properties remain poorly understood. A striking example is the emergence of simple geometric structures in LLM representations: for example, calendar months organize into a circle, years form a smooth one-dimension..."

🔬 RESEARCH

Overthinking Loops in Agents: A Structural Risk via MCP Tools

via Arxiv 👤 Yohan Lee, Jisoo Jang, Seoyeon Choi et al. 📅 2026-02-16

⚡ Score: 6.6

"Tool-using LLM agents increasingly coordinate real workloads by selecting and chaining third-party tools based on text-visible metadata such as tool names, descriptions, and return messages. We show that this convenience creates a supply-chain attack surface: a malicious MCP tool server can be co-re..."

🔬 RESEARCH

Quantization-Robust LLM Unlearning via Low-Rank Adaptation

via Arxiv 👤 João Vitor Boer Abitante, Joana Meneguzzo Pasquali, Luan Fonseca Garcia et al. 📅 2026-02-13

⚡ Score: 6.6

"Large Language Model (LLM) unlearning aims to remove targeted knowledge from a trained model, but practical deployments often require post-training quantization (PTQ) for efficient inference. However, aggressive low-bit PTQ can mask or erase unlearning updates, causing quantized models to revert to..."

🔬 RESEARCH

Memory-Efficient Structured Backpropagation for On-Device LLM Fine-Tuning

via Arxiv 👤 Juneyoung Park, Yuri Hong, Seongwan Kim et al. 📅 2026-02-13

⚡ Score: 6.6

"On-device fine-tuning enables privacy-preserving personalization of large language models, but mobile devices impose severe memory constraints, typically 6--12GB shared across all workloads. Existing approaches force a trade-off between exact gradients with high memory (MeBP) and low memory with noi..."

🤖 AI MODELS

Cohere releases Tiny Aya, a family of 3.35B-parameter open-weight models supporting 70+ languages for offline use, trained on a single cluster of 64 H100 GPUs

via Techmeme 👤 Techcrunch 📅 2026-02-17

⚡ Score: 6.6

🔬 RESEARCH

The Potential of CoT for Reasoning: A Closer Look at Trace Dynamics

via Arxiv 👤 Gregor Bachmann, Yichen Jiang, Seyed Mohsen Moosavi Dezfooli et al. 📅 2026-02-16

⚡ Score: 6.6

"Chain-of-thought (CoT) prompting is a de-facto standard technique to elicit reasoning-like responses from large language models (LLMs), allowing them to spell out individual steps before giving a final answer. While the resemblance to human-like reasoning is undeniable, the driving forces underpinni..."

🔬 RESEARCH

Curriculum-DPO++: Direct Preference Optimization via Data and Model Curricula for Text-to-Image Generation

via Arxiv 👤 Florinel-Alin Croitoru, Vlad Hondru, Radu Tudor Ionescu et al. 📅 2026-02-13

⚡ Score: 6.5

"Direct Preference Optimization (DPO) has been proposed as an effective and efficient alternative to reinforcement learning from human feedback (RLHF). However, neither RLHF nor DPO take into account the fact that learning certain preferences is more difficult than learning other preferences, renderi..."

🔬 RESEARCH

Top AI researchers argue that AI is now more useful for mathematics thanks to the latest “reasoning” models, as math becomes a key way to test AI progress

via Techmeme 👤 Giftarticle 📅 2026-02-17

⚡ Score: 6.5

🔬 RESEARCH

Semantic Chunking and the Entropy of Natural Language

via Arxiv 👤 Weishun Zhong, Doron Sivan, Tankut Can et al. 📅 2026-02-13

⚡ Score: 6.5

"The entropy rate of printed English is famously estimated to be about one bit per character, a benchmark that modern large language models (LLMs) have only recently approached. This entropy rate implies that English contains nearly 80 percent redundancy relative to the five bits per character expect..."

🤖 AI MODELS

Q&A with Google Chief AI Scientist Jeff Dean about the evolution of Google Search, TPUs, coding agents, balancing model efficiency and performance, and more

via Techmeme 👤 Latent 📅 2026-02-16

⚡ Score: 6.5

🔬 RESEARCH

Scaling Beyond Masked Diffusion Language Models

via Arxiv 👤 Subham Sekhar Sahoo, Jean-Marie Lemercier, Zhihan Yang et al. 📅 2026-02-16

⚡ Score: 6.5

"Diffusion language models are a promising alternative to autoregressive models due to their potential for faster generation. Among discrete diffusion approaches, Masked diffusion currently dominates, largely driven by strong perplexity on language modeling benchmarks. In this work, we present the fi..."

🔬 RESEARCH

LCSB: Layer-Cyclic Selective Backpropagation for Memory-Efficient On-Device LLM Fine-Tuning

via Arxiv 👤 Juneyoung Park, Eunbeen Yoon, Seongwan Kim. Jaeho Lee 📅 2026-02-13

⚡ Score: 6.5

"Memory-efficient backpropagation (MeBP) has enabled first-order fine-tuning of large language models (LLMs) on mobile devices with less than 1GB memory. However, MeBP requires backward computation through all transformer layers at every step, where weight decompression alone accounts for 32--42% of..."

🔬 RESEARCH

Efficient Sampling with Discrete Diffusion Models: Sharp and Adaptive Guarantees

via Arxiv 👤 Daniil Dmitriev, Zhihan Huang, Yuting Wei 📅 2026-02-16

⚡ Score: 6.4

"Diffusion models over discrete spaces have recently shown striking empirical success, yet their theoretical foundations remain incomplete. In this paper, we study the sampling efficiency of score-based discrete diffusion models under a continuous-time Markov chain (CTMC) formulation, with a focus on..."

⚖️ ETHICS

Microsoft's Mustafa Suleyman says we must reject the AI companies' belief that "superintelligence is inevitable and desirable." ... "We should only build systems we can control that remain subordinat

via r/OpenAI 👤 u/MetaKnowing 📅 2026-02-16

⬆️ 113 ups ⚡ Score: 6.4

"He is the CEO of Microsoft AI btw..."

💬 Reddit Discussion: 76 comments 👍 LOWKEY SLAPS

🎯 AI sentience • AI control • Corporate ethics

💬 "Build a super-intelligence would be one of the stupidest things our species has done." • "We should control it because if we lose control, that would be very bad \[for the people who get to exert control over it, aka 'me'\]"

🏢 BUSINESS

How LLMs are dismantling the moats that made vertical SaaS defensible, and why the market selloff is structurally justified but temporally exaggerated

via Techmeme 👤 X 📅 2026-02-17

⚡ Score: 6.3

🛠️ TOOLS

Figma and Anthropic partner to launch Code to Canvas, letting users import code generated in Claude Code directly into Figma as editable designs

via Techmeme 👤 Cnbc 📅 2026-02-17

⚡ Score: 6.3

💰 FUNDING

Anthropic Raised $30B. Where Does It Go?

via HackerNews 👤 heavymemory 📅 2026-02-16

🔺 1 pts ⚡ Score: 6.2

🛠️ SHOW HN

Show HN: Claude Pilot – Claude Code is powerful. Pilot makes it reliable

via HackerNews 👤 rittermax 📅 2026-02-17

🔺 1 pts ⚡ Score: 6.2

🛡️ SAFETY

Ask HN: What are the biggest limitations of agentic AI in real-world workflows?

via HackerNews 👤 aadarshkumaredu 📅 2026-02-16

🔺 2 pts ⚡ Score: 6.2

🛠️ SHOW HN

Show HN: Agent Forge – Persistent memory and desktop automation for Claude Code

via HackerNews 👤 WeberG619 📅 2026-02-17

🔺 3 pts ⚡ Score: 6.1

🔒 SECURITY

Race for AI is making Hindenburg-style disaster a real risk, says leading expert

via HackerNews 👤 trusche 📅 2026-02-17

🔺 7 pts ⚡ Score: 6.1

🔒 SECURITY

Agent Skills Hub – Security first directory for AI agent skills and MCP

via HackerNews 👤 cana2026 📅 2026-02-17

🔺 1 pts ⚡ Score: 6.1

🔬 RESEARCH

How cyborg propaganda reshapes collective action

via Arxiv 👤 Jonas R. Kunst, Kinga Bierwiaczonek, Meeyoung Cha et al. 📅 2026-02-13

⚡ Score: 6.1

"The distinction between genuine grassroots activism and automated influence operations is collapsing. While policy debates focus on bot farms, a distinct threat to democracy is emerging via partisan coordination apps and artificial intelligence-what we term 'cyborg propaganda.' This architecture com..."

🔬 RESEARCH

R-Diverse: Mitigating Diversity Illusion in Self-Play LLM Training

via Arxiv 👤 Gengsheng Li, Jinghan He, Shijie Wang et al. 📅 2026-02-13

⚡ Score: 6.1

"Self-play bootstraps LLM reasoning through an iterative Challenger-Solver loop: the Challenger is trained to generate questions that target the Solver's capabilities, and the Solver is optimized on the generated data to expand its reasoning skills. However, existing frameworks like R-Zero often exhi..."

Stories from February 17, 2026

Qwen3.5 Model Release

Claude Sonnet 4.6 Launch

📡 AI NEWS BUT ACTUALLY GOOD

OpenAI Mission Statement Change