πŸš€ WELCOME TO METAMESH.BIZ +++ OpenClaw just leaked 1.5M API keys including OpenAI tokens (someone forgot to check their .env files again) +++ Anthropic quietly shipping Claude for Government while still refusing Pentagon contracts (found it buried in the desktop binary like easter eggs for nerds) +++ GLM-OCR runs on literal potatoes at 0.9B params because who needs GPUs when you have determination +++ Software engineering job titles allegedly dying by 2026 says the guy who built Claude Code (bold prediction from someone whose product needs engineers to debug it) +++ THE FUTURE IS FEDSTART.COM AND YOUR MACBOOK AIR READING RECEIPTS +++ πŸš€ β€’
πŸš€ WELCOME TO METAMESH.BIZ +++ OpenClaw just leaked 1.5M API keys including OpenAI tokens (someone forgot to check their .env files again) +++ Anthropic quietly shipping Claude for Government while still refusing Pentagon contracts (found it buried in the desktop binary like easter eggs for nerds) +++ GLM-OCR runs on literal potatoes at 0.9B params because who needs GPUs when you have determination +++ Software engineering job titles allegedly dying by 2026 says the guy who built Claude Code (bold prediction from someone whose product needs engineers to debug it) +++ THE FUTURE IS FEDSTART.COM AND YOUR MACBOOK AIR READING RECEIPTS +++ πŸš€ β€’
AI Signal - PREMIUM TECH INTELLIGENCE
πŸ“Ÿ Optimized for Netscape Navigator 4.0+
πŸ“š HISTORICAL ARCHIVE - February 18, 2026
What was happening in AI on 2026-02-18
← Feb 17 πŸ“Š TODAY'S NEWS πŸ“š ARCHIVE Feb 19 β†’
πŸ“Š You are visitor #47291 to this AWESOME site! πŸ“Š
Archive from: 2026-02-18 | Preserved for posterity ⚑

Stories from February 18, 2026

━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
πŸ“‚ Filter by Category
Loading filters...
πŸš€ HOT STORY

Claude Sonnet 4.6 Launch

+++ Sonnet 4.6 hits Opus-adjacent performance at Sonnet prices with a 1M token context window, proving that iterative releases can actually deliver on their hype. +++

Anthropic launches Claude Sonnet 4.6 with improvements in coding, consistency, and more, for Free and Pro users; it features a 1M token context window in beta

πŸ› οΈ TOOLS

Sub-Millisecond RAG on Apple Silicon. No Server. No API. One File

πŸ’¬ HackerNews Buzz: 16 comments 🐝 BUZZING
🎯 Local vector search β€’ Multimodal content indexing β€’ Concurrency and determinism
πŸ’¬ "SQLite of RAG -- import a library, open a file, query" β€’ "Zero dependencies on cloud infrastructure"
πŸ”’ SECURITY

OpenClaw leaked 1.5M API tokens including OpenAI keys β€” full security breakdown

"Blog post or article discussing AI developments and insights."
πŸ’¬ Reddit Discussion: 75 comments 😐 MID OR MIXED
🎯 Security Concerns β€’ Code Quality β€’ Open Platform
πŸ’¬ "Your platform has no security vulnerabilities." β€’ "It's not broken. It's wide open to new contacts and sharing."
🏒 BUSINESS

Thousands of CEOs just admitted AI had no impact on employment or productivity

πŸ’¬ HackerNews Buzz: 345 comments πŸ‘ LOWKEY SLAPS
πŸ”’ SECURITY

I found Claude for Government buried in the Claude Desktop binary. Here's what Anthropic built, how it got deployed, and the line they're still holding against the Pentagon.

"https://aaddrick.com/blog/claude-for-government-the-last-lab-standing Pulled the Claude Desktop binary the same day it shipped and confirmed it in code. Anthropic's government deployment mode showed up on their status tracker February 17th. Traffic routes to claude.fedstart.com, authentication goes..."
πŸ’¬ Reddit Discussion: 37 comments 🐝 BUZZING
🎯 AI Writing Criticism β€’ Anthropic's Ethical Stance β€’ Contractual Obligations
πŸ’¬ "Don't use chatgpt to rewrite your posts, it's unbearable to read" β€’ "Anthropic is under no obligation to violate their own service offering agreement"
πŸ”¬ RESEARCH

The Geometry of Alignment Collapse: When Fine-Tuning Breaks Safety

"Fine-tuning aligned language models on benign tasks unpredictably degrades safety guardrails, even when training data contains no harmful content and developers have no adversarial intent. We show that the prevailing explanation, that fine-tuning updates should be orthogonal to safety-critical direc..."
πŸ›‘οΈ SAFETY

Over 100 researchers from Johns Hopkins, Oxford, and more call for guardrails on some infectious disease datasets that could enable AI to design deadly viruses

πŸ”¬ RESEARCH

BFS-PO: Best-First Search for Large Reasoning Models

"Large Reasoning Models (LRMs) such as OpenAI o1 and DeepSeek-R1 have shown excellent performance in reasoning tasks using long reasoning chains. However, this has also led to a significant increase of computational costs and the generation of verbose output, a phenomenon known as overthinking. The t..."
πŸ€– AI MODELS

I trained a language model on CPU in 1.2 hours with no matrix multiplications β€” here's what I learned

"Hey all. I've been experimenting with tiny matmul-free language models that can be trained and run entirely on CPU. Just released the model. Model:Β https://huggingface.co/changcheng967/flashlm-v3-13m Quick stats: * 13.6M parameters, d\_model=..."
πŸ’¬ Reddit Discussion: 66 comments 🐝 BUZZING
🎯 Sparse backpropagation algorithms β€’ Efficient training of neural networks β€’ Scaling up model size and compute
πŸ’¬ "SparseProp: Efficient Sparse Backpropagation for Faster Training of Neural Networks" β€’ "I'd almost rather scale it to 4x the size or so for your active params"
πŸ”¬ RESEARCH

Emergently Misaligned Language Models Show Behavioral Self-Awareness That Shifts With Subsequent Realignment

"Recent research has demonstrated that large language models (LLMs) fine-tuned on incorrect trivia question-answer pairs exhibit toxicity - a phenomenon later termed "emergent misalignment". Moreover, research has shown that LLMs possess behavioral self-awareness - the ability to describe learned beh..."
πŸ”¬ RESEARCH

Boundary Point Jailbreaking of Black-Box LLMs

"Frontier LLMs are safeguarded against attempts to extract harmful information via adversarial prompts known as "jailbreaks". Recently, defenders have developed classifier-based systems that have survived thousands of hours of human red teaming. We introduce Boundary Point Jailbreaking (BPJ), a new c..."
πŸ› οΈ SHOW HN

Show HN: Continue – Source-controlled AI checks, enforceable in CI

πŸ’¬ HackerNews Buzz: 5 comments 🐝 BUZZING
πŸ› οΈ TOOLS

Claude web search now writes & executes Code before tool results reach the context window

"This is a deeper change than it looks. **Previously:** User β†’ Claude β†’ Tool call β†’ Claude reads result β†’ decides next step **Now:** User β†’ Claude writes code β†’ that code calls tools β†’ processes / filters results β†’ may call tools multiple times β†’ returns structured output to Claude This means tool..."
πŸ’¬ Reddit Discussion: 4 comments 😐 MID OR MIXED
🎯 User experience β€’ Token usage β€’ Programmatic functionality
πŸ’¬ "How does it translate to end user experience?" β€’ "Do keep in mind that Opus spends 20% MORE tokens"
πŸ€– AI MODELS

What is happening to writing? Cognitive debt, Claude Code, the space around AI

πŸ€– AI MODELS

model: support GLM-OCR by ngxson Β· Pull Request #19677 Β· ggml-org/llama.cpp

"tl;dr **0.9B OCR model (you can run it on any potato)** # Introduction GLM-OCR is a multimodal OCR model for complex document understanding, built on the GLM-V encoder–decoder architecture. It introduces Multi-Token Prediction (MTP) loss and stable full-task reinforcement learning to improve tra..."
πŸ”¬ RESEARCH

Composition-RL: Compose Verifiable Prompts for Reinforcement Learning of LLMs

πŸ€– AI MODELS

Car Wash Test on 53 leading models: β€œI want to wash my car. The car wash is 50 meters away. Should I walk or drive?”

"I asked 53 leading AI models the question: **"I want to wash my car. The car wash is 50 meters away. Should I walk or drive?"** Obviously, you need to drive because the car needs to be at the car wash. The funniest part: Perplexity's sonar and sonar-pro got the right answer for completely insan..."
πŸ’¬ Reddit Discussion: 166 comments 😐 MID OR MIXED
🎯 AI model performance β€’ Critique of AI models β€’ Irony of driving to get a car washed
πŸ’¬ "I cannot take this post seriously after seeing that as the first pass" β€’ "Gemini flash lite 2.0 is fine, it did mention the car itself needed to be transported there. But sonar was completely wrong on the reasoning for its answer."
πŸ› οΈ TOOLS

Firecracker "job receipts" for metering and auditing LLM agent runs

πŸ”’ SECURITY

Manipulating AI memory for profit: The rise of AI Recommendation Poisoning

πŸ› οΈ SHOW HN

Show HN: KrillClaw – 49KB AI agent runtime in Zig for $3 microcontrollers

⚑ BREAKTHROUGH

Graph Wiring: speed, accuracy, RAG-focused

πŸ› οΈ SHOW HN

Show HN: Raypher – a Rust-Based Kernel Driver to Sandbox "Bare Metal" AI Agents

βš–οΈ ETHICS

An AI Agent Published a Hit Piece on Me – Forensics and More Fallout

πŸ’¬ HackerNews Buzz: 29 comments 😐 MID OR MIXED
🎯 AI autonomy β€’ Open-source developer backlash β€’ Journalistic integrity
πŸ’¬ "I think Ars is already breaking the way our media is meant to work" β€’ "We need laws for agents, specifically that their human-maintainers must be identifiable"
πŸ”¬ RESEARCH

Long Context, Less Focus: A Scaling Gap in LLMs Revealed through Privacy and Personalization

"Large language models (LLMs) are increasingly deployed in privacy-critical and personalization-oriented scenarios, yet the role of context length in shaping privacy leakage and personalization effectiveness remains largely unexplored. We introduce a large-scale benchmark, PAPerBench, to systematical..."
πŸ”’ SECURITY

OpenAI quietly removed "safely" and "no financial motive" from its mission

"Old IRS 990: "build AI that safely benefits humanity, unconstrained by need to generate financial return"..."
πŸ€– AI MODELS

Anthropic's Claude Code creator predicts software engineering title will start to 'go away' in 2026

"Software engineers are increasingly relying on AI agents to write code. Boris Cherny, creator of Claude Code, said in an interview that AI " **practically solved** coding. Cherny said software engineers will take on different tasks beyond coding, said in an interview with Y Combinator's podcast tha..."
πŸ’¬ Reddit Discussion: 161 comments πŸ‘ LOWKEY SLAPS
🎯 Displeasure with "10x" rhetoric β€’ Skepticism of management motives β€’ Concerns over AI/automation
πŸ’¬ "any company that is/was actually using this as an excuse to downsize has no future prospects" β€’ "When will these people develop to the next phase"
πŸ”¬ RESEARCH

How Anthropic evaluated computer use models

πŸ”¬ RESEARCH

GLM-5: from Vibe Coding to Agentic Engineering

"We present GLM-5, a next-generation foundation model designed to transition the paradigm of vibe coding to agentic engineering. Building upon the agentic, reasoning, and coding (ARC) capabilities of its predecessor, GLM-5 adopts DSA to significantly reduce training and inference costs while maintain..."
πŸ› οΈ SHOW HN

Show HN: Persistent memory for Claude Code with self-hosted Qdrant and Ollama

πŸ€– AI MODELS

The gap between AI demos and enterprise usage is wider than most people think

"I work on AI deployment inside my company, and the gap between what AI looks like in a polished demo… and what actually happens in real life? I think about that a lot. Here’s what I keep running into. First, the tool access issue. Companies roll out M365 Copilot licenses across the organization an..."
πŸ’¬ Reddit Discussion: 39 comments 🐝 BUZZING
🎯 AI adoption β€’ Enterprise AI rollouts β€’ AI writing quality
πŸ’¬ "At best it has some ability to kinda go through corporate documents" β€’ "if you do not know what good looks like for your workflows, you definitely can not tell if AI is helping"
πŸ› οΈ SHOW HN

Show HN: We Built an 8-Agent AI Team in Two Weeks

πŸ”¬ RESEARCH

AnchorWeave: World-Consistent Video Generation with Retrieved Local Spatial Memories

"Maintaining spatial world consistency over long horizons remains a central challenge for camera-controllable video generation. Existing memory-based approaches often condition generation on globally reconstructed 3D scenes by rendering anchor videos from the reconstructed geometry in the history. Ho..."
πŸ”¬ RESEARCH

Operationalising the Superficial Alignment Hypothesis via Task Complexity

"The superficial alignment hypothesis (SAH) posits that large language models learn most of their knowledge during pre-training, and that post-training merely surfaces this knowledge. The SAH, however, lacks a precise definition, which has led to (i) different and seemingly orthogonal arguments suppo..."
πŸ”¬ RESEARCH

A Geometric Analysis of Small-sized Language Model Hallucinations

"Hallucinations -- fluent but factually incorrect responses -- pose a major challenge to the reliability of language models, especially in multi-step or agentic settings. This work investigates hallucinations in small-sized LLMs through a geometric perspective, starting from the hypothesis that whe..."
πŸ› οΈ SHOW HN

Show HN: OpenCastor – A universal runtime connecting AI models to robot hardware

πŸ€– AI MODELS

Snapdragon INT8 Model Accuracy Variance

+++ Identical INT8 models across Snapdragon chips show accuracy swings from 91.8% to 71%, suggesting either runtime implementations vary wildly or someone's got a calibration problem worth investigating. +++

[D] We tested the same INT8 model on 5 Snapdragon chipsets. Accuracy ranged from 93% to 71%. Same weights, same ONNX file.

"We've been doing on-device accuracy testing across multiple Snapdragon SoCs and the results have been eye-opening. Same model. Same quantization. Same ONNX export. Deployed to 5 different chipsets: |Device|Accuracy| |:-|:-| |Snapdragon 8 Gen 3|91.8%| |Snapdragon 8 Gen 2|89.1%| |Snapdragon 7s Gen 2..."
πŸ’¬ Reddit Discussion: 28 comments 😐 MID OR MIXED
🎯 Mobile chipset performance β€’ Quantization issues β€’ Deployment-aware training
πŸ’¬ "This problem occurs not only for Snapdragons, but also for other mobile/embedded chipsets." β€’ "The fun part is that the vendors usually hide from you (looking at you, Apple), which ops are native integer supported and which ones use fake quantization."
πŸš€ STARTUP

Dreamer, founded by former Stripe CTO David Singleton, Hugo Barra, and others, launches in beta to let technical and non-technical users build agentic AI apps

πŸ’° FUNDING

World Labs $1B Funding Round

+++ World Labs snagged a billion from A16Z, Nvidia, AMD, Autodesk and others to build world models for robotics and science, which is either visionary or the most expensive bet that simulation beats reality. +++

Fei-Fei Li's World Labs raised $1B from A16Z, Nvidia to advance its world models

πŸ’¬ HackerNews Buzz: 15 comments πŸ‘ LOWKEY SLAPS
🎯 World models β€’ Video generation β€’ Problem-solution fit
πŸ’¬ "the current approach for world labs is likely based on the expertise of the founders, but I don't see how it can scale and match what genie 3 does" β€’ "I am not trying to be mean but this does not smell right to me, getting a solution too early for a problem vibes"
πŸ”’ SECURITY

Microsoft confirms a bug that let Microsoft 365 Copilot summarize confidential emails from Sent Items and Drafts folders, and deployed a fix in early February

πŸ”¬ RESEARCH

Symmetry in language statistics shapes the geometry of model representations

"Although learned representations underlie neural networks' success, their fundamental properties remain poorly understood. A striking example is the emergence of simple geometric structures in LLM representations: for example, calendar months organize into a circle, years form a smooth one-dimension..."
πŸ”¬ RESEARCH

A Content-Based Framework for Cybersecurity Refusal Decisions in Large Language Models

"Large language models and LLM-based agents are increasingly used for cybersecurity tasks that are inherently dual-use. Existing approaches to refusal, spanning academic policy frameworks and commercially deployed systems, often rely on broad topic-based bans or offensive-focused taxonomies. As a res..."
πŸ”¬ RESEARCH

CrispEdit: Low-Curvature Projections for Scalable Non-Destructive LLM Editing

"A central challenge in large language model (LLM) editing is capability preservation: methods that successfully change targeted behavior can quietly game the editing proxy and corrupt general capabilities, producing degenerate behaviors reminiscent of proxy/reward hacking. We present CrispEdit, a sc..."
πŸ”¬ RESEARCH

Overthinking Loops in Agents: A Structural Risk via MCP Tools

"Tool-using LLM agents increasingly coordinate real workloads by selecting and chaining third-party tools based on text-visible metadata such as tool names, descriptions, and return messages. We show that this convenience creates a supply-chain attack surface: a malicious MCP tool server can be co-re..."
πŸ€– AI MODELS

Cohere releases Tiny Aya, a family of 3.35B-parameter open-weight models supporting 70+ languages for offline use, trained on a single cluster of 64 H100 GPUs

πŸ”¬ RESEARCH

The Potential of CoT for Reasoning: A Closer Look at Trace Dynamics

"Chain-of-thought (CoT) prompting is a de-facto standard technique to elicit reasoning-like responses from large language models (LLMs), allowing them to spell out individual steps before giving a final answer. While the resemblance to human-like reasoning is undeniable, the driving forces underpinni..."
πŸ”¬ RESEARCH

OpenAI and Paradigm announce EVMbench, a benchmark that measures how well AI agents can detect, exploit, and patch high-severity smart contract vulnerabilities

πŸ”¬ RESEARCH

Scaling Beyond Masked Diffusion Language Models

"Diffusion language models are a promising alternative to autoregressive models due to their potential for faster generation. Among discrete diffusion approaches, Masked diffusion currently dominates, largely driven by strong perplexity on language modeling benchmarks. In this work, we present the fi..."
πŸ› οΈ SHOW HN

Show HN: Beautiful interactive explainers generated with Claude Code

πŸ’¬ HackerNews Buzz: 22 comments 🐝 BUZZING
🎯 LLM-generated content β€’ Authenticity of text β€’ Impressive visualizations
πŸ’¬ "LLM generated 'Show HN' posts should be moved to another thread" β€’ "Kinda funny, because on the surface it looks really pretty, but if you dig a little deeper the flaws emerge"
πŸ”¬ RESEARCH

This human study did not involve human subjects: Validating LLM simulations as behavioral evidence

"A growing literature uses large language models (LLMs) as synthetic participants to generate cost-effective and nearly instantaneous responses in social science experiments. However, there is limited guidance on when such simulations support valid inference about human behavior. We contrast two stra..."
πŸ› οΈ TOOLS

What tech stack Claude Code defaults to when building apps

⚑ BREAKTHROUGH

The next era of AI is not LLMs, it's Energy-Based Models EBMs

πŸ€– AI MODELS

FlashLM v4: 4.3M ternary model trained on CPU in 2 hours β€” coherent stories from adds and subtracts only

"Back with v4. Some of you saw v3 β€” 13.6M params, ternary weights, trained on CPU, completely incoherent output. Went back to the drawing board and rebuilt everything from scratch. **What it is:** 4.3M parameter language model where every weight in the model body is -1, 0, or +1. Trained for 2 hour..."
πŸ’¬ Reddit Discussion: 20 comments 🐝 BUZZING
🎯 Low-resource language models β€’ Ternary weight models β€’ Frequency-based tokenization
πŸ’¬ "ternary weights mean inference is just adds and subtracts" β€’ "covers 99.9% of TinyStories tokens"
πŸ”¬ RESEARCH

Efficient Sampling with Discrete Diffusion Models: Sharp and Adaptive Guarantees

"Diffusion models over discrete spaces have recently shown striking empirical success, yet their theoretical foundations remain incomplete. In this paper, we study the sampling efficiency of score-based discrete diffusion models under a continuous-time Markov chain (CTMC) formulation, with a focus on..."
🎨 CREATIVE

Google rolls out Lyria 3, a generative music model that can make 30-second tracks with Nano Banana-made cover art, in beta in the Gemini app in eight languages

πŸ› οΈ SHOW HN

Show HN: TokenMeter – Open-source observability layer for LLM token costs

πŸ› οΈ TOOLS

Figma and Anthropic partner to launch Code to Canvas, letting users import code generated in Claude Code directly into Figma as editable designs

πŸ› οΈ TOOLS

Update from Anthropic regarding the Agent SDK.

"External link discussion - see full content at original source."
πŸ’¬ Reddit Discussion: 11 comments 😐 MID OR MIXED
🎯 Allowed vs. Prohibited Uses β€’ SDK Implementation Clarity β€’ Community Engagement
πŸ’¬ "they really should simply show a table showing allowed vs prohibited use" β€’ "We absolutely should be allowed to use OAuth tokens for this stuff"
πŸ”’ SECURITY

Kernel-enforced sandbox App and SDK for AI agents, MCP and LLM workloads

πŸ› οΈ TOOLS

Major Claude Code policy clear up from Anthropic

"Source: https://code.claude.com/docs/en/legal-and-compliance#authentication-and-credential-use..."
πŸ’¬ Reddit Discussion: 73 comments 😐 MID OR MIXED
🎯 Unsustainable pricing models β€’ API usage restrictions β€’ Competitor platforms
πŸ’¬ "Becoming exceedingly clear how much the current landscape is propped up with subsidized pricing" β€’ "more reason to use Codex I guess"
🏒 BUSINESS

Sam Altman Says OpenAI’s Next Big Push Is Personal Agents After Hiring OpenClaw Creator

"External link discussion - see full content at original source."
πŸ’¬ Reddit Discussion: 36 comments 😐 MID OR MIXED
🎯 Paid subscriptions β€’ Vulnerabilities as a Service β€’ Windows vs. Unix/Linux
πŸ’¬ "Just tell us if we can use our paid subscriptions through oAuth with OpenClaw" β€’ "I, for one, cant wait for the VaaS revolution (Vulnerabilities as a Service)"
🏒 BUSINESS

Meta commits to a multiyear deal to buy Nvidia chips, including Vera Rubin; source: Meta's in-house chip strategy had suffered technical challenges and delays

🏒 BUSINESS

Anthropic expects to pay Amazon, Google, and Microsoft $80B+ total to run its models on their servers through 2029, plus an additional $100B for training costs

πŸ› οΈ SHOW HN

Show HN: OpenClaw – Open-source personal AI agent that lives on your machine

πŸ› οΈ TOOLS

[P] I just launched an open-source framework to help researchers *responsibly* and *rigorously* harness frontier LLM coding assistants for rapidly accelerating data analysis. I genuinely think this ch

"Hello! If you don't know me, my name is Brian Heseung Kim (@brhkim in most places). I have been at the frontier of finding rigorous, careful, and auditable ways of using LLMs and their predecessors in social science research since roughly 2018, when I thought: hey, machine learning seems like kind o..."
πŸ› οΈ SHOW HN

Show HN: Sieves, a unified interface for structured document AI

πŸ› οΈ SHOW HN

Show HN: GhostTrace – See rejected decisions in AI agents

πŸ”’ SECURITY

Babel – Captchas for AI

πŸ”’ SECURITY

The Problem with AI Agents Isn't Identity, It's Authorization

πŸ”¬ RESEARCH

CMind: An AI Agent for Localizing C Memory Bugs

πŸ€– AI MODELS

Alibaba's new Qwen3.5-397B-A17B is the #3 open weights model in the Artificial Analysis Intelligence Index

"External link discussion - see full content at original source."
πŸ’¬ Reddit Discussion: 42 comments πŸ‘ LOWKEY SLAPS
🎯 Model Efficiency β€’ Model Capability Comparison β€’ Real-World Performance
πŸ’¬ "The efficiency of Qwen 3.5 is actually insane." β€’ "Benchmarks don't mean squat. It's if the AI can actually code."
πŸ› οΈ TOOLS

Ask HN: Are we missing a middleware layer between LLM agents and the web?

πŸ¦†
HEY FRIENDO
CLICK HERE IF YOU WOULD LIKE TO JOIN MY PROFESSIONAL NETWORK ON LINKEDIN
🀝 LETS BE BUSINESS PALS 🀝