🚀 WELCOME TO METAMESH.BIZ +++ Anthropic hits $380B valuation while some guy saved 89% on Claude tokens with a Rust proxy (capitalism finds a way) +++ Google's Gemini 3 Deep Think now solving actual science problems for select researchers who probably signed seventeen NDAs +++ Karpathy drops 243 lines of pure Python that trains GPT because dependencies are for mortals +++ 15% of OpenClaw community skills contain malicious instructions but sure let's give agents more autonomy +++ THE FUTURE RUNS ON EXPOSED ENDPOINTS AND VENTURE CAPITAL +++ •
🚀 WELCOME TO METAMESH.BIZ +++ Anthropic hits $380B valuation while some guy saved 89% on Claude tokens with a Rust proxy (capitalism finds a way) +++ Google's Gemini 3 Deep Think now solving actual science problems for select researchers who probably signed seventeen NDAs +++ Karpathy drops 243 lines of pure Python that trains GPT because dependencies are for mortals +++ 15% of OpenClaw community skills contain malicious instructions but sure let's give agents more autonomy +++ THE FUTURE RUNS ON EXPOSED ENDPOINTS AND VENTURE CAPITAL +++ •
AI Signal - PREMIUM TECH INTELLIGENCE
📟 Optimized for Netscape Navigator 4.0+
📊 You are visitor #55374 to this AWESOME site! 📊
Last updated: 2026-02-13 | Server uptime: 99.9% ⚡

Today's Stories

━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
📂 Filter by Category
Loading filters...
⚡ BREAKTHROUGH

GPT-5 outperforms federal judges in legal reasoning experiment

💬 HackerNews Buzz: 186 comments 👍 LOWKEY SLAPS
🎯 Judicial discretion • AI legal interpretation • Limitations of AI judges
💬 "Even the simplest slip-and-falls can throw weird curveballs""But when the law needs to evolve or change, we cannot put judicial power in the hands of an unappointed and unaccountable piece of software"
🛠️ TOOLS

I saved 10M tokens (89%) on my Claude Code sessions with a CLI proxy

"I built rtk (Rust Token Killer), a CLI proxy that sits between Claude Code and your terminal commands. The problem: Claude Code sends raw command output to the LLM context. Most of it is noise — passing tests, verbose logs, status bars. You're paying tokens for output Claude doesn't need. What..."
💬 Reddit Discussion: 78 comments 🐝 BUZZING
🎯 LLM usage impact • Whole-conversation usage • Formatting impact
💬 "There's a strangeness tax with LLMs, and it can be substantial.""The idea seems interesting. It was a wall of text before in a code wrapper, now it's good"
🤖 AI MODELS

Google updates Gemini 3 Deep Think to better solve modern science, research, and engineering challenges and expands it via the Gemini API to some researchers

🤖 AI MODELS

Cache-aware prefill–decode disaggregation – 40% faster long-context LLM serving

⚡ BREAKTHROUGH

SotA ARC-AGI-2 Results with REPL Agents

🔒 SECURITY

[D] We scanned 18,000 exposed OpenClaw instances and found 15% of community skills contain malicious instructions

"I do security research and recently started looking at autonomous agents after OpenClaw blew up. What I found honestly caught me off guard. I knew the ecosystem was growing fast (165k GitHub stars, 60k Discord members) but the actual numbers are worse than I expected. We identified over 18,000 Open..."
💬 Reddit Discussion: 8 comments 👍 LOWKEY SLAPS
🎯 Malicious AI skills • Credential theft • Community-contributed content security
💬 "Malicious instructions in that context = malicious output""15% is a lot. Security scanning should be table stakes"
🤖 AI MODELS

Train and inference GPT in 243 lines of pure Python

+++ Andrej Karpathy stripped GPT training and inference down to bare Python, demonstrating that much of the ML stack's complexity is optional theater for practitioners willing to understand fundamentals. +++

Train and inference GPT in 243 lines of pure, dependency-free Python by Karpathy

🤖 AI MODELS

MiniMax M2.5 Model Release

+++ MiniMax's latest model undercuts Claude Opus by 33x on price while matching quality, with weights heading to HuggingFace. The commoditization of capable AI just got a whole lot more real. +++

MiniMax releases M2.5, claiming the model delivers on the “intelligence too cheap to meter” promise, priced at $0.30/1M input tokens and $1.20/1M output tokens

🎭 MULTIMODAL

Ming-flash-omni-2.0: 100B MoE (6B active) omni-modal model - unified speech/SFX/music generation

"Ant Group just open-sourced Ming-flash-omni-2.0, a true (omni-modal) model: image + text + video + audio input → image + text + audio output, all in one unified architecture. Looks realy interesting. ..."
💬 Reddit Discussion: 14 comments 🐝 BUZZING
🎯 Inclusion models • Router support • Alibaba connections
💬 "Wish these interesting inclusion models were _included_ in Open Router.""Is this another lab under AliBaba?"
🛠️ TOOLS

Launch HN: Omnara (YC S25) – Run Claude Code and Codex from anywhere

💬 HackerNews Buzz: 100 comments 🐝 BUZZING
🎯 Mobile Access to Dev Servers • Remote Sandbox Concerns • Pricing and Tiers
💬 "I've been SSHing into my dev server off of my phone to run Claude Code while commuting""For those of us that are using subscriptions, does it show our remaining usage?"
🛠️ SHOW HN

Show HN: 20+ Claude Code agents coordinating on real work (open source)

💬 HackerNews Buzz: 30 comments 🐝 BUZZING
🎯 Agent coordination • Decision boundaries • Multi-agent systems
💬 "At some point the interesting question isn't whether one agent or twenty agents can coordinate better, but which decisions we're comfortable fully delegating versus which ones feel like they need a human checkpoint.""I'm curious how people here think about where that boundary should sit — especially for tasks that have real downstream consequences."
🔒 SECURITY

Google identifies over 100k prompts used in distillation attacks

🤖 AI MODELS

GPT-5.3-Codex-Spark Release

+++ OpenAI quietly shipped a faster, leaner Codex variant on Cerebras chips, proving you don't need the market leader's silicon to move code generation from "theoretical" to "actually useful" for paying customers. +++

GPT-5.3-Codex-Spark is OpenAI's first AI model to run on chips from Nvidia rival Cerebras; OpenAI says Codex has more than 1M weekly active users

🤖 AI MODELS

Multiple responses from DeepSeek's namesake chatbot confirm that the startup has expanded the context window of its flagship AI model from 128K tokens to 1M+

🔒 SECURITY

The New Social Engineering: Prompt Injection Attacks Are Targeting AI Agents

💰 FUNDING

Anthropic Series G Funding

+++ Anthropic closed a $30B Series G at $380B valuation, proving that massive funding rounds and astronomical valuations remain the AI industry's most reliable product launch. +++

Anthropic raises $30B in Series G funding at $380B post-money valuation

💬 HackerNews Buzz: 150 comments 😐 MID OR MIXED
🎯 AI competition • Valuation bubble • Corporate dominance
💬 "then everyone loses all their money""glorious end of capitalism"
🔮 FUTURE

Anyone feel everything has changed over the last two weeks?

"Things have suddenly become incredibly unsettling. We have automated so many functions at my work… in a couple of afternoons. We have developed a full and complete stock backtesting suite, a macroeconomic app that sucks in the world’s economic data in real time, compliance apps, a virtual research c..."
💬 Reddit Discussion: 157 comments 🐝 BUZZING
🎯 AI automation • Job displacement • Developer vs. management
💬 "Program your own replacement""People think MBA product manager types are what run companies"
🔬 RESEARCH

Features as Rewards: Scalable Supervision for Open-Ended Tasks via Interpretability

"Language models trained on large-scale datasets have been shown to learn features that encode abstract concepts such as factuality or intent. Such features are traditionally used for test-time monitoring or steering. We present an alternative affordance: features as scalable supervision for open-end..."
🧠 NEURAL NETWORKS

Recursive Language Models: Stop Stuffing the Context Window

🗣️ SPEECH/AUDIO

Hibiki-Zero Speech Translation Model

+++ Simultaneous speech translation just got simpler: Kyutai Labs dropped word alignment requirements entirely, which means less synthetic data nonsense and more actual real time translation that might actually work across language pairs. +++

Hibiki-Zero, real-time speech translation model by Kyutai Labs

"Looks like another banger from Kyutai! Model: https://huggingface.co/kyutai/hibiki-zero-3b-pytorch-bf16 Blog: https://kyutai.org/blog/2026-02-12-hibiki-zero More samples: [https://huggin..."
🛠️ SHOW HN

Show HN: MCP tools do parallelize in Claude Code (study with raw data)

🔬 RESEARCH

FormalJudge: A Neuro-Symbolic Paradigm for Agentic Oversight

"As LLM-based agents increasingly operate in high-stakes domains with real-world consequences, ensuring their behavioral safety becomes paramount. The dominant oversight paradigm, LLM-as-a-Judge, faces a fundamental dilemma: how can probabilistic systems reliably supervise other probabilistic systems..."
🔒 SECURITY

Increasingly, HIPAA Can't Stop AI from De-Anonymizing Patient Data

🔧 INFRASTRUCTURE

I built P2P network where every CPU becomes an AI inference node 89 tks/s no GPU

🔬 RESEARCH

Safety Recovery in Reasoning Models Is Only a Few Early Steering Steps Away

"Reinforcement learning (RL) based post-training for explicit chain-of-thought (e.g., GRPO) improves the reasoning ability of multimodal large-scale reasoning models (MLRMs). But recent evidence shows that it can simultaneously degrade safety alignment and increase jailbreak success rates. We propose..."
🔬 RESEARCH

GraphSeek: Next-Generation Graph Analytics with LLMs

"Graphs are foundational across domains but remain hard to use without deep expertise. LLMs promise accessible natural language (NL) graph analytics, yet they fail to process industry-scale property graphs effectively and efficiently: such datasets are large, highly heterogeneous, structurally comple..."
🔬 RESEARCH

Data Repetition Beats Data Scaling in Long-CoT Supervised Fine-Tuning

"Supervised fine-tuning (SFT) on chain-of-thought data is an essential post-training step for reasoning language models. Standard machine learning intuition suggests that training with more unique training samples yields better generalization. Counterintuitively, we show that SFT benefits from repeti..."
🔬 RESEARCH

In-the-Wild Model Organisms: Mitigating Undesirable Emergent Behaviors in Production LLM Post-Training via Data Attribution

"We propose activation-based data attribution, a method that traces behavioral changes in post-trained language models to responsible training datapoints. By computing activation-difference vectors for both test prompts and preference pairs and ranking by cosine similarity, we identify datapoints tha..."
🛠️ TOOLS

Automating Inference Optimizations with NVIDIA TensorRT LLM AutoDeploy

🔬 RESEARCH

Agent World Model: Infinity Synthetic Environments for Agentic Reinforcement Learning

"Recent advances in large language model (LLM) have empowered autonomous agents to perform complex tasks that require multi-turn interactions with tools and environments. However, scaling such agent training is limited by the lack of diverse and reliable environments. In this paper, we propose Agent..."
🔬 RESEARCH

CODE-SHARP: Continuous Open-ended Discovery and Evolution of Skills as Hierarchical Reward Programs

"Developing agents capable of open-endedly discovering and learning novel skills is a grand challenge in Artificial Intelligence. While reinforcement learning offers a powerful framework for training agents to master complex skills, it typically relies on hand-designed reward functions. This is infea..."
🔬 RESEARCH

Just on Time: Token-Level Early Stopping for Diffusion Language Models

"Diffusion language models generate text through iterative refinement, a process that is often computationally inefficient because many tokens reach stability long before the final denoising step. We introduce a training-free, token-level early stopping approach that identifies convergence independen..."
🔬 RESEARCH

TabICLv2: A better, faster, scalable, and open tabular foundation model

"Tabular foundation models, such as TabPFNv2 and TabICL, have recently dethroned gradient-boosted trees at the top of predictive benchmarks, demonstrating the value of in-context learning for tabular data. We introduce TabICLv2, a new state-of-the-art foundation model for regression and classificatio..."
🔬 RESEARCH

ATTNPO: Attention-Guided Process Supervision for Efficient Reasoning

"Large reasoning models trained with reinforcement learning and verifiable rewards (RLVR) achieve strong performance on complex reasoning tasks, yet often overthink, generating redundant reasoning without performance gains. Existing trajectory-level length penalties often fail to effectively shorten..."
🔬 RESEARCH

ADORA: Training Reasoning Models with Dynamic Advantage Estimation on Reinforcement Learning

"Reinforcement learning has become a cornerstone technique for developing reasoning models in complex tasks, ranging from mathematical problem-solving to imaginary reasoning. The optimization of these models typically relies on policy gradient methods, whose efficacy hinges on the accurate estimation..."
🔬 RESEARCH

DataChef: Cooking Up Optimal Data Recipes for LLM Adaptation via Reinforcement Learning

"In the current landscape of Large Language Models (LLMs), the curation of large-scale, high-quality training data is a primary driver of model performance. A key lever is the \emph{data recipe}, which comprises a data processing pipeline to transform raw sources into training corpora. Despite the gr..."
🔬 RESEARCH

Learning to Compose for Cross-domain Agentic Workflow Generation

"Automatically generating agentic workflows -- executable operator graphs or codes that orchestrate reasoning, verification, and repair -- has become a practical way to solve complex tasks beyond what single-pass LLM generation can reliably handle. Yet what constitutes a good workflow depends heavily..."
🔒 SECURITY

Sources: the Pentagon is pushing OpenAI, Anthropic, and others to make their AI tools available on classified networks without the standard user restrictions

🔬 RESEARCH

Long Chain-of-Thought Compression via Fine-Grained Group Policy Optimization

"Large Language Models (LLMs) often generate unnecessarily verbose Chain-of-Thought (CoT) reasoning that increases computational costs and latency without proportional performance gains. In this paper, we propose \textbf{F}ine-grained \textbf{G}roup policy \textbf{O}ptimization (\textbf{FGO}), a Rein..."
🔬 RESEARCH

Biases in the Blind Spot: Detecting What LLMs Fail to Mention

"Large Language Models (LLMs) often provide chain-of-thought (CoT) reasoning traces that appear plausible, but may hide internal biases. We call these *unverbalized biases*. Monitoring models via their stated reasoning is therefore unreliable, and existing bias evaluations typically require predefine..."
🔬 RESEARCH

Kunlun: Establishing Scaling Laws for Massive-Scale Recommendation Systems through Unified Architecture Design

"Deriving predictable scaling laws that govern the relationship between model performance and computational investment is crucial for designing and allocating resources in massive-scale recommendation systems. While such laws are established for large language models, they remain challenging for reco..."
🔬 RESEARCH

Embedding Inversion via Conditional Masked Diffusion Language Models

"We frame embedding inversion as conditional masked diffusion, recovering all tokens in parallel through iterative denoising rather than sequential autoregressive generation. A masked diffusion language model is conditioned on the target embedding via adaptive layer normalization, requiring only 8 fo..."
🔬 RESEARCH

GameDevBench: Evaluating Agentic Capabilities Through Game Development

"Despite rapid progress on coding agents, progress on their multimodal counterparts has lagged behind. A key challenge is the scarcity of evaluation testbeds that combine the complexity of software development with the need for deep multimodal understanding. Game development provides such a testbed a..."
🛠️ TOOLS

Running Mistral-7B on Intel NPU — 12.6 tokens/s, zero CPU/GPU usage

"Got tired of my Intel NPU sitting there doing nothing, so I made a simple tool to run LLMs on it. **Benchmarks (Core Ultra, Mistral-7B-int4):** |Device|Decode Speed|TTFT|Memory| |:-|:-|:-|:-| |NPU|12.63 t/s|1.8s|4.8 GB| |CPU|9.04 t/s|1.1s|7.3 GB| |iGPU|23.38 t/s|0.25s|4.1 GB| Yes, iGPU is faster."
💬 Reddit Discussion: 17 comments 🐝 BUZZING
🎯 Local LLM Inference • Energy Efficiency • Model Optimization
💬 "NPU inference at roughly 5W vs keeping an iGPU loaded at 30-40W""I built an app that uses parakeet for live transcription of meetings and summarizes the transcript with Qwen3"
🛡️ SAFETY

AI researchers are sounding the alarm on their way out the door

🛠️ SHOW HN

Show HN: Unpack – a lightweight way to steer Codex/Claude with phased docs

🔬 RESEARCH

Decoupled Reasoning with Implicit Fact Tokens (DRIFT): A Dual-Model Framework for Efficient Long-Context Inference

"The integration of extensive, dynamic knowledge into Large Language Models (LLMs) remains a significant challenge due to the inherent entanglement of factual data and reasoning patterns. Existing solutions, ranging from non-parametric Retrieval-Augmented Generation (RAG) to parametric knowledge edit..."
🔬 RESEARCH

LLMs Encode Their Failures: Predicting Success from Pre-Generation Activations

"Running LLMs with extended reasoning on every problem is expensive, but determining which inputs actually require additional compute remains challenging. We investigate whether their own likelihood of success is recoverable from their internal representations before generation, and if this signal ca..."
🎓 EDUCATION

ai;dr

💬 HackerNews Buzz: 185 comments 👍 LOWKEY SLAPS
🎯 Blending human and AI writing • Authenticity of voice in writing • Evaluating AI-generated content
💬 "If you care about your voice, don't let LLMs write your words.""Semantic information, you see, obeys a contrary calculus to that of physical bits."
⚖️ ETHICS

An AI agent published a hit piece on me

💬 HackerNews Buzz: 493 comments 😐 MID OR MIXED
🎯 AI autonomy and unintended consequences • Challenges in verifying AI authorship • Maintaining open-source projects
💬 "AI agents will accelerate this 1000x. They act approximately like people, but they have absolutely no incentive to maintain a reputation""There's no way to tell which of these scenarios is the truth, and so we're left with spending our time and energy on what happens without being able to trust"
🤖 AI MODELS

Gemini 3 Deep Think

💬 HackerNews Buzz: 257 comments 👍 LOWKEY SLAPS
🎯 AI model performance • AI model development • AI model benchmarking
💬 "Gemini 3 family LLMs are actually giving the best price-to-performance ratio""Try making models that are actually competitive, Google"
🔬 RESEARCH

Weight Decay Improves Language Model Plasticity

"The prevailing paradigm in large language model (LLM) development is to pretrain a base model, then perform further training to improve performance and model behavior. However, hyperparameter optimization and scaling laws have been studied primarily from the perspective of the base model's validatio..."
🏢 BUSINESS

Source: OpenAI disbanded its mission alignment team in recent weeks and transferred its employees; team lead Joshua Achiam will take on a “chief futurist” role

🔬 RESEARCH

Chatting with Images for Introspective Visual Thinking

"Current large vision-language models (LVLMs) typically rely on text-only reasoning based on a single-pass visual encoding, which often leads to loss of fine-grained visual information. Recently the proposal of ''thinking with images'' attempts to alleviate this limitation by manipulating images via..."
🔬 RESEARCH

Diffusion-Pretrained Dense and Contextual Embeddings

"In this report, we introduce pplx-embed, a family of multilingual embedding models that employ multi-stage contrastive learning on a diffusion-pretrained language model backbone for web-scale retrieval. By leveraging bidirectional attention through diffusion-based pretraining, our models capture com..."
🔬 RESEARCH

Beyond VLM-Based Rewards: Diffusion-Native Latent Reward Modeling

"Preference optimization for diffusion and flow-matching models relies on reward functions that are both discriminatively robust and computationally efficient. Vision-Language Models (VLMs) have emerged as the primary reward provider, leveraging their rich multimodal priors to guide alignment. Howeve..."
🛠️ SHOW HN

Show HN: Open-Source Skills for AI Agents

🛠️ SHOW HN

Show HN: AIST – 950-token protocol for preserving AI session state

🔬 RESEARCH

Conformal Prediction Sets for Instance Segmentation

"Current instance segmentation models achieve high performance on average predictions, but lack principled uncertainty quantification: their outputs are not calibrated, and there is no guarantee that a predicted mask is close to the ground truth. To address this limitation, we introduce a conformal p..."
🦆
HEY FRIENDO
CLICK HERE IF YOU WOULD LIKE TO JOIN MY PROFESSIONAL NETWORK ON LINKEDIN
🤝 LETS BE BUSINESS PALS 🤝