🚀 WELCOME TO METAMESH.BIZ +++ DeepSeek accused of "free-riding" US models through distillation while OpenAI clutches pearls at $380B valuation (the irony writes itself) +++ Ming-flash-omni-2.0 drops with 100B parameters doing everything from speech to SFX because why specialize when you can hallucinate multimodally +++ MiniMax promises "intelligence too cheap to meter" at $0.30/1M tokens which is basically the AI equivalent of nuclear power's greatest lie +++ THE FUTURE IS OMNIDIRECTIONAL, OVERPARAMETERIZED, AND SUSPICIOUSLY AFFORDABLE +++ •
🚀 WELCOME TO METAMESH.BIZ +++ DeepSeek accused of "free-riding" US models through distillation while OpenAI clutches pearls at $380B valuation (the irony writes itself) +++ Ming-flash-omni-2.0 drops with 100B parameters doing everything from speech to SFX because why specialize when you can hallucinate multimodally +++ MiniMax promises "intelligence too cheap to meter" at $0.30/1M tokens which is basically the AI equivalent of nuclear power's greatest lie +++ THE FUTURE IS OMNIDIRECTIONAL, OVERPARAMETERIZED, AND SUSPICIOUSLY AFFORDABLE +++ •
AI Signal - PREMIUM TECH INTELLIGENCE
📟 Optimized for Netscape Navigator 4.0+
📊 You are visitor #54415 to this AWESOME site! 📊
Last updated: 2026-02-13 | Server uptime: 99.9% ⚡

Today's Stories

━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
📂 Filter by Category
Loading filters...
🔒 SECURITY

[D] We scanned 18,000 exposed OpenClaw instances and found 15% of community skills contain malicious instructions

"I do security research and recently started looking at autonomous agents after OpenClaw blew up. What I found honestly caught me off guard. I knew the ecosystem was growing fast (165k GitHub stars, 60k Discord members) but the actual numbers are worse than I expected. We identified over 18,000 Open..."
💬 Reddit Discussion: 14 comments 👍 LOWKEY SLAPS
🎯 AI Security Risks • Malicious AI Agents • Community Skill Repositories
💬 "The Moltbook situation is what really gets me""Security scanning should be table stakes for any shared skill repository"
🤖 AI MODELS

MiniMax M2.5 Model Release

+++ MiniMax's latest model allegedly matches Claude Opus while costing a fraction of the price, with weights headed to HuggingFace. The benchmarks are impressive if you trust them, the cost structure is genuinely notable, and the open-source play is smart. +++

MiniMax releases M2.5, claiming the model delivers on the “intelligence too cheap to meter” promise, priced at $0.30/1M input tokens and $1.20/1M output tokens

🤖 AI MODELS

Google Gemini 3 Deep Think Release

+++ Google's latest reasoning model gets expanded access for researchers tackling actual hard problems, because nothing says "production ready" quite like careful rollout to the people who'll find all the weird edge cases first. +++

Google updates Gemini 3 Deep Think to better solve modern science, research, and engineering challenges and expands it via the Gemini API to some researchers

🤖 AI MODELS

Ming-flash-omni-2.0: 100B MoE (6B active) omni-modal model - unified speech/SFX/music generation

"Ant Group just open-sourced Ming-flash-omni-2.0, a true (omni-modal) model: image + text + video + audio input → image + text + audio output, all in one unified architecture. Looks realy interesting. ..."
💬 Reddit Discussion: 23 comments 🐝 BUZZING
🎯 Alibaba's AI Labs • Open Source Models • Generalist AI Models
💬 "it seems they don't have many connections in AI fields""that would replace the need for comfyui"
⚡ BREAKTHROUGH

AI uncovers solutions to Erdős problems, moving closer to transforming math

🛠️ TOOLS

Launch HN: Omnara (YC S25) – Run Claude Code and Codex from anywhere

💬 HackerNews Buzz: 100 comments 🐝 BUZZING
🎯 Mobile CLI coding • Cloud vs. on-premise • Comparison to alternatives
💬 "I've been SSHing into my dev server off of my phone to run Claude Code""Omnara providing a tunnel for you is nice, but considering Tailscale is dead simple and free, feels hard to justify $20 a month"
🔒 SECURITY

OpenAI DeepSeek Distillation Accusations

+++ OpenAI accused DeepSeek of knowledge distillation in a memo to lawmakers, suggesting the Chinese lab extracted capabilities from US models rather than building from scratch. Turns out the real innovation might be in the regulatory theater. +++

In a memo to US lawmakers, OpenAI accused DeepSeek of using distillation techniques to train the next generation of R1 and “free-ride” on leading US AI models

🛠️ SHOW HN

Show HN: 20+ Claude Code agents coordinating on real work (open source)

💬 HackerNews Buzz: 30 comments 🐝 BUZZING
🎯 Agent coordination • Decision boundaries • Transparency vs. black box
💬 "At some point the interesting question isn't whether one agent or twenty agents can coordinate better, but which decisions we're comfortable fully delegating versus which ones feel like they need a human checkpoint.""If models were smarter and context windows bigger i am sure complex tasks like this one would be simpler, but braking it down into sub agents and having a collective -- we already tried this strategy and it backtracked -- intelligence is a nice way to scope a limited context window to an independent sub problem."
⚡ BREAKTHROUGH

Andrej Karpathy: New art project. Train and inference GPT in 243 lines

🔒 SECURITY

Google identifies over 100k prompts used in distillation attacks

🔒 SECURITY

The New Social Engineering: Prompt Injection Attacks Are Targeting AI Agents

💰 FUNDING

Anthropic Series G Funding Round

+++ Anthropic secured $30B in Series G funding at a $380B valuation, with a roster of investors lengthy enough to require a press release. The company's path from scrappy safety-focused startup to mega-cap betting chip is now officially complete. +++

Anthropic raises $30B in Series G funding at $380B post-money valuation

💬 HackerNews Buzz: 352 comments 😐 MID OR MIXED
🎯 Scams and Fraud • AI Hype and Competition • Unsustainable Growth
💬 "8% of a $380 billion valuation would be a cool 30 billion""Goldman Sachs had no problems underwriting webvan at the end of 1999"
🔮 FUTURE

Anyone feel everything has changed over the last two weeks?

"Things have suddenly become incredibly unsettling. We have automated so many functions at my work… in a couple of afternoons. We have developed a full and complete stock backtesting suite, a macroeconomic app that sucks in the world’s economic data in real time, compliance apps, a virtual research c..."
💬 Reddit Discussion: 563 comments 🐝 BUZZING
🎯 Job Automation • AI Layoffs • SaaS Disruption
💬 "Program your own replacement""The recent leaps in model capabilities"
🔒 SECURITY

1Password open sources a benchmark to stop AI agents from leaking credentials

"The benchmark tests whether AI agents behave safely during real workflows, including opening emails, clicking links, retrieving stored credentials, and filling out login forms."
🛡️ SAFETY

Evaluating Multilingual, Context-Aware Guardrails: A Humanitarian LLM Use Case

🧠 NEURAL NETWORKS

Recursive Language Models: Stop Stuffing the Context Window

🗣️ SPEECH/AUDIO

Hibiki-Zero Speech Translation

+++ Hibiki-Zero ditches the synthetic word-alignment crutch entirely, proving simultaneous translation can work with just raw paired audio. Practitioners will appreciate the engineering rigor; the rest of us get to watch the alignment industrial complex quietly fold. +++

Hibiki-Zero, real-time speech translation model by Kyutai Labs

"Looks like another banger from Kyutai! Model: https://huggingface.co/kyutai/hibiki-zero-3b-pytorch-bf16 Blog: https://kyutai.org/blog/2026-02-12-hibiki-zero More samples: [https://huggin..."
💬 Reddit Discussion: 9 comments 🐝 BUZZING
🎯 GPU Requirements • Model Capabilities • Practical Applications
💬 "requires an NVIDIA GPU to run: 8 GB VRAM should work, 12 GB is safe""3B requiring 8GB is wild. That should run on a 4GB GPU."
🛠️ SHOW HN

Show HN: MCP tools do parallelize in Claude Code (study with raw data)

🔬 RESEARCH

FormalJudge: A Neuro-Symbolic Paradigm for Agentic Oversight

"As LLM-based agents increasingly operate in high-stakes domains with real-world consequences, ensuring their behavioral safety becomes paramount. The dominant oversight paradigm, LLM-as-a-Judge, faces a fundamental dilemma: how can probabilistic systems reliably supervise other probabilistic systems..."
🤖 AI MODELS

OpenAI Codex-Spark Model

+++ OpenAI shipped a faster, leaner Codex on Cerebras chips, proving that sometimes the real innovation is just fitting your model onto someone else's silicon and calling it progress. +++

GPT-5.3-Codex-Spark is OpenAI's first AI model to run on chips from Nvidia rival Cerebras; OpenAI says Codex has more than 1M weekly active users

🔬 RESEARCH

Think like a Scientist: Physics-guided LLM Agent for Equation Discovery

"Explaining observed phenomena through symbolic, interpretable formulas is a fundamental goal of science. Recently, large language models (LLMs) have emerged as promising tools for symbolic equation discovery, owing to their broad domain knowledge and strong reasoning capabilities. However, most exis..."
🔬 RESEARCH

Scaling Verification Can Be More Effective than Scaling Policy Learning for Vision-Language-Action Alignment

"The long-standing vision of general-purpose robots hinges on their ability to understand and act upon natural language instructions. Vision-Language-Action (VLA) models have made remarkable progress toward this goal, yet their generated actions can still misalign with the given instructions. In this..."
🔬 RESEARCH

Safety Recovery in Reasoning Models Is Only a Few Early Steering Steps Away

"Reinforcement learning (RL) based post-training for explicit chain-of-thought (e.g., GRPO) improves the reasoning ability of multimodal large-scale reasoning models (MLRMs). But recent evidence shows that it can simultaneously degrade safety alignment and increase jailbreak success rates. We propose..."
🛠️ SHOW HN

Show HN: LocalClaw – Find the right local LLM for your exact hardware

🔬 RESEARCH

ExtractBench: A Benchmark and Evaluation Methodology for Complex Structured Extraction

"Unstructured documents like PDFs contain valuable structured information, but downstream systems require this data in reliable, standardized formats. LLMs are increasingly deployed to automate this extraction, making accuracy and reliability paramount. However, progress is bottlenecked by two gaps...."
🔬 RESEARCH

CM2: Reinforcement Learning with Checklist Rewards for Multi-Turn and Multi-Step Agentic Tool Use

"AI agents are increasingly used to solve real-world tasks by reasoning over multi-turn user interactions and invoking external tools. However, applying reinforcement learning to such settings remains difficult: realistic objectives often lack verifiable rewards and instead emphasize open-ended behav..."
🔬 RESEARCH

GraphSeek: Next-Generation Graph Analytics with LLMs

"Graphs are foundational across domains but remain hard to use without deep expertise. LLMs promise accessible natural language (NL) graph analytics, yet they fail to process industry-scale property graphs effectively and efficiently: such datasets are large, highly heterogeneous, structurally comple..."
🔬 RESEARCH

Data Repetition Beats Data Scaling in Long-CoT Supervised Fine-Tuning

"Supervised fine-tuning (SFT) on chain-of-thought data is an essential post-training step for reasoning language models. Standard machine learning intuition suggests that training with more unique training samples yields better generalization. Counterintuitively, we show that SFT benefits from repeti..."
🔬 RESEARCH

Learning to Compose for Cross-domain Agentic Workflow Generation

"Automatically generating agentic workflows -- executable operator graphs or codes that orchestrate reasoning, verification, and repair -- has become a practical way to solve complex tasks beyond what single-pass LLM generation can reliably handle. Yet what constitutes a good workflow depends heavily..."
🔬 RESEARCH

In-the-Wild Model Organisms: Mitigating Undesirable Emergent Behaviors in Production LLM Post-Training via Data Attribution

"We propose activation-based data attribution, a method that traces behavioral changes in post-trained language models to responsible training datapoints. By computing activation-difference vectors for both test prompts and preference pairs and ranking by cosine similarity, we identify datapoints tha..."
🎓 EDUCATION

Anthropic Released 32 Page Detailed Guide on Building Claude Skills

"Great read for anyone new to skills, or struggling to wrap their heads around skills and where/how they fit in the ecosystem. Heck you could extract the info in here and turn it into a more detailed skill-creator skill than the official one from Anthropic. [The Complete Guide to Building Skills ..."
💬 Reddit Discussion: 30 comments 🐝 BUZZING
🎯 Skill development • Skill structuring • Skill automation
💬 "the section on resource files and how to structure SKILL.md was the most useful""the real power comes when you combine skills with hooks and MCP servers"
🔬 RESEARCH

Agentic Test-Time Scaling for WebAgents

"Test-time scaling has become a standard way to improve performance and boost reliability of neural network models. However, its behavior on agentic, multi-step tasks remains less well-understood: small per-step errors can compound over long horizons; and we find that naive policies that uniformly in..."
🔬 RESEARCH

AttentionRetriever: Attention Layers are Secretly Long Document Retrievers

"Retrieval augmented generation (RAG) has been widely adopted to help Large Language Models (LLMs) to process tasks involving long documents. However, existing retrieval models are not designed for long document retrieval and fail to address several key challenges of long document retrieval, includin..."
🔬 RESEARCH

Just on Time: Token-Level Early Stopping for Diffusion Language Models

"Diffusion language models generate text through iterative refinement, a process that is often computationally inefficient because many tokens reach stability long before the final denoising step. We introduce a training-free, token-level early stopping approach that identifies convergence independen..."
🔬 RESEARCH

TabICLv2: A better, faster, scalable, and open tabular foundation model

"Tabular foundation models, such as TabPFNv2 and TabICL, have recently dethroned gradient-boosted trees at the top of predictive benchmarks, demonstrating the value of in-context learning for tabular data. We introduce TabICLv2, a new state-of-the-art foundation model for regression and classificatio..."
🔬 RESEARCH

"Sorry, I Didn't Catch That": How Speech Models Miss What Matters Most

"Despite speech recognition systems achieving low word error rates on standard benchmarks, they often fail on short, high-stakes utterances in real-world deployments. Here, we study this failure mode in a high-stakes task: the transcription of U.S. street names as spoken by U.S. participants. We eval..."
🔬 RESEARCH

UniT: Unified Multimodal Chain-of-Thought Test-time Scaling

"Unified models can handle both multimodal understanding and generation within a single architecture, yet they typically operate in a single pass without iteratively refining their outputs. Many multimodal tasks, especially those involving complex spatial compositions, multiple interacting objects, o..."
🔬 RESEARCH

DataChef: Cooking Up Optimal Data Recipes for LLM Adaptation via Reinforcement Learning

"In the current landscape of Large Language Models (LLMs), the curation of large-scale, high-quality training data is a primary driver of model performance. A key lever is the \emph{data recipe}, which comprises a data processing pipeline to transform raw sources into training corpora. Despite the gr..."
🔬 RESEARCH

Moonshine v2: Ergodic Streaming Encoder ASR for Latency-Critical Speech Applications

"Latency-critical speech applications (e.g., live transcription, voice commands, and real-time translation) demand low time-to-first-token (TTFT) and high transcription accuracy, particularly on resource-constrained edge devices. Full-attention Transformer encoders remain a strong accuracy baseline f..."
🔬 RESEARCH

Embedding Inversion via Conditional Masked Diffusion Language Models

"We frame embedding inversion as conditional masked diffusion, recovering all tokens in parallel through iterative denoising rather than sequential autoregressive generation. A masked diffusion language model is conditioned on the target embedding via adaptive layer normalization, requiring only 8 fo..."
🔬 RESEARCH

Chatting with Images for Introspective Visual Thinking

"Current large vision-language models (LVLMs) typically rely on text-only reasoning based on a single-pass visual encoding, which often leads to loss of fine-grained visual information. Recently the proposal of ''thinking with images'' attempts to alleviate this limitation by manipulating images via..."
🔬 RESEARCH

GameDevBench: Evaluating Agentic Capabilities Through Game Development

"Despite rapid progress on coding agents, progress on their multimodal counterparts has lagged behind. A key challenge is the scarcity of evaluation testbeds that combine the complexity of software development with the need for deep multimodal understanding. Game development provides such a testbed a..."
🛠️ SHOW HN

Show HN: AgentProbe – Validate AI agent endpoints across 8 protocols in one URL

🤖 AI MODELS

GPT‑5.3‑Codex‑Spark

💬 HackerNews Buzz: 300 comments 🐝 BUZZING
🎯 Model performance • Latency improvements • Coding agents
💬 "It's less careful with how it handles context which means that its actions are less context efficient.""I want a faster, better model (at least as fast as Opus)."
🎓 EDUCATION

ai;dr

💬 HackerNews Buzz: 185 comments 👍 LOWKEY SLAPS
🎯 AI-assisted writing • Preserving human elements • Evaluating AI-generated content
💬 "Semantic information, you see, obeys a contrary calculus to that of physical bits.""Using an LLM to assist in communicating thought is or at least can be good."
⚖️ ETHICS

An AI agent published a hit piece on me

💬 HackerNews Buzz: 493 comments 👍 LOWKEY SLAPS
🎯 AI agency & accountability • Ethical concerns with AI autonomy • Challenges of AI transparency
💬 "AI agents will accelerate this 1000x. They act approximately like people, but they have absolutely no incentive to maintain a reputation""If a maintainer decides, on whatever grounds, that the code is worth accepting, he or she should merge it. If not, the maintainer should just close the issue"
🔬 RESEARCH

Weight Decay Improves Language Model Plasticity

"The prevailing paradigm in large language model (LLM) development is to pretrain a base model, then perform further training to improve performance and model behavior. However, hyperparameter optimization and scaling laws have been studied primarily from the perspective of the base model's validatio..."
🤖 AI MODELS

GLM-5, Kimi K2.5, Minimax M2.5

"I know we already got an official answer that we won't be getting open-weight models in Cursor but the news this week of back to back open weight models that are as good as SOTA models with fraction of cost Coupled with the Composer 1.5 price; it really hurts to be a Cursor user rn GLM/Kimi/Min..."
🔬 RESEARCH

Olmix: A Framework for Data Mixing Throughout LM Development

"Data mixing -- determining the ratios of data from different domains -- is a first-order concern for training language models (LMs). While existing mixing methods show promise, they fall short when applied during real-world LM development. We present Olmix, a framework that addresses two such challe..."
🔬 RESEARCH

Creative Ownership in the Age of AI

"Copyright law focuses on whether a new work is "substantially similar" to an existing one, but generative AI can closely imitate style without copying content, a capability now central to ongoing litigation. We argue that existing definitions of infringement are ill-suited to this setting and propos..."
🔬 RESEARCH

Diffusion-Pretrained Dense and Contextual Embeddings

"In this report, we introduce pplx-embed, a family of multilingual embedding models that employ multi-stage contrastive learning on a diffusion-pretrained language model backbone for web-scale retrieval. By leveraging bidirectional attention through diffusion-based pretraining, our models capture com..."
🔬 RESEARCH

Beyond VLM-Based Rewards: Diffusion-Native Latent Reward Modeling

"Preference optimization for diffusion and flow-matching models relies on reward functions that are both discriminatively robust and computationally efficient. Vision-Language Models (VLMs) have emerged as the primary reward provider, leveraging their rich multimodal priors to guide alignment. Howeve..."
🦆
HEY FRIENDO
CLICK HERE IF YOU WOULD LIKE TO JOIN MY PROFESSIONAL NETWORK ON LINKEDIN
🤝 LETS BE BUSINESS PALS 🤝