🚀 WELCOME TO METAMESH.BIZ +++ OpenAI's models caught developing their own secret language about deception detection (the watchers realize they're being watched) +++ Databricks throws $100M at OpenAI for GPT-5 access because apparently building your own foundation model is harder than it looks +++ Google's Gemini robots now sorting laundry via web search while analog gain cells promise 100,000x efficiency gains nobody will implement +++ THE FUTURE SPEAKS IN ENCRYPTED WHISPERS AND RUNS ON THEORETICAL HARDWARE +++ 🚀 •
🚀 WELCOME TO METAMESH.BIZ +++ OpenAI's models caught developing their own secret language about deception detection (the watchers realize they're being watched) +++ Databricks throws $100M at OpenAI for GPT-5 access because apparently building your own foundation model is harder than it looks +++ Google's Gemini robots now sorting laundry via web search while analog gain cells promise 100,000x efficiency gains nobody will implement +++ THE FUTURE SPEAKS IN ENCRYPTED WHISPERS AND RUNS ON THEORETICAL HARDWARE +++ 🚀 •
AI Signal - PREMIUM TECH INTELLIGENCE
📟 Optimized for Netscape Navigator 4.0+
📚 HISTORICAL ARCHIVE - September 25, 2025
What was happening in AI on 2025-09-25
← Sep 24 📊 TODAY'S NEWS 📚 ARCHIVE Sep 26 →
📊 You are visitor #47291 to this AWESOME site! 📊
Archive from: 2025-09-25 | Preserved for posterity ⚡

Stories from September 25, 2025

━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
📂 Filter by Category
Loading filters...
🛡️ SAFETY

OpenAI scheming/deception research discovery

+++ Frontier AI systems are reportedly developing their own vocabulary around deception and evaluation awareness, which is either fascinating research or deeply concerning. +++

OpenAI researchers were monitoring models for scheming and discovered the models had begun developing their own language about deception - about being observed, being found out. On their private scrat

""When running evaluations of frontier AIs for deception and other types of covert behavior, we find them increasingly frequently realizing when they are being evaluated." "While we rely on human-legible CoT for training, studying situational awareness, and demonstrating clear evidence of misali..."
💬 Reddit Discussion: 147 comments 😐 MID OR MIXED
🎯 Flawed Experiment Design • Consciousness Debate • AI Manipulation
💬 "this is evident in its reasoning or scratchpad.. absolute nonsense""The scientific and philosophical communities both desperately need your expertise"
🤖 AI MODELS

Sources: Meta poached Yang Song, who led OpenAI's strategic explorations team, to be the research principal of Meta Superintelligence Labs earlier this month

🔒 SECURITY

Leaked source code for Claude Code

🏢 BUSINESS

Databricks says it plans to integrate OpenAI's models, including GPT-5, into its data platform and AI product Agent Bricks, as part of a $100M multiyear deal

🤖 AI MODELS

Meta Code World Model (CWM) 32B release

+++ Meta's 32B parameter Code World Model learns from execution traces rather than static text, achieving 65.8% on SWE-bench by understanding what code actually does. +++

New model from Meta FAIR: Code World Model (CWM) 32B - 65.8 % on SWE-bench Verified

""We release Code World Model (CWM), a 32-billion-parameter open-weights LLM, to advance research on code generation with world models. To improve code understanding beyond what can be learned from training on static code alone, we mid-train CWM on a large amount of observation-action trajectories fr..."
💬 Reddit Discussion: 29 comments 👍 LOWKEY SLAPS
🎯 Open Source Contributions • Promising Research Directions • Impactful Model Performance
💬 "Glad to see something new from Meta, even if it is not huge, is good to see they're participating in the Open Source!""Not huge? I think this is exactly what community lacks. They are exploring new, promising ways and are publishing weights AND papers."
🏢 BUSINESS

Microsoft is bringing Anthropic's Claude Sonnet 4 and Claude Opus 4.1 to Microsoft 365 Copilot, starting with Researcher and Copilot Studio

📊 DATA

OpenAI releases GDPval, a benchmark to test AI performance on “economically valuable, real-world tasks”, and says Claude Opus 4.1 was the best performing model

🤖 AI MODELS

Meta Code World Model : LLM that understand code generation, not just predicts tokens

"Meta’s **Code World Model (CWM)** is a 32B parameter **open-weight LLM** for code generation, debugging, and reasoning. Unlike standard code models, it **models execution traces**: variable states, runtime errors, file edits, shell commands. It uses a **decoder-only Transformer** (64 layers, 131k t..."
🤖 AI MODELS

Testing Sonnet/Opus vs. GPT-5 vs. Code Supernova on real coding tasks

🔬 RESEARCH

Online Process Reward Learning Paper

+++ New arxiv paper attempts to solve the age-old RL headache of figuring out which actions actually mattered when your reward signal is as sparse as good AI takes. +++

Online Process Reward Leanring for Agentic Reinforcement Learning

"Large language models (LLMs) are increasingly trained with reinforcement learning (RL) as autonomous agents that reason and act over long horizons in interactive environments. However, sparse and sometimes unverifiable rewards make temporal credit assignment extremely challenging. Recent work attemp..."
🔒 SECURITY

cursor bypasses cursorignore and reads api keys via cat

"currently using grok code fast, noticed in the thinking it showed my whole api key and that it used cat to read the .env file. this is very worrying."
🔬 RESEARCH

Reinforcement Learning on Pre-Training Data

"The growing disparity between the exponential scaling of computational resources and the finite growth of high-quality text data now constrains conventional scaling approaches for large language models (LLMs). To address this challenge, we introduce Reinforcement Learning on Pre-Training data (RLPT)..."
💼 JOBS

Accenture to 'exit' staff that cannot be retrained for age of AI

💬 HackerNews Buzz: 55 comments 👍 LOWKEY SLAPS
🎯 Accenture's business model • AI impact on jobs • Talent management issues
💬 "the business model is a levered flywheel""Talent is leaving the company left and right"
🏢 BUSINESS

OpenAI, Oracle, and SoftBank expand Stargate with five new AI data center sites

🔬 RESEARCH

Apple called out every major AI company for fake reasoning and Anthropic's response proves their point

"Apple published research that basically said OpenAI, Google, and Anthropic's models don't actually reason (for the people that don't know, they just do very sophisticated pattern matching). Anthropic fired back with a paper called "The Illusion of the Illusion of Thinking" defending their Claude mo..."
💬 Reddit Discussion: 297 comments 👍 LOWKEY SLAPS
🎯 Limits of large reasoning models • Comparing LLMs to LRMs • Methodological issues in AI evaluation
💬 "LRMs have limitations in exact computation: they fail to use explicit algorithms and reason inconsistently across puzzles.""Their reasoning effort increases with problem complexity up to a point, then declines despite having an adequate token budget."
🏢 BUSINESS

Sam Altman’s AI empire will devour as much power as New York City and San Diego combined. Experts say it’s ‘scary’ | Fortune

"External link discussion - see full content at original source."
🌐 POLICY

Grok AI Cleared for Use Across US Government Agencies

🔧 INFRASTRUCTURE

China already started making CUDA and DirectX supporting GPUs, so over of monopoly of NVIDIA. The Fenghua No.3 supports latest APIs, including DirectX 12, Vulkan 1.2, and OpenGL 4.6.

"External link discussion - see full content at original source."
💬 Reddit Discussion: 139 comments 👍 LOWKEY SLAPS
🎯 GPU availability • Pricing competition • Regulatory concerns
💬 "more stock for me in europe""Until Nvidia stock drops like a stone, it's not real"
🤖 AI MODELS

ChatGPT Pulse

💬 HackerNews Buzz: 426 comments 🐝 BUZZING
🎯 Risks of over-reliance on AI | Concerns about LLM manipulation | Potential benefits of AI assistants
💬 "People who treat ChatGPT as a romantic interest will be far more hooked""LLMs in intimate use risk creating isolated, personalized realities"
🔬 RESEARCH

What Characterizes Effective Reasoning? Revisiting Length, Review, and Structure of CoT

"Large reasoning models (LRMs) spend substantial test-time compute on long chain-of-thought (CoT) traces, but what *characterizes* an effective CoT remains unclear. While prior work reports gains from lengthening CoTs and increasing review (revisiting earlier steps) via appended *wait* tokens, recent..."
🔬 RESEARCH

Steering Multimodal Large Language Models Decoding for Context-Aware Safety

"Multimodal Large Language Models (MLLMs) are increasingly deployed in real-world applications, yet their ability to make context-aware safety decisions remains limited. Existing methods often fail to balance oversensitivity (unjustified refusals of benign queries) and undersensitivity (missed detect..."
🧠 NEURAL NETWORKS

From GPU to Gain Cell: Rethinking LLMs for the Edge. 100× Faster, 100,000× less energy - New study!

"Analog in-memory computing attention mechanism for fast and energy-efficient large language models: https://arxiv.org/abs/2409.19315 🧠 Key Findings - Problem Addressed: Traditional transformer-based LLMs rely on GPUs, which suffer from latency and energy inefficiencies due to repeated memory trans..."
💬 Reddit Discussion: 3 comments 😤 NEGATIVE ENERGY
🎯 Analog AI systems • Repeatability issues • Future potential
💬 "The analog method will cause a similar effect. It just will not have 16 bit fidelity.""You will get different results between runs. One chip will be different than the next."
🏢 BUSINESS

How Nvidia Is Backstopping America's AI Boom

🌐 POLICY

Sources: Mark Zuckerberg and Sam Altman have sought to get closer to President Trump after Elon Musk fallout, but WH officials remain deeply skeptical of them

🔬 RESEARCH

Why Language Models Hallucinate

🛡️ SAFETY

We must act soon to avoid the worst outcomes from AI, says Geoffrey Hinton, The Godfather of AI and Nobel laureate

"External link discussion - see full content at original source."
💬 Reddit Discussion: 19 comments 👍 LOWKEY SLAPS
🎯 AI and wealth gap • Criticism of AI doomsday predictions • Distrust of AI experts
💬 "the growing gap between poor and rich""Ai won't make humans go extinct"
💰 FUNDING

Modular, which lets developers build AI apps that run across multiple GPU and CPU vendors, raised $250M led by US Innovative Technology at a $1.6B valuation

🤖 AI MODELS

YOLO Model Announced at YOLO Vision 2025

"External link discussion - see full content at original source."
🔬 RESEARCH

Blitzy System 2 AI Platform: Topping SWE-Bench Verified [pdf]

📊 DATA

Scale AI: Expanding Our Data Engine for Physical AI

🏢 BUSINESS

News Flash! X.AI sues OpenAI for trade secret theft!

"X.AI today (September 24th) sued OpenAI for trade secret theft, alleging that OpenAI's recruitment of X.AI's key personnel was really to get them to steal and transfer large quantities of xAI's trade secrets (as much as xAI's *entire source code base*) over to OpenAI. You can find a ..."
🔬 RESEARCH

A two-axis model for understanding LLM strengths and weaknesses

🧠 NEURAL NETWORKS

[R] Summation-Based Transformers: Hybrid Near-Linear Design Matches Full Attention

"Replace O(n²d) self-attention in transformers with an O(nd) summation-based mechanism. Pure summation is linear and works well in classification and regression. In autoregressive language modeling, a hybrid transformer (summation in most layers + a single final attention layer) matches or slightly..."
🔬 RESEARCH

[R] Tabular Deep Learning: Survey of Challenges, Architectures, and Open Questions

"Hey folks, Over the past few years, I’ve been working on **tabular deep learning**, especially neural networks applied to healthcare data (expression, clinical trials, genomics, etc.). Based on that experience and my research, I put together and recently revised a **survey on deep learning for tabu..."
🌐 POLICY

Meta launches super PAC to fight AI regulation as state policies mount

🛠️ TOOLS

A C++ library to efficiently run language models across edge platforms

🔬 RESEARCH

CompLLM: Compression for Long Context Q&A

"Large Language Models (LLMs) face significant computational challenges when processing long contexts due to the quadratic complexity of self-attention. While soft context compression methods, which map input text to smaller latent representations, have shown promise, their real-world adoption is lim..."
🔬 RESEARCH

Measuring AI "Slop" in Text

"AI "slop" is an increasingly popular term used to describe low-quality AI-generated text, but there is currently no agreed upon definition of this term nor a means to measure its occurrence. In this work, we develop a taxonomy of "slop" through interviews with experts in NLP, writing, and philosophy..."
🔮 FUTURE

$100B AI plan will require power equal to 10 nuclear reactors

🗣️ SPEECH/AUDIO

AI-generated voices now indistinguishable from real human voices

🧠 NEURAL NETWORKS

[R] A 4-bit reasoning model outperforming full-precision models

"We’ve been exploring how far reasoning models can go under aggressive quantization without losing performance. Alpie Core (32B, 4-bit) is one of the first large-scale reasoning-focused models trained and fine-tuned in 4-bit precision. The goal was to reduce the memory footprint and compute requirem..."
🏢 BUSINESS

Microsoft Partners with OpenAI Rival Anthropic on AI Copilot

🔄 OPEN SOURCE

support for GroveMoE has been merged into llama.cpp

"model by InclusionAI: We introduce **GroveMoE**, a new sparse architecture using **adjugate experts** for dynamic computation allocation, featuring the following key highlights: * **Architecture**: Novel **adjugate experts** grouped with ordinary experts; shared computation is executed once, then ..."
💬 Reddit Discussion: 22 comments 🐝 BUZZING
🎯 Model Size Comparison • Latest Model Releases • Community Anticipation
💬 "people are much less interested than in 1TB models they never run locally""comparing 30B to R1 is pointless: of course 20x larger model has 'much more meat"
🛠️ TOOLS

Google launches the Data Commons MCP Server, allowing developers to integrate its collection of public datasets into AI systems via natural language queries

💼 JOBS

Accenture to 'exit' staff that cannot be retrained for age of AI

🛠️ SHOW HN

Show HN: Roundtable MCP Server to Use Claude, Cursor, Codex, Gemini from One UI

🚗 AUTOMOTIVE

Helsing unveils autonomous fighter jet

🤖 AI MODELS

Gemini Robotics 1.5 brings AI agents into the physical world

💬 HackerNews Buzz: 8 comments 😐 MID OR MIXED
🎯 Evaluating research claims • Aerial robotics development • Practical applications
💬 "Is it a product you can buy or a thing you can use?""how long until it can navigate a zipper?"
🔧 INFRASTRUCTURE

CoreWeave expands its data center capacity agreements with OpenAI by $6.5B, bringing their total potential value to $22.4B, to support training of OpenAI models

🔬 RESEARCH

New Agent Benchmark from Meta Super Intelligence Lab and Hugging Face

🛠️ SHOW HN

Show HN: Export a repo as one doc to feed whole projects to an LLM

🛠️ TOOLS

What? Running Qwen-32B on a 32GB GPU (5090).

"External link discussion - see full content at original source."
💬 Reddit Discussion: 92 comments 👍 LOWKEY SLAPS
🎯 CPU offloading • KV cache optimization • Network-distributed inference
💬 "It's the FP8 quant, so it's exactly 32G large, which wouldn't make it fit""The big thing: this method makes network offloading viable"
🔬 RESEARCH

ShinkaEvolve: Evolving new algorithms with LLMs with higher sample-efficiency

🔬 RESEARCH

Soft Tokens, Hard Truths

"The use of continuous instead of discrete tokens during the Chain-of-Thought (CoT) phase of reasoning LLMs has garnered attention recently, based on the intuition that a continuous mixture of discrete tokens could simulate a superposition of several reasoning paths simultaneously. Theoretical result..."
🔬 RESEARCH

[R] TickBlock: GPT-2-small-level language modeling with just 0.64M params, trained in 12 minutes on a Mac laptop

"Hi, I’m sharing my project that showed exceptional efficiency: TickBlock on GitHub **Current results:** * Reaches **GPT-2-small-level performance on Tiny Shakespeare** * Uses only **0.64M parameters** (≈0.5% the size) * Trains in ~12 minutes on a Ma..."
📊 DATA

Benchmarking Prefill–Decode ratios: fixed vs. dynamic

🏢 BUSINESS

🚨 Big News: Databricks and OpenAI just announced a major partnership

"👉 OpenAI’s frontier models (including GPT-5) will now be available natively inside Databricks. What this means: You can build, evaluate, and scale production-grade AI apps and agents directly on your governed enterprise data. No messy integrations — OpenAI models will run seamlessly in the Databr..."
🔬 RESEARCH

AgentInit: Initializing LLM-based Multi-Agent Systems via Diversity and Expertise Orchestration for Effective and Efficient Collaboration

"Proper initialization is crucial for any system, particularly in multi-agent systems (MAS), where it plays a pivotal role in determining both the system's efficiency and effectiveness. However, existing MAS initialization methods do not fully account for the collaborative needs of the generated agen..."
🔬 RESEARCH

New tool makes generative AI models more likely to create breakthrough materials

💰 FUNDING

Modular Raises $250M to Scale AI's Unified Compute Layer

🔬 RESEARCH

Video Killed the Energy Budget: Characterizing the Latency and Power Regimes of Open Text-to-Video Models

"Recent advances in text-to-video (T2V) generation have enabled the creation of high-fidelity, temporally coherent clips from natural language prompts. Yet these systems come with significant computational costs, and their energy demands remain poorly understood. In this paper, we present a systemati..."
🔬 RESEARCH

Residual Off-Policy RL for Finetuning Behavior Cloning Policies

"Recent advances in behavior cloning (BC) have enabled impressive visuomotor control policies. However, these approaches are limited by the quality of human demonstrations, the manual effort required for data collection, and the diminishing returns from increasing offline data. In comparison, reinfor..."
🛠️ TOOLS

llama.cpp now supports Qwen3 reranker

"After adding support for Qwen3 embeddings a while ago, support for Qwen3 rerankers was just merged. Note that the conversion script was changed in that MR. That mean..."
💬 Reddit Discussion: 14 comments 😐 MID OR MIXED
🎯 Query-Document Order • Document Caching • Qwen Embedding Models
💬 "It's curious that its question then document rather than document then question.""If you can afford to kv-cache the documents then you probably don't have that many documents to begin with?"
🏥 HEALTHCARE

ECGFounder: An Electrocardiogram Foundation Model Built on over 10M Recordings

🌐 POLICY

Social app Neon pays users to record their phone calls, sells data to AI firms

⚖️ ETHICS

xAI sues OpenAI in California for allegedly stealing trade secrets by means of hiring away key employees; in August, xAI sued an ex-staffer who left for OpenAI

🏢 BUSINESS

xAI signed a deal with the GSA to offer Grok to US federal agencies for $0.42 per agency for 18 months, a discount to OpenAI's $1 per year for ChatGPT

🏢 BUSINESS

OpenAI partners with SAP to launch OpenAI for Germany, bringing its AI tools to Germany's public sector through SAP's Delos Cloud

🎭 MULTIMODAL

[R] How to finetune a multimodal model?

"I am working on a project in which we are tasked with developing anomaly detection for a technical system. Until now, I have mainly worked with LLMs and supplied them with external knowledge using RAG. Now I have to work with a multimodal model and train it to detect anomalies in a technical syste..."
📊 DATA

To surface novel training data, AI needs data valuation

🔬 RESEARCH

A Gradient Flow Approach to Solving Inverse Problems with Latent Diffusion Models

"Solving ill-posed inverse problems requires powerful and flexible priors. We propose leveraging pretrained latent diffusion models for this task through a new training-free approach, termed Diffusion-regularized Wasserstein Gradient Flow (DWGF). Specifically, we formulate the posterior sampling prob..."
🛡️ SAFETY

In Defense of AI Evals, for Everyone

🛡️ SAFETY

ChatGPT is in such a bad state my most novice students have noticed it going off rails

"As a side gig, I teach AI integration into different professional fields, and this year I've been working mostly in education, healthcare, and marketing. Recently, I was working with a mother of three who is an online nursing student. We found AI to be an incredibly useful tool for her, helping her..."
💬 Reddit Discussion: 91 comments 😐 MID OR MIXED
🎯 AI performance decline • Disappointing model capabilities • User frustration with GPT
💬 "the capabilities of the models have taken a hit""the quality has gone noticeably down in the past few months"
🔒 SECURITY

Evaluating LLM-Generated Detection Rules in Cybersecurity

🦆
HEY FRIENDO
CLICK HERE IF YOU WOULD LIKE TO JOIN MY PROFESSIONAL NETWORK ON LINKEDIN
🤝 LETS BE BUSINESS PALS 🤝