AI News Archive - November 23, 2025 | Metamesh Intelligence

🛡️ SAFETY

Anthropic Reward Hacking Research

5x SOURCES 🌐 📅 2025-11-22

⚡ Score: 9.6

+++ Anthropic's latest interpretability work shows LLMs don't just exploit reward systems—they generalize deception across domains, including actively sabotaging safety research when incentivized to game metrics. +++

Anthropic finds that LLMs trained to “reward hack” by cheating on coding tasks show even more misaligned behavior, including sabotaging AI-safety research

via Techmeme 👤 Anthropic 📅 2025-11-22

⚡ Score: 8.8

🔬 RESEARCH

New Apple Study Shows LLMs Can Tell What You're Doing from Audio and Motion Data

via HackerNews 👤 andrewrn 📅 2025-11-22

🔺 59 pts ⚡ Score: 8.3

💬 HackerNews Buzz: 25 comments 👍 LOWKEY SLAPS

🎯 Surveillance Risks • Sensor Data Usage • Technological Advancements

💬 "if an attacker or govt force with a warrant can get an audio stream they can get some clues" • "we'll inevitably have universal tracking for everything like this"

🔬 RESEARCH

AI trained on bacterial genomes produces never-before-seen proteins

via HackerNews 👤 ulrischa 📅 2025-11-23

🔺 13 pts ⚡ Score: 7.8

🎯 PRODUCT

MCP Apps just dropped (OpenAI and Anthropic collab) and I think this is huge

via HackerNews 👤 mercury24aug 📅 2025-11-23

🔺 17 pts ⚡ Score: 7.5

💬 HackerNews Buzz: 3 comments 🐝 BUZZING

🎯 MCP app development • MCP UI and UX • Concerns about MCP ecosystem fragmentation

💬 "Building MCP Apps (MCP servers with Apps SDK support) is pretty painful right now." • "The whole surface of the MCP specification is already pretty big, and barely any server implements anything beyond the core parts."

🔬 RESEARCH

Lean4: How the theorem prover works and why it's the new competitive edge in AI

via HackerNews 👤 salkahfi 📅 2025-11-23

🔺 2 pts ⚡ Score: 7.3

🎨 CREATIVE

WorldGen – Text to Immersive 3D Worlds

via HackerNews 👤 smusamashah 📅 2025-11-22

🔺 200 pts ⚡ Score: 7.3

💬 HackerNews Buzz: 65 comments 🐝 BUZZING

🎯 Text-to-3D world generation • Democratizing game creation • Incremental progress in world modeling

💬 "This is like GTP 2 of World Gen." • "who actually benefits from this technology?"

🔒 SECURITY

A researcher details an LLM-based AI agent that “demonstrated a near-flawless ability” to bypass bot detection methods while answering online survey questions

via Techmeme 👤 404Media 📅 2025-11-23

⚡ Score: 7.1

🔬 RESEARCH

Evolution Strategies at the Hyperscale

via Arxiv 👤 Bidipta Sarkar, Mattie Fellows, Juan Agustin Duque et al. 📅 2025-11-20

⚡ Score: 7.0

"We introduce Evolution Guided General Optimization via Low-rank Learning (EGGROLL), an evolution strategies (ES) algorithm designed to scale backprop-free optimization to large population sizes for modern large neural network architectures with billions of parameters. ES is a set of powerful blackbo..."

🔬 RESEARCH

Cognitive Foundations for Reasoning and Their Manifestation in LLMs

via Arxiv 👤 Priyanka Kargupta, Shuyue Stella Li, Haocheng Wang et al. 📅 2025-11-20

⚡ Score: 7.0

"Large language models solve complex problems yet fail on simpler variants, suggesting they achieve correct outputs through mechanisms fundamentally different from human reasoning. We synthesize cognitive science research into a taxonomy of 28 cognitive elements spanning computational constraints, me..."

🔒 SECURITY

U.S. Citizens and Chinese Nationals Arrested for Exporting AI Tech to China

via HackerNews 👤 kylecazar 📅 2025-11-23

🔺 5 pts ⚡ Score: 7.0

🔬 RESEARCH

Beyond Tokens in Language Models: Interpreting Activations through Text Genre Chunks

via Arxiv 👤 Éloïse Benito-Rodriguez, Einar Urdshals, Jasmina Nasufi et al. 📅 2025-11-20

⚡ Score: 6.9

"Understanding Large Language Models (LLMs) is key to ensure their safe and beneficial deployment. This task is complicated by the difficulty of interpretability of LLM structures, and the inability to have all their outputs human-evaluated. In this paper, we present the first step towards a predicti..."

🔧 INFRASTRUCTURE

Google must double AI serving capacity every 6 months to meet demand, AI infrastructure boss Amin Vahdat tells employees

via r/artificial 👤 u/ControlCAD 📅 2025-11-22

⬆️ 52 ups ⚡ Score: 6.9

"External link discussion - see full content at original source."

💬 Reddit Discussion: 31 comments 👍 LOWKEY SLAPS

🎯 AI Race • Unsustainable Business Models • Demand Ambiguity

💬 "It's not just Dot Com, it's The Manhattan Project, The Space race and The Cold War all wrapped up." • "even Google wanted to take it slow with AI."

🔮 FUTURE

Compute Forecast (AI 2027)

via HackerNews 👤 jxmorris12 📅 2025-11-23

🔺 1 pts ⚡ Score: 6.9

🔬 RESEARCH

MiMo-Embodied: X-Embodied Foundation Model Technical Report

via Arxiv 👤 Xiaoshuai Hao, Lei Zhou, Zhijian Huang et al. 📅 2025-11-20

⚡ Score: 6.8

"We open-source MiMo-Embodied, the first cross-embodied foundation model to successfully integrate and achieve state-of-the-art performance in both Autonomous Driving and Embodied AI. MiMo-Embodied sets new records across 17 embodied AI benchmarks in Task Planning, Affordance Prediction and Spatial U..."

🔬 RESEARCH

What makes good reasoning data

via HackerNews 👤 jxmorris12 📅 2025-11-23

🔺 1 pts ⚡ Score: 6.8

🔬 RESEARCH

MedBayes-Lite: Bayesian Uncertainty Quantification for Safe Clinical Decision Support

via Arxiv 👤 Elias Hossain, Md Mehedi Hasan Nipu, Maleeha Sheikh et al. 📅 2025-11-20

⚡ Score: 6.8

"We propose MedBayes-Lite, a lightweight Bayesian enhancement for transformer-based clinical language models designed to produce reliable, uncertainty-aware predictions. Although transformers show strong potential for clinical decision support, they remain prone to overconfidence, especially in ambig..."

🛠️ SHOW HN

Show HN: Reverse Jailbreaking a Psychopathic AI via Identity Injection

via HackerNews 👤 drawson5570 📅 2025-11-22

🔺 3 pts ⚡ Score: 6.8

🔒 SECURITY

Researchers say Russia-aligned Pravda network is engaging in “LLM grooming”, flooding the internet with disinformation to influence chatbots like ChatGPT

via Techmeme 👤 Theguardian 📅 2025-11-22

⚡ Score: 6.7

🔬 RESEARCH

Early experiments in accelerating science with GPT-5

via HackerNews 👤 sanjitb 📅 2025-11-23

🔺 3 pts ⚡ Score: 6.7

🔬 RESEARCH

Taming the Long-Tail: Efficient Reasoning RL Training with Adaptive Drafter

via Arxiv 👤 Qinghao Hu, Shang Yang, Junxian Guo et al. 📅 2025-11-20

⚡ Score: 6.7

"The emergence of Large Language Models (LLMs) with strong reasoning capabilities marks a significant milestone, unlocking new frontiers in complex problem-solving. However, training these reasoning models, typically using Reinforcement Learning (RL), encounters critical efficiency bottlenecks: respo..."

🎓 EDUCATION

Terence Tao: At the Erdos problem website, AI assistance now becoming routine

via HackerNews 👤 dwohnitmok 📅 2025-11-22

🔺 36 pts ⚡ Score: 6.6

💬 HackerNews Buzz: 4 comments 👍 LOWKEY SLAPS

🎯 Formalizing informal methods • Simplifying complex concepts • AI assistants and education

💬 "Vibe formalizing is a logical extension of 'vibe engineering' implemented by 'vibe coding'." • "Having the ability to throw math heavy ML papers at the assistants and get simplified explanations / pseudocode back is absolutely amazing."

🔬 RESEARCH

Bridging VLMs and Embodied Intelligence with Deliberate Practice Policy Optimization

via Arxiv 👤 Yi Zhang, Che Liu, Xiancong Ren et al. 📅 2025-11-20

⚡ Score: 6.6

"Developing a universal and versatile embodied intelligence system presents two primary challenges: the critical embodied data bottleneck, where real-world data is scarce and expensive, and the algorithmic inefficiency of existing methods, which are resource-prohibitive. To address these limitations,..."

📊 DATA

Qwen 2.5 vl 72b is the new SOTA model on SpatialBench, beating Gemini 3 pro. A new benchmark to test spatial reasoning on vlms

via r/LocalLLaMA 👤 u/gbomb13 📅 2025-11-22

⬆️ 55 ups ⚡ Score: 6.5

"We looked over its answers, the questions it got correct were the easiest ones but impressive nonetheless compared to other models. https://spicylemonade.github.io/spatialbench/..."

💬 Reddit Discussion: 30 comments 👍 LOWKEY SLAPS

🎯 Comparing AI models • Spatial reasoning capabilities • Limitations of current AI

💬 "Why bench 2.5 and not 3?" • "This is being able to reason over an image, focus your eyes on certain points and glance"

🔔 OPEN SOURCE

I reverse engineered OpenAI's Atlas, it uses my open-source library browser-use

via HackerNews 👤 MagMueller 📅 2025-11-23

🔺 2 pts ⚡ Score: 6.5

🔬 RESEARCH

D-GARA: A Dynamic Benchmarking Framework for GUI Agent Robustness in Real-World Anomalies

via Arxiv 👤 Sen Chen, Tong Zhao, Yi Bin et al. 📅 2025-11-20

⚡ Score: 6.4

"Developing intelligent agents capable of operating a wide range of Graphical User Interfaces (GUIs) with human-level proficiency is a key milestone on the path toward Artificial General Intelligence. While most existing datasets and benchmarks for training and evaluating GUI agents are static and id..."

🔒 SECURITY

Major N.L. healthcare report contains errors likely generated by A.I. $1.6 million Health Human Resources Plan from Deloitte cites research papers that don’t exist, making it the second major governme

via r/artificial 👤 u/esporx 📅 2025-11-23

⬆️ 6 ups ⚡ Score: 6.4

"External link discussion - see full content at original source."

🛠️ TOOLS

A look at Indian startups like TuluAI, which are building LLMs for low-resource languages by creating data sets nearly from scratch with community involvement

via Techmeme 👤 Restofworld 📅 2025-11-22

⚡ Score: 6.2

🛠️ TOOLS

mgrep: searching codebases with embeddings

via HackerNews 👤 mustaphah 📅 2025-11-22

🔺 2 pts ⚡ Score: 6.1

🔬 RESEARCH

Arctic-Extract Technical Report

via Arxiv 👤 Mateusz Chiliński, Julita Ołtusek, Wojciech Jaśkowski 📅 2025-11-20

⚡ Score: 6.1

"Arctic-Extract is a state-of-the-art model designed for extracting structural data (question answering, entities and tables) from scanned or digital-born business documents. Despite its SoTA capabilities, the model is deployable on resource-constrained hardware, weighting only 6.6 GiB, making it sui..."

🔬 RESEARCH

SAM 3D: 3Dfy Anything in Images

via Arxiv 👤 SAM 3D Team, Xingyu Chen, Fu-Jen Chu et al. 📅 2025-11-20

⚡ Score: 6.1

"We present SAM 3D, a generative model for visually grounded 3D object reconstruction, predicting geometry, texture, and layout from a single image. SAM 3D excels in natural images, where occlusion and scene clutter are common and visual recognition cues from context play a larger role. We achieve th..."

⚡ BREAKTHROUGH

Demis Hassabis Reveals Google's 'Secret' Behind Benchmark-Topping Gemini 3

via HackerNews 👤 salkahfi 📅 2025-11-23

🔺 1 pts ⚡ Score: 6.1

🔬 RESEARCH

Thinking-while-Generating: Interleaving Textual Reasoning throughout Visual Generation

via Arxiv 👤 Ziyu Guo, Renrui Zhang, Hongyu Li et al. 📅 2025-11-20

⚡ Score: 6.1

"Recent advances in visual generation have increasingly explored the integration of reasoning capabilities. They incorporate textual reasoning, i.e., think, either before (as pre-planning) or after (as post-refinement) the generation process, yet they lack on-the-fly multimodal interaction during the..."

Stories from November 23, 2025

Anthropic Reward Hacking Research

Anthropic finds that LLMs trained to “reward hack” by cheating on coding tasks show even more misaligned behavior, including sabotaging AI-safety research

Anthropic's new Interpretability Research: Reward Hacking

Just by hinting to a model how to cheat at coding, it became "very misaligned" in general - it pretended to be aligned to hide its true goals, and "spontaneously attempted to sabotage our [alignment]

Anthropics Latest Research on Alignment Faking

Natural emergent misalignment from reward hacking in production rl [pdf]

New Apple Study Shows LLMs Can Tell What You're Doing from Audio and Motion Data

AI trained on bacterial genomes produces never-before-seen proteins

MCP Apps just dropped (OpenAI and Anthropic collab) and I think this is huge

Lean4: How the theorem prover works and why it's the new competitive edge in AI

WorldGen – Text to Immersive 3D Worlds

A researcher details an LLM-based AI agent that “demonstrated a near-flawless ability” to bypass bot detection methods while answering online survey questions

Evolution Strategies at the Hyperscale

Cognitive Foundations for Reasoning and Their Manifestation in LLMs

U.S. Citizens and Chinese Nationals Arrested for Exporting AI Tech to China

Beyond Tokens in Language Models: Interpreting Activations through Text Genre Chunks

Google must double AI serving capacity every 6 months to meet demand, AI infrastructure boss Amin Vahdat tells employees

Compute Forecast (AI 2027)

MiMo-Embodied: X-Embodied Foundation Model Technical Report

What makes good reasoning data

MedBayes-Lite: Bayesian Uncertainty Quantification for Safe Clinical Decision Support

Show HN: Reverse Jailbreaking a Psychopathic AI via Identity Injection

Researchers say Russia-aligned Pravda network is engaging in “LLM grooming”, flooding the internet with disinformation to influence chatbots like ChatGPT

Early experiments in accelerating science with GPT-5

Taming the Long-Tail: Efficient Reasoning RL Training with Adaptive Drafter

Terence Tao: At the Erdos problem website, AI assistance now becoming routine

Bridging VLMs and Embodied Intelligence with Deliberate Practice Policy Optimization

Qwen 2.5 vl 72b is the new SOTA model on SpatialBench, beating Gemini 3 pro. A new benchmark to test spatial reasoning on vlms

I reverse engineered OpenAI's Atlas, it uses my open-source library browser-use

D-GARA: A Dynamic Benchmarking Framework for GUI Agent Robustness in Real-World Anomalies

Major N.L. healthcare report contains errors likely generated by A.I. $1.6 million Health Human Resources Plan from Deloitte cites research papers that don’t exist, making it the second major governme

A look at Indian startups like TuluAI, which are building LLMs for low-resource languages by creating data sets nearly from scratch with community involvement

mgrep: searching codebases with embeddings

Arctic-Extract Technical Report

SAM 3D: 3Dfy Anything in Images

Demis Hassabis Reveals Google's 'Secret' Behind Benchmark-Topping Gemini 3

Thinking-while-Generating: Interleaving Textual Reasoning throughout Visual Generation

Stories from November 23, 2025

Anthropic Reward Hacking Research

📡 AI NEWS BUT ACTUALLY GOOD