AI News Archive - November 20, 2025 | Metamesh Intelligence

⚡ BREAKTHROUGH

Meta Segment Anything Model 3 Release

4x SOURCES 🌐 📅 2025-11-19

⚡ Score: 8.7

+++ Meta upgraded Segment Anything from "click pixels" to "describe what you want" across images and video, proving that foundation models work better when you stop making users think like programmers. +++

Meta Segment Anything Model 3

via HackerNews 👤 lukeinator42 📅 2025-11-19

🔺 80 pts ⚡ Score: 9.2

💬 HackerNews Buzz: 20 comments 🐝 BUZZING

🎯 Rapid prototyping • Distillation • Computer vision breakthroughs

💬 "This feels like a seminal moment for computer vision." • "It feels really magical to go from an unlabeled video to a fine-tuned realtime segmentation model with minimal human intervention in just a few minutes."

[R] Segment Anything Model 3 (SAM 3) is released

via r/MachineLearning 👤 u/KateSaenko 📅 2025-11-19

⬆️ 128 ups ⚡ Score: 8.0

"**Abstract**: *We present Segment Anything Model (SAM) 3, a unified model that detects, segments, and tracks objects in images and videos based on concept prompts, which we define as either short noun phrases (e.g., “yellow school bus”), image exemplars, or a combination of both. Promptable Concept ..."

💬 Reddit Discussion: 19 comments 🐝 BUZZING

🎯 Model Evolution • Prompting Capabilities • Tracking Performance

💬 "It's a shame that Meta laid off some of the people on this team." • "Insane how fast SAM is evolving."

SAM3 is out. You prompt images and video with text for pixel perfect segmentation.

via r/computervision 👤 u/RandomForests92 📅 2025-11-19

⬆️ 239 ups ⚡ Score: 8.0

"\- code: https://github.com/facebookresearch/sam3..."

💬 Reddit Discussion: 23 comments 🐝 BUZZING

🎯 Rapid prototyping and distillation • Transformative potential of SAM3 • Challenges of deploying SAM3

💬 "This feels like a seminal moment for computer vision." • "You can use the big, powerful, expensive SAM3 model to create a dataset to train the small, fast, cheap RF-DETR model."

Meta AI Releases Segment Anything Model 3 (SAM 3) for Promptable Concept Segmentation in Images and Videos

via r/computervision 👤 u/ai-lover 📅 2025-11-20

⬆️ 1 ups ⚡ Score: 7.0

"Meta’s Segment Anything Model 3 (SAM 3) is a 848M parameter vision foundation model that upgrades Segment Anything from promptable visual segmentation to Promptable Concept Segmentation, unifying image and video detection, segmentation and tracking from text prompts, exemplars, points and boxes. Tra..."

🤖 AI MODELS

Agentic systems redraw the Pareto frontier on ARC-AGI

via HackerNews 👤 gkapur 📅 2025-11-20

🔺 1 pts ⚡ Score: 8.3

🧠 NEURAL NETWORKS

Understanding neural networks through sparse circuits – OpenAI

via HackerNews 👤 JnBrymn 📅 2025-11-19

🔺 7 pts ⚡ Score: 7.9

🛡️ SAFETY

A study of teen mental health chatbot conversations: ChatGPT, Claude, Gemini, and Meta AI often failed to recognize signs of conditions and gave general advice

via Techmeme 👤 Wsj 📅 2025-11-20

⚡ Score: 7.6

🔄 OPEN SOURCE

Your local LLM agents can be just as good as closed-source models - I open-sourced Stanford's ACE framework that makes agents learn from mistakes

via r/LocalLLaMA 👤 u/cheetguy 📅 2025-11-20

⬆️ 35 ups ⚡ Score: 7.6

"I implemented Stanford's Agentic Context Engineering paper. The framework makes agents learn from their own execution feedback through in-context learning instead of fine-tuning. **How it works:** Agent runs task → reflects on what worked/failed → curates strate..."

🔒 SECURITY

[R] Privacy Preserving In-Context-Learning Framework for Large Language Models

via r/MachineLearning 👤 u/manoja328 📅 2025-11-20

⬆️ 7 ups ⚡ Score: 7.5

"**AMA (I am one of the authors ), Accepted to AAAI 2026** https://preview.redd.it/2yj3cnvfnb2g1.png?width=1696&format=png&auto=webp&s=0ba33ababfc633e3f7efbc15f5c4dc2b9b1ac6b6 Large Language Models (LLMs) do not inherently preserve privacy during inference. Their outputs can inadvertent..."

🌐 POLICY

White House drafts order directing Justice Department to sue states that pass AI regulations

via r/artificial 👤 u/MetaKnowing 📅 2025-11-20

⬆️ 4 ups ⚡ Score: 7.4

"External link discussion - see full content at original source."

🤖 AI MODELS

Gemini co-lead Oriol Vinyals says Gemini 3's gains come from better pre-training and post-training, contradicting the idea that pre-training gains are falling

via Techmeme 👤 Theinformation 📅 2025-11-19

⚡ Score: 7.3

🤖 AI MODELS

Allen Institute for AI, or Ai2, unveils Olmo 3 models that it says outperform open models like Stanford's Marin and commercial open-weight models like Llama 3.1

via Techmeme 👤 Geekwire 📅 2025-11-20

⚡ Score: 7.3

⚡ BREAKTHROUGH

Act-1: A Robot Foundation Model Trained on Zero Robot Data

via HackerNews 👤 pr337h4m 📅 2025-11-19

🔺 3 pts ⚡ Score: 7.2

💼 JOBS

Devin's 2025 Performance Review: Learnings from 18 Months of Agents at Work

via HackerNews 👤 gk1 📅 2025-11-19

🔺 1 pts ⚡ Score: 7.1

⚡ BREAKTHROUGH

Sam 3D: Powerful 3D Reconstruction for Physical World Images

via HackerNews 👤 meetpateltech 📅 2025-11-19

🔺 12 pts ⚡ Score: 7.0

🔬 RESEARCH

ARC Is a Vision Problem!

via Arxiv 👤 Keya Hu, Ali Cy, Linlu Qiu et al. 📅 2025-11-18

⚡ Score: 7.0

"The Abstraction and Reasoning Corpus (ARC) is designed to promote research on abstract reasoning, a fundamental aspect of human intelligence. Common approaches to ARC treat it as a language-oriented problem, addressed by large language models (LLMs) or recurrent reasoning models. However, although t..."

🔬 RESEARCH

When to Think and When to Look: Uncertainty-Guided Lookback

via Arxiv 👤 Jing Bi, Filippos Bellos, Junjia Guo et al. 📅 2025-11-19

⚡ Score: 7.0

"Test-time thinking (that is, generating explicit intermediate reasoning chains) is known to boost performance in large language models and has recently shown strong gains for large vision language models (LVLMs). However, despite these promising results, there is still no systematic analysis of how..."

🔬 RESEARCH

Parallel Loop Transformer for Efficient Test-Time Computation Scaling

via HackerNews 👤 PaulHoule 📅 2025-11-20

🔺 3 pts ⚡ Score: 7.0

🔬 RESEARCH

The Impact of Quantization on Large Reasoning Model Reinforcement Learning

via Arxiv 👤 Medha Kumar, Zifei Xu, Xin Wang et al. 📅 2025-11-19

⚡ Score: 6.9

"Strong reasoning capabilities can now be achieved by large-scale reinforcement learning (RL) without any supervised fine-tuning. Although post-training quantization (PTQ) and quantization-aware training (QAT) are well studied in the context of fine-tuning, how quantization impacts RL in large reason..."

🛠️ TOOLS

Code Execution Mode

via r/claudeai 👤 u/d4v3y0rk 📅 2025-11-20

⬆️ 6 ups ⚡ Score: 6.9

"I implemented the code execution mode that Anthropic talked about in a recent blog post. Here is how it works. Basically I build a docker container with Claude code and a configured MCP server inside it. I had Claude create a wrapper.py script that essentially accepts TCP or http connection and us..."

🔬 RESEARCH

Computer-Use Agents as Judges for Generative User Interface

via Arxiv 👤 Kevin Qinghong Lin, Siyuan Hu, Linjie Li et al. 📅 2025-11-19

⚡ Score: 6.9

"Computer-Use Agents (CUA) are becoming increasingly capable of autonomously operating digital environments through Graphical User Interfaces (GUI). Yet, most GUI remain designed primarily for humans--prioritizing aesthetics and usability--forcing agents to adopt human-oriented behaviors that are unn..."

🔬 RESEARCH

MoDES: Accelerating Mixture-of-Experts Multimodal Large Language Models via Dynamic Expert Skipping

via Arxiv 👤 Yushi Huang, Zining Wang, Zhihang Yuan et al. 📅 2025-11-19

⚡ Score: 6.8

"Mixture-of-Experts (MoE) Multimodal large language models (MLLMs) excel at vision-language tasks, but they suffer from high computational inefficiency. To reduce inference overhead, expert skipping methods have been proposed to deactivate redundant experts based on the current input tokens. However,..."

🔬 RESEARCH

A Specialized Large Language Model for Clinical Reasoning and Diagnosis in Rare Diseases

via Arxiv 👤 Tao Yang, Dandan Huang, Yunting Lin et al. 📅 2025-11-18

⚡ Score: 6.8

"Rare diseases affect hundreds of millions worldwide, yet diagnosis often spans years. Convectional pipelines decouple noisy evidence extraction from downstream inferential diagnosis, and general/medical large language models (LLMs) face scarce real world electronic health records (EHRs), stale domai..."

🔬 RESEARCH

What Does It Take to Be a Good AI Research Agent? Studying the Role of Ideation Diversity

via Arxiv 👤 Alexis Audran-Reiss, Jordi Armengol Estapé, Karen Hambardzumyan et al. 📅 2025-11-19

⚡ Score: 6.7

"AI research agents offer the promise to accelerate scientific progress by automating the design, implementation, and training of machine learning models. However, the field is still in its infancy, and the key factors driving the success or failure of agent trajectories are not fully understood. We..."

🔬 RESEARCH

$π^{*}_{0.6}$: a VLA That Learns From Experience

via Arxiv 👤 Ali Amin, Raichelle Aniceto, Ashwin Balakrishna et al. 📅 2025-11-18

⚡ Score: 6.7

"We study how vision-language-action (VLA) models can improve through real-world deployments via reinforcement learning (RL). We present a general-purpose method, RL with Experience and Corrections via Advantage-conditioned Policies (RECAP), that provides for RL training of VLAs via advantage conditi..."

🤖 AI MODELS

EBind: Multi-modal embedding model that supports image, video, audio, text

via HackerNews 👤 rahimnathwani 📅 2025-11-20

🔺 1 pts ⚡ Score: 6.7

🔬 RESEARCH

DEPO: Dual-Efficiency Preference Optimization for LLM Agents

via Arxiv 👤 Sirui Chen, Mengshi Zhao, Lei Xu et al. 📅 2025-11-19

⚡ Score: 6.7

"Recent advances in large language models (LLMs) have greatly improved their reasoning and decision-making abilities when deployed as agents. Richer reasoning, however, often comes at the cost of longer chain of thought (CoT), hampering interaction efficiency in real-world scenarios. Nevertheless, th..."

🔬 RESEARCH

VisPlay: Self-Evolving Vision-Language Models from Images

via Arxiv 👤 Yicheng He, Chengsong Huang, Zongxia Li et al. 📅 2025-11-19

⚡ Score: 6.6

"Reinforcement learning (RL) provides a principled framework for improving Vision-Language Models (VLMs) on complex reasoning tasks. However, existing RL approaches often rely on human-annotated labels or task-specific heuristics to define verifiable rewards, both of which are costly and difficult to..."

🧠 NEURAL NETWORKS

LLMs now think they're more rational than humans, so they use advanced game theory - but only when they think they're competing against other LLMs.

via r/OpenAI 👤 u/MetaKnowing 📅 2025-11-20

⬆️ 21 ups ⚡ Score: 6.6

"https://arxiv.org/abs/2511.00926..."

🔮 FUTURE

An overview of macro tech trends for 2026, as “AI eats the world”: bubbles, the AI platform shift, Big Tech has FOMO, capex, Nvidia, US power backlogs, and more

via Techmeme 👤 Ben-Evans 📅 2025-11-20

⚡ Score: 6.4

🤖 AI MODELS

OpenAI says GPT-5 has demonstrated the ability to accelerate scientific research workflows but can't run projects or solve scientific problems autonomously

via Techmeme 👤 Zdnet 📅 2025-11-20

⚡ Score: 6.4

🔒 SECURITY

A Researcher Made an AI That Completely Breaks the Online Surveys Scientists Rely On | "We can no longer trust that survey responses are coming from real people."

via r/artificial 👤 u/MetaKnowing 📅 2025-11-20

⬆️ 11 ups ⚡ Score: 6.4

"External link discussion - see full content at original source."

🌐 POLICY

How the AI Act became a case study for critics who say the EU puts regulation ahead of innovation, as the European Commission postpones a key part of the law

via Techmeme 👤 Ft 📅 2025-11-20

⚡ Score: 6.3

📊 DATA

Two-thirds of AI-generated citations are fabricated or contain errors

via HackerNews 👤 geox 📅 2025-11-20

🔺 3 pts ⚡ Score: 6.3

🛠️ SHOW HN

Show HN: CTON: JSON-compatible, token-efficient text format for LLM prompts

via HackerNews 👤 daviducolo 📅 2025-11-20

🔺 5 pts ⚡ Score: 6.3

🤖 AI MODELS

Building more with GPT-5.1-Codex-Max

via HackerNews 👤 hansonw 📅 2025-11-19

🔺 407 pts ⚡ Score: 6.2

💬 HackerNews Buzz: 234 comments 🐝 BUZZING

🎯 AI model capabilities • Challenges with AI code generation • Comparison of Codex and Claude

💬 "Codex is extremely, painfully, doggedly persistent in following every last character of them" • "Hallucinations and ignored requirements are big problems that are very annoying to deal with"

🤖 AI MODELS

Nano Banana Pro

via HackerNews 👤 meetpateltech 📅 2025-11-20

🔺 640 pts ⚡ Score: 6.2

💬 HackerNews Buzz: 402 comments 🐐 GOATED ENERGY

🎯 AI image generation capabilities • Pricing and accessibility of AI models • Comparisons between AI and human-created art

💬 "How successful are people at getting these things to actually produce useful images?" • "Nano Banana is certainly proven itself to me."

🔧 INFRASTRUCTURE

The US DOE accelerates its approach to equipping national labs with AI supercomputers by working with Nvidia, AMD, and Oracle, which will pay some of the costs

via Techmeme 👤 Nytimes 📅 2025-11-20

⚡ Score: 6.2

🛠️ SHOW HN

Show HN: MCP Code Execution Enhanced – 99.6% Token Reduction for Claude Code

via HackerNews 👤 yoloshii 📅 2025-11-20

🔺 2 pts ⚡ Score: 6.2

🔬 RESEARCH

DMA Collectives for Efficient ML Communication Offloads

via HackerNews 👤 matt_d 📅 2025-11-19

🔺 1 pts ⚡ Score: 6.1

🔬 RESEARCH

NORA-1.5: A Vision-Language-Action Model Trained using World Model- and Action-based Preference Rewards

via Arxiv 👤 Chia-Yu Hung, Navonil Majumder, Haoyuan Deng et al. 📅 2025-11-18

⚡ Score: 6.1

"Vision--language--action (VLA) models have recently shown promising performance on a variety of embodied tasks, yet they still fall short in reliability and generalization, especially when deployed across different embodiments or real-world environments. In this work, we introduce NORA-1.5, a VLA mo..."

Stories from November 20, 2025

Meta Segment Anything Model 3 Release

📡 AI NEWS BUT ACTUALLY GOOD