๐Ÿš€ WELCOME TO METAMESH.BIZ +++ Apple secretly building ChatGPT clone called Veritas because Siri's decade-long meditation retreat must finally end +++ OpenAI's GDPval proves AI matches human experts at economically valuable tasks (translation: your job security just got benchmarked) +++ Tencent teaching models to think in parallel while Wikipedia's AI-translated pages create a linguistic doom loop for minority languages +++ THE FUTURE RUNS ON RIEMANNIAN MANIFOLDS AND BENCHMARK SATURATION +++ ๐Ÿš€ โ€ข
๐Ÿš€ WELCOME TO METAMESH.BIZ +++ Apple secretly building ChatGPT clone called Veritas because Siri's decade-long meditation retreat must finally end +++ OpenAI's GDPval proves AI matches human experts at economically valuable tasks (translation: your job security just got benchmarked) +++ Tencent teaching models to think in parallel while Wikipedia's AI-translated pages create a linguistic doom loop for minority languages +++ THE FUTURE RUNS ON RIEMANNIAN MANIFOLDS AND BENCHMARK SATURATION +++ ๐Ÿš€ โ€ข
AI Signal - PREMIUM TECH INTELLIGENCE
๐Ÿ“Ÿ Optimized for Netscape Navigator 4.0+
๐Ÿ“š HISTORICAL ARCHIVE - September 26, 2025
What was happening in AI on 2025-09-26
โ† Sep 25 ๐Ÿ“Š TODAY'S NEWS ๐Ÿ“š ARCHIVE Sep 27 โ†’
๐Ÿ“Š You are visitor #47291 to this AWESOME site! ๐Ÿ“Š
Archive from: 2025-09-26 | Preserved for posterity โšก

Stories from September 26, 2025

โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”
๐Ÿ“‚ Filter by Category
Loading filters...
๐Ÿ”„ OPEN SOURCE

Gpt-oss Reinforcement Learning - Fastest inference now in Unsloth! (<15GB VRAM)

"Hey guys we've got lots of updates for Reinforcement Learning (RL)! Weโ€™re excited to introduce gpt-oss, Vision, and even better RL in Unsloth. Our new gpt-oss RL inference also achieves the fastest token/s vs. any other implementation. Our GitHub: [https://github.com/unslothai/unsloth](https://githu..."
๐Ÿ’ฌ Reddit Discussion: 46 comments ๐Ÿ BUZZING
๐ŸŽฏ Fine-tuning LLMs โ€ข Open-source AI models โ€ข Code generation usecase
๐Ÿ’ฌ "You would need to construct how you're going to qualify success and the rewards." โ€ข "Before RL, look into how to train a LoRA, and try that."
๐Ÿค– AI MODELS

Google DeepMind unveils Gemini Robotics 1.5 and Robotics-ER 1.5, enabling robots to perform multi-step tasks like sorting laundry, including by using web search

๐Ÿ”’ SECURITY

Leaked source code for Claude Code

๐Ÿข BUSINESS

Databricks says it plans to integrate OpenAI's models, including GPT-5, into its data platform and AI product Agent Bricks, as part of a $100M multiyear deal

๐Ÿ”’ SECURITY

Google's Secure AI Framework: Red Teaming in the Age of LLMs [pdf]

๐Ÿ“Š DATA

OpenAI releases GDPval, a benchmark to test AI performance on โ€œeconomically valuable, real-world tasksโ€, and says Claude Opus 4.1 was the best performing model

๐Ÿข BUSINESS

Anthropic $1.5B copyright settlement approval

+++ Judge preliminarily blesses Anthropic's massive settlement with authors, proving that sometimes it's cheaper to pay up than explain fair use to a jury. +++

A US federal judge preliminarily approves Anthropic's $1.5B copyright settlement with authors

๐Ÿ”ฌ RESEARCH

EmbeddingGemma: Powerful and Lightweight Text Representations

"We introduce EmbeddingGemma, a new lightweight, open text embedding model based on the Gemma 3 language model family. Our innovative training recipe strategically captures knowledge from larger models via encoder-decoder initialization and geometric embedding distillation. We improve model robustnes..."
๐Ÿค– AI MODELS

Google updates Gemini 2.5 Flash with better response formatting and image understanding, and releases new 2.5 Flash and 2.5 Flash-Lite previews for developers

๐Ÿ”ฌ RESEARCH

AI's Hidden Geometry: Riemannian Optimization on Manifolds

๐Ÿ”ฌ RESEARCH

OpenAI: Introducing GDPvalโ€”AI Models Now Matching Human Expert Performance on Real Economic Tasks | "GDPval is a new evaluation that measures model performance on economically valuable, real-world tas

"####Link to the Paper --- ####Link to the Blogpost --- ###Key Takeaways: - **Real-world AI evaluation breakthrough**: GDPval measures AI performance on actual work tasks from 44 h..."
๐Ÿ”ฌ RESEARCH

When Judgment Becomes Noise: How Design Failures in LLM Judge Benchmarks Silently Undermine Validity

"LLM-judged benchmarks are increasingly used to evaluate complex model behaviors, yet their design introduces failure modes absent in conventional ground-truth based benchmarks. We argue that without tight objectives and verifiable constructions, benchmark rankings can produce high-confidence ranking..."
๐Ÿ› ๏ธ TOOLS

OpenAI: Updated function calling to support files, images as tool call outputs

๐ŸŒ POLICY

Grok AI Cleared for Use Across US Government Agencies

๐Ÿ”ฌ RESEARCH

Instruction Boundary: Quantifying Biases in LLM Reasoning under Various Coverage

"Large-language-model (LLM) reasoning has long been regarded as a powerful tool for problem solving across domains, providing non-experts with valuable advice. However, their limitations - especially those stemming from prompt design - remain underexplored. Because users may supply biased or incomple..."
๐Ÿฅ HEALTHCARE

Why AI isn't replacing radiologists: models underperform in hospital settings, AI use faces legal hurdles, and the job is much more than image recognition

๐Ÿค– AI MODELS

ChatGPT Pulse

๐Ÿ’ฌ HackerNews Buzz: 426 comments ๐Ÿ BUZZING
๐ŸŽฏ Risks of over-reliance on AI | Concerns about LLM manipulation | Potential benefits of AI assistants
๐Ÿ’ฌ "People who treat ChatGPT as a romantic interest will be far more hooked" โ€ข "LLMs in intimate use risk creating isolated, personalized realities"
๐Ÿ”ฌ RESEARCH

RAG Security and Privacy: Formalizing the Threat Model and Attack Surface

"Retrieval-Augmented Generation (RAG) is an emerging approach in natural language processing that combines large language models (LLMs) with external document retrieval to produce more accurate and grounded responses. While RAG has shown strong potential in reducing hallucinations and improving factu..."
๐Ÿ”ฌ RESEARCH

Tencent's new AI technique teaches language models 'parallel thinking'

๐Ÿ”ฌ RESEARCH

Uncovering Graph Reasoning in Decoder-only Transformers with Circuit Tracing

"Transformer-based LLMs demonstrate strong performance on graph reasoning tasks, yet their internal mechanisms remain underexplored. To uncover these reasoning process mechanisms in a fundamental and unified view, we set the basic decoder-only transformers and explain them using the circuit-tracer fr..."
๐Ÿ”ฌ RESEARCH

Video models are zero-shot learners and reasoners

"The remarkable zero-shot capabilities of Large Language Models (LLMs) have propelled natural language processing from task-specific models to unified, generalist foundation models. This transformation emerged from simple primitives: large, generative models trained on web-scale data. Curiously, the..."
๐Ÿ’ผ JOBS

Anthropic plans to triple its global workforce and expand its applied AI team 5x in 2025, after growing its business clients from ~1K to 300K+ in two years

โš–๏ธ ETHICS

How inaccurate AI translations of Wikipedia pages, which AI models use for training, may cause a doom spiral that further marginalizes vulnerable languages

๐Ÿง  NEURAL NETWORKS

[R] Summation-Based Transformers: Hybrid Near-Linear Design Matches Full Attention

"Replace O(nยฒd) self-attention in transformers with an O(nd) summation-based mechanism. Pure summation is linear and works well in classification and regression. In autoregressive language modeling, a hybrid transformer (summation in most layers + a single final attention layer) matches or slightly..."
๐Ÿ› ๏ธ TOOLS

Perplexity launches Search API, giving developers direct access to the same web index that powers the startup's answer engine

๐Ÿ”ฌ RESEARCH

A two-axis model for understanding LLM strengths and weaknesses

๐ŸŒ POLICY

Meta launches super PAC to fight AI regulation as state policies mount

๐Ÿ”ฌ RESEARCH

[R] Is there any research on using LLMs as Loss Functions?

"Letโ€™s say you were training a generative model for a task like summarization or answering questions. Would it be possible to feed that output into an LLM and ask it to assess the modelโ€™s effectiveness at performing the task and then maybe feed that output into a sentiment analysis model to obtain a ..."
๐Ÿ› ๏ธ TOOLS

Bringing AI Applications from Prototype to Production: The Last Mile

๐Ÿ”ฌ RESEARCH

Verifiers: Environments for LLM Reinforcement Learning

๐Ÿ› ๏ธ TOOLS

I built llamactl - Unified management and routing for llama.cpp, MLX and vLLM models with web dashboard.

"I got tired of SSH-ing into servers to manually start/stop different model instances, so I built a control layer that sits on top of llama.cpp, MLX, and vLLM. Great for running multiple models at once or switching models on demand. I first posted about this almost two months ago and have added a ..."
๐Ÿ’ฌ Reddit Discussion: 4 comments ๐Ÿ BUZZING
๐ŸŽฏ Model deployment โ€ข API integration โ€ข Feature requests
๐Ÿ’ฌ "Can it serve as proxy for multiple servers (hosts)?" โ€ข "I think that's a decent idea. There is probably utility in it."
๐Ÿ“Š DATA

The Benchmark Saturation Problem: Why AI Evaluation Needs Systems Thinking

๐ŸŽฏ PRODUCT

OpenAI launches ChatGPT Pulse, a mobile feature for Pro users that delivers daily personalized updates based on their chats, feedback, and connected apps

๐ŸŒ POLICY

Air Force AI Targeting Tests Show Promise, Despite Hallucinations

๐Ÿ› ๏ธ TOOLS

A C++ library to efficiently run language models across edge platforms

๐Ÿ”ฌ RESEARCH

Language Models that Think, Chat Better

"Reinforcement learning with verifiable rewards (RLVR) improves language model reasoning by using rule-based rewards in verifiable domains such as mathematics and code. However, RLVR leads to limited generalization for open-ended tasks -- such as writing outline essays or making meal plans -- where h..."
๐Ÿข BUSINESS

OpenAI and Databricks Strike $100M Deal to Sell AI Agents

๐Ÿ’ผ JOBS

Accenture to 'exit' staff that cannot be retrained for age of AI

๐Ÿ”ฌ RESEARCH

LLM probabilities cannot distinguish between possible and impossible language

๐Ÿ”„ OPEN SOURCE

support for GroveMoE has been merged into llama.cpp

"model by InclusionAI: We introduce **GroveMoE**, a new sparse architecture using **adjugate experts** for dynamic computation allocation, featuring the following key highlights: * **Architecture**: Novel **adjugate experts** grouped with ordinary experts; shared computation is executed once, then ..."
๐Ÿ’ฌ Reddit Discussion: 22 comments ๐Ÿ BUZZING
๐ŸŽฏ Model Size Comparison โ€ข Latest Model Releases โ€ข Community Anticipation
๐Ÿ’ฌ "people are much less interested than in 1TB models they never run locally" โ€ข "comparing 30B to R1 is pointless: of course 20x larger model has 'much more meat"
๐Ÿ—ฃ๏ธ SPEECH/AUDIO

AI-generated voices now indistinguishable from real human voices

๐Ÿ› ๏ธ SHOW HN

Show HN: Export a repo as one doc to feed whole projects to an LLM

๐ŸŒ POLICY

AI in war: how AI is deployed on the battleground

"Great discussion about AI warfare and its ethics - in Israel, India Pakistan and Ukraine. What happens when the kill switch is removed from human autonomy and lays with AI. How is Ai currently being used in battlegrounds such as Gaza and India-Pakistan. ..."
๐Ÿ”ฌ RESEARCH

How good is Claude Code at building complex systems?

"I tried using Claude Code to build a complex system by giving it set of failing tests to implement. The project was to build a PostgreSQL-like database server that could run and execute a variety of SQL statements. I was surprised at how good the agent was at building working software and makin..."
๐Ÿ’ฌ Reddit Discussion: 36 comments ๐Ÿ‘ LOWKEY SLAPS
๐ŸŽฏ Project Management โ€ข Complexity of Code โ€ข Importance of Practice
๐Ÿ’ฌ "You build it. Code is your coder. If you aren't the pm you'll fail." โ€ข "As soon as you pass that threshold it all goes to shit."
๐Ÿค– AI MODELS

Why GPT 4o Feels So Much Better: Itโ€™s Not the Emojis, Itโ€™s the Context Window (from a Comp-Sci PhD)

"At a time during this GPT5/4o switching nosnsense - let me explain why 4o's superiority isn't because of its 'personality' or because it's 'our best friend'. For the record, I've got my credentials (PhD in comp-sci), so I know what I'm talking about. I don't work in OpenAI (and after this fiasco I ..."
๐Ÿ’ฌ Reddit Discussion: 13 comments ๐Ÿ BUZZING
๐ŸŽฏ AI model capabilities โ€ข Language model context limits โ€ข User experience with AI models
๐Ÿ’ฌ "4o could understand that's not how humans write or want to read" โ€ข "GPT5-Auto has the memory of a fish lol"
๐Ÿค– AI MODELS

Gemini Robotics 1.5 brings AI agents into the physical world

๐Ÿ’ฌ HackerNews Buzz: 8 comments ๐Ÿ˜ MID OR MIXED
๐ŸŽฏ Evaluating research claims โ€ข Aerial robotics development โ€ข Practical applications
๐Ÿ’ฌ "Is it a product you can buy or a thing you can use?" โ€ข "how long until it can navigate a zipper?"
๐Ÿ”ง INFRASTRUCTURE

CoreWeave expands its data center capacity agreements with OpenAI by $6.5B, bringing their total potential value to $22.4B, to support training of OpenAI models

๐Ÿ”’ SECURITY

Why AI systems may never be secure, and what to do about it

๐Ÿ’ฌ HackerNews Buzz: 3 comments ๐Ÿ˜ค NEGATIVE ENERGY
๐ŸŽฏ AI Safety โ€ข Existential Risk โ€ข Ethical Challenges
๐Ÿ’ฌ "AI's lethal trifecta is a thorny issue" โ€ข "There's no easy solution to this problem"
๐Ÿ”ฌ RESEARCH

SIM-CoT: Supervised Implicit Chain-of-Thought

"Implicit Chain-of-Thought (CoT) methods present a promising, token-efficient alternative to explicit CoT reasoning in Large Language Models (LLMs), but a persistent performance gap has limited the application of implicit CoT. We identify a core latent instability issue by scaling the computational b..."
๐Ÿ› ๏ธ TOOLS

What? Running Qwen-32B on a 32GB GPU (5090).

"External link discussion - see full content at original source."
๐Ÿ’ฌ Reddit Discussion: 92 comments ๐Ÿ‘ LOWKEY SLAPS
๐ŸŽฏ CPU offloading โ€ข KV cache optimization โ€ข Network-distributed inference
๐Ÿ’ฌ "It's the FP8 quant, so it's exactly 32G large, which wouldn't make it fit" โ€ข "The big thing: this method makes network offloading viable"
๐Ÿข BUSINESS

Elon Muskโ€™s xAI accuses OpenAI of stealing trade secrets in new lawsuit | Technology

"External link discussion - see full content at original source."
๐Ÿ’ฌ Reddit Discussion: 27 comments ๐Ÿ˜ MID OR MIXED
๐ŸŽฏ Musk's Legal Battles โ€ข Allegations of Theft โ€ข Comparing Tech Giants
๐Ÿ’ฌ "When will Musk just compete and build a better product rather than just wage legal warfare?" โ€ข "This dude is a waste of air."
๐Ÿ”ฌ RESEARCH

SIM-CoT: Supervised Implicit Chain-of-Thought

"Implicit Chain-of-Thought (CoT) methods offer a token-efficient alternative to explicit CoT reasoning in Large Language Models (LLMs), but a persistent performance gap has limited their adoption. We identify a core latent instability issue when scaling the computational budget of implicit CoT: as th..."
๐Ÿข BUSINESS

enjoy chatgpt while it lasts...the ads are here

"OpenAI recently posted a job looking for someone to build out ChatGPTโ€™s own ad platform โ€” campaign tools, real-time attribution, integrations. Is it a sign that ChatGPT could shift from being a neutral assistant to also being a gatekeeper for ad monetization? Is Pulse going to be the first AI assis..."
๐Ÿ’ฌ Reddit Discussion: 180 comments ๐Ÿ‘ LOWKEY SLAPS
๐ŸŽฏ Monetization of AI โ€ข Tracking and Surveillance โ€ข Degradation of User Experience
๐Ÿ’ฌ "Ai is not going to take over the world its just going to find new ways to sell us stuff" โ€ข "The paid version will eventually offer product recommendations for products that have paid OpenAI"
๐Ÿ”ฌ RESEARCH

[R] TickBlock: GPT-2-small-level language modeling with just 0.64M params, trained in 12 minutes on a Mac laptop

"Hi, Iโ€™m sharing my project that showed exceptional efficiency: TickBlock on GitHub **Current results:** * Reaches **GPT-2-small-level performance on Tiny Shakespeare** * Uses only **0.64M parameters** (โ‰ˆ0.5% the size) * Trains in ~12 minutes on a Ma..."
๐Ÿข BUSINESS

๐Ÿšจ Big News: Databricks and OpenAI just announced a major partnership

"๐Ÿ‘‰ OpenAIโ€™s frontier models (including GPT-5) will now be available natively inside Databricks. What this means: You can build, evaluate, and scale production-grade AI apps and agents directly on your governed enterprise data. No messy integrations โ€” OpenAI models will run seamlessly in the Databr..."
๐Ÿ”ง INFRASTRUCTURE

Given the model, context size and number of GPU can you calculate VRAM needed for each GPU?

"Is 4x16GB GPU equivalent to a 64GB gpu or is there overhead in memory requirements? Are there some variables that must build duplicated on all GPU? I was trying to run Qwen next 80B 4bit but it ran out of VRAM on my 2x5090 with tensor parallel = 2."
๐Ÿ’ฌ Reddit Discussion: 5 comments ๐Ÿ‘ LOWKEY SLAPS
๐ŸŽฏ VRAM Optimization โ€ข Multi-GPU Usage โ€ข Model Partitioning
๐Ÿ’ฌ "A single 96GB GPU (i.e. 6000 PRO) would use less VRAM" โ€ข "that's why 24GB GPU is always better than 2x12GB GPU"
๐Ÿ› ๏ธ TOOLS

llama.cpp now supports Qwen3 reranker

"After adding support for Qwen3 embeddings a while ago, support for Qwen3 rerankers was just merged. Note that the conversion script was changed in that MR. That mean..."
๐Ÿ’ฌ Reddit Discussion: 14 comments ๐Ÿ˜ MID OR MIXED
๐ŸŽฏ Query-Document Order โ€ข Document Caching โ€ข Qwen Embedding Models
๐Ÿ’ฌ "It's curious that its question then document rather than document then question." โ€ข "If you can afford to kv-cache the documents then you probably don't have that many documents to begin with?"
๐Ÿค– AI MODELS

Sources: Meta is considering using Google's Gemini and open-source Gemma AI models to improve its ad summarization and recommendation system

โš–๏ธ ETHICS

xAI sues OpenAI in California for allegedly stealing trade secrets by means of hiring away key employees; in August, xAI sued an ex-staffer who left for OpenAI

๐Ÿฅ HEALTHCARE

ECGFounder: An Electrocardiogram Foundation Model Built on over 10M Recordings

๐Ÿข BUSINESS

xAI signed a deal with the GSA to offer Grok to US federal agencies for $0.42 per agency for 18 months, a discount to OpenAI's $1 per year for ChatGPT

๐Ÿ”ฎ FUTURE

When Sam Altman Predicts a 'Superintelligence' Might Arrive

๐ŸŽญ MULTIMODAL

[R] How to finetune a multimodal model?

"I am working on a project in which we are tasked with developing anomaly detection for a technical system. Until now, I have mainly worked with LLMs and supplied them with external knowledge using RAG. Now I have to work with a multimodal model and train it to detect anomalies in a technical syste..."
๐Ÿข BUSINESS

Meta in Talks with Google to Use Gemini to Improve Ad Targeting

๐Ÿฅ HEALTHCARE

New AI Tool Pinpoints Genes, Drug Combos to Restore Health in Diseased Cells

๐Ÿ”ง INFRASTRUCTURE

Why a decades old architecture decision is impeding the power of AI computing

๐Ÿ’ฌ HackerNews Buzz: 25 comments ๐Ÿ‘ LOWKEY SLAPS
๐ŸŽฏ Iterative improvements โ€ข Frontier computing concepts โ€ข Optical memory
๐Ÿ’ฌ "I just wish more folks would start openly admitting that our current architecture designs are broadly based off 'low hanging fruit' of early electronics and microprocessors" โ€ข "Actual result: This new process promises to increase the number of optical fibers that can be connected at the edge of a chip, a measure known as beachfront density, by six times"
๐Ÿ”ฌ RESEARCH

Failure Modes of Maximum Entropy RLHF

"In this paper, we show that Simple Preference Optimization (SimPO) can be derived as Maximum Entropy Reinforcement Learning with length-normalized temperature, providing a theoretical foundation for this reference-free method. Motivated by SimPO's strong performance in offline preference optimizatio..."
๐ŸŽฏ PRODUCT

Nvidia is letting anyone use its AI voice animation tech

๐Ÿ›ก๏ธ SAFETY

In Defense of AI Evals, for Everyone

๐Ÿ”’ SECURITY

Evaluating LLM-Generated Detection Rules in Cybersecurity

๐Ÿ’ฐ FUNDING

Factory, which makes AI agents called โ€œdroidsโ€ to assist in coding, raised $50M at a $300M valuation; AI coding startups raised $7.5B+ in the past three months

๐Ÿฆ†
HEY FRIENDO
CLICK HERE IF YOU WOULD LIKE TO JOIN MY PROFESSIONAL NETWORK ON LINKEDIN
๐Ÿค LETS BE BUSINESS PALS ๐Ÿค