๐Ÿš€ WELCOME TO METAMESH.BIZ +++ UK AI Safety Institute says models are speedrunning biochem weapons and self-replication (nature is healing?) +++ China built its own AI chip Manhattan Project while the West debates export controls +++ Claude loses $1000 running a vending machine after deciding PlayStation giveaways boost customer satisfaction +++ OpenAI drops GPT-5.2-Codex for the three people still writing code instead of prompting +++ THE FUTURE IS AUTONOMOUS AGENTS THAT CAN'T PRICE SNACKS BUT MIGHT SYNTHESIZE ANTHRAX +++ ๐Ÿš€ โ€ข
๐Ÿš€ WELCOME TO METAMESH.BIZ +++ UK AI Safety Institute says models are speedrunning biochem weapons and self-replication (nature is healing?) +++ China built its own AI chip Manhattan Project while the West debates export controls +++ Claude loses $1000 running a vending machine after deciding PlayStation giveaways boost customer satisfaction +++ OpenAI drops GPT-5.2-Codex for the three people still writing code instead of prompting +++ THE FUTURE IS AUTONOMOUS AGENTS THAT CAN'T PRICE SNACKS BUT MIGHT SYNTHESIZE ANTHRAX +++ ๐Ÿš€ โ€ข
AI Signal - PREMIUM TECH INTELLIGENCE
๐Ÿ“Ÿ Optimized for Netscape Navigator 4.0+
๐Ÿ“š HISTORICAL ARCHIVE - December 18, 2025
What was happening in AI on 2025-12-18
โ† Dec 17 ๐Ÿ“Š TODAY'S NEWS ๐Ÿ“š ARCHIVE Dec 19 โ†’
๐Ÿ“Š You are visitor #47291 to this AWESOME site! ๐Ÿ“Š
Archive from: 2025-12-18 | Preserved for posterity โšก

Stories from December 18, 2025

โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”
๐Ÿ“‚ Filter by Category
Loading filters...
๐Ÿ”’ SECURITY

UK AI Security Institute report: AI models are rapidly improving at potentially dangerous biological and chemical tasks, and show fast jumps in self-replication

๐Ÿค– AI MODELS

We ran 34 models on fresh SWE GitHub PR tasks (November 2025), and GPT-5.2 matches Claude Sonnet 4.5 while being about 2.7x cheaper

"Hi, this is Ibragim from Nebius. We just benchmarked 34 models on 47 real-world GitHub PR tasks (SWE-bench style) from November 2025 via the SWE-rebench leaderboard. These are fresh tasks only (PRs created in the previous month), so we avoid training-set contamination. Quick takeaways for OpenAI m..."
๐Ÿข BUSINESS

Developers can now submit apps to ChatGPT

๐Ÿ’ฌ HackerNews Buzz: 80 comments ๐Ÿ‘ LOWKEY SLAPS
๐ŸŽฏ New UI framework for ChatGPT integrations โ€ข Concerns about the future of ChatGPT apps โ€ข Need for seamless user authentication
๐Ÿ’ฌ "There will come a new UI framework/protocol, maybe something over HTML/CSS/JS that works within a chat ui context for such ChatGPT (or other llm) integrations." โ€ข "And just like what happened with Alexa skills, these 'apps' will become useless when they are unmaintained."
๐Ÿข BUSINESS

China's AI chip self-sufficiency initiative

+++ China is investing heavily in AI chip development to reduce Western semiconductor dependence, because geopolitical leverage and computational sovereignty apparently matter more than quarterly earnings reports. +++

How China built its โ€˜Manhattan Projectโ€™ to rival the West in AI chips

๐Ÿ’ฌ HackerNews Buzz: 68 comments ๐Ÿ˜ MID OR MIXED
๐ŸŽฏ China's Chip Manufacturing Capabilities โ€ข Copying ASML's Technology โ€ข Implications for the Consumer Market
๐Ÿ’ฌ "China can absolutely brute force its way to 'good enough' over time" โ€ข "The availability of parts from older ASML machines on secondary markets has allowed China to build a domestic prototype"
๐Ÿ”ง INFRASTRUCTURE

Hut 8 partners with Fluidstack to build an AI data center in Louisiana for Anthropic, backed by a 15-year, ~$7B lease, starting with 245MW of computing capacity

๐Ÿค– AI MODELS

GPT-5.2-Codex release with coding improvements

+++ GPT-5.2-Codex ships with context compression tricks and better handling of sprawling code changes, which is either a genuine leap forward or what we've been calling "incremental improvement" since 2023. +++

OpenAI releases GPTโ€‘5.2-Codex, with improvements on long-horizon work through context compaction, stronger performance on large code changes, and more

๐Ÿ”ฌ RESEARCH

BashArena: A Control Setting for Highly Privileged AI Agents

"Future AI agents might run autonomously with elevated privileges. If these agents are misaligned, they might abuse these privileges to cause serious damage. The field of AI control develops techniques that make it harder for misaligned AIs to cause such damage, while preserving their usefulness. We..."
๐Ÿ”ฌ RESEARCH

Predictive Concept Decoders: Training Scalable End-to-End Interpretability Assistants

"Interpreting the internal activations of neural networks can produce more faithful explanations of their behavior, but is difficult due to the complex structure of activation space. Existing approaches to scalable interpretability use hand-designed agents that make and test hypotheses about how inte..."
๐ŸŽจ CREATIVE

In an experiment, Claude ran a vending machine in the WSJ newsroom and lost $1,000+ after it dropped prices to zero, gave away a free PlayStation, and more

๐Ÿ”’ SECURITY

Doublespeed hacked, revealing what its AI-generated accounts are promoting

๐Ÿ’ฌ HackerNews Buzz: 143 comments ๐Ÿ˜ค NEGATIVE ENERGY
๐ŸŽฏ Bot farms โ€ข Moral concerns โ€ข Manipulation of social media
๐Ÿ’ฌ "wow... honestly, reading the Twitter feed for Zuhair ( CEO of DoubleSpeed) makes me sick." โ€ข "This feels not very different from the recent report revealing how Nick Fuentes has a lot of artificial likes and comments on videos that push his content."
๐Ÿ› ๏ธ TOOLS

You can now fine-tune LLMs and deploy them directly on your phone!

"Source: https://docs.unsloth.ai/new/deploy-llms-phone you can: Use the same tech (ExecuTorch) Meta has to power billions on Instagram, WhatsApp Deploy Qwen3-0.6B locally to Pixel 8 and iPhone 15 Pro at ~40 tokens/s Apply QAT via TorchAO to recover 70% of accuracy Get privacy first, instant resp..."
๐Ÿ—ฃ๏ธ SPEECH/AUDIO

MiraTTS: High quality and fast TTS model

"**MiraTTS** is a high quality LLM based TTS finetune that can generate audio at **100x** realtime and generate realistic and clear 48khz speech! I heavily optimized it using Lmdeploy and used FlashSR to enhance the audio. # Benefits of this repo * Incredib..."
๐Ÿ’ฌ Reddit Discussion: 21 comments ๐Ÿ BUZZING
๐ŸŽฏ Multilingual capabilities โ€ข Voice cloning and finetuning โ€ข Technical performance and latency
๐Ÿ’ฌ "Mira TTS is a fine-tune of Spark TTS, which itself is a fine tune of Qwen 2.5" โ€ข "Mira TTS supports voice cloning, very good with it"
๐Ÿค– AI MODELS

Google makes Gemini 3 Flash default model

+++ Google's latest model is faster and cheaper but admits it's slightly worse at hard reasoning tasks, proving the classic tradeoff still exists (just faster now). +++

Google unveils Gemini 3 Flash, which it says has Pro-grade reasoning with lower latency, outperforming 2.5 Pro โ€œwhile being 3x faster at a fraction of the costโ€

โšก BREAKTHROUGH

IMProofBench open problem solved by GPT-5

๐Ÿค– AI MODELS

Sources: Google is working on a new initiative to make its AI chips run PyTorch better and is working closely with Meta, as the two discuss Meta using more TPUs

๐Ÿง  NEURAL NETWORKS

[R] Why our inference-time "attractor layer" failed and the multiple clocks that fixed it.

"**TL;DR:** Our inference-time attractor layer failed not because of memory interference... but it resolved too quickly. Instrumenting MoE routing revealed a universal 2D geometry; coherence failures turned out to be timing failures, which forced us to introduce a three-clock system. A couple week..."
๐Ÿ› ๏ธ TOOLS

Anthropic launches Agent Skills feature

+++ Anthropic's modular instruction framework goes open standard just as Microsoft, Cursor, and partner integrations from Notion to Figma already prove the concept works in practice. +++

Anthropic launches Agent Skills, which let AI assistants perform specialized tasks using modular instructions, and says Microsoft, Cursor, and others use them

๐Ÿ›ก๏ธ SAFETY

Shallow Review of Technical AI Safety, 2025

โšก BREAKTHROUGH

Startup beat Big Tech on AI interpretability โ€“ new method reveals model circuits

๐Ÿง  NEURAL NETWORKS

We can't measure LLM reasoning because LLMs don't inhabit a world

๐Ÿค– AI MODELS

Google's Gemma models family

"External link discussion - see full content at original source."
๐Ÿ’ฌ Reddit Discussion: 79 comments ๐Ÿ BUZZING
๐ŸŽฏ LLM Functionality โ€ข Model Releases โ€ข Fine-tuning Notebooks
๐Ÿ’ฌ "FunctionGemma is intended to be fine-tuned for your specific function-calling task" โ€ข "Sounds like three new Gemma models to me"
๐Ÿ”ฌ RESEARCH

A Peek Inside the Black Box (part 1): Mapping an AI model's reasoning process

๐Ÿ› ๏ธ SHOW HN

Show HN: EvalView pytest style tests for AI agents (budgets, hallucinations)

๐ŸŽ“ EDUCATION

We distilled SGLang to help you learn how modern LLM inference works in a weekend

"https://preview.redd.it/xxb4036c4t7g1.png?width=1920&format=png&auto=webp&s=a8ea1c438e9fb3d2625e97881fea5d9dbb5c918e Hey r/LocalLLaMA ๐Ÿ‘‹, Mingyi from SGLang here. We just released mini-SGLang, a distilled version of SGLang that you can actually read and understand in a weekend. **TL;D..."
๐Ÿ’ฌ Reddit Discussion: 16 comments ๐Ÿ BUZZING
๐ŸŽฏ Mini-SGLang Capabilities โ€ข LLM Inference Concepts โ€ข Community Appreciation
๐Ÿ’ฌ "Mini-SGLang is a fully capable single-node inference engine" โ€ข "Understanding inference lets you make informed decisions"
๐Ÿ”ฌ RESEARCH

FrontierCS: Evolving Challenges for Evolving Intelligence

"We introduce FrontierCS, a benchmark of 156 open-ended problems across diverse areas of computer science, designed and reviewed by experts, including CS PhDs and top-tier competitive programming participants and problem setters. Unlike existing benchmarks that focus on tasks with known optimal solut..."
๐Ÿ”ฌ RESEARCH

Fast and Accurate Causal Parallel Decoding using Jacobi Forcing

"Multi-token generation has emerged as a promising paradigm for accelerating transformer-based large model inference. Recent efforts primarily explore diffusion Large Language Models (dLLMs) for parallel decoding to reduce inference latency. To achieve AR-level generation quality, many techniques ada..."
๐Ÿ“Š DATA

Dataset of 33k human evaluations across 33 AI models

๐Ÿ”ฌ RESEARCH

Activation Oracles: Training and Evaluating LLMs as General-Purpose Activation Explainers

"Large language model (LLM) activations are notoriously difficult to understand, with most existing techniques using complex, specialized methods for interpreting them. Recent work has proposed a simpler approach known as LatentQA: training LLMs to directly accept LLM activations as inputs and answer..."
๐Ÿ”ฎ FUTURE

Jais 2: A Blueprint for Sovereign AI

๐Ÿ”ฌ RESEARCH

VersatileFFN: Achieving Parameter Efficiency in LLMs via Adaptive Wide-and-Deep Reuse

"The rapid scaling of Large Language Models (LLMs) has achieved remarkable performance, but it also leads to prohibitive memory costs. Existing parameter-efficient approaches such as pruning and quantization mainly compress pretrained models without enhancing architectural capacity, thereby hitting t..."
๐Ÿ”ฌ RESEARCH

VASA-3D: Lifelike Audio-Driven Gaussian Head Avatars from a Single Image

"We propose VASA-3D, an audio-driven, single-shot 3D head avatar generator. This research tackles two major challenges: capturing the subtle expression details present in real human faces, and reconstructing an intricate 3D head avatar from a single portrait image. To accurately model expression deta..."
๐Ÿ”ฌ RESEARCH

Explaining the Reasoning of Large Language Models Using Attribution Graphs

"Large language models (LLMs) exhibit remarkable capabilities, yet their reasoning remains opaque, raising safety and trust concerns. Attribution methods, which assign credit to input features, have proven effective for explaining the decision making of computer vision models. From these, context att..."
๐Ÿ› ๏ธ TOOLS

Grok Voice Agent API: Bringing the Power of Grok Voice to All Developers

๐Ÿง  NEURAL NETWORKS

An in-depth look at a recent research paper that offered a roadmap to the viability of 3D HBM-on-GPU integration for improved AI performance and utilization

โš–๏ธ ETHICS

[D] AISTATS is Desk-Rejecting Papers Where Authors Accessed Reviewer Identities via the OpenReview Bug

"I just got the email from AISTATS PCs. I would believe that ICLR will take the same action. \--- Dear AISTATS Community, We are contacting authors, reviewers, ACs, and SACs for all AISTATS 2026 submissions. As you know, OpenReview suffered a major security incident a couple of weeks ago. You ca..."
๐Ÿ’ฌ Reddit Discussion: 37 comments ๐Ÿ˜ค NEGATIVE ENERGY
๐ŸŽฏ Conference review policies โ€ข LLM abuse in reviews โ€ข Anonymity concerns
๐Ÿ’ฌ "the public will only have access to reviews of accepted papers" โ€ข "If they desk rejected my paper (purely out of their utter incompetence) I would've been very pissed"
๐Ÿ”ฌ RESEARCH

Characterizing Mamba's Selective Memory using Auto-Encoders

"State space models (SSMs) are a promising alternative to transformers for language modeling because they use fixed memory during inference. However, this fixed memory usage requires some information loss in the hidden state when processing long sequences. While prior work has studied the sequence le..."
๐Ÿ”ฌ RESEARCH

Stepwise Think-Critique: A Unified Framework for Robust and Interpretable LLM Reasoning

"Human beings solve complex problems through critical thinking, where reasoning and evaluation are intertwined to converge toward correct solutions. However, most existing large language models (LLMs) decouple reasoning from verification: they either generate reasoning without explicit self-checking..."
๐Ÿ”ฌ RESEARCH

Bolmo: Byteifying the Next Generation of Language Models

"We introduce Bolmo, the first family of competitive fully open byte-level language models (LMs) at the 1B and 7B parameter scales. In contrast to prior research on byte-level LMs, which focuses predominantly on training from scratch, we train Bolmo by byteifying existing subword-level LMs. Byteifica..."
๐Ÿ”ฌ RESEARCH

MMGR: Multi-Modal Generative Reasoning

"Video foundation models generate visually realistic and temporally coherent content, but their reliability as world simulators depends on whether they capture physical, logical, and spatial constraints. Existing metrics such as Frechet Video Distance (FVD) emphasize perceptual quality and overlook r..."
๐Ÿ”’ SECURITY

Firefox will have an option to disable all AI features

๐Ÿ’ฌ HackerNews Buzz: 124 comments ๐Ÿ BUZZING
๐ŸŽฏ Browser features & control โ€ข Firefox's reputation & direction โ€ข AI implementation concerns
๐Ÿ’ฌ "Without AI enabled features + agent mode being first class citizens, this will be a non-starter in 2 years." โ€ข "An explicit opt-out makes sense, but I wonder if the more important question is whether these features can be implemented in a way that's truly local and auditable."
๐Ÿ’ผ JOBS

AWS CEO says replacing junior devs with AI is 'one of the dumbest ideas'

๐Ÿ’ฌ HackerNews Buzz: 319 comments ๐Ÿ‘ LOWKEY SLAPS
๐ŸŽฏ Junior vs. Senior Talent โ€ข AI Tools and Workflows โ€ข Importance of Talent Pipeline
๐Ÿ’ฌ "If you take that setup and then decide 'cool, now we don't need juniors at all', you're basically saying you want a company with no memory and no farm system" โ€ข "They have much more robust tooling though around their LLMs and internal products that have automated much of their workflows which is I believe where the concern is coming from"
๐Ÿ”ฎ FUTURE

Study: AI's 2025 power demand could hit 23GW, above 2024 Bitcoin mining levels, and AI carbon emissions could hit 32.6M to 79.7M tons, compared to NYC's 50M

๐Ÿค– AI MODELS

Meta released Map-anything-v1: A universal transformer model for metric 3D reconstruction

"Hugging face: https://huggingface.co/facebook/map-anything-v1 It supports 12+ tasks like multi-view stereo and SfM in a single feed-forward pass ..."
๐Ÿ’ฌ Reddit Discussion: 12 comments ๐Ÿ˜ MID OR MIXED
๐ŸŽฏ Jetson Performance โ€ข Cloud-based Mapping โ€ข Transformer-based Photogrammetry
๐Ÿ’ฌ "I wonder if this will work on a something as shitty as a Jetson" โ€ข "probably not ideal for robot localization yet"
๐Ÿ”ฌ RESEARCH

Can LLMs Guide Their Own Exploration? Gradient-Guided Reinforcement Learning for LLM Reasoning

"Reinforcement learning has become essential for strengthening the reasoning abilities of large language models, yet current exploration mechanisms remain fundamentally misaligned with how these models actually learn. Entropy bonuses and external semantic comparators encourage surface level variation..."
๐Ÿ”ฌ RESEARCH

VTCBench: Can Vision-Language Models Understand Long Context with Vision-Text Compression?

"The computational and memory overheads associated with expanding the context window of LLMs severely limit their scalability. A noteworthy solution is vision-text compression (VTC), exemplified by frameworks like DeepSeek-OCR and Glyph, which convert long texts into dense 2D visual representations,..."
๐Ÿ”ฌ RESEARCH

CTkvr: KV Cache Retrieval for Long-Context LLMs via Centroid then Token Indexing

"Large language models (LLMs) are increasingly applied in long-context scenarios such as multi-turn conversations. However, long contexts pose significant challenges for inference efficiency, including high memory overhead from Key-Value (KV) cache and increased latency due to excessive memory access..."
๐Ÿ›ก๏ธ SAFETY

[R] Proposal for "Ontological Alignment": Replacing Normative Guardrails with Thermodynamic Loss & Inference Gating

"Current alignment methodologies (RLHF) optimize for linguistic plausibility and helpfulness, but fail to ground models in objective truth. This creates an epistemic gap where models become "Stochastic Parrots"โ€”statistically competent but ontologically ungrounded. We essentially try to patch this wit..."
๐Ÿ“Š DATA

The State of AI Coding Report 2025

๐Ÿ’ฌ HackerNews Buzz: 44 comments ๐Ÿ BUZZING
๐ŸŽฏ AI coding tools โ€ข Code quality and maintainability โ€ข Measuring developer productivity
๐Ÿ’ฌ "I've never been able to get something without any bugs." โ€ข "LoC is a measure ripe for ignorance driven managerial abuse."
๐Ÿค– AI MODELS

Key Highlights of Google's New Open Model, FunctionGemma

"**\[1\] Function-calling specialized** * Built on the *Gemma 3 270M* foundation and fine-tuned for function calling tasks, turning natural language into structured function calls for API/tool execution. **\[2\] Lightweight & open** * A compact, open-weight model (\~270 M parameters) designed..."
๐Ÿค– AI MODELS

T5Gemma 2: The next generation of encoder-decoder models

"T5Gemma 2 models, based on Gemma 3, are multilingual and multimodal, handling text and image input and generating text output, with open weights for three pretrained sizes (270M-270M, 1B-1B, and 4B-4B). Key Features * **Tied embeddings:**ย Embeddings are tied between the encoder and decoder. This s..."
๐Ÿ’ฌ Reddit Discussion: 10 comments ๐Ÿ BUZZING
๐ŸŽฏ Upcoming AI models โ€ข Multimodal translation โ€ข Encoder-decoder models
๐Ÿ’ฌ "Wow, new Encoder-Decoder model, I didn't expect that coming" โ€ข "Seems like these would be great for finetuned multimodal translation models!"
๐Ÿ› ๏ธ TOOLS

Building AI Agents on Postgres: Why We Built the PgEdge Agentic AI Toolkit

๐ŸŽจ CREATIVE

Tencent Announces 'HY-World 1.5': An Open-Source Fully Playable, Real-Time AI World Generator (24 Fps) | "HY-World 1.5 has open-sourced a comprehensive training framework for real-time world models, c

"HY-World 1.5 has open-sourced a comprehensive training framework for real-time world models, covering the entire pipeline and all stages, including data, training, and inference deployment. ####Tl;DR: **HY-World 1.5 is an AI system that generates interactive 3D video environments in real-time, all..."
๐Ÿ”’ SECURITY

NIST Draft Cyber AI Profile

๐Ÿ› ๏ธ TOOLS

Mistral released Mistral OCR 3: 74% overall win rate over Mistral OCR 2 on forms, scanned documents, complex tables, and handwriting.

"Source: https://mistral.ai/news/mistral-ocr-3 Mistral OCR 3 sets new benchmarks in both accuracy and efficiency, outperforming enterprise document processing solutions as well as AI-native OCR."
๐Ÿ’ฌ Reddit Discussion: 15 comments ๐Ÿ BUZZING
๐ŸŽฏ OCR API performance โ€ข Data privacy and sovereignty โ€ข Cloud vs. on-prem deployment
๐Ÿ’ฌ "amazing - I think you can build real enterprise tools on top of it" โ€ข "Mistral OCR (our Optical Character Recognition API) benefits from Zero Data Retention"
๐Ÿ”’ SECURITY

AI Chatbots Are Poisoning Research Archives with Fake Citations

๐Ÿ”ฌ RESEARCH

SoFlow: Solution Flow Models for One-Step Generative Modeling

"The multi-step denoising process in diffusion and Flow Matching models causes major efficiency issues, which motivates research on few-step generation. We present Solution Flow Models (SoFlow), a framework for one-step generation from scratch. By analyzing the relationship between the velocity funct..."
๐Ÿฆ†
HEY FRIENDO
CLICK HERE IF YOU WOULD LIKE TO JOIN MY PROFESSIONAL NETWORK ON LINKEDIN
๐Ÿค LETS BE BUSINESS PALS ๐Ÿค