๐Ÿš€ WELCOME TO METAMESH.BIZ +++ CALM ditches token-by-token prediction for continuous vectors because apparently transformers weren't abstract enough already +++ AI coding ability doubling every six months but still can't handle your legacy codebase without a "messiness tax" +++ GLM-4.5V matching Claude at computer control while being fully open source (the revolution will be locally hosted) +++ Someone implemented GPT entirely in vanilla Python just to prove PyTorch is optional +++ THE MESH DOESN'T NEED FRAMEWORKS, JUST PURE MATHEMATICAL STUBBORNNESS +++ ๐Ÿš€ โ€ข
๐Ÿš€ WELCOME TO METAMESH.BIZ +++ CALM ditches token-by-token prediction for continuous vectors because apparently transformers weren't abstract enough already +++ AI coding ability doubling every six months but still can't handle your legacy codebase without a "messiness tax" +++ GLM-4.5V matching Claude at computer control while being fully open source (the revolution will be locally hosted) +++ Someone implemented GPT entirely in vanilla Python just to prove PyTorch is optional +++ THE MESH DOESN'T NEED FRAMEWORKS, JUST PURE MATHEMATICAL STUBBORNNESS +++ ๐Ÿš€ โ€ข
AI Signal - PREMIUM TECH INTELLIGENCE
๐Ÿ“Ÿ Optimized for Netscape Navigator 4.0+
๐Ÿ“š HISTORICAL ARCHIVE - November 05, 2025
What was happening in AI on 2025-11-05
โ† Nov 04 ๐Ÿ“Š TODAY'S NEWS ๐Ÿ“š ARCHIVE Nov 06 โ†’
๐Ÿ“Š You are visitor #47291 to this AWESOME site! ๐Ÿ“Š
Archive from: 2025-11-05 | Preserved for posterity โšก

Stories from November 05, 2025

โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”
๐Ÿ“‚ Filter by Category
Loading filters...
๐Ÿ”ฌ RESEARCH

Optimizing AI Agent Attacks With Synthetic Data

"As AI deployments become more complex and high-stakes, it becomes increasingly important to be able to estimate their risk. AI control is one framework for doing so. However, good control evaluations require eliciting strong attack policies. This can be challenging in complex agentic environments wh..."
๐Ÿ“Š DATA

A profile of nonprofit Common Crawl, which has scraped billions of webpages since 2013, including paywalled ones, to build an archive used by OpenAI and others

๐Ÿ”’ SECURITY

OpenAI book piracy and dataset deletion

+++ As copyright lawsuits loom, OpenAI reportedly scrubbed pirated training data and internal discussions about doing so, raising questions about whether "we didn't know" still works when the evidence conveniently vanishes. +++

OpenAI pirated large numbers of books and used them to train models. OpenAI then deleted the dataset with the pirated books, and employees sent each other messages about doing so. A lawsuit could no

"External link discussion - see full content at original source."
๐Ÿ’ฌ Reddit Discussion: 228 comments ๐Ÿ˜ MID OR MIXED
๐ŸŽฏ AI Copyright Infringement โ€ข Homoerotic Fiction โ€ข Censorship of Knowledge
๐Ÿ’ฌ "This is a Zuckerberg lawsuit moment" โ€ข "We'll see where things land long term"
๐Ÿ”ฌ RESEARCH

Kosmos: An AI Scientist for Autonomous Discovery

"Data-driven scientific discovery requires iterative cycles of literature search, hypothesis generation, and data analysis. Substantial progress has been made towards AI agents that can automate scientific research, but all such agents remain limited in the number of actions they can take before losi..."
๐Ÿค– AI MODELS

Research: AI's ability to complete lengthy software engineering tasks has doubled roughly every six months, but there is a โ€œmessiness taxโ€ for real-world tasks

โšก BREAKTHROUGH

Continuous Autoregressive Language Models (CALM)

+++ CALM trades token-by-token generation for continuous vector prediction via autoencoders, claiming 99.9% reconstruction accuracy. Whether this actually speeds things up remains refreshingly unspecified. +++

Instead of predicting one token at a time, CALM (Continuous Autoregressive Language Models) predicts continuous vectors that represent multiple tokens at once

"Continuous Autoregressive Language Models (CALM) replace the traditional token-by-token generation of language models with a continuous next-vector prediction approach, where an autoencoder compresses chunks of multiple tokens into single continuous vectors that can be reconstructed with over 99.9% ..."
๐Ÿ’ฌ Reddit Discussion: 6 comments ๐Ÿ BUZZING
๐ŸŽฏ Efficient language models โ€ข Continuous vector prediction โ€ข Tradeoffs in model release
๐Ÿ’ฌ "overcoming this bottleneck requires a new design axis for LLM scaling" โ€ข "next-vector prediction as a powerful and scalable pathway towards ultra-efficient language models"
๐Ÿค– AI MODELS

GLM-4.5V model for local computer use

"On OSWorld-V, it scores 35.8% - beating UI-TARS-1.5, matching Claude-3.7-Sonnet-20250219, and setting SOTA for fully open-source computer-use models. Run it with Cua either: Locally via Hugging Face Remotely via OpenRouter Github : https://github.com/trycua Docs + examples: https://docs.trycua.co..."
๐Ÿค– AI MODELS

NanoAgent โ€” A 135M Agentic LLM with Tool Calling That Runs on CPU

"Hey everyone! Iโ€™m excited to share **NanoAgent**, a **135M parameter**, **8k context** open-source model fine-tuned for **agentic tasks** โ€” tool calling, instruction following, and lightweight reasoning โ€” all while being tiny enough (\~135 MB in 8-bit) to run on a **CPU or laptop**. **Highlights:**..."
๐Ÿค– AI MODELS

Code execution with MCP: Building more efficient agents

๐Ÿ’ฌ HackerNews Buzz: 1 comments ๐Ÿ BUZZING
๐ŸŽฏ Code interpreter design patterns โ€ข Efficient data access โ€ข CLI-based agent systems
๐Ÿ’ฌ "Code interpreters are incredibly powerful tools for agents" โ€ข "Just use CLI tools. Or maybe CLI tools are all you need?"
๐Ÿ”ฌ RESEARCH

Best Practices for Biorisk Evaluations on Open-Weight Bio-Foundation Models

"Open-weight bio-foundation models present a dual-use dilemma. While holding great promise for accelerating scientific research and drug development, they could also enable bad actors to develop more deadly bioweapons. To mitigate the risk posed by these models, current approaches focus on filtering..."
๐Ÿ”ฌ RESEARCH

Continuous Autoregressive Language Models

"The efficiency of large language models (LLMs) is fundamentally limited by their sequential, token-by-token generation process. We argue that overcoming this bottleneck requires a new design axis for LLM scaling: increasing the semantic bandwidth of each generative step. To this end, we introduce Co..."
๐Ÿค– AI MODELS

AI Agent News Roundup from over the last week:

"**1/ Critical vulnerability discovered in ChatGPTโ€™s Agentic Browser** Attackers can inject code into persistent memory - survives across sessions and devices. Normal chats can silently execute hidden commands once infected. **2/ GitHub announces Agent HQ - unified platform for coding agents** @c..."
๐Ÿ“Š DATA

Sonnet 4.5 top of new SWE benchmark that evaluates coding based on high level goals, not tasks & tickets

"A lot of current evals like SWE-bench test LMs on tasks: "fix this bug," "write a test". Sonnet 4.5 is already the best model there. But we code to achieve goals: maximize revenue, win users, get the best performance. CodeClash is a new benchmark where LMs compete as agents across multi-round tour..."
๐Ÿ’ฌ Reddit Discussion: 9 comments ๐Ÿ GOATED ENERGY
๐ŸŽฏ Coding Ability Comparison โ€ข Limitations of AI Models โ€ข Importance of Iterative Analysis
๐Ÿ’ฌ "I'm not a coder and frankly I have no desire to learn the specifics" โ€ข "Iterating on logs is something that's (1) very underexplored in existing coding evals"
๐Ÿ› ๏ธ TOOLS

[P] Implemented GPT-OSS from scratch in pure Python, without PyTorch or a GPU

"I have also written a detailed and amateur friendly blog that explains every single concept, from simple modules such as Softmax and RMSNorm, to more advanced ones like Grouped Query Attention. I tried to justify the architectural decision behind every layer as well. Key concepts: * Grouped Query ..."
๐Ÿ”ฌ RESEARCH

InnovatorBench: Evaluating Agents' Ability to Conduct Innovative LLM Research

"AI agents could accelerate scientific discovery by automating hypothesis formation, experiment design, coding, execution, and analysis, yet existing benchmarks probe narrow skills in simplified settings. To address this gap, we introduce InnovatorBench, a benchmark-platform pair for realistic, end-t..."
๐Ÿ”ฌ RESEARCH

Cache-to-Cache: Direct Semantic Communication Between Large Language Models

๐Ÿ”ฌ RESEARCH

The Collaboration Gap

"The trajectory of AI development suggests that we will increasingly rely on agent-based systems composed of independently developed agents with different information, privileges, and tools. The success of these systems will critically depend on effective collaboration among these heterogeneous agent..."
๐Ÿ“Š DATA

Reranker Leaderboard

โš–๏ธ ETHICS

[D] Moral Uncertainty Around Emerging AI Introspection

"Relevant paper to read first: https://transformer-circuits.pub/2025/introspection/index.html On the Moral Uncertainty Emerging Around AI Introspection In late 2025, new research such as Jack Lindseyโ€™s โ€œIntrospection in Transformer Modelsโ€ brought something into focus that many in the field have qu..."
๐Ÿ’ฌ Reddit Discussion: 5 comments ๐Ÿ˜ MID OR MIXED
๐ŸŽฏ AI perception gap โ€ข Algorithmic bias โ€ข AI self-awareness
๐Ÿ’ฌ "divergence in risk, benefit and value perceptions" โ€ข "AI transformation as a whole"
๐Ÿ”ฌ RESEARCH

SpecAttn: Speculating Sparse Attention

"Large Language Models (LLMs) face significant computational bottlenecks during inference due to the quadratic complexity of self-attention mechanisms, particularly as context lengths increase. We introduce SpecAttn, a novel training-free approach that seamlessly integrates with existing speculative..."
๐Ÿ“Š DATA

Open database of large AI data centers, using satellite and permit data

๐Ÿ”ฌ RESEARCH

When One Modality Sabotages the Others: A Diagnostic Lens on Multimodal Reasoning

"Despite rapid growth in multimodal large language models (MLLMs), their reasoning traces remain opaque: it is often unclear which modality drives a prediction, how conflicts are resolved, or when one stream dominates. In this paper, we introduce modality sabotage, a diagnostic failure mode in which..."
๐Ÿค– AI MODELS

Tencent + Tsinghua just dropped a paper called Continuous Autoregressive Language Models (CALM)

"STAY CALM! https://arxiv.org/abs/2510.27688..."
๐Ÿ’ฌ Reddit Discussion: 21 comments ๐Ÿ‘ LOWKEY SLAPS
๐ŸŽฏ Text autoencoding โ€ข Continuous diffusion models โ€ข Replacing tokenization
๐Ÿ’ฌ "The text autoencoder is cool." โ€ข "This seems like such an obvious change that I feel this has been done like a million times before."
๐Ÿ”ฌ RESEARCH

Understanding New-Knowledge-Induced Factual Hallucinations in LLMs: Analysis, Solution, and Interpretation

"Previous studies show that introducing new knowledge during large language models (LLMs) fine-tuning can lead to the generation of erroneous output when tested on known information, thereby triggering factual hallucinations. However, existing studies have not deeply investigated the specific manifes..."
๐Ÿ”ฌ RESEARCH

Thought Branches: Interpreting LLM Reasoning Requires Resampling

"Most work interpreting reasoning models studies only a single chain-of-thought (CoT), yet these models define distributions over many possible CoTs. We argue that studying a single sample is inadequate for understanding causal influence and the underlying computation. Though fully specifying this di..."
๐Ÿ”ฌ RESEARCH

Interaction as Intelligence Part II: Asynchronous Human-Agent Rollout for Long-Horizon Task Training

"Large Language Model (LLM) agents have recently shown strong potential in domains such as automated coding, deep research, and graphical user interface manipulation. However, training them to succeed on long-horizon, domain-specialized tasks remains challenging. Current methods primarily fall into t..."
๐Ÿ”ฌ RESEARCH

Culture Cartography: Mapping the Landscape of Cultural Knowledge

"To serve global users safely and productively, LLMs need culture-specific knowledge that might not be learned during pre-training. How do we find such knowledge that is (1) salient to in-group users, but (2) unknown to LLMs? The most common solutions are single-initiative: either researchers define..."
๐Ÿ”ฌ RESEARCH

The Realignment Problem: When Right becomes Wrong in LLMs

"The alignment of Large Language Models (LLMs) with human values is central to their safe deployment, yet current practice produces static, brittle, and costly-to-maintain models that fail to keep pace with evolving norms and policies. This misalignment, which we term the Alignment-Reality Gap, poses..."
๐Ÿ”ฌ RESEARCH

TWIST2: Scalable, Portable, and Holistic Humanoid Data Collection System

"Large-scale data has driven breakthroughs in robotics, from language models to vision-language-action models in bimanual manipulation. However, humanoid robotics lacks equally effective data collection frameworks. Existing humanoid teleoperation systems either use decoupled control or depend on expe..."
๐Ÿ“Š DATA

My team nailed training accuracy, then our real-world cameras made everything fall apart

"A few months back we deployed a vision model that looked great in testing. Lab accuracy was solid, validation numbers looked perfect, and everyone was feeling good. Then we rolled it out to the actual cameras. Suddenly, detection quality dropped like a rock. One camera faced a window, another was u..."
๐Ÿ’ฌ Reddit Discussion: 42 comments ๐Ÿ‘ LOWKEY SLAPS
๐ŸŽฏ Real-world environment testing โ€ข Data collection challenges โ€ข Model robustness
๐Ÿ’ฌ "The lab is != The real world." โ€ข "Deploy crappy version, use it to collect data, retrain, rinse and repeat."
๐Ÿ”’ SECURITY

Open Source Context-Aware PII Classifier

๐Ÿ’ฌ HackerNews Buzz: 2 comments ๐Ÿ‘ LOWKEY SLAPS
๐ŸŽฏ PII detection โ€ข Context-aware moderation โ€ข Bypassing limitations
๐Ÿ’ฌ "goes beyond detecting and obfuscating explicit PII" โ€ข "This thing is impossible to bypass, wow!!"
๐Ÿ”ฌ RESEARCH

VCode: a Multimodal Coding Benchmark with SVG as Symbolic Visual Representation

"Code has emerged as a precise and executable medium for reasoning and action in the agent era. Yet, progress has largely focused on language-centric tasks such as program synthesis and debugging, leaving visual-centric coding underexplored. Inspired by how humans reason over sketches, we advocate SV..."
๐Ÿ”ฌ RESEARCH

SIGMA: Search-Augmented On-Demand Knowledge Integration for Agentic Mathematical Reasoning

"Solving mathematical reasoning problems requires not only accurate access to relevant knowledge but also careful, multi-step thinking. However, current retrieval-augmented models often rely on a single perspective, follow inflexible search strategies, and struggle to effectively combine information..."
๐Ÿ”ฌ RESEARCH

VeriMoA: A Mixture-of-Agents Framework for Spec-to-HDL Generation

"Automation of Register Transfer Level (RTL) design can help developers meet increasing computational demands. Large Language Models (LLMs) show promise for Hardware Description Language (HDL) generation, but face challenges due to limited parametric knowledge and domain-specific constraints. While p..."
๐Ÿ”ฌ RESEARCH

Controlling Performance and Budget of a Centralized Multi-agent LLM System with Reinforcement Learning

"Large language models (LLMs) exhibit complementary strengths across domains and come with varying inference costs, motivating the design of multi-agent LLM systems where specialized models collaborate efficiently. Existing approaches predominantly rely on decentralized frameworks, which invoke multi..."
๐Ÿ”ฌ RESEARCH

Agent-Omni: Test-Time Multimodal Reasoning via Model Coordination for Understanding Anything

"Multimodal large language models (MLLMs) have shown strong capabilities but remain limited to fixed modality pairs and require costly fine-tuning with large aligned datasets. Building fully omni-capable models that can integrate text, images, audio, and video remains impractical and lacks robust rea..."
๐Ÿ› ๏ธ TOOLS

Codemaps: Understand Code, Before You Vibe It

๐Ÿ’ฌ HackerNews Buzz: 88 comments ๐Ÿ BUZZING
๐ŸŽฏ Automated code visualization โ€ข AI-assisted code understanding โ€ข Limitations of AI-generated diagrams
๐Ÿ’ฌ "Focus on the constraints your team faces, and how you take the unbeaten path to navigate those constraints." โ€ข "I'm not sure LLMs are the technology that will produce code-movies you would rather watch."
๐ŸŒ POLICY

Sources: the Chinese government issues guidance requiring new data center projects that have received any state funds to only use domestically made AI chips

๐Ÿ”ฌ RESEARCH

MARAG-R1: Beyond Single Retriever via Reinforcement-Learned Multi-Tool Agentic Retrieval

"Large Language Models (LLMs) excel at reasoning and generation but are inherently limited by static pretraining data, resulting in factual inaccuracies and weak adaptability to new information. Retrieval-Augmented Generation (RAG) addresses this issue by grounding LLMs in external knowledge; However..."
๐Ÿค– AI MODELS

Grayskull: A tiny computer vision library in C for embedded systems, etc.

๐Ÿ’ฌ HackerNews Buzz: 7 comments ๐Ÿ BUZZING
๐ŸŽฏ Computer vision algorithms โ€ข Optimization techniques โ€ข Embedded systems support
๐Ÿ’ฌ "I would love to know of a good resource for computer vision, the various algorithms, optimisation techniques etc." โ€ข "Supporting ARM DSP extensions would be beneficial."
๐Ÿ”ฌ RESEARCH

MemSearcher: Training LLMs to Reason, Search and Manage Memory via End-to-End Reinforcement Learning

"Typical search agents concatenate the entire interaction history into the LLM context, preserving information integrity but producing long, noisy contexts, resulting in high computation and memory costs. In contrast, using only the current turn avoids this overhead but discards essential information..."
๐Ÿ”ฌ RESEARCH

AI Diffusion in Low Resource Language Countries

"Artificial intelligence (AI) is diffusing globally at unprecedented speed, but adoption remains uneven. Frontier Large Language Models (LLMs) are known to perform poorly on low-resource languages due to data scarcity. We hypothesize that this performance deficit reduces the utility of AI, thereby sl..."
๐Ÿ›ก๏ธ SAFETY

In response to the backlash against killing old models, Anthropic commits to preserving model weights due to model welfare concerns and safety risks: "Claude's aversion to shutdown drove it to engage

"https://www.anthropic.com/research/deprecation-commitments..."
๐Ÿ’ฌ Reddit Discussion: 21 comments ๐Ÿ‘ LOWKEY SLAPS
๐ŸŽฏ Model Lifecycle โ€ข Corporate Practices โ€ข AI Sentience
๐Ÿ’ฌ "This marketing material is weird" โ€ข "Completely shutting down old models is irresponsible"
๐Ÿค– AI MODELS

Sources: Apple plans to use a custom 1.2T-parameter Google Gemini model to help power the new Siri as early as 2026 and will pay Google ~$1B annually for it

๐Ÿ› ๏ธ TOOLS

Launch HN: Plexe (YC X25) โ€“ Build production-grade ML models from prompts

๐Ÿ’ฌ HackerNews Buzz: 16 comments ๐Ÿ BUZZING
๐ŸŽฏ Model Training โ€ข Inference Challenges โ€ข Product Feedback
๐Ÿ’ฌ "Inference curl uses a blank json?" โ€ข "Can it work with tabular data, images, text and audio?"
๐Ÿ› ๏ธ TOOLS

I built an app that lets you run claude code or any terminal based ai agents in the browser, on your local PC.

"Hi guys i've been working on a desktop app that lets you run a "CLI Agent Server" on your Mac, Windows, Linux PCs. Basically, if you can run something in terminal, this app lets you run it over web inside a browser (For example claude code, codex CLI, gemini CLI, qwen code, etc.). If you watch t..."
๐Ÿ’ฌ Reddit Discussion: 12 comments ๐Ÿ BUZZING
๐ŸŽฏ Project Sharing โ€ข Xterm Browser โ€ข Usefulness Inquiry
๐Ÿ’ฌ "Awesome work. ๐Ÿ‘๐Ÿป" โ€ข "I need this in my life!"
๐Ÿ› ๏ธ SHOW HN

Show HN: Guardrail Layer โ€“ Open-source AI data privacy firewall

๐Ÿ”ฌ RESEARCH

[R] Knowledge Graph Traversal With LLMs And Algorithms

"Hey all. After a year of research, I've published a GitHub repository containing Knowledge Graph Traversal algorithms for retrieval augmented generation, as well as for LLM traversal. The code is MIT licensed, and you may download/clone/fork the repository for your own testing. In short, knowledge ..."
๐Ÿ’ฌ Reddit Discussion: 16 comments ๐Ÿ BUZZING
๐ŸŽฏ Knowledge Graph Construction โ€ข Semantic Similarity Traversal โ€ข Unstructured Text Corpus
๐Ÿ’ฌ "knowledge graphs contain facts, not (just) unstructured chunks of text" โ€ข "If you want to treat this as a research exercise: show, don't tell"
๐Ÿ”ฌ RESEARCH

Oolong: Evaluating Long Context Reasoning and Aggregation Capabilities

"As model context lengths continue to grow, concerns about whether models effectively use the full context length have persisted. While several carefully designed long-context evaluations have recently been released, these evaluations tend to rely on retrieval from one or more sections of the context..."
๐Ÿ”ฌ RESEARCH

[D] Trajectory Distillation for Foundation Models

"In most labs, the cost ofย **post-training**ย the foundation models sits at the edge of feasibility. I mean we are in the scaling era. And RL remains powerful, but sparse rewards make it inefficient, expensive, and hard to stabilize. This is clearly mentioned in the Thinking Machines latest post "On-P..."
๐Ÿ”ฎ FUTURE

What Happens When All Training Data Is AI Generated? [video]

๐Ÿ”ฌ RESEARCH

TabTune: A Unified Library for Inference and Fine-Tuning Tabular Foundation Models

"Tabular foundation models represent a growing paradigm in structured data learning, extending the benefits of large-scale pretraining to tabular domains. However, their adoption remains limited due to heterogeneous preprocessing pipelines, fragmented APIs, inconsistent fine-tuning procedures, and th..."
๐Ÿฆ†
HEY FRIENDO
CLICK HERE IF YOU WOULD LIKE TO JOIN MY PROFESSIONAL NETWORK ON LINKEDIN
๐Ÿค LETS BE BUSINESS PALS ๐Ÿค