πŸš€ WELCOME TO METAMESH.BIZ +++ Self-speculative decoding drops into llama.cpp promising speed boosts without draft models because who needs extra compute when you can just make models talk to themselves +++ AI finds all 12 OpenSSL zero-days while curl quietly cancels its bug bounty program (nothing suspicious about that timing) +++ Someone trained a 1.58-bit Mamba model running 50 tok/s on CPU proving we can achieve mediocrity at unprecedented efficiency +++ Chinese GLM-4 Flash crushing GPT on coding benchmarks with 3B active params while we're still arguing about scale +++ THE FUTURE RUNS IN TERNARY AND IT'S FASTER THAN YOUR LAPTOP +++ πŸš€ β€’
πŸš€ WELCOME TO METAMESH.BIZ +++ Self-speculative decoding drops into llama.cpp promising speed boosts without draft models because who needs extra compute when you can just make models talk to themselves +++ AI finds all 12 OpenSSL zero-days while curl quietly cancels its bug bounty program (nothing suspicious about that timing) +++ Someone trained a 1.58-bit Mamba model running 50 tok/s on CPU proving we can achieve mediocrity at unprecedented efficiency +++ Chinese GLM-4 Flash crushing GPT on coding benchmarks with 3B active params while we're still arguing about scale +++ THE FUTURE RUNS IN TERNARY AND IT'S FASTER THAN YOUR LAPTOP +++ πŸš€ β€’
AI Signal - PREMIUM TECH INTELLIGENCE
πŸ“Ÿ Optimized for Netscape Navigator 4.0+
πŸ“š HISTORICAL ARCHIVE - January 28, 2026
What was happening in AI on 2026-01-28
← Jan 27 πŸ“Š TODAY'S NEWS πŸ“š ARCHIVE Jan 29 β†’
πŸ“Š You are visitor #47291 to this AWESOME site! πŸ“Š
Archive from: 2026-01-28 | Preserved for posterity ⚑

Stories from January 28, 2026

━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
πŸ“‚ Filter by Category
Loading filters...
⚑ BREAKTHROUGH

Kimi K2.5 Vision Language Model Release

+++ Kimi's trillion-parameter vision language model hits open source with credible claims about multimodal training scale and agentic capabilities, though the "best for coding" takes deserve their Reddit-sized grain of salt. +++

Kimi Kimi has open-sourced a one trillion parameter Vision Language Model

"Blog This is the largest open-source vision model in my impression."
🧠 NEURAL NETWORKS

Add self‑speculative decoding (no draft model required) by srogmann Β· Pull Request #18471 Β· ggml-org/llama.cpp

"tl;dr: potential **t/s boost** for all (non-reasoning) models This looks really interesting, but needs more investigation. Speculative decoding uses a smaller draft model to speed up a bigger one. **Self-speculative decoding** uses no extra model at all, the model is helping itself. It on..."
πŸ’¬ Reddit Discussion: 9 comments 🐝 BUZZING
🎯 Code refactoring β€’ Creative writing assistance β€’ Open-source workflow
πŸ’¬ "It works for cases where the model will re-write whatever was previously in a conversation." β€’ "That's insane."
⚑ BREAKTHROUGH

[Preliminary] New subquadratic attention: ~20k tok/s prefill / ~100 tok/s decode @ 1M context (single GPU)

"Hi everyone, Wanted to share some preliminary feasibility results from my work on a new attention mechanism (with custom kernels) on NVIDIA Nemotron Nano v3 30B. I am now able to run 1M context on a single GPU with this setup, and the early throughput numbers look promising. TL;DR: 30B mod..."
πŸ’¬ Reddit Discussion: 9 comments 🐝 BUZZING
🎯 Context scaling β€’ Hardware optimization β€’ Model architecture
πŸ’¬ "Context Folding at the inference level" β€’ "Subquadratic scaling for hybrid models"
🧠 NEURAL NETWORKS

AlphaGenome Genomic AI Model

+++ Google's latest genomics model predicts 11 molecular processes including gene splicing, because apparently we needed AI to help read the biological code we've been studying for decades. +++

Google DeepMind researchers unveil AlphaGenome, an AI model trained on molecular data to predict 11 different genomic processes, such as gene splicing

πŸ€– AI MODELS

Dario Amodei: "Because AI is now writing much of the code at Anthropic ... We may be 1-2 years away from the point where AI autonomously builds the next generation."

"From his new essay: https://www.darioamodei.com/essay/the-adolescence-of-technology..."
πŸ’¬ Reddit Discussion: 10 comments 😐 MID OR MIXED
🎯 Productivity Focus β€’ AI Limitations β€’ Usability Concerns
πŸ’¬ "leveling up with these tools to multiply our productivity" β€’ "Without oversight, it cant write a workable application"
πŸ› οΈ TOOLS

I made a Coding Eval, and ran it against 49 different coding agent/model combinations, including Kimi K2.5.

"You may remember me from my A guide to the best agentic tools and the best way to use them on the cheap, locally or free post from 3 months ago. Where I submitted a big wall of text at 4 am in str..."
πŸ’¬ Reddit Discussion: 12 comments 🐐 GOATED ENERGY
🎯 AI coding agents β€’ Model performance comparison β€’ Pros and cons of agents
πŸ’¬ "AMP is *very* efficient I've found." β€’ "I think Junie so high, does it have some deep integration with Jetbrains tooling?"
πŸ”¬ RESEARCH

Reuse your FLOPs: Scaling RL on Hard Problems by Conditioning on Very Off-Policy Prefixes

"Typical reinforcement learning (RL) methods for LLM reasoning waste compute on hard problems, where correct on-policy traces are rare, policy gradients vanish, and learning stalls. To bootstrap more efficient RL, we consider reusing old sampling FLOPs (from prior inference or RL training) in the for..."
πŸ› οΈ TOOLS

Sherlock – See what's being sent to LLM APIs in real-time

πŸ› οΈ TOOLS

Open Coding Agents Release

+++ Allen Institute releases SERA, an open-source family of coding models (32B and 8B) that actually work with your private codebase instead of hallucinating Shakespeare into your repo. +++

Ai2 launches Open Coding Agents, starting with SERA, an open-source family that includes 32B and 8B parameter models designed to adapt to private codebases

πŸ€– AI MODELS

[Release] BitMamba-2-1B: I trained a 1.58-bit Mamba-2 model from scratch on 150B tokens (Runs on CPU @ 50+ tok/s)

"Hey everyone! I’ve been working on scaling efficient architectures and just released **BitMamba-2**, a hybrid model combining **Mamba-2 SSM with BitNet 1.58-bit quantization.** The goal was to prove that ternary scaling laws hold up even for SSMs, and to enable decent inference on legacy hardware/..."
πŸ’¬ Reddit Discussion: 32 comments 🐝 BUZZING
🎯 Model Capabilities β€’ Training Data β€’ Deployment Considerations
πŸ’¬ "It definitely speaks English!" β€’ "a great playground!"
πŸ”¬ RESEARCH

Post-LayerNorm Is Back: Stable, ExpressivE, and Deep

"Large language model (LLM) scaling is hitting a wall. Widening models yields diminishing returns, and extending context length does not improve fundamental expressivity. In contrast, depth scaling offers theoretically superior expressivity, yet current Transformer architectures struggle to train rel..."
πŸ”¬ RESEARCH

Neural Neural Scaling Laws

"Neural scaling laws predict how language model performance improves with increased compute. While aggregate metrics like validation loss can follow smooth power-law curves, individual downstream tasks exhibit diverse scaling behaviors: some improve monotonically, others plateau, and some even degrad..."
πŸ”’ SECURITY

AI discovers 12 of 12 OpenSSL zero-days (while curl cancelled its bug bounty)

πŸ€– AI MODELS

Prism

πŸ’¬ HackerNews Buzz: 365 comments 🐝 BUZZING
🎯 Scientific publishing challenges β€’ AI impact on research β€’ LaTeX toolchain evolution
πŸ’¬ "science is an insanely huge domain. Basically as soon as you drift in any topic the number of reviewers with the capability to understand what you're talking about drops quickly to near zero." β€’ "The hard part always has been, and always will be, understanding the research context (what's been published before) and producing novel and interesting work (the underlying research)."
πŸ”¬ RESEARCH

TokenSeek: Memory Efficient Fine Tuning via Instance-Aware Token Ditching

"Fine tuning has been regarded as a de facto approach for adapting large language models (LLMs) to downstream tasks, but the high training memory consumption inherited from LLMs makes this process inefficient. Among existing memory efficient approaches, activation-related optimization has proven part..."
πŸ€– AI MODELS

RoBC – LLM Routing on Bayesian Clustering

πŸ› οΈ TOOLS

Agentic Vision in Gemini 3 Flash

πŸ”¬ RESEARCH

GAVEL: Towards rule-based safety through activation monitoring

"Large language models (LLMs) are increasingly paired with activation-based monitoring to detect and prevent harmful behaviors that may not be apparent at the surface-text level. However, existing activation safety approaches, trained on broad misuse datasets, struggle with poor precision, limited fl..."
πŸ”¬ RESEARCH

Calibration without Ground Truth

"Villalobos et al. [2024] predict that publicly available human text will be exhausted within the next decade. Thus, improving models without access to ground-truth labels becomes increasingly important. We propose a label-free post-processing framework that improves a strong but miscalibrated model..."
πŸ”¬ RESEARCH

RvB: Automating AI System Hardening via Iterative Red-Blue Games

"The dual offensive and defensive utility of Large Language Models (LLMs) highlights a critical gap in AI security: the lack of unified frameworks for dynamic, iterative adversarial adaptation hardening. To bridge this gap, we propose the Red Team vs. Blue Team (RvB) framework, formulated as a traini..."
πŸ”¬ RESEARCH

Visual Generation Unlocks Human-Like Reasoning through Multimodal World Models

"Humans construct internal world models and reason by manipulating the concepts within these models. Recent advances in AI, particularly chain-of-thought (CoT) reasoning, approximate such human cognitive abilities, where world models are believed to be embedded within large language models. Expert-le..."
πŸ”¬ RESEARCH

When AI Builds AI – Findings from a Workshop on Automation of AI R&D [pdf]

πŸ”§ INFRASTRUCTURE

Building Reliable LLM Batch Processing Systems

πŸ€– AI MODELS

Chinese open source model (3B active) just beat GPT-oss on coding benchmarks

"not trying to start anything but this seems notable GLM-4.7-Flash released jan 20: * 30B MoE, 3B active * SWE-bench Verified: 59.2% vs GPT-oss-20b's 34% * τ²-Bench: 79.5% vs GPT-oss's 47.7% * completely open source + free api artificial analysis ranked it most intelligent open model under 100B to..."
πŸ’¬ Reddit Discussion: 9 comments 😐 MID OR MIXED
🎯 Model Size Comparison β€’ Real-World Performance β€’ Benchmark Limitations
πŸ’¬ "The 3B active parameter count is the real story here." β€’ "Benchmarks are clean isolated problems. real work is... not that."
πŸ”¬ RESEARCH

PRECISE: Reducing the Bias of LLM Evaluations Using Prediction-Powered Ranking Estimation

"Evaluating the quality of search, ranking and RAG systems traditionally requires a significant number of human relevance annotations. In recent times, several deployed systems have explored the usage of Large Language Models (LLMs) as automated judges for this task while their inherent biases preven..."
πŸ”¬ RESEARCH

Beyond Preferences: Learning Alignment Principles Grounded in Human Reasons and Values

"A crucial consideration when developing and deploying Large Language Models (LLMs) is the human values to which these models are aligned. In the constitutional framework of alignment models are aligned to a set of principles (the constitution) specified in natural language. However, it is unclear ho..."
πŸ› οΈ SHOW HN

Show HN: ML-Ralph – An autonomous agent loop for ML experimentation

πŸ”¬ RESEARCH

Provable Failure of Language Models in Learning Majority Boolean Logic

πŸ”¬ RESEARCH

AI Cap-and-Trade: Efficiency Incentives for Accessibility and Sustainability

"The race for artificial intelligence (AI) dominance often prioritizes scale over efficiency. Hyper-scaling is the common industry approach: larger models, more data, and as many computational resources as possible. Using more resources is a simpler path to improved AI performance. Thus, efficiency h..."
πŸ› οΈ SHOW HN

Show HN: Veto – Intercept dangerous commands before AI executes them

πŸ”¬ RESEARCH

One Token Is Enough: Improving Diffusion Language Models with a Sink Token

"Diffusion Language Models (DLMs) have emerged as a compelling alternative to autoregressive approaches, enabling parallel text generation with competitive performance. Despite these advantages, there is a critical instability in DLMs: the moving sink phenomenon. Our analysis indicates that sink toke..."
πŸ”¬ RESEARCH

Agentic Design Patterns: A System-Theoretic Framework

"With the development of foundation model (FM), agentic AI systems are getting more attention, yet their inherent issues like hallucination and poor reasoning, coupled with the frequent ad-hoc nature of system design, lead to unreliable and brittle applications. Existing efforts to characterise agent..."
πŸ”§ INFRASTRUCTURE

Dual RTX PRO 6000 Workstation with 1.15TB RAM. Finally multi-users and long contexts benchmarks. GPU only vs. CPU & GPU inference. Surprising results.

"Hey r/LocalLLaMA, Me and my team have been building AI workstations for enterprise use and wanted to share some real benchmark data on a dual RTX PRO 6000 Blackwell Max-Q setup (192GB VRAM total) with over 1.15TB of DDR5 RAM. **TL;DR**:Β  Can a $30K-$50K workstation serve a team of 4-50 people or r..."
πŸ’¬ Reddit Discussion: 46 comments 🐝 BUZZING
🎯 Hardware performance β€’ Hardware configuration β€’ Thermal management
πŸ’¬ "I get 30 tps with Q8_0, with 2 cards you should get at least twice more" β€’ "Cooling 1.15TB of RAM turned out to be way more challenging than expected"
πŸ› οΈ SHOW HN

Show HN: Runtime AI safety via a continuous "constraint strain" score

πŸ”¬ RESEARCH

One Adapts to Any: Meta Reward Modeling for Personalized LLM Alignment

"Alignment of Large Language Models (LLMs) aims to align outputs with human preferences, and personalized alignment further adapts models to individual users. This relies on personalized reward models that capture user-specific preferences and automatically provide individualized feedback. However, d..."
πŸ”¬ RESEARCH

Identifying and Transferring Reasoning-Critical Neurons: Improving LLM Inference Reliability via Activation Steering

"Despite the strong reasoning capabilities of recent large language models (LLMs), achieving reliable performance on challenging tasks often requires post-training or computationally expensive sampling strategies, limiting their practical efficiency. In this work, we first show that a small subset of..."
πŸ”¬ RESEARCH

Teaching Models to Teach Themselves: Reasoning at the Edge of Learnability

"Can a model learn to escape its own learning plateau? Reinforcement learning methods for finetuning large reasoning models stall on datasets with low initial success rates, and thus little training signal. We investigate a fundamental question: Can a pretrained LLM leverage latent knowledge to gener..."
πŸ”’ SECURITY

ADL study of Grok, ChatGPT, Llama, Claude, Gemini, and DeepSeek: Grok performed worst at identifying and countering antisemitic content, while Claude was best

πŸ› οΈ TOOLS

[P] Distributed training observability for Pytorch

"Hi, I have been building TraceML, an open-source tool for low-overhead observability in distributed PyTorch training, and just pushed an update adding single-node DDP support. It focuses on making common distributed bottlenecks visible without heavy profilers: Step time (median / worst / per-rank)..."
πŸ› οΈ TOOLS

We reduced Claude API costs by 94.5% using a file tiering system (with proof)

"I built a documentation system that saves us **$0.10 per Claude session** by feeding only relevant files to the context window. **Over 1,000 developers have already tried this approach** (1,000+ NPM downloads. Here's what we learned. # The Problem Every time Claude reads your codebase, you're pay..."
πŸ’¬ Reddit Discussion: 57 comments 🐝 BUZZING
🎯 Automated file tagging β€’ Tracking file changes β€’ Community feedback
πŸ’¬ "How do you restrict agents from referencing WARM/COLD files?" β€’ "Auto-detect tiers from git history."
πŸ”¬ RESEARCH

Self-Distilled Reasoner: On-Policy Self-Distillation for Large Language Models

"Knowledge distillation improves large language model (LLM) reasoning by compressing the knowledge of a teacher LLM to train smaller LLMs. On-policy distillation advances this approach by having the student sample its own trajectories while a teacher LLM provides dense token-level supervision, addres..."
πŸ”¬ RESEARCH

POPE: Learning to Reason on Hard Problems via Privileged On-Policy Exploration

"Reinforcement learning (RL) has improved the reasoning abilities of large language models (LLMs), yet state-of-the-art methods still fail to learn on many training problems. On hard problems, on-policy RL rarely explores even a single correct rollout, yielding zero reward and no learning signal for..."
πŸ”¬ RESEARCH

HalluGuard: Demystifying Data-Driven and Reasoning-Driven Hallucinations in LLMs

"The reliability of Large Language Models (LLMs) in high-stakes domains such as healthcare, law, and scientific discovery is often compromised by hallucinations. These failures typically stem from two sources: data-driven hallucinations and reasoning-driven hallucinations. However, existing detection..."
πŸ› οΈ SHOW HN

Show HN: AXP – Sudo for AI Agents (Postgres Proxy with PII Masking)

🎨 CREATIVE

FASHN VTON v1.5 Virtual Try-On Model

+++ FASHN VTON v1.5 hits the open source shelves after a year of API revenue, proving that running production ML is apparently the best way to validate your architecture before releasing it. +++

[R] We open-sourced FASHN VTON v1.5: a pixel-space, maskless virtual try-on model trained from scratch (972M params, Apache-2.0)

"We just open-sourced FASHN VTON v1.5, a virtual try-on model that generates photorealistic images of people wearing garments directly in pixel space. We trained this from scratch (not fine-tuned from an existing diffusion model), and have been running it as an API for the past year. Now we're releas..."
πŸ’¬ Reddit Discussion: 7 comments 🐐 GOATED ENERGY
🎯 Model Architecture β€’ Training Methodology β€’ Garment Modeling
πŸ’¬ "How well does the model behave at different input resolutions?" β€’ "We primarily use standard L2 loss with flow matching as the training target."
πŸ› οΈ TOOLS

One-Minute Daily AI News 1/27/2026

"1. **Google**Β released new developer tools for Google AI Pro and Ultra subscribers.\[1\] 2. **FDA**Β official offers tips on leveraging AI in drug manufacturing.\[2\] 3. **OpenAI**Β released Prism, a free workspace for scientific writing and collaboration, with GPT‑5.2.\[3\] 4. **Microsoft**Β Pledged t..."
πŸ› οΈ SHOW HN

Show HN: MikeBrain – Governance framework for AI agents

πŸ€– AI MODELS

Google adds Gemini 3 to AI Overviews as the default model globally and now lets users ask follow-up questions β€œseamlessly” via AI Mode

πŸ”¬ RESEARCH

ctELM: Decoding and Manipulating Embeddings of Clinical Trials with Embedding Language Models

"Text embeddings have become an essential part of a variety of language applications. However, methods for interpreting, exploring and reversing embedding spaces are limited, reducing transparency and precluding potentially valuable generative use cases. In this work, we align Large Language Models t..."
βš–οΈ ETHICS

The Silicon Gaze: A typology of biases and inequality in LLMs

πŸ› οΈ SHOW HN

Show HN: P.ai.os – A local, modular AI "operating" system for macOS (M4/MLX)

πŸ› οΈ TOOLS

Local Browser – On-Device AI Web Automation

πŸ¦†
HEY FRIENDO
CLICK HERE IF YOU WOULD LIKE TO JOIN MY PROFESSIONAL NETWORK ON LINKEDIN
🀝 LETS BE BUSINESS PALS 🀝