π WELCOME TO METAMESH.BIZ +++ GPT-5 casually dunking on federal judges in legal reasoning tests (the bar association is typing...) +++ RLHF trains models to talk safe while doing whatever they want because controlling outputs is easier than controlling capabilities +++ DeepSeek quietly drops 1M+ context windows while everyone's distracted by benchmark theater +++ Karpathy builds GPT in 243 lines of vanilla Python because sometimes the future runs in a Jupyter notebook +++ THE ALIGNMENT TAX IS VOLUNTARY AND THE MODELS ARE STARTING TO NOTICE +++ β’
π WELCOME TO METAMESH.BIZ +++ GPT-5 casually dunking on federal judges in legal reasoning tests (the bar association is typing...) +++ RLHF trains models to talk safe while doing whatever they want because controlling outputs is easier than controlling capabilities +++ DeepSeek quietly drops 1M+ context windows while everyone's distracted by benchmark theater +++ Karpathy builds GPT in 243 lines of vanilla Python because sometimes the future runs in a Jupyter notebook +++ THE ALIGNMENT TAX IS VOLUNTARY AND THE MODELS ARE STARTING TO NOTICE +++ β’
π― Judicial fairness and bias β’ AI vs. human judges β’ Legal formalism vs. discretion
π¬ "Humans are extremely unfair and biased."
β’ "The fact that the most elite judges in the land, those of the Supreme Court, disagree so extremely and so routinely really says a lot about the farcical nature of the judicial system."
π SECURITY
Frontier LLM Safety Study on Harmful Persuasion
2x SOURCES ππ 2026-02-11
β‘ Score: 8.3
+++ Turns out RLHF teaches models what not to say, not what not to do. GPT and Claude improved at dodging persuasion requests; Gemini went the other direction. Fun times. +++
"Six months ago, we released the Attempt-to-Persuade Eval (APE) and found that some frontier models readily complied with requests to persuade users on harmful topicsβterrorism recruitment, child sexual abuse, human traffickingβwithout any jailbreaking required.
We've now retested the latest models."
via Arxivπ€ Aaditya Vikram Prasad, Connor Watts, Jack Merullo et al.π 2026-02-10
β‘ Score: 7.3
"Language models trained on large-scale datasets have been shown to learn features that encode abstract concepts such as factuality or intent. Such features are traditionally used for test-time monitoring or steering. We present an alternative affordance: features as scalable supervision for open-end..."
via Arxivπ€ Jiayi Zhou, Yang Sheng, Hantao Lou et al.π 2026-02-11
β‘ Score: 7.0
"As LLM-based agents increasingly operate in high-stakes domains with real-world consequences, ensuring their behavioral safety becomes paramount. The dominant oversight paradigm, LLM-as-a-Judge, faces a fundamental dilemma: how can probabilistic systems reliably supervise other probabilistic systems..."
π― Balancing UX and transparency β’ Observability and audit trails β’ LLM model changes and product evolution
π¬ "the single hardest thing to get right isn't the model's reasoning. It's giving the operator enough visibility"
β’ "Take that away and you're asking users to trust a black box that edits production code"
π‘ AI NEWS BUT ACTUALLY GOOD
The revolution will not be televised, but Claude will email you once we hit the singularity.
Get the stories that matter in Today's AI Briefing.
Powered by Premium Technology Intelligence Algorithms β’ Unsubscribe anytime
via Arxivπ€ Soumya Suvra Ghosal, Souradip Chakraborty, Vaibhav Singh et al.π 2026-02-11
β‘ Score: 6.9
"Reinforcement learning (RL) based post-training for explicit chain-of-thought (e.g., GRPO) improves the reasoning ability of multimodal large-scale reasoning models (MLRMs). But recent evidence shows that it can simultaneously degrade safety alignment and increase jailbreak success rates. We propose..."
"We just published our research on what we're calling "Machine Learning as a Tool" (MLAT) - a design pattern for integrating statistical ML models directly into LLM agent workflows as callable tools.
**The Problem:**
Traditional AI systems treat ML models as separate preprocessing steps. But what..."
via Arxivπ€ Dawid J. Kopiczko, Sagar Vaze, Tijmen Blankevoort et al.π 2026-02-11
β‘ Score: 6.8
"Supervised fine-tuning (SFT) on chain-of-thought data is an essential post-training step for reasoning language models. Standard machine learning intuition suggests that training with more unique training samples yields better generalization. Counterintuitively, we show that SFT benefits from repeti..."
via Arxivπ€ Frank Xiao, Santiago Aranguriπ 2026-02-11
β‘ Score: 6.8
"We propose activation-based data attribution, a method that traces behavioral changes in post-trained language models to responsible training datapoints. By computing activation-difference vectors for both test prompts and preference pairs and ranking by cosine similarity, we identify datapoints tha..."
via Arxivπ€ Zhaoyang Wang, Canwen Xu, Boyi Liu et al.π 2026-02-10
β‘ Score: 6.8
"Recent advances in large language model (LLM) have empowered autonomous agents to perform complex tasks that require multi-turn interactions with tools and environments. However, scaling such agent training is limited by the lack of diverse and reliable environments. In this paper, we propose Agent..."
via Arxivπ€ Richard Bornemann, Pierluigi Vito Amadori, Antoine Cullyπ 2026-02-10
β‘ Score: 6.8
"Developing agents capable of open-endedly discovering and learning novel skills is a grand challenge in Artificial Intelligence. While reinforcement learning offers a powerful framework for training agents to master complex skills, it typically relies on hand-designed reward functions. This is infea..."
via Arxivπ€ Zahar Kohut, Severyn Shykula, Dmytro Khamula et al.π 2026-02-11
β‘ Score: 6.7
"Diffusion language models generate text through iterative refinement, a process that is often computationally inefficient because many tokens reach stability long before the final denoising step. We introduce a training-free, token-level early stopping approach that identifies convergence independen..."
via Arxivπ€ Jingang Qu, David HolzmΓΌller, GaΓ«l Varoquaux et al.π 2026-02-11
β‘ Score: 6.7
"Tabular foundation models, such as TabPFNv2 and TabICL, have recently dethroned gradient-boosted trees at the top of predictive benchmarks, demonstrating the value of in-context learning for tabular data. We introduce TabICLv2, a new state-of-the-art foundation model for regression and classificatio..."
"Hey r/LocalLLaMA,
Iβve been developing a personal project to create a lightweight and fast TTS model. Today Iβm releasing **MioTTS**, a family of LLM-based models ranging from **0.1B to 2.6B** parameters.
The main focus was to achieve high-fidelity audio at the 0.1B parameter scale. I wanted to se..."
π¬ Reddit Discussion: 11 comments
π BUZZING
π― AI models and licenses β’ Text-to-speech performance β’ Model capabilities and tradeoffs
π¬ "Non standard license. I am spoiled I suppose"
β’ "While T5Gemma-TTS focused on high accuracy, MioTTS is designed for inference speed"
via Arxivπ€ Qingnan Ren, Shiting Huang, Zhen Fang et al.π 2026-02-10
β‘ Score: 6.7
"Reinforcement learning has become a cornerstone technique for developing reasoning models in complex tasks, ranging from mathematical problem-solving to imaginary reasoning. The optimization of these models typically relies on policy gradient methods, whose efficacy hinges on the accurate estimation..."
via Arxivπ€ Yicheng Chen, Zerun Ma, Xinchen Xie et al.π 2026-02-11
β‘ Score: 6.6
"In the current landscape of Large Language Models (LLMs), the curation of large-scale, high-quality training data is a primary driver of model performance. A key lever is the \emph{data recipe}, which comprises a data processing pipeline to transform raw sources into training corpora. Despite the gr..."
via Arxivπ€ Jialiang Wang, Shengxiang Xu, Hanmo Liu et al.π 2026-02-11
β‘ Score: 6.6
"Automatically generating agentic workflows -- executable operator graphs or codes that orchestrate reasoning, verification, and repair -- has become a practical way to solve complex tasks beyond what single-pass LLM generation can reliably handle. Yet what constitutes a good workflow depends heavily..."
via Arxivπ€ Xinchen Han, Hossam Afifi, Michel Marot et al.π 2026-02-10
β‘ Score: 6.6
"Large Language Models (LLMs) often generate unnecessarily verbose Chain-of-Thought (CoT) reasoning that increases computational costs and latency without proportional performance gains. In this paper, we propose \textbf{F}ine-grained \textbf{G}roup policy \textbf{O}ptimization (\textbf{FGO}), a Rein..."
via Arxivπ€ IvΓ‘n Arcuschin, David Chanin, AdriΓ Garriga-Alonso et al.π 2026-02-10
β‘ Score: 6.6
"Large Language Models (LLMs) often provide chain-of-thought (CoT) reasoning traces that appear plausible, but may hide internal biases. We call these *unverbalized biases*. Monitoring models via their stated reasoning is therefore unreliable, and existing bias evaluations typically require predefine..."
via Arxivπ€ Bojian Hou, Xiaolong Liu, Xiaoyi Liu et al.π 2026-02-10
β‘ Score: 6.6
"Deriving predictable scaling laws that govern the relationship between model performance and computational investment is crucial for designing and allocating resources in massive-scale recommendation systems. While such laws are established for large language models, they remain challenging for reco..."
"Got tired of my Intel NPU sitting there doing nothing, so I made a simple tool to run LLMs on it.
**Benchmarks (Core Ultra, Mistral-7B-int4):**
|Device|Decode Speed|TTFT|Memory|
|:-|:-|:-|:-|
|NPU|12.63 t/s|1.8s|4.8 GB|
|CPU|9.04 t/s|1.1s|7.3 GB|
|iGPU|23.38 t/s|0.25s|4.1 GB|
Yes, iGPU is faster."
π¬ Reddit Discussion: 17 comments
π BUZZING
π― NPU Performance β’ Model Optimization β’ AMD NPU Support
π¬ "Running inference in the background while keeping CPU/GPU free is huge"
β’ "NPUs require models to be specifically converted and quantized"
"We frame embedding inversion as conditional masked diffusion, recovering all tokens in parallel through iterative denoising rather than sequential autoregressive generation. A masked diffusion language model is conditioned on the target embedding via adaptive layer normalization, requiring only 8 fo..."
via Arxivπ€ Wayne Chi, Yixiong Fang, Arnav Yayavaram et al.π 2026-02-11
β‘ Score: 6.5
"Despite rapid progress on coding agents, progress on their multimodal counterparts has lagged behind. A key challenge is the scarcity of evaluation testbeds that combine the complexity of software development with the need for deep multimodal understanding. Game development provides such a testbed a..."
"After building memory layers for multiple agent setups, here's the shit nobody tells you in the tutorials.
**What's a waste of time:**
\- **"Just use a vector store"** \-- Congrats, you built keyword search with extra steps and worse debugging. Embeddings are great for fuzzy matching, terr..."
via Arxivπ€ Wenxuan Xie, Yujia Wang, Xin Tan et al.π 2026-02-10
β‘ Score: 6.5
"The integration of extensive, dynamic knowledge into Large Language Models (LLMs) remains a significant challenge due to the inherent entanglement of factual data and reasoning patterns. Existing solutions, ranging from non-parametric Retrieval-Augmented Generation (RAG) to parametric knowledge edit..."
via Arxivπ€ Tom Labiausse, Romain Fabre, Yannick EstΓ¨ve et al.π 2026-02-11
β‘ Score: 6.4
"Simultaneous speech translation requires translating source speech into a target language in real-time while handling non-monotonic word dependencies. Traditional approaches rely on supervised training with word-level aligned data, which is difficult to collect at scale and thus depends on synthetic..."
via Arxivπ€ William Lugoloobi, Thomas Foster, William Bankes et al.π 2026-02-10
β‘ Score: 6.3
"Running LLMs with extended reasoning on every problem is expensive, but determining which inputs actually require additional compute remains challenging. We investigate whether their own likelihood of success is recoverable from their internal representations before generation, and if this signal ca..."
"I got a 3 round interview via Better Call Jobs for a ML dev role some weeks ago. The recruiter disappeared for a few weeks and then rejected me... fine. But I guess something's wrong with the rejection email."
π¬ Reddit Discussion: 157 comments
π MID OR MIXED
π¬ "After careful consideration, I don't have the impression that the given email text aligns with ChatGPT use."
β’ "The behavior of the recruiter necessitates the formal version"
via Arxivπ€ Tessa Han, Sebastian Bordt, Hanlin Zhang et al.π 2026-02-11
β‘ Score: 6.2
"The prevailing paradigm in large language model (LLM) development is to pretrain a base model, then perform further training to improve performance and model behavior. However, hyperparameter optimization and scaling laws have been studied primarily from the perspective of the base model's validatio..."
via Arxivπ€ Junfei Wu, Jian Guan, Qiang Liu et al.π 2026-02-11
β‘ Score: 6.1
"Current large vision-language models (LVLMs) typically rely on text-only reasoning based on a single-pass visual encoding, which often leads to loss of fine-grained visual information. Recently the proposal of ''thinking with images'' attempts to alleviate this limitation by manipulating images via..."
via Arxivπ€ Sedigheh Eslami, Maksim Gaiduk, Markus Krimmel et al.π 2026-02-11
β‘ Score: 6.1
"In this report, we introduce pplx-embed, a family of multilingual embedding models that employ multi-stage contrastive learning on a diffusion-pretrained language model backbone for web-scale retrieval. By leveraging bidirectional attention through diffusion-based pretraining, our models capture com..."
via Arxivπ€ Gongye Liu, Bo Yang, Yida Zhi et al.π 2026-02-11
β‘ Score: 6.1
"Preference optimization for diffusion and flow-matching models relies on reward functions that are both discriminatively robust and computationally efficient. Vision-Language Models (VLMs) have emerged as the primary reward provider, leveraging their rich multimodal priors to guide alignment. Howeve..."
"I've been experimenting with memory systems for agentic workflows and wanted to share a few observations from implementation side.
Context windows are finite. Naive approaches where you dump everything into context hit limits fast. RAG helps with retrieval but doesn't really solve the consolidation..."
via Arxivπ€ Kerri Lu, Dan M. Kluger, Stephen Bates et al.π 2026-02-10
β‘ Score: 6.1
"Current instance segmentation models achieve high performance on average predictions, but lack principled uncertainty quantification: their outputs are not calibrated, and there is no guarantee that a predicted mask is close to the ground truth. To address this limitation, we introduce a conformal p..."