π WELCOME TO METAMESH.BIZ +++ TICKER ERROR: CONTENT TOO SPICY FOR ANTHROPIC'S USAGE POLICY +++ HERE'S WHAT'S HAPPENING +++ 'Western Qwen': IBM Wows with Granite 4 LLM Launch and Hybrid Mamba/Transformer +++ Sora 2: AI Video Generation with Realistic Sound +++ LoRA without regrets implemented in Hugging Face TRL [colab, and python scripts] π β’
π WELCOME TO METAMESH.BIZ +++ TICKER ERROR: CONTENT TOO SPICY FOR ANTHROPIC'S USAGE POLICY +++ HERE'S WHAT'S HAPPENING +++ 'Western Qwen': IBM Wows with Granite 4 LLM Launch and Hybrid Mamba/Transformer +++ Sora 2: AI Video Generation with Realistic Sound +++ LoRA without regrets implemented in Hugging Face TRL [colab, and python scripts] π β’
π― Monetization strategies β’ Competition from Chinese models β’ OpenAI's strategic dilemma
π¬ "That VC loss playbook only works if you can corner the market and squeeze later to make up for the losses."
β’ "The biggest concern IMO is how good the open weight models coming out of China are, on consumer hardware."
+++ Big Blue drops open source LLM family mixing Mamba with transformers, betting enterprises care more about memory efficiency than benchmark leaderboards. +++
"Hi everyone, our team has been working nonstop on our open source project, LMCache, to reduce repetitive computation in LLM inference and make systems serve more people (3x more throughput in chat applications) and recently it has been implemented by NVIDIA's Inference project Dyanamo.
In LLM servi..."
π¬ Reddit Discussion: 4 comments
π BUZZING
π― Implementing Llama Integration β’ Caching for Inference Costs β’ Caching Benefits for Models
π¬ "How would we local llama-ers implement this?"
β’ "The reason they did not, from my best guess, is that for local workload, it has 1. less context reuse 2. usually runs smaller models which prefill very fast 3. the workload does not saturate the server(usually with lower qps)"
via Arxivπ€ Yixuan Weng, Minjun Zhu, Qiujie Xie et al.π 2025-09-30
β‘ Score: 8.0
"While previous AI Scientist systems can generate novel findings, they often
lack the focus to produce scientifically valuable contributions that address
pressing human-defined challenges. We introduce DeepScientist, a system
designed to overcome this by conducting goal-oriented, fully autonomous
sci..."
via Arxivπ€ MaΓ«l Macuglia, Paul Friedrich, Giorgia Ramponiπ 2025-09-30
β‘ Score: 8.0
"Deploying reinforcement learning (RL) in robotics, industry, and health care
is blocked by two obstacles: the difficulty of specifying accurate rewards and
the risk of unsafe, data-hungry exploration. We address this by proposing a
two-stage framework that first learns a safe initial policy from a r..."
π‘ AI NEWS BUT ACTUALLY GOOD
The revolution will not be televised, but Claude will email you once we hit the singularity.
Get the stories that matter in Today's AI Briefing.
Powered by Premium Technology Intelligence Algorithms β’ Unsubscribe anytime
"# LoRA Without Regret
> [!WARNING]
> I wrote this page for the TRL docs, but thought it's just drop it here in advance for anyone who can't wait.
I also made a colab notebook of this guide.
Recent res..."
π¬ Reddit Discussion: 4 comments
π BUZZING
π― LoRA training β’ LLM capabilities β’ Practical applications
π¬ "For RL to be the next frontier of LLM training, it should be changing all parts of the system, not just tweak 0.0326% of model weights"
β’ "Choose a model that's Well suited, train multiple LoRAs, let a Backend decide which fine-tune to use and you quickly have experts at Hand for very little cost"
"Quick paper highlight (adapted from TLDR thread):
Finds no special advantage using an LLM to predict its own correctness (a trend in prior work), instead finding that LLMs benefit from learning to predict the correctness of many other models β becoming a GCM.
\--
Training 1 GCM is strictly mor..."
+++ Former Stripe CTO Rahul Patil takes the technical reins while cofounder McCandlish gets a shiny new "chief architect" title. Infrastructure era begins. +++
π― Custom silicon competition β’ Vertical integration risks β’ Analog ML hardware
π¬ "The software titan is rather late to the custom silicon party"
β’ "If everyone is siloed into their own vertically integrated hardware+operating system stack, the results will be awful for free software"
via Arxivπ€ Siddarth Venkatraman, Vineet Jain, Sarthak Mittal et al.π 2025-09-30
β‘ Score: 6.8
"Test-time scaling methods improve the capabilities of large language models
(LLMs) by increasing the amount of compute used during inference to make a
prediction. Inference-time compute can be scaled in parallel by choosing among
multiple independent solutions or sequentially through self-refinement..."
"We ran one of our hardest computer-use benchmarks on Anthropic Sonnet 4.5, side-by-side with Sonnet 4.
Ask: "Install LibreOffice and make a sales table".
Sonnet 4.5: 214 turns, clean trajectory
Sonnet 4: 316 turns, major detours
The difference shows up in multi-step sequences where errors compou..."
via Arxivπ€ Yuyang Liu, Chuan Wen, Yihang Hu et al.π 2025-09-30
β‘ Score: 6.8
"Designing dense rewards is crucial for reinforcement learning (RL), yet in
robotics it often demands extensive manual effort and lacks scalability. One
promising solution is to view task progress as a dense reward signal, as it
quantifies the degree to which actions advance the system toward task
co..."
via Arxivπ€ Chenxi Whitehouse, Sebastian Ruder, Tony Lin et al.π 2025-09-30
β‘ Score: 6.8
"Ensuring native-like quality of large language model (LLM) responses across
many languages is challenging. To address this, we introduce MENLO, a framework
that operationalizes the evaluation of native-like response quality based on
audience design-inspired mechanisms. Using MENLO, we create a datas..."
via Arxivπ€ Jessica Bader, Mateusz Pach, Maria A. Bravo et al.π 2025-09-30
β‘ Score: 6.5
"Text-to-Image (T2I) generation models have advanced rapidly in recent years,
but accurately capturing spatial relationships like "above" or "to the right
of" poses a persistent challenge. Earlier methods improved spatial relationship
following with external position control. However, as architecture..."
via Arxivπ€ Alexander Fishkov, Kajetan Schweighofer, Mykyta Ielanskyi et al.π 2025-09-30
β‘ Score: 6.3
"Quantifying uncertainty of machine learning model predictions is essential
for reliable decision-making, especially in safety-critical applications.
Recently, uncertainty quantification (UQ) theory has advanced significantly,
building on a firm basis of learning with proper scoring rules. However, t..."
via Arxivπ€ Florian GrΓΆtschla, Longxiang Jiao, Luca A. LanzendΓΆrfer et al.π 2025-09-30
β‘ Score: 6.3
"We introduce Panama, an active learning framework to train parametric guitar
amp models end-to-end using a combination of an LSTM model and a WaveNet-like
architecture. With \model, one can create a virtual amp by recording samples
that are determined through an ensemble-based active learning strate..."
via Arxivπ€ Seiji Maekawa, Jackson Hassell, Pouya Pezeshkpour et al.π 2025-09-30
β‘ Score: 6.3
"As language models gain access to external tools via structured function
calls, they become increasingly more capable of solving complex, multi-step
tasks. However, existing benchmarks for tool-augmented language models (TaLMs)
provide insufficient control over factors such as the number of function..."
"As large language models (LLMs) begin to saturate existing benchmarks,
automated benchmark creation using LLMs (LLM as a benchmark) has emerged as a
scalable alternative to slow and costly human curation. While these generated
test sets have to potential to cheaply rank models, we demonstrate a crit..."
via Arxivπ€ JoΓ£o Vitorino, Eva Maia, Isabel PraΓ§a et al.π 2025-09-30
β‘ Score: 6.3
"Due to the susceptibility of Artificial Intelligence (AI) to data
perturbations and adversarial examples, it is crucial to perform a thorough
robustness evaluation before any Machine Learning (ML) model is deployed.
However, examining a model's decision boundaries and identifying potential
vulnerabi..."
π¬ "LeCun has correctly identified that LLM is only one type of intelligence"
β’ "This seems like the same exact talk LeCun has been giving for years"