π WELCOME TO METAMESH.BIZ +++ OpenAI drops Sora 2 claiming it's the "GPT-3.5 moment for video" while shipping an iOS app for your cousins to deepfake themselves +++ Periodic Labs vacuum-cleaners 20+ researchers from the usual suspects to make AI do actual science instead of writing LinkedIn posts +++ Cerebras casually raises another $1.1B because training runs don't pay for themselves +++ THE FUTURE IS MULTIMODAL, VENTURE-FUNDED, AND GENERATING COMPREHENSION DEBT AT SCALE +++ π β’
π WELCOME TO METAMESH.BIZ +++ OpenAI drops Sora 2 claiming it's the "GPT-3.5 moment for video" while shipping an iOS app for your cousins to deepfake themselves +++ Periodic Labs vacuum-cleaners 20+ researchers from the usual suspects to make AI do actual science instead of writing LinkedIn posts +++ Cerebras casually raises another $1.1B because training runs don't pay for themselves +++ THE FUTURE IS MULTIMODAL, VENTURE-FUNDED, AND GENERATING COMPREHENSION DEBT AT SCALE +++ π β’
+++ OpenAI releases its video generation sequel with improved physics and cinematic flair, only to discover users immediately started cloning SpongeBob. +++
"Finally got my hands on a Sora 2 access code (everyone's fighting for these rn) and dropped my first video.
First 3 image gens failed due to moderation..
Here the prompt I used for my text to video!
If you have any questions HMU or reply with a prompt request!
Prompt: A frantic 90s action movie ..."
"Please feel free to share, exchange or contact each other for Sora 2 invite codes. And if you used a code, please comment that it has been used. Thanks everyone for participating!"
π¬ "If the code doesn't work for you, it's probably because your IP is not US/CA."
β’ "Please report anyone offering to sell codes. Do not attempt to buy codes; there is a very high chance you'll get scammed."
"A new Thinking Machines blog led by John Schulman (OpenAI co-founder) shows how LoRA in reinforcement learning (RL) can match full-finetuning performance when done right! And all while using 2/3 of the resources of FFT. Blog: [https://thinkingmachines.ai/blog/lora/](https://thinkingmachines.ai/blog/..."
"We just finished evaluating Sonnet 4.5 on SWE-bench verified with our minimal agent and it's quite a big leap, reaching 70.6% making it the solid #1 of all the models we have evaluated.
This is all independently run with a minimal agent with a very common sense prompt that is the same for all lang..."
π― Model Comparisons β’ Benchmark Methodology β’ Model Flexibility
π¬ "GPT-5 mini price to performance is insane"
β’ "Need to compare high effort GPT-5 to Sonnet"
π’ BUSINESS
OpenAI and Stripe Agentic Commerce Protocol
3x SOURCES ππ 2025-09-29
β‘ Score: 8.5
+++ The ChatGPT maker releases Agentic Commerce Protocol specs, letting AI agents buy things online without humans fumbling through checkout forms. +++
π― LLM impact on software engineering β’ Importance of code review and quality β’ Evolving software development practices
π¬ "LLMs absolutely produce reams of hard-to-debug code. It's a real problem."
β’ "Teams that care about quality will take the time to review and understand LLM-generated code is already failing."
π¬ "we run our agent process in a locked-down rootless podman container"
β’ "Exposing an API to the agent that specifically give it access to the above data, avoiding the risk altogether"
π‘ AI NEWS BUT ACTUALLY GOOD
The revolution will not be televised, but Claude will email you once we hit the singularity.
Get the stories that matter in Today's AI Briefing.
Powered by Premium Technology Intelligence Algorithms β’ Unsubscribe anytime
π― Cerebras' performance and adoption β’ Alternatives to Nvidia GPUs β’ Tradeoffs in model performance
π¬ "Cerebras has been a true revelation when it comes to inference"
β’ "Sooner or later, lots of competitors including Cerebras are going to take apart Nvidia's data center market share"
π― Synthetic data evaluation β’ Model generalization β’ Fine-tuning for task-specific performance
π¬ "Essentially, model trained on synthetic arXiv/PubMed/FDA extractions performs better on more synthetic arXiv/PubMed/FDA extractions than a model that never saw this distribution."
β’ "It's wild to me how many people still think that fine-tuning doesn't work."
π¬ HackerNews Buzz: 175 comments
π MID OR MIXED
π― Censorship and Regulation β’ AI Safety and Oversight β’ Unintended Consequences
π¬ "The government doesn't get to create new categories of dangerous speech just because the technology is new."
β’ "Once you accept the premise that government can mandate content restrictions for safety, you've lost the argument."
π― Data privacy concerns β’ AI adoption challenges β’ AI startup landscape
π¬ "We're testing the use of AI to aggregate and explain patterns in the data we have, but this is limited to our ticketing systems and Slack."
β’ "AI might be great. AI might be terrible. I'm not all convinced that most data aggregation features baked into AI and used by most normal companies couldn't be implemented in R or SQL."
via Arxivπ€ Xiangxin Zhou, Zichen Liu, Haonan Wang et al.π 2025-09-26
β‘ Score: 7.7
"We introduce a variational reasoning framework for language models that
treats thinking traces as latent variables and optimizes them through
variational inference. Starting from the evidence lower bound (ELBO), we extend
it to a multi-trace objective for tighter bounds and propose a forward-KL
form..."
via Arxivπ€ Xingyu Fu, Siyi Liu, Yinuo Xu et al.π 2025-09-26
β‘ Score: 7.3
"Can humans identify AI-generated (fake) videos and provide grounded reasons?
While video generation models have advanced rapidly, a critical dimension --
whether humans can detect deepfake traces within a generated video, i.e.,
spatiotemporal grounded visual artifacts that reveal a video as machine..."
via Arxivπ€ Junkang Wu, Kexin Huang, Jiancan Wu et al.π 2025-09-26
β‘ Score: 7.3
"Reinforcement Learning with Verifiable Rewards (RLVR) strengthens LLM
reasoning, but training often oscillates between {entropy collapse} and
{entropy explosion}. We trace both hazards to the mean baseline used in
value-free RL (e.g., GRPO and DAPO), which improperly penalizes
negative-advantage sam..."
via Arxivπ€ Amandeep Kumar, Nithin Gopalakrishnan Nair, Vishal M. Patelπ 2025-09-26
β‘ Score: 6.9
"Autoregressive (AR) transformers have emerged as a powerful paradigm for
visual generation, largely due to their scalability, computational efficiency
and unified architecture with language and vision. Among them, next scale
prediction Visual Autoregressive Generation (VAR) has recently demonstrated..."
via Arxivπ€ Siwei Wang, Yifei Shen, Haoran Sun et al.π 2025-09-26
β‘ Score: 6.8
"Recent reinforcement learning (RL) methods have substantially enhanced the
planning capabilities of Large Language Models (LLMs), yet the theoretical
basis for their effectiveness remains elusive. In this work, we investigate
RL's benefits and limitations through a tractable graph-based abstraction,..."
"A PyTorch add-on that shows *GPU/CPU/memory usage per layer* while training. The goal: make efficiency problems visible without digging into Nsights or heavy profilers. Github link
Training runs often crash with CUDA OOM errors but itβs hard to know which l..."
via Arxivπ€ Luc Boudier, Loris Manganelli, Eleftherios Tsonis et al.π 2025-09-26
β‘ Score: 6.8
"Few-shot image classification remains challenging due to the limited
availability of labeled examples. Recent approaches have explored generating
synthetic training data using text-to-image diffusion models, but often require
extensive model fine-tuning or external information sources. We present a..."
via Arxivπ€ Ke Wang, Houxing Ren, Zimu Lu et al.π 2025-09-26
β‘ Score: 6.8
"The growing capabilities of large language models and multimodal systems have
spurred interest in voice-first AI assistants, yet existing benchmarks are
inadequate for evaluating the full range of these systems' capabilities. We
introduce VoiceAssistant-Eval, a comprehensive benchmark designed to as..."
π¬ "Running full containerized applications with many versions of Postgres at the same time sounds very heavy for a dev laptop."
β’ "I found the diffs, Sculptor's internal to-do list, and summaries all helpful to this end."
via Arxivπ€ Xingyu Shen, Yingfa Chen, Zhen Leng Thai et al.π 2025-09-26
β‘ Score: 6.6
"While Transformer-based models have demonstrated remarkable language modeling
performance, their high complexities result in high costs when processing long
contexts. In contrast, recurrent neural networks (RNNs) such as linear
attention and state space models have gained popularity due to their con..."
"Iβve never written a real line of code in my life. I ran a SaaS years ago (outsourced devs), Iβm tech-curious, and I figured AI IDEs might finally let me build stuff myself.
**Round 1: The dopamine prototypes**
Bolt, Lovable, Replit. Looked amazing in hours. βWorkingβ? Not really. Iβd spend **wee..."
"You can now track your usage in real time across Claude Code and the Claude apps.
* Claude Code: /usage slash command
* Claude apps: Settings -> Usage
The weekly rate limits we announced in July ..."
"1. California Governor Newsom signs landmark AI safety bill SB 53.\[1\]
2. **Anthropic**Β launches Claude Sonnet 4.5, its latest AI model thatβs βmore of a colleagueβ\[2\]
3. **OpenAI**Β takes on Google, Amazon with new agentic shopping system.\[3\]
4. U.S. rejects international AI oversight at U.N. G..."
via Arxivπ€ Yasmine Omri, Connor Ding, Tsachy Weissman et al.π 2025-09-26
β‘ Score: 6.3
"Modern vision language pipelines are driven by RGB vision encoders trained on
massive image text corpora. While these pipelines have enabled impressive zero
shot capabilities and strong transfer across tasks, they still inherit two
structural inefficiencies from the pixel domain: (i) transmitting de..."
via Arxivπ€ Renjie Luo, Zichen Liu, Xiangyan Liu et al.π 2025-09-26
β‘ Score: 6.3
"LLMs are often trained with RL from human or AI feedback, yet such methods
typically compress nuanced feedback into scalar rewards, discarding much of
their richness and inducing scale imbalance. We propose treating verbal
feedback as a conditioning signal. Inspired by language priors in text-to-ima..."
via Arxivπ€ Chih Yao Hu, Yang-Sen Lin, Yuna Lee et al.π 2025-09-26
β‘ Score: 6.3
"We present See, Point, Fly (SPF), a training-free aerial vision-and-language
navigation (AVLN) framework built atop vision-language models (VLMs). SPF is
capable of navigating to any goal based on any type of free-form instructions
in any kind of environment. In contrast to existing VLM-based approa..."
π¬ HackerNews Buzz: 19 comments
π GOATED ENERGY
π― Secure data access β’ Permissions and confidentiality β’ GPU-powered search and processing
π¬ "How can I be the one to set up the system for our company, but ensure that only files that I've explicitly shared with the company are ingested?"
β’ "Being able to categorize by likely confidentiality, and allowing an administrator to partition access on a project and sub-project basis based on that, might be crucial for growth."
via Arxivπ€ Ziyu Liu, Yuhang Zang, Shengyuan Ding et al.π 2025-09-26
β‘ Score: 6.1
"Recent Large Language Models (LLMs) and Large Vision-Language Models (LVLMs)
increasingly use Reinforcement Learning (RL) for post-pretraining, such as RL
with Verifiable Rewards (RLVR) for objective tasks and RL from Human Feedback
(RLHF) for subjective tasks. However, RLHF incurs high costs and po..."