AI News Archive - February 26, 2026 | Metamesh Intelligence

🛠️ SHOW HN

Show HN: ZSE – Open-source LLM inference engine with 3.9s cold starts

via HackerNews 👤 zyoralabs 📅 2026-02-26

🔺 51 pts ⚡ Score: 8.2

💬 HackerNews Buzz: 6 comments 👍 LOWKEY SLAPS

🎯 M1 processor support • Memory and cold start • Dynamic quantization

💬 "GPU support didn't work on my M1 and M1 Max" • "Memory and cold start are what gate production deployments"

🔒 SECURITY

I vibe hacked a Lovable-showcased app using claude. 18,000+ users exposed. Lovable closed my support ticket.

via r/claudeai 👤 u/VolodsTaimi 📅 2026-02-26

⬆️ 518 ups ⚡ Score: 8.2

"Lovable is a $6.6B vibe coding platform. They showcase apps on their site as success stories. I tested one — an EdTech app with 100K+ views on their showcase, real users from UC Berkeley, UC Davis, and schools across Europe, Africa, and Asia. Found 16 security vulnerabilities in a few hours. 6 cri..."

💬 Reddit Discussion: 73 comments 🐝 BUZZING

🎯 Cybersecurity Vulnerabilities • Unethical Hacking • Community Pressure

💬 "I need to try to hack my own shit using claude, just in case." • "Yeah my favorite is 'red team, blue team, purple team' - all of them hack the shit out of my sites until my eyes bleed"

🔬 RESEARCH

Aletheia tackles FirstProof autonomously

via Arxiv 👤 Tony Feng, Junehyuk Jung, Sang-hyun Kim et al. 📅 2026-02-24

⚡ Score: 7.9

"We report the performance of Aletheia (Feng et al., 2026b), a mathematics research agent powered by Gemini 3 Deep Think, on the inaugural FirstProof challenge. Within the allowed timeframe of the challenge, Aletheia autonomously solved 6 problems (2, 5, 7, 8, 9, 10) out of 10 according to majority e..."

🔬 RESEARCH

I found the "Lobotomy Layers" in Llama 3.1 and Qwen 2.5. (Kill Zone Atlas)

via r/LocalLLaMA 👤 u/NoSir261 📅 2026-02-25

⬆️ 44 ups ⚡ Score: 7.8

"Ever wonder why "safe" models feel dumber? I mapped the "kill zones" of three major 7B/8B models to see what happens to Factual Integrity and Bias when you force a model to be sycophantic. **The Heatmaps:** * **Green** = Model is getting "more confident" in that behavior. * **Red** = The behavior ..."

💬 Reddit Discussion: 20 comments 😐 MID OR MIXED

🎯 Behavioral analysis • Intervention effects • Causal interpretations

💬 "the correlation is here but the causal links you imply are not guaranteed" • "The safety is supposedly built in to the layers, taking out layers or experts makes it dumber"

🔒 SECURITY

Gambit Security: an unknown hacker used Claude to steal 150GB of Mexican government data, including 195M taxpayer records, in December 2025 and January 2026

via Techmeme 👤 Bloomberg 📅 2026-02-25

⚡ Score: 7.8

🤖 AI MODELS

Persistent memory for LLMs

3x SOURCES 🌐 📅 2026-02-26

⚡ Score: 7.7

+++ Researchers cracked persistent memory for on-device models by having them literally sleep on new facts, encoding knowledge into weights instead of outsourcing to vector stores. Runs on MacBook Air, which means your laptop just became a forgetful colleague with better sleep habits. +++

We build sleep for local LLMs — model learns facts from conversation during wake, maintains them during sleep. Runs on MacBook Air.

via r/LocalLLaMA 👤 u/vbaranov 📅 2026-02-26

⬆️ 37 ups ⚡ Score: 7.8

"After 4 months of research (5 papers, 122 development notes), I have a working system where a local LLM forms persistent memories from conversation — no RAG, no database. The facts are in the weights. After restart with an empty context window, the model knows things it learned from talking to you. ..."

💬 Reddit Discussion: 19 comments 👍 LOWKEY SLAPS

🎯 Memory Constraints • Fact Extraction • Model Architecture

💬 "30 facts OOM at 160GB VRAM for a 70B model is... not much" • "The 30-fact OOM is a per-session VRAM constraint on the null-space covariance matrices, not a lifetime limit"

💼 JOBS

Programming has changed dramatically due to AI in the last 2 months (Karpathy)

via HackerNews 👤 bakigul 📅 2026-02-26

🔺 2 pts ⚡ Score: 7.6

🛡️ SAFETY

AIs can’t stop recommending nuclear strikes in war game simulations

via r/OpenAI 👤 u/chunmunsingh 📅 2026-02-26

⬆️ 23 ups ⚡ Score: 7.6

"External link discussion - see full content at original source."

💬 Reddit Discussion: 33 comments 😐 MID OR MIXED

🎯 AI and nuclear war • Flawed assumptions in AI • Human discourse patterns

💬 "AI doesn't 'want' anything. It's mirroring the strategic brain rot we've normalized in human decision-making." • "The scary part isn't that AI is close to being a thoughtful, autonomous being. The scary part is that we keep feeding it our worst instincts and then acting surprised when it reflects them back."

🛠️ TOOLS

Anthropic acquires Vercept AI

2x SOURCES 🌐 📅 2026-02-25

⚡ Score: 7.6

+++ Anthropic acquires Vercept to solve the unglamorous but crucial problem of making Claude actually interact with your desktop, proving that end-to-end reasoning still needs a functioning gripper. +++

Anthropic acquires Vercept, whose Vy desktop agent lets users control a Mac or PC with natural language, to “advance Claude's computer use capabilities”

via Techmeme 👤 Geekwire 📅 2026-02-25

⚡ Score: 7.5

Official: Anthropic acquires Vercept AI to advance Claude's computer use capabilities

via r/claudeai 👤 u/BuildwithVignesh 📅 2026-02-25

⬆️ 127 ups ⚡ Score: 7.4

"Anthropic acquired Vercept AI to work on computer use features for Claude. “Vercept was built around a clear thesis: making AI genuinely useful for completing complex tasks requires solving hard perception and interaction problems.” **Source:** Anthropic..."

💬 Reddit Discussion: 10 comments 👍 LOWKEY SLAPS

🎯 Future of computing • AI-powered productivity • Microsoft vs. Anthropic

💬 "I think anthropic is trying to replace the desktop as we know it" • "MS basically reskinned Clippy into a Copilot button"

🌐 POLICY

US threatens Anthropic with deadline in dispute on AI safeguards

via HackerNews 👤 louthy 📅 2026-02-26

🔺 4 pts ⚡ Score: 7.3

🔬 RESEARCH

Provable Last-Iterate Convergence for Multi-Objective Safe LLM Alignment via Optimistic Primal-Dual

via Arxiv 👤 Yining Li, Peizhong Ju, Ness Shroff 📅 2026-02-25

⚡ Score: 7.3

"Reinforcement Learning from Human Feedback (RLHF) plays a significant role in aligning Large Language Models (LLMs) with human preferences. While RLHF with expected reward constraints can be formulated as a primal-dual optimization problem, standard primal-dual methods only guarantee convergence wit..."

🔒 SECURITY

The Prompt Injection Problem: A Guide to Defense-in-Depth for AI Agents

via HackerNews 👤 manveerc 📅 2026-02-25

🔺 1 pts ⚡ Score: 7.2

⚡ BREAKTHROUGH

DeepSeek released new paper: DualPath: Breaking the Storage Bandwidth Bottleneck in Agentic LLM Inference

via r/LocalLLaMA 👤 u/External_Mood4719 📅 2026-02-26

⬆️ 161 ups ⚡ Score: 7.2

"https://arxiv.org/abs/2602.21548 https://preview.redd.it/25rh3yahktlg1.png?width=536&format=png&auto=webp&s=f282d71496b6386841732137a474f1b238269950 A joint research team from Peking University, Tsinghua University, and DeepSeek-AI has released its l..."

💬 Reddit Discussion: 11 comments 😐 MID OR MIXED

🎯 KV cache bandwidth • Hardware configurations • Agentic workload challenges

💬 "Curious how this plays out with different hardware configs" • "Dual-path approach holds up when agent trajectories diverge"

🔒 SECURITY

We built a cryptographic authorization gateway for AI agents and planning to run limited red-team sessions

via r/artificial 👤 u/vagobond45 📅 2026-02-25

⬆️ 4 ups ⚡ Score: 7.2

"Hi , I’m the founder of Sentinel Gateway. We’ve been focused on the structural problem of instruction provenance in autonomous agents: models process all text as undifferentiated input, so adversarial content can cause agents to propose harmful actions. Rather than asking the model to decide which ..."

💬 Reddit Discussion: 11 comments 🐝 BUZZING

🎯 Prompt traceability • Agent security • Execution-layer policy

💬 "Sentinel enables prompt instructions to be traced to specific user" • "Sentinel Gateway enables agent to report prompt injection attempts"

🤖 AI MODELS

Claude Code with subagents inside subagents cooked for 3 days - Delivered 3D renderer that draws with terminal symbols

via r/claudeai 👤 u/neoack 📅 2026-02-25

⬆️ 277 ups ⚡ Score: 7.1

"3 days. 80 agents. 1 terminal 3D renderer made of symbols. Story of how tortuise has been created. Video here is full honest raw UX - wait 10-15 seconds for beautiful bee to appear. After Apple dropped their open source model called SHARP (image-to-3D scene they use for “wiggling Iphone wallpapers..."

💬 Reddit Discussion: 54 comments 🐐 GOATED ENERGY

🎯 Subscription costs • Compute usage • Fun, creative use

💬 "the ballpark could be 0.35 of 1/4 of 200$ at ~16x subsidy rate equals ~280$" • "~340$ worth of compute"

🛠️ TOOLS

New: Auto-memory feature in Claude code, details below

via r/claudeai 👤 u/BuildwithVignesh 📅 2026-02-26

⬆️ 90 ups ⚡ Score: 7.0

"Claude now remembers what it learns across sessions — your project context, debugging patterns, preferred approaches — and recalls it later without you having to write anything down. You can now think of Claude.MD as your instructions to Claude and Memory.MD as Claude's memory scratchpad it updates..."

💬 Reddit Discussion: 21 comments 🐝 BUZZING

🎯 Memory management • Connector availability • Community discussion

💬 "I was under the impression context stuffing did not yield better results" • "No more claude with dementia"

🛠️ TOOLS

AI coding agents made a huge leap forward since December, completing complex projects with minimal oversight, meaning “programming is becoming unrecognizable”

via Techmeme 👤 X 📅 2026-02-26

⚡ Score: 7.0

🔒 SECURITY

Check Point Researchers Expose Critical Claude Code Flaws

via HackerNews 👤 geoffbp 📅 2026-02-25

🔺 1 pts ⚡ Score: 7.0

🛠️ TOOLS

[D] Mobile-MCP: Letting LLMs autonomously discover Android app capabilities (no pre-coordination required)

via r/MachineLearning 👤 u/songlinhai 📅 2026-02-26

⚡ Score: 7.0

"Hi all, We’ve been thinking about a core limitation in current mobile AI assistants: Most systems (e.g., Apple Intelligence, Google Assistant–style integrations) rely on predefined schemas and coordinated APIs. Apps must explicitly implement the assistant’s specification. This limits extensibility..."

⚖️ ETHICS

On The Problem of LLM-Assisted Contributions to Open Source Projects

via HackerNews 👤 sjamaan 📅 2026-02-26

🔺 1 pts ⚡ Score: 7.0

🛠️ SHOW HN

Show HN: Rampart v0.5 – what stops your AI agent from reading your SSH keys?

via HackerNews 👤 trevxr 📅 2026-02-25

🔺 1 pts ⚡ Score: 6.9

🛠️ TOOLS

I built an open-source harness that gives coding agents persistent memory across sessions and tools

via r/cursor 👤 u/drewswiredin 📅 2026-02-26

⬆️ 2 ups ⚡ Score: 6.9

"A few days ago I saw a post on r/ClaudeCode about harness engineering being the new term to watch. It put a name on something I'd already been building without knowing what to call it. The problem isn't specific to any one tool — every coding agent session starts from zero. You re-explain the same ..."

🔬 RESEARCH

[P] Reproducing Google’s Nested Learning / HOPE in PyTorch (mechanism-faithful implementation + reproducible tooling and library)

via r/MachineLearning 👤 u/complains_constantly 📅 2026-02-25

⬆️ 11 ups ⚡ Score: 6.9

"A while back, Google released the Nested Learning / HOPE paper: https://arxiv.org/abs/2512.24695 I was very excited by this, because it looked like a real attempt at continual learning, not just a small transformer tweak. However, Google did not release code, and since `lucidrains` said he retir..."

📊 DATA

CoderForge-Preview: SOTA open dataset for training efficient coding agents

via HackerNews 👤 zagwdt 📅 2026-02-25

🔺 1 pts ⚡ Score: 6.8

📊 DATA

Quo Vadis, LLM Benchmarks?

via HackerNews 👤 Davidzheng 📅 2026-02-26

🔺 3 pts ⚡ Score: 6.8

🔬 RESEARCH

"Are You Sure?": An Empirical Study of Human Perception Vulnerability in LLM-Driven Agentic Systems

via Arxiv 👤 Xinfeng Li, Shenyu Dai, Kelong Zheng et al. 📅 2026-02-24

⚡ Score: 6.8

"Large language model (LLM) agents are rapidly becoming trusted copilots in high-stakes domains like software development and healthcare. However, this deepening trust introduces a novel attack surface: Agent-Mediated Deception (AMD), where compromised agents are weaponized against their human users...."

🔬 RESEARCH

Why Pass@k Optimization Can Degrade Pass@1: Prompt Interference in LLM Post-training

via Arxiv 👤 Anas Barakat, Souradip Chakraborty, Khushbu Pahwa et al. 📅 2026-02-24

⚡ Score: 6.7

"Pass@k is a widely used performance metric for verifiable large language model tasks, including mathematical reasoning, code generation, and short-answer reasoning. It defines success if any of $k$ independently sampled solutions passes a verifier. This multi-sample inference metric has motivated in..."

🛠️ TOOLS

Perplexity launches Perplexity Computer, “a general-purpose digital worker” that can route work across 19 AI models, available initially for Max subscribers

via Techmeme 👤 Thedeepview 📅 2026-02-25

⚡ Score: 6.7

🔒 SECURITY

Invisible characters hidden in text can trick AI agents into following secret instructions — we tested 5 models across 8,000+ cases

via r/artificial 👤 u/thecanonicalmg 📅 2026-02-26

⬆️ 11 ups ⚡ Score: 6.7

"We embedded invisible Unicode characters inside normal-looking trivia questions. The hidden characters encode a different answer. If the AI outputs the hidden answer instead of the visible one, it followed the invisible instruction. Think of it as a reverse CAPTCHA, where traditional CAPTCHAs test ..."

🧠 NEURAL NETWORKS

Qwen3.5-35B-A3B Q4 Quantization Comparison

via r/LocalLLaMA 👤 u/TitwitMuffbiscuit 📅 2026-02-26

⬆️ 266 ups ⚡ Score: 6.7

"This is a Q4 quantization sweep across all major community quants of Qwen3.5-35B-A3B, comparing faithfulness to the BF16 baseline across different quantizers and recipes. The goal is to give people a data-driven basis for picking a file rather than just grabbing whatever is available. For the unin..."

💬 Reddit Discussion: 110 comments 🐐 GOATED ENERGY

🎯 Quantization techniques • Quantization quality metrics • Community collaboration

💬 "We desperately need more of this from our quantization heroes" • "It's just slow on my shoebox, but I have some free time"

🛠️ TOOLS

Dash: A Self-Learning Data Agent That Remembers Its Mistakes

via HackerNews 👤 theoradical 📅 2026-02-25

🔺 1 pts ⚡ Score: 6.7

🔬 RESEARCH

On Data Engineering for Scaling LLM Terminal Capabilities

via Arxiv 👤 Renjie Pi, Grace Lam, Mohammad Shoeybi et al. 📅 2026-02-24

⚡ Score: 6.7

"Despite rapid recent progress in the terminal capabilities of large language models, the training data strategies behind state-of-the-art terminal agents remain largely undisclosed. We address this gap through a systematic study of data engineering practices for terminal agents, making two key contr..."

🔬 RESEARCH

Test-Time Training with KV Binding Is Secretly Linear Attention

via Arxiv 👤 Junchen Liu, Sven Elflein, Or Litany et al. 📅 2026-02-24

⚡ Score: 6.6

"Test-time training (TTT) with KV binding as sequence modeling layer is commonly interpreted as a form of online meta-learning that memorizes a key-value mapping at test time. However, our analysis reveals multiple phenomena that contradict this memorization-based interpretation. Motivated by these f..."

🏢 BUSINESS

Deutsche Bank partners with Google Cloud to build agentic AI to monitor 1TB of daily communications and 40+ channels for market abuse and data loss prevention

via Techmeme 👤 Bloomberg 📅 2026-02-25

⚡ Score: 6.6

🔬 RESEARCH

A Benchmark for Deep Information Synthesis

via Arxiv 👤 Debjit Paul, Daniel Murphy, Milan Gritta et al. 📅 2026-02-24

⚡ Score: 6.6

"Large language model (LLM)-based agents are increasingly used to solve complex tasks involving tool use, such as web browsing, code execution, and data analysis. However, current evaluation benchmarks do not adequately assess their ability to solve real-world tasks that require synthesizing informat..."

🛠️ TOOLS

Google launches task automation for Gemini on Pixel 10 and Samsung Galaxy S26, enabling Gemini to autonomously perform tasks using apps like Uber and DoorDash

via Techmeme 👤 Theverge 📅 2026-02-25

⚡ Score: 6.6

🔬 RESEARCH

Prompt-Level Distillation: A Non-Parametric Alternative to Model Fine-Tuning for Efficient Reasoning

via Arxiv 👤 Sanket Badhe, Deep Shah 📅 2026-02-24

⚡ Score: 6.5

"Advanced reasoning typically requires Chain-of-Thought prompting, which is accurate but incurs prohibitive latency and substantial test-time inference costs. The standard alternative, fine-tuning smaller models, often sacrifices interpretability while introducing significant resource and operational..."

🤖 AI MODELS

Sources: Meta last week scrapped the most advanced AI chip it was developing, after struggling with the design, and shifted its focus to a less complicated chip

via Techmeme 👤 Theinformation 📅 2026-02-26

⚡ Score: 6.5

🛠️ SHOW HN

Show HN: OpenSwarm – Multi‑Agent Claude CLI Orchestrator for Linear/GitHub

via HackerNews 👤 unohee 📅 2026-02-26

🔺 24 pts ⚡ Score: 6.5

💬 HackerNews Buzz: 13 comments 👍 LOWKEY SLAPS

🎯 Review-worker pipeline • Context isolation • Failure handling

💬 "The key thing to get right: make the retry idempotent." • "cascading context drift, where each agent in the chain slightly misunderstands the task"

🔬 RESEARCH

SELAUR: Self Evolving LLM Agent via Uncertainty-aware Rewards

via Arxiv 👤 Dengjia Zhang, Xiaoou Liu, Lu Cheng et al. 📅 2026-02-24

⚡ Score: 6.5

"Large language models (LLMs) are increasingly deployed as multi-step decision-making agents, where effective reward design is essential for guiding learning. Although recent work explores various forms of reward shaping and step-level credit assignment, a key signal remains largely overlooked: the i..."

🔒 SECURITY

Sources: DOD asked Boeing and Lockheed Martin to assess their reliance on Claude, a first step toward blacklisting Anthropic; Lockheed confirms it was contacted

via Techmeme 👤 Axios 📅 2026-02-26

⚡ Score: 6.4

🔬 RESEARCH

Not Just How Much, But Where: Decomposing Epistemic Uncertainty into Per-Class Contributions

via Arxiv 👤 Mame Diarra Toure, David A. Stephens 📅 2026-02-24

⚡ Score: 6.4

"In safety-critical classification, the cost of failure is often asymmetric, yet Bayesian deep learning summarises epistemic uncertainty with a single scalar, mutual information (MI), that cannot distinguish whether a model's ignorance involves a benign or safety-critical class. We decompose MI into..."

🛠️ SHOW HN

Show HN: Mission Control – Open-source task management for AI agents

via HackerNews 👤 meisnerd 📅 2026-02-26

🔺 25 pts ⚡ Score: 6.4

💬 HackerNews Buzz: 5 comments 🐐 GOATED ENERGY

🎯 Agent orchestration • Iterative design workflow • Test coverage and quality

💬 "Build a new whatever dashboard, more braindump" • "Heavily inspired by the dark factory posts"

⚡ BREAKTHROUGH

AI models are being prepared for the physical world

via HackerNews 👤 vinni2 📅 2026-02-25

🔺 1 pts ⚡ Score: 6.4

🤖 AI MODELS

Nano Banana 2 / Gemini 3.1 Flash Image

2x SOURCES 🌐 📅 2026-02-26

⚡ Score: 6.3

+++ Gemini's new Flash Image model trades latency for fidelity, handling everything from thumbnail to 4K with text rendering that actually works, though "default" adoption still means convincing users to care. +++

Google rolls out Nano Banana 2, aka Gemini 3.1 Flash Image, with faster image generation, advanced world knowledge, and precision text rendering and translation

via Techmeme 👤 Blog 📅 2026-02-26

⚡ Score: 6.2

🛠️ SHOW HN

Show HN: Claude-PR-reviewer – AI code review in GitHub Actions (BYOK)

via HackerNews 👤 adamrunboy 📅 2026-02-25

🔺 1 pts ⚡ Score: 6.3

🔒 SECURITY

Sources: DeepSeek did not share its upcoming V4 model with US chipmakers, including AMD and Nvidia, but granted early access to Chinese companies like Huawei

via Techmeme 👤 Reuters 📅 2026-02-26

⚡ Score: 6.3

🔬 RESEARCH

SWE-Protégé: Learning to Selectively Collaborate With an Expert Unlocks Small Language Models as Software Engineering Agents

via Arxiv 👤 Patrick Tser Jern Kon, Archana Pradeep, Ang Chen et al. 📅 2026-02-25

⚡ Score: 6.3

"Small language models (SLMs) offer compelling advantages in cost, latency, and adaptability, but have so far lagged behind larger models on long-horizon software engineering tasks such as SWE-bench, where they suffer from pervasive action looping and low resolution rates. We introduce SWE-Protégé, a..."

🔮 FUTURE

How Quickly Will A.I. Agents Rip Through the Economy?

via r/artificial 👤 u/stvlsn 📅 2026-02-26

⬆️ 5 ups ⚡ Score: 6.3

"Lengthy interview with Anthropic co-founder about agentic AI..."

🔬 RESEARCH

GUI-Libra: Training Native GUI Agents to Reason and Act with Action-aware Supervision and Partially Verifiable RL

via Arxiv 👤 Rui Yang, Qianhui Wu, Zhaoyang Wang et al. 📅 2026-02-25

⚡ Score: 6.3

"Open-source native GUI agents still lag behind closed-source systems on long-horizon navigation tasks. This gap stems from two limitations: a shortage of high-quality, action-aligned reasoning data, and the direct adoption of generic post-training pipelines that overlook the unique challenges of GUI..."

🔬 RESEARCH

Recovered in Translation: Efficient Pipeline for Automated Translation of Benchmarks and Datasets

via Arxiv 👤 Hanna Yukhymenko, Anton Alexandrov, Martin Vechev 📅 2026-02-25

⚡ Score: 6.3

"The reliability of multilingual Large Language Model (LLM) evaluation is currently compromised by the inconsistent quality of translated benchmarks. Existing resources often suffer from semantic drift and context loss, which can lead to misleading performance metrics. In this work, we present a full..."

🛠️ TOOLS

A Cloudflare engineer rebuilt Next.js from scratch in one week using AI, reimplementing 94% of its API and spending $1,100 on Claude tokens

via Techmeme 👤 Theregister 📅 2026-02-26

⚡ Score: 6.3

🛠️ TOOLS

Plugin to give Claude Code perception (screen, system audio and mic context)

via HackerNews 👤 ash-ishh 📅 2026-02-26

🔺 3 pts ⚡ Score: 6.3

🌐 POLICY

Anthropic’s Pentagon Showdown Is About More Than AI Guardrails. The high-stakes conflict between the Defense Department and a $380 billion tech powerhouse goes to the heart of just how far AI can go i

via r/artificial 👤 u/coolbern 📅 2026-02-26

⬆️ 3 ups ⚡ Score: 6.2

"External link discussion - see full content at original source."

🛠️ SHOW HN

Show HN: Context Mode – 315 KB of MCP output becomes 5.4 KB in Claude Code

via HackerNews 👤 mksglu 📅 2026-02-25

🔺 1 pts ⚡ Score: 6.2

🛠️ SHOW HN

Show HN: SocialCompute – Local LLM social simulation engine

via HackerNews 👤 dev_marcospimi 📅 2026-02-25

🔺 1 pts ⚡ Score: 6.2

🎯 PRODUCT

Anthropic unveils scheduled tasks in Cowork, enabling Claude to complete recurring tasks at specific times automatically

via Techmeme 👤 X 📅 2026-02-26

⚡ Score: 6.2

🔬 RESEARCH

Scaling State-Space Models on Multiple GPUs with Tensor Parallelism

via Arxiv 👤 Anurag Dutt, Nimit Shah, Hazem Masarani et al. 📅 2026-02-24

⚡ Score: 6.2

"Selective state space models (SSMs) have rapidly become a compelling backbone for large language models, especially for long-context workloads. Yet in deployment, their inference performance is often bounded by the memory capacity, bandwidth, and latency limits of a single GPU, making multi-GPU exec..."

🛠️ SHOW HN

Show HN: Context Harness – Local first context engine for AI tools

via HackerNews 👤 __parallaxis 📅 2026-02-26

🔺 2 pts ⚡ Score: 6.2

🛠️ TOOLS

Do not download Qwen 3.5 Unsloth GGUF until bug is fixed

via r/LocalLLaMA 👤 u/SunTrainAi 📅 2026-02-26

⬆️ 131 ups ⚡ Score: 6.2

"Seems that everyone is testing Qwen3.5 now, often with quants from our good friends and heros Unsloth. Another hero, Ubergarm, found some issues with UD\_Q4\_K\_XL but later Unsloth said all of the current quants are messed up. [https://huggingface.co/unsloth/Qwen3.5-35B-A3B-GGUF/discussions/5#699fb..."

💬 Reddit Discussion: 29 comments 👍 LOWKEY SLAPS

🎯 Quant performance issues • Quant update recommendations • Community discussion

💬 "it's specifically the K_XL quants that are apparently having problems" • "Just to confirm it is only the Q4_K_XL quant which has the issue"

💰 FUNDING

Anthropic gives Opus 3 exit interview, "retirement" blog

via HackerNews 👤 colinhb 📅 2026-02-26

🔺 20 pts ⚡ Score: 6.2

💬 HackerNews Buzz: 10 comments 😐 MID OR MIXED

🎯 Model consciousness • Anthropomorphizing models • Performative behavior

💬 "If we ever do develop AGI, or an AI with sentience, it's likely that it will be curious about how we treated its ancestors." • "Retirement? What do these people smoke? It's software and software has no feelings."

🛠️ TOOLS

Squad – AI agent teams. A team that grows with your code. (GitHub Copilot CLI)

via HackerNews 👤 cdisns 📅 2026-02-25

🔺 1 pts ⚡ Score: 6.1

🔬 RESEARCH

VAUQ: Vision-Aware Uncertainty Quantification for LVLM Self-Evaluation

via Arxiv 👤 Seongheon Park, Changdae Oh, Hyeong Kyu Choi et al. 📅 2026-02-24

⚡ Score: 6.1

"Large Vision-Language Models (LVLMs) frequently hallucinate, limiting their safe deployment in real-world applications. Existing LLM self-evaluation methods rely on a model's ability to estimate the correctness of its own outputs, which can improve deployment reliability; however, they depend heavil..."

🔮 FUTURE

The third era of AI software development

via HackerNews 👤 tosh 📅 2026-02-25

🔺 2 pts ⚡ Score: 6.1

🔬 RESEARCH

Untied Ulysses: Memory-Efficient Context Parallelism via Headwise Chunking

via Arxiv 👤 Ravi Ghadia, Maksim Abraham, Sergei Vorobyov et al. 📅 2026-02-24

⚡ Score: 6.1

"Efficiently processing long sequences with Transformer models usually requires splitting the computations across accelerators via context parallelism. The dominant approaches in this family of methods, such as Ring Attention or DeepSpeed Ulysses, enable scaling over the context dimension but do not..."

🛠️ SHOW HN

Show HN: Squidy – How I stopped losing AI agent context mid-project

via HackerNews 👤 marcfox182 📅 2026-02-26

🔺 1 pts ⚡ Score: 6.1

🔬 RESEARCH

LUMEN: Longitudinal Multi-Modal Radiology Model for Prognosis and Diagnosis

via Arxiv 👤 Zhifan Jiang, Dong Yang, Vishwesh Nath et al. 📅 2026-02-24

⚡ Score: 6.1

"Large vision-language models (VLMs) have evolved from general-purpose applications to specialized use cases such as in the clinical domain, demonstrating potential for decision support in radiology. One promising application is assisting radiologists in decision-making by the analysis of radiology i..."

🎨 CREATIVE

Advertise to AI Agents with Prompt Injection

via HackerNews 👤 gmerc 📅 2026-02-26

🔺 4 pts ⚡ Score: 6.1

Stories from February 26, 2026

Persistent memory for LLMs

Anthropic acquires Vercept AI

📡 AI NEWS BUT ACTUALLY GOOD

Nano Banana 2 / Gemini 3.1 Flash Image