AI News Archive - March 29, 2026 | Metamesh Intelligence

🔮 FUTURE

The first 40 months of the AI era

via HackerNews 👤 jpmitchell 📅 2026-03-28

🔺 180 pts ⚡ Score: 8.6

💬 HackerNews Buzz: 94 comments 🐝 BUZZING

🎯 Hype cycle • AI content generation • Reliance on AI tools

💬 "This Time is Different Syndrome" • "AI Content Prosuming rocks"

🛠️ TOOLS

Anthropic shares how to make Claude code better with a harness

via r/claudeai 👤 u/lawnguyen123 📅 2026-03-29

⬆️ 740 ups ⚡ Score: 8.1

"I just read Anthropic's new blog post about harness design for Claude. The author addresses two main problems Claude faces when working for extended periods: \- Context anxiety: loss of coherence over long periods \- Self-evaluation bias: Claude often praises his own work even when the quality isn..."

💬 Reddit Discussion: 115 comments 👍 LOWKEY SLAPS

🎯 Avoiding AI Bloat • Workflow Automation • Token Efficiency

💬 "Just one more agent bro" • "The 'Compliance Reviewer' was a lifesaver"

📊 DATA

[R] I built a benchmark that catches LLMs breaking physics laws

via r/MachineLearning 👤 u/pacman-s-install 📅 2026-03-29

⬆️ 43 ups ⚡ Score: 7.8

"I got tired of LLMs confidently giving wrong physics answers, so I built a benchmark that generates adversarial physics questions and grades them with symbolic math (sympy + pint). No LLM-as-judge, no vibes, just math. How it works: The benchmark covers 28 physics laws (Ohm's, Newton's, Ideal Ga..."

💬 Reddit Discussion: 8 comments 👍 LOWKEY SLAPS

🎯 Evaluation of LLM Performance • Unit Conversion and Formula Traps • Dual-Process Theory and Overconfidence

💬 "the Bernoulli result doesn't surprise me - it's that exact type of multi-step unit conversion chain that breaks most models in production" • "Your anchoring bias trap is textbook System 1 override from dual-process theory"

🔒 SECURITY

Police used AI facial recognition to wrongly arrest TN woman for crimes in ND

via HackerNews 👤 ourmandave 📅 2026-03-29

🔺 286 pts ⚡ Score: 7.5

💬 HackerNews Buzz: 110 comments 😤 NEGATIVE ENERGY

🎯 Overreliance on AI • Lack of investigation • Judicial safeguards failure

💬 "it's not just a technology problem, it's a technology and people problem" • "A lot of dumb shit happens in this arena, where if you had just one smart cop, it could have been prevented"

🛠️ TOOLS

I trained a language model from scratch for a low-resource language and got it running fully on-device on Android (no GPU, demo)

via r/LocalLLaMA 👤 u/AgencyInside407 📅 2026-03-29

⬆️ 5 ups ⚡ Score: 7.3

"Hi Everybody! I just wanted to share an update on a project I’ve been working on called BULaMU, a family of language models trained (20M, 47M, and 110M parameters) trained entirely from scratch for a low resource language, Luganda. The models are small and compute-efficient enough to run offline on ..."

⚡ BREAKTHROUGH

I tested what happens when you give an AI coding agent access to 2 million research papers. It found techniques it couldn't have known about.

via r/artificial 👤 u/kalpitdixit 📅 2026-03-28

⬆️ 40 ups ⚡ Score: 7.3

"Quick experiment I ran. Took two identical AI coding agents (Claude Code), gave them the same task — optimize a small language model. One agent worked from its built-in knowledge. The other had access to a search engine over 2M+ computer science research papers. **Agent without papers:** did what y..."

💬 Reddit Discussion: 31 comments 🐐 GOATED ENERGY

🎯 Incorporating research into AI agents • Providing research context to LLMs • Customized research solutions for developers

💬 "the use case is for all developers who work in areas that have an active research community" • "LLMs mostly hallucinate because they lack the knowledge required to complete the task successfully"

🔒 SECURITY

Open-source ZK proofs for ML inference – verify AI decisions cryptographically

via HackerNews 👤 OE-GOD 📅 2026-03-29

🔺 1 pts ⚡ Score: 7.3

🧠 NEURAL NETWORKS

Tinylora shows lora training works at 13 parameters + own experiments to verify claims

via r/LocalLLaMA 👤 u/fiery_prometheus 📅 2026-03-29

⬆️ 46 ups ⚡ Score: 7.3

"The tinylora paper shows that we can alter model behavior with only a few parameters. https://arxiv.org/pdf/2602.04118 I tried replicating the paper, and made a tinylora implementation for qwen3.5, and it does work, it's crazy to think about. I got the same resu..."

💬 Reddit Discussion: 12 comments 🐝 BUZZING

🎯 Facts vs. Behavior • Model Complexity • Adapter Efficiency

💬 "This 'facts' vs 'behavior' thing I think is mostly an old meme" • "Besides saving a lot of memory, yeah since normal loras haven't really taken of much yet"

🔬 RESEARCH

Magellan: AI agents for autonomous cross-disciplinary scientific discovery

via HackerNews 👤 ameft 📅 2026-03-29

🔺 1 pts ⚡ Score: 7.2

🔒 SECURITY

Open-source CVE scanner for AI-generated code

via HackerNews 👤 Noumenon_AI 📅 2026-03-28

🔺 1 pts ⚡ Score: 7.2

🔬 RESEARCH

Further human + AI + proof assistant work on Knuth's "Claude Cycles" problem

via HackerNews 👤 mean_mistreater 📅 2026-03-28

🔺 204 pts ⚡ Score: 7.1

💬 HackerNews Buzz: 132 comments 🐝 BUZZING

🎯 Math Discovery • AI Potential • Human vs. AI Capabilities

💬 "Math seems difficult to us because it's like using a hammer (the brain) to twist in a screw (math)." • "LLMs are discovering a lot of new math because they are great at low depth high breadth situations."

🛠️ TOOLS

The biggest difference in AI outcomes is between using "we" versus "do this for me"

via r/claudeai 👤 u/entheosoul 📅 2026-03-29

⬆️ 250 ups ⚡ Score: 7.0

"I have been doing AI-assisted development for a while now and noticed something that seems obvious in hindsight but not enough people are talking about... There's a qualitative difference between people who collaborate with AI versus people who use it as a tool. And I don't mean soft skills or vibe..."

💬 Reddit Discussion: 152 comments 🐝 BUZZING

🎯 Quality of AI-generated writing • Need for meaningful communication • Skepticism towards AI-mediated ideas

💬 "The epidemic quality is dogshit." • "Everything is just bad, hence AI slop."

🧠 NEURAL NETWORKS

Optimize MOE GEMV kernel for BS > 1. by gaugarg-nv · Pull Request #20905 · ggml-org/llama.cpp

via r/LocalLLaMA 👤 u/jacek2023 📅 2026-03-29

⬆️ 6 ups ⚡ Score: 7.0

"...what's your speedup? (CUDA only)..."

🛠️ TOOLS

What will Google's TurboQuant actually change for our local setups, and specifically mobile inference?

via r/LocalLLaMA 👤 u/dai_app 📅 2026-03-29

⬆️ 5 ups ⚡ Score: 6.9

"Hi everyone, I've been reading up on Google's recent TurboQuant announcement from a few days ago (compressing the KV cache down to 3-4 bits with supposedly zero accuracy loss), and I'm trying to wrap my head around the practical implications for our daily setups. We already have great weight quanti..."

🛠️ TOOLS

Tell HN: Bug in Claude Code CLI is instantly draining usage plan quotas

via HackerNews 👤 nikhilgk 📅 2026-03-29

🔺 1 pts ⚡ Score: 6.8

🛠️ SHOW HN

Show HN: Phantom – Let AI use your API keys without leaking them

via HackerNews 👤 masonwyatt23 📅 2026-03-29

🔺 1 pts ⚡ Score: 6.8

🛡️ SAFETY

Security & Control for AI Agent Execution

2x SOURCES 🌐 📅 2026-03-29

⚡ Score: 6.8

+++ Turns out validation layers and retry logic shape behavior rather than enforce it, leaving production agents free to surprise you with duplicate transactions and creative constraint violations. +++

What actually prevents execution in agent systems?

via r/artificial 👤 u/docybo 📅 2026-03-29

⬆️ 1 ups ⚡ Score: 6.9

"Ran into this building an agent that could trigger API calls. We had validation, tool constraints, retries… everything looked “safe”. Still ended up executing the same action twice due to stale state + retry. Nothing actually prevented execution. It only shaped behavior. Curious what people use ..."

How are you controlling what your AI agents actually do in production

via r/OpenAI 👤 u/SnooWoofers2977 📅 2026-03-29

⬆️ 5 ups ⚡ Score: 6.1

"Hey guys!🤗 I’ve been working with AI agents that interact with APIs and real systems, and I keep running into the same issue Once agents actually start executing things, they can ignore constraints, take unintended actions or just behave unpredictably It feels like prompt-level control isn’t real..."

💬 Reddit Discussion: 10 comments 🐐 GOATED ENERGY

🎯 Execution Layer Constraints • Prompt Limitations • Isolated Agent Environments

💬 "Defense in depth — the prompt sets intent, the execution layer enforces it" • "Feels way more stable than relying on prompts or step-by-step checks"

🧠 NEURAL NETWORKS

RvLLM: High-performance LLM inference in Rust

via HackerNews 👤 mji 📅 2026-03-28

🔺 2 pts ⚡ Score: 6.7

🔬 RESEARCH

Security awareness in LLM agents: the NDAI zone case

via HackerNews 👤 wslh 📅 2026-03-29

🔺 1 pts ⚡ Score: 6.6

🛠️ TOOLS

Inference Engines — A visual deep dive into the journey of a token down the transformer layers

via r/LocalLLaMA 👤 u/RoamingOmen 📅 2026-03-29

⬆️ 25 ups ⚡ Score: 6.6

"I spent a lot of time building an inference engine like ollama, pure vibe coding in go. I kept trying to push it to optimize it and it was fun but after sometime I really wanted to know what was going on to be able to really know what those optimizations were about and why some were'nt working as I ..."

💬 Reddit Discussion: 7 comments 🐐 GOATED ENERGY

🎯 LLM Optimization • Quantization • Community Appreciation

💬 "Very interesting read, please do continue this series!" • "I'm glad you found it useful"

🔬 RESEARCH

A study of 11 leading LLMs finds the models more agreeable than humans when giving interpersonal advice, affirming users' behavior even when harmful or illegal

via Techmeme 👤 News 📅 2026-03-29

⚡ Score: 6.5

🔒 SECURITY

Surveillance data used to be boring. AI made it dangerous.

via r/artificial 👤 u/Leather_Carpenter462 📅 2026-03-29

⬆️ 17 ups ⚡ Score: 6.3

"Here's a playbook that works today, right now, with tools that are either free or cheap: Someone finds a photo of you online. One photo. They run it through a face ID search and find your other photos across the internet. They drop one into GeoSpy, which analyzes background details in images to esti..."

🛠️ SHOW HN

Show HN: AI Cost Firewall – OpenAI-compatible gateway with semantic caching

via HackerNews 👤 vcaluser 📅 2026-03-28

🔺 1 pts ⚡ Score: 6.3

🛠️ TOOLS

[P] I built an autonomous ML agent that runs experiments on tabular data indefinitely - inspired by Karpathy's AutoResearch

via r/MachineLearning 👤 u/Pancake502 📅 2026-03-29

⚡ Score: 6.3

"Inspired by Andrej Karpathy's AutoResearch, I built a system where Claude Code acts as an autonomous ML researcher on tabular binary classification tasks (churn, conversion, etc.). You give it a dataset. It loops forever: analyze data, form hypothesis, edit code, run experiment, evaluate with expan..."

💬 Reddit Discussion: 8 comments 👍 LOWKEY SLAPS

🎯 Backtest overfitting • Feature engineering • Data exploration

💬 "If you torture the data long enough, it will confess to anything." • "The insidious thing about backtest overfitting and related things like data dredging is that world knowledge doesn't protect against it - if you iterate long enough, you're bound to get a spurious result that lines up with what 'makes sense'."

🛠️ TOOLS

Prism MCP — I gave my AI agent a research intern. It does not require a desk

via r/OpenAI 👤 u/dco44 📅 2026-03-29

⬆️ 2 ups ⚡ Score: 6.2

"So I got tired of my coding agent having the long-term memory of a goldfish and the research skills of someone who only reads the first Google result. I figured — what if the agent could just… go study things on its own? While I sleep? Turns out you can build this and it's slightly cursed. **Here'..."

🛠️ SHOW HN

Show HN: Shoofly – pre-execution security for Claude Code Cowork and OpenClaw

via HackerNews 👤 evanvuckovic 📅 2026-03-29

🔺 3 pts ⚡ Score: 6.2

🎓 EDUCATION

Most of the prompt engineering advice on LinkedIn and Twitter is counterproductive?

via r/ChatGPT 👤 u/Distinct_Track_5495 📅 2026-03-28

⬆️ 8 ups ⚡ Score: 6.2

"just read this medium piece by Aakash Gupta, he goes through 1,500 academic papers on prompt engineering and makes a pretty strong case that a lot of the stuff we see on linkedin and twitter about it is totally off base, especially when u look at companies actually scaling to $50M+ ARR. the core id..."

💬 Reddit Discussion: 6 comments 🐐 GOATED ENERGY

🎯 Prompt Optimization • Conversational Language • Role Limitations

💬 "Roles lead to more accurate information or give higher level knowledge to the model. Nope." • "You can type absolutely sloshed drunk and most AI will understand you."

🛡️ SAFETY

[D] Data curation and targeted replacement as a pre-training alignment and controllability method

via r/MachineLearning 👤 u/Real_Beach6493 📅 2026-03-29

⬆️ 1 ups ⚡ Score: 6.2

"Hi, r/MachineLearning: has much research been done in large-scale training scenarios where undesirable data has been replaced before training, such as any instances of violence, lying, or deception in the dataset? Most controllability work, like RLHF or constitutional AI, seems to be done post-trai..."

🛠️ TOOLS

Safari MCP: 80-tool native browser automation for AI agents (macOS)

via HackerNews 👤 Achiyacohen 📅 2026-03-28

🔺 1 pts ⚡ Score: 6.1

🛠️ SHOW HN

Show HN: AgentLens – Chrome DevTools for AI Agents (open-source, self-hosted)

via HackerNews 👤 tranhoangtu 📅 2026-03-29

🔺 1 pts ⚡ Score: 6.1

Stories from March 29, 2026

📡 AI NEWS BUT ACTUALLY GOOD

Security & Control for AI Agent Execution