📚 HISTORICAL ARCHIVE - March 30, 2026
What was happening in AI on 2026-03-30
📊 You are visitor #47291 to this AWESOME site! 📊
Archive from: 2026-03-30 | Preserved for posterity ⚡
📂 Filter by Category
Loading filters...
🛠️ TOOLS
"Claude can open your apps, click through your UI, and test what it built, right from the CLI.
It works on anything you can open on your Mac: a compiled SwiftUI app, a local Electron build, or a GUI tool that doesn't have a CLI.
Now available in research preview on Pro and Max on macOS. Enable it..."
🎯 Frustration with Limits • Lack of Transparency • Disappointment with Functionality
💬 "It's a great product, but the limitations are bringing that reputation down..."
• "Maybe someday I will have enough tokens to try this feature at least once without burning 100% of my weekly rate"
🔒 SECURITY
"I spent the past few days reverse-engineering the Claude Code standalone binary (228MB ELF, Ghidra + MITM proxy + radare2) and found two independent bugs that cause prompt cache to break, silently inflating costs by 10-20x. Posting this so others can protect themselves.
## Bug 1: Sentinel replaceme..."
🎯 Cost spikes • AI-generated bugs • Reverse engineering
💬 "10x costs with zero changelogs is a bold strategy"
• "Reverse engineering the bun fork with Ghidra is next level"
🔒 SECURITY
"
https://shapingrooms.com/research
I published a paper today on something I've been calling postural manipulation. The short version: ordinary language buried in prior context can shift how an AI reasons about a decision before any instruction arrives. No adversa..."
🎯 Documenting language bias • Defending against LLM attacks • Interpreting LLM behavior
💬 "The reason to document it is exactly what you said: new layers of protection are needed."
• "It turns out the LLM picks up that bias and uses it, not the instructions part of what you said, not the data, the biased wording, and it is using that in its reasoning 50 turns later."
🛠️ TOOLS
🎯 Open source sustainability • AI and open source • Open source development models
💬 "Open source is great because of the people creating value with it"
• "Commitment is exactly what they don't want, rather they want the fast sugar high"
🛠️ TOOLS
"Built a tool that profiles your GGUF model's layer shapes on your AMD GPU and generates optimal kernel configs that llama.cpp loads at runtime. No recompilation needed.
**The problem:** llama.cpp's MMVQ kernels use the same thread/block configuration for every layer regardless of shape. A 1024-row ..."
🎯 Inference performance improvements • RDNA3/4 support • Patch details and verification
💬 "12 tok/s -> 27 tok/s (2.25x)"
• "943 GB/s on the XTX (98% bandwidth utilization vs 622 GB/s stock)"
🔄 OPEN SOURCE
🎯 Local LLM inference • Project recognition • AI community impact
💬 "llama.cpp is one of the most influential project"
• "This community owes so much to your dedication"
🛡️ SAFETY
"Most of the current “AI security” stack seems focused on:
• prompts
• identities
• outputs
After an agent deleted a prod database on me a year ago. I saw the gap and started building.
a control layer directly in the execution path between agents and tools. We are to market but I don’t want ..."
🎯 Execution risk mitigation • Credential management • Cross-session optimization
💬 "Fail-closed beats smart recovery"
• "Credential starvation works, but re-granting is the bottleneck"
🧠 NEURAL NETWORKS
"Hi Everybody! I just wanted to share an update on a project I’ve been working on called BULaMU, a family of language models trained (20M, 47M, and 110M parameters) trained entirely from scratch for a low resource language, Luganda. The models are small and compute-efficient enough to run offline on ..."
📡 AI NEWS BUT ACTUALLY GOOD
The revolution will not be televised, but Claude will email you once we hit the singularity.
Get the stories that matter in Today's AI Briefing.
Powered by Premium Technology Intelligence Algorithms • Unsubscribe anytime
🧠 NEURAL NETWORKS
"The tinylora paper shows that we can alter model behavior with only a few parameters.
https://arxiv.org/pdf/2602.04118
I tried replicating the paper, and made a tinylora implementation for qwen3.5, and it does work, it's crazy to think about. I got the same resu..."
🎯 Human vs. Bot • Facts vs. Behavior • Model Optimization
💬 "Do you type anything yourself anymore?"
• "It probably got hacked or sold."
🔒 SECURITY
"Link:
https://m.youtube.com/watch?v=1sd26pWhfmg
The Linux exploit is especially interesting because it was introduced in 2003 and was never found until now. It was a buffer overflow error, which are so hard to do that Carlini has never done it before.
H..."
🎯 Security Vulnerabilities • Model Capabilities • Binary Exploitation
💬 "Buffer overflows on modern OS with things like stack randomization is quite hard to pull off"
• "There are potentially hundreds of these kinds of vulnerabilities in popular software which people that have little security experience can exploit with these capabilities"
🔬 RESEARCH
🎯 AI Impact on Society • Education Challenges • Limitations of AI Claims
💬 "technological system as a force that human society could contain"
• "how we might restructure our educational institutions"
🛠️ TOOLS
"been following this community for a while — you all produced
the clearest signals about the context drift problem I've
seen anywhere. wanted to share what I built from that.
the core problem: your CLAUDE.md or AGENTS.md goes stale
after commits and your AI tool proceeds on incorrect context..."
🎯 Galactic architecture • Cursor indexing • Context drift detection
💬 "the drift problem gets messier when multiple agents are running simultaneously"
• "context files that grow into noise are worse than no context at all"
🔧 INFRASTRUCTURE
"...what's your speedup? (CUDA only)..."
🛡️ SAFETY
"Ran into this building an agent that could trigger API calls.
We had validation, tool constraints, retries… everything looked “safe”.
Still ended up executing the same action twice due to stale state + retry.
Nothing actually prevented execution. It only shaped behavior.
Curious what people use ..."
🎯 Agent Execution Reliability • Safety Layers vs Structural Controls • Deterministic Intent Validation
💬 "Nothing actually prevented execution. It only shaped behavior."
• "Validation that runs inside the same reasoning loop that produced the action is checking its own work with the same blind spots."
🔬 RESEARCH
"I published a model you can use now to help detect sycophantic AI responses. It rejects 100% of the sycophantic delusion affirming responses from
psychosis-bench. It also does well on the [AISI Harmful Advice](
https://huggingface.co/datasets/ai-safety-ins..."
🛠️ TOOLS
"Hi everyone,
I've been reading up on Google's recent TurboQuant announcement from a few days ago (compressing the KV cache down to 3-4 bits with supposedly zero accuracy loss), and I'm trying to wrap my head around the practical implications for our daily setups.
We already have great weight quanti..."
🎯 Quantization techniques • Model performance benchmarks • Practical considerations for deployment
💬 "TurboQuant (as people understand it) is 3.0 / 3.5 / 4.0 bits"
• "Now turboquant enables safe use of q4, q3.5 or q3 kv cache"
🔬 RESEARCH
via Arxiv
👤 Mo Li, L. H. Xu, Qitai Tan et al.
📅 2026-03-27
⚡ Score: 6.8
"Large language model (LLM)-based coding agents achieve impressive results on controlled benchmarks yet routinely produce pull requests that real maintainers reject. The root cause is not functional incorrectness but a lack of organicity: generated code ignores project-specific conventions, duplicate..."
🛠️ TOOLS
"I'm releasing TRACER (Trace-Based Adaptive Cost-Efficient Routing), a library for learning cost-efficient routing policies from LLM traces.
The setup: you have an LLM handling classification tasks. You want to replace a fraction of calls with a cheap local surrogate, with a formal guarantee that th..."
🏥 HEALTHCARE
"Published drug safety studies take months of specialized work and end up behind paywalls. Commercial pharmacovigilance platforms cost about 50K-500K/year if the AI is right about the cost.
The FDA's public dashboard shows raw report counts but not the disproportionality statistics (PRR, ROR, chi-..."
🎯 Software Limitations • Quality Assurance • Unrealistic Expectations
💬 "This sw does not cost 50-500K per year"
• "Is it even a proof of concept if you can't search ibuprofen??"
🛠️ TOOLS
"Been using Claude Code heavily across multiple projects and got tired of the same issues everyone complains about.
So I built a fix. One file. Drop it in your project root. No code changes.
Full disclosure - the entire thing was researched, built, benchmarked, and validated in one session with Cla..."
🎯 Discussion quality • AI-generated content • Prompt-based solutions
💬 "You and the 100 people who posted something like this this week should work together"
• "They should definitely 'do something' together… maybe not work… 😮💨"
🔒 SECURITY
🎯 Privacy concerns • Regulation of smart glasses • Accessibility and inclusivity
💬 "Nobody expects someone's eyeglasses to be recording them."
• "Absolutely fuck these things and anyone who advocates for them."
🔄 OPEN SOURCE
"Hey r/ClaudeAI,
Garry Tan (CEO of Y Combinator) just open-sourced gstack — his own personal pack of slash commands/skills for Claude Code.
Instead of treating Claude as one generic assistant, gstack turns it into a structured virtual team with specialized roles:
• CEO (product strategy & vis..."
🎯 Code Quality • Workflow Criticism • Hype vs Substance
💬 "more code is always better"
• "AI-generated code is verbose by default"
🛠️ TOOLS
"Inspired by Andrej Karpathy's AutoResearch, I built a system where Claude Code acts as an autonomous ML researcher on tabular binary classification tasks (churn, conversion, etc.).
You give it a dataset. It loops forever: analyze data, form hypothesis, edit code, run experiment, evaluate with expan..."
🎯 Data analysis • Backtest overfitting • Feature engineering
💬 "If you torture the data long enough, it will confess to anything."
• "The insidious thing about backtest overfitting and related things like data dredging is that world knowledge doesn't protect against it - if you iterate long enough, you're bound to get a spurious result that lines up with what 'makes sense'."
🔄 OPEN SOURCE
"My research suggests github is seeing a new (public) MCP server added every 20 minutes. A new Claude skill every 7.5 minutes. Who here has tried to build either so far? I'd love to ask you some questions if you'd be willing. ..."
🎯 Skill development challenges • Skill ecosystem maturity • Validation of skill ideas
💬 "the skill side is more interesting to me because it's less about code and more about prompt architecture"
• "the whitespace finder is an interesting question for skills specifically"
📊 DATA
"Last week I asked for some feedback about what extra models I should test. I've added them all and now the benchmark is available at
https://sql-benchmark.nicklothian.com/
I didn't say a lot about what the agent at the time, but in simple terms it takes an ..."
🎯 AI model performance • Hardware requirements • Self-hosting
💬 "Qwen 3.5-27B is the goat. You can run it on a RTX 3090 at 40 tok/s."
• "Qwen 3.5-27B is probably the best for most people to self host."
🔬 RESEARCH
🎯 Reconciling digital and continuous mathematics • Challenges of learning advanced mathematics • Practical applications of control theory
💬 "fundamental issue that is always swept under the rug"
• "I am unsure of the next course of action"
🔧 INFRASTRUCTURE
"This just showed up a couple of days ago on GitHub. Note that **ANE is the NPU in all Apple Silicon**, *not* the new 'Neural Accelerator' GPU cores that are only in M5.
(ggml-org/llama.cpp#10453) \- Comment by **arozano..."
🎯 Chip Architecture • Model Limitations • Neural Processing
💬 "the 4GB addressing limit on older M chips is the real caveat here"
• "useful for small models and maybe a few MoE expert layers"
⚖️ ETHICS
🎯 Appropriate AI usage • AI-assisted writing process • Preserving human thinking
💬 "Don't let an LLM do your thinking, or interfere with processes essential to you thinking things clearly through."
• "When I send somebody a document that whiffs of LLM, I'm only demonstrating that the LLM produced something approximating what others want to hear. I'm not showing that I contended with the ideas."
⚡ BREAKTHROUGH
"External link discussion - see full content at original source."
🛡️ SAFETY
"Hi, r/MachineLearning: has much research been done in large-scale training scenarios where undesirable data has been replaced before training, such as any instances of violence, lying, or deception in the dataset?
Most controllability work, like RLHF or constitutional AI, seems to be done post-trai..."
🛠️ TOOLS
" We built an open-source prototype that applies Unix philosophy to retrieval pipelines. Each stage (PII redaction, chunking, dedup, embeddings, eval) is its own plugin with a typed contract, like pipes between Unix tools.
The motivation: we swapped a chunker and retrieval got worse, but ..."
🛡️ SAFETY
"Hey guys!🤗
I’ve been working with AI agents that interact with APIs and real systems, and I keep running into the same issue
Once agents actually start executing things, they can ignore constraints, take unintended actions or just behave unpredictably
It feels like prompt-level control isn’t real..."
🎯 Prompt Constraints • Execution Layer Control • Isolated Environment
💬 "Defense in depth — the prompt sets intent, the execution layer enforces it regardless of what the LLM says."
• "Feels like combining both approaches (isolation + execution control) gives a lot more stability in practice"