đ You are visitor #51401 to this AWESOME site! đ
Last updated: 2026-03-15 | Server uptime: 99.9% âĄ
đ Filter by Category
Loading filters...
đŦ RESEARCH
đē 57 pts
⥠Score: 8.5
đ¯ Inference Compute âĸ MCTS Popularity âĸ Compute Budget Comparison
đŦ "MCTS uses more inference compute on a per-sample basis than GRPO"
âĸ "Why MCTS is not more popular as a test time compute harness"
đ ī¸ SHOW HN
đē 72 pts
⥠Score: 8.5
đ¯ Agent portability âĸ Agent discovery âĸ Security concerns
đŦ "the abstractions don't line up 1:1"
âĸ "the only thing standing between your plaintext secrets and the rest of the world is a .gitignore rule"
đ§ INFRASTRUCTURE
đē 1 pts
⥠Score: 7.5
đĸ BUSINESS
âŦī¸ 334 ups
⥠Score: 7.3
"Dragging the controllers of the 3 parameters left or right automatically adjusts the chart in a real time. And you get that from a six word prompt."
đ¯ AI Disruption of Education âĸ Dismissal of EdTech âĸ Skepticism of Hype
đŦ "wiping out" and entire educational startup sector"
âĸ "These kids think educators and online teaching services do nothing more than display random chats all day"
đŦ RESEARCH
via Arxiv
đ¤ Yushi Bai, Qian Dong, Ting Jiang et al.
đ
2026-03-12
⥠Score: 7.3
"Long-context agentic workflows have emerged as a defining use case for large language models, making attention efficiency critical for both inference speed and serving cost. Sparse attention addresses this challenge effectively, and DeepSeek Sparse Attention (DSA) is a representative production-grad..."
đŦ RESEARCH
via Arxiv
đ¤ Ninghui Li, Kaiyuan Zhang, Kyle Polley et al.
đ
2026-03-12
⥠Score: 7.3
"This article, a lightly adapted version of Perplexity's response to NIST/CAISI Request for Information 2025-0035, details our observations and recommendations concerning the security of frontier AI agents. These insights are informed by Perplexity's experience operating general-purpose agentic syste..."
đ ī¸ TOOLS
âŦī¸ 974 ups
⥠Score: 7.2
"I want to share something I built with Claude Code this past week because I think it shows what AI-assisted development can actually do when pointed at a genuinely hard problem.
Disney Infinity 1.0 (2013) is a game where you place physical figures on a base to play as characters. Each character is ..."
đ¯ Binary Reverse Engineering âĸ Workflow Automation âĸ Community Reaction
đŦ "binary RE on a stripped commercial game engine with no symbols is genuinely hard work"
âĸ "the ability to hold that much context while reasoning about control flow is where it really shines"
đŦ RESEARCH
via Arxiv
đ¤ Krishnakumar Balasubramanian, Shiva Prasad Kasiviswanathan
đ
2026-03-12
⥠Score: 7.2
"Continual post-training of generative models is widely used, yet a principled understanding of when and why forgetting occurs remains limited. We develop theoretical results under a two-mode mixture abstraction (representing old and new tasks), proposed by Chen et al. (2025) (arXiv:2510.18874), and..."
đŦ RESEARCH
via Arxiv
đ¤ Alexandre Le Mercier, Thomas Demeester, Chris Develder
đ
2026-03-12
⥠Score: 7.1
"State space models (SSMs) like Mamba have gained significant traction as efficient alternatives to Transformers, achieving linear complexity while maintaining competitive performance. However, Hidden State Poisoning Attacks (HiSPAs), a recently discovered vulnerability that corrupts SSM memory throu..."
đ ī¸ TOOLS
âŦī¸ 22 ups
⥠Score: 7.0
"
https://github.com/ggml-org/llama.cpp/releases/tag/b8338
Lots of work done by the Intel team, I'm looking forward to trying this out on the 255H with the Arc 140T iGPU..."
đ ī¸ TOOLS
âŦī¸ 73 ups
⥠Score: 7.0
"If you train Graph Neural Networks on large datasets (like Papers100M), you already know the pain: trying to load the edge list and feature matrix usually results in an instant 24GB+ OOM allocation crash before the GPU even gets to do any work.
I just open-sourced **GraphZero v0.2**, a custom C++ d..."
đ¯ GNN neighbor sampling âĸ Memory-efficient data structures âĸ CPU/CUDA message passing ops
đŦ "np.memmap is fine for basic arrays, but using it for GNN neighbor sampling ("fancy indexing") triggers implicit RAM copies in Python, causing OOMs anyway."
âĸ "Another easy win from a throughput perspective is if you use any edge -> node pooling message passing ops, you can write a pretty nice CPU/CUDA implementation that bypasses storing the full edge feature list in memory and instead consumes on-the-fly."
đĄ AI NEWS BUT ACTUALLY GOOD
The revolution will not be televised, but Claude will email you once we hit the singularity.
Get the stories that matter in Today's AI Briefing.
Powered by Premium Technology Intelligence Algorithms âĸ Unsubscribe anytime
đ ī¸ SHOW HN
đē 7 pts
⥠Score: 7.0
đ¯ Programmatic email accounts âĸ Domain rotation model âĸ Abuse prevention
đŦ "how often do agents need to get an email address?"
âĸ "What's the argument for letting agents create accounts?"
đŦ RESEARCH
"Large language models struggle to catch errors in their own outputs when the review happens in the same session that produced them. This paper introduces Cross-Context Review (CCR), a straightforward method where the review is conducted in a fresh session with no access to the production conversatio..."
đŦ RESEARCH
đē 2 pts
⥠Score: 6.9
đŦ RESEARCH
via Arxiv
đ¤ Samy Jelassi, Mujin Kwun, Rosie Zhao et al.
đ
2026-03-12
⥠Score: 6.8
"Cross-entropy (CE) training provides dense and scalable supervision for language models, but it optimizes next-token prediction under teacher forcing rather than sequence-level behavior under model rollouts. We introduce a feature-matching objective for language-model fine-tuning that targets sequen..."
⥠BREAKTHROUGH
âŦī¸ 22 ups
⥠Score: 6.7
"Robert Lange, founding researcher at Sakana AI, joins Tim to discuss **Shinka Evolve** â a framework that combines LLMs with evolutionary algorithms to do open-ended program search. The core claim: systems like AlphaEvolve can optimize solutions to fixed problems, but real scientific progress requir..."
đ SECURITY
đē 1 pts
⥠Score: 6.7
đŦ RESEARCH
via Arxiv
đ¤ Yixin Liu, Yue Yu, DiJia Su et al.
đ
2026-03-12
⥠Score: 6.7
"Reasoning LLMs-as-Judges, which can benefit from inference-time scaling, provide a promising path for extending the success of reasoning models to non-verifiable domains where the output correctness/quality cannot be directly checked. However, while reasoning judges have shown better performance on..."
đĸ BUSINESS
âŦī¸ 348 ups
⥠Score: 6.5
"External link discussion - see full content at original source."
đ¯ Concerns about ArXiv changes âĸ CEO compensation for non-profits âĸ Potential monetization of ArXiv
đŦ "Looks like it's over for Arxiv too."
âĸ "300K is not a lot to be CEO of a big non-profit."
đŦ RESEARCH
âŦī¸ 1 ups
⥠Score: 6.5
"We're sharing ZeroProofML, a small framework for scientific ML problems where the target can be genuinely undefined or non-identifiable: poles, assay censoring boundaries, kinematic locks, etc. The underlying issue is division by zero. Not as a numerical bug, but as a semantic event that shows up wh..."
đ ī¸ TOOLS
đē 1 pts
⥠Score: 6.5
đŦ RESEARCH
đē 2 pts
⥠Score: 6.5
đ ī¸ TOOLS
đē 2 pts
⥠Score: 6.5
đ DATA
đē 1 pts
⥠Score: 6.5
đ ī¸ SHOW HN
đē 2 pts
⥠Score: 6.5
đ§ INFRASTRUCTURE
đē 1 pts
⥠Score: 6.5
đŦ RESEARCH
via Arxiv
đ¤ Yuetian Du, Yucheng Wang, Rongyu Zhang et al.
đ
2026-03-12
⥠Score: 6.3
"Recent advances in Multi-modal Large Language Models (MLLMs) have predominantly focused on enhancing visual perception to improve accuracy. However, a critical question remains unexplored: Do models know when they do not know? Through a probing experiment, we reveal a severe confidence miscalibratio..."
đŦ RESEARCH
via Arxiv
đ¤ Yulu Gan, Phillip Isola
đ
2026-03-12
⥠Score: 6.3
"Pretraining produces a learned parameter vector that is typically treated as a starting point for further iterative adaptation. In this work, we instead view the outcome of pretraining as a distribution over parameter vectors, whose support already contains task-specific experts. We show that in sma..."
đ ī¸ TOOLS
đē 1 pts
⥠Score: 6.2
đ ī¸ SHOW HN
đē 1 pts
⥠Score: 6.1