π You are visitor #51264 to this AWESOME site! π
Last updated: 2026-05-25 | Server uptime: 99.9% β‘
π Filter by Category
Loading filters...
π° NEWS
πΊ 185 pts
β‘ Score: 8.3
π¬ RESEARCH
πΊ 141 pts
β‘ Score: 8.2
π° NEWS
πΊ 169 pts
β‘ Score: 8.1
π¬ RESEARCH
via Arxiv
π€ Xu Ouyang, Deyi Liu, Yuhang Cai et al.
π
2026-05-22
β‘ Score: 7.9
"Existing scaling laws for Large Language Models (LLMs), predominantly monotonic power laws, fail to explain emerging non-monotonic phenomena such as catastrophic overtraining and quantization-induced degradation, where performance deteriorates despite increased compute.
We propose the Shannon Scal..."
π° NEWS
β¬οΈ 62 ups
β‘ Score: 7.8
"Paper:
https://github.com/OpenBMB/MiniCPM/blob/main/docs/BitCPM_CANN.pdf
### Abstract
>We present BitCPM-CANN, a systematic family-level study of 1.58-bit (ternary)
quantization-aware training (QAT) on the Huawei Ascend NPU platform. To address
two practical gaps for extreme low-bit LLMsβwhethe..."
π¬ RESEARCH
via Arxiv
π€ Yunpeng Dong, Jingkai He, Yuze Hou et al.
π
2026-05-21
β‘ Score: 7.7
"LLM-powered AI agents require high-frequency state exploration (e.g., test-time tree search and reinforcement learning), relying on rapid checkpoint and rollback (C/R) of the complete sandbox state, including files and process state (e.g., memory, contexts, etc.). Existing mechanisms duplicate the e..."
π° NEWS
πΊ 221 pts
β‘ Score: 7.3
π¬ RESEARCH
via Arxiv
π€ Piercosma Bisconti, Matteo Prandi, Federico Pierucci et al.
π
2026-05-21
β‘ Score: 7.3
"Background. Traditional safety benchmarks for language models evaluate generated text: whether a model outputs toxic language, reproduces bias, or follows harmful instructions. When models are deployed as agents, the safety-relevant object shifts from what the system says to what it does within an e..."
π¬ RESEARCH
via Arxiv
π€ Long Phan, Devin Kim, Alexander Pan et al.
π
2026-05-21
β‘ Score: 7.2
"Large language models (LLMs) exhibit systematic political bias across a variety of sensitive contexts. We find that LLMs handle counterpart topics from opposing political sides asymmetrically. We refer to this phenomenon as covert political bias and identify 7 categories of techniques through which..."
π° NEWS
πΊ 1 pts
β‘ Score: 7.1
π° NEWS
πΊ 1 pts
β‘ Score: 7.0
π‘ AI NEWS BUT ACTUALLY GOOD
The revolution will not be televised, but Claude will email you once we hit the singularity.
Get the stories that matter in Today's AI Briefing.
Powered by Premium Technology Intelligence Algorithms β’ Unsubscribe anytime
π° NEWS
β¬οΈ 328 ups
β‘ Score: 6.9
"External link discussion - see full content at original source."
π¬ RESEARCH
πΊ 3 pts
β‘ Score: 6.9
π¬ RESEARCH
via Arxiv
π€ Qianshu Cai, Yonggang Zhang, Xianzhang Jia et al.
π
2026-05-21
β‘ Score: 6.9
"Autonomous agentic systems are largely static after deployment: they do not learn from user interactions, and recurring failures persist until the next human-driven update ships a fix. Self-evolving agents have emerged in response, but all confine evolution to text-mutable artifacts -- skill files,..."
π° NEWS
πΊ 8 pts
β‘ Score: 6.8
π° NEWS
πΊ 2 pts
β‘ Score: 6.8
π° NEWS
β¬οΈ 58 ups
β‘ Score: 6.7
"A few weeks ago, after finishing
FastDMS, I started toying around writing some RDNA3 kernels again to see how fast I could get Qwen 3.6 MoE running. It turned out well enough, so over the past cou..."
π° NEWS
β¬οΈ 1 ups
β‘ Score: 6.7
"If you use Cursor heavily, you've probably hit this: you have internal patterns, boilerplate, team conventions β and every new chat you spend the first few messages re-establishing context. Rules files help but they load everything upfront, which burns context fast.
I built **knowledge-shelf** to f..."
π¬ RESEARCH
via Arxiv
π€ Sadia Asif, Mohammad Mohammadi Amiri, Momin Abbas et al.
π
2026-05-21
β‘ Score: 6.7
"Large language model (LLM)-based multi-agent systems increasingly rely on intermediate communication to coordinate complex tasks. While most existing systems communicate through natural language, recent work shows that latent communication, particularly through transformer key-value (KV) caches, can..."
π° NEWS
πΊ 75 pts
β‘ Score: 6.7
π¬ RESEARCH
via Arxiv
π€ Stuart Bladon, Brinnae Bent
π
2026-05-22
β‘ Score: 6.6
"It has generally been assumed that geopolitical bias in language models originates from the training data used during the pre-training phase. We tested seven open-weight LLM pairs consisting of the base model (pre-training only) and the chat model (pre-training and post-training) from seven labs on..."
π¬ RESEARCH
"Large language models are routinely used as automated evaluators: to review code, moderate content, or score outputs, often with many items passing through one conversation. We ask whether the polarity of prior conversation history biases subsequent judgments, an effect we call the accumulated messa..."
π¬ RESEARCH
via Arxiv
π€ George Tsoukalas, Anton Kovsharov, Sergey Shirobokov et al.
π
2026-05-21
β‘ Score: 6.6
"Large language models (LLMs) increasingly excel at mathematical reasoning, but their unreliability limits their utility in mathematics research. A mitigation is using LLMs to generate formal proofs in languages like Lean. We perform the first large-scale evaluation of this method's ability to solve..."
π¬ RESEARCH
via Arxiv
π€ Taiming Lu, Zhuang Liu
π
2026-05-22
β‘ Score: 6.5
"Knowledge distillation generally assumes a strong-to-weak relationship where stronger teachers yield better students. In this work, we examine this assumption about distillation in large language model pretraining. By varying architecture sizes and training token budgets, we create strong-to-weak, s..."
π¬ RESEARCH
via Arxiv
π€ Ryan Bahlous-Boldi, Isha Puri, Idan Shenfeld et al.
π
2026-05-21
β‘ Score: 6.4
"Language models must now generalize out of the box to novel environments and work inside inference-scaling search procedures, such as AlphaEvolve, that select rollouts with a variety of task-specific reward functions. Unfortunately, the standard paradigm of LLM post-training optimizes a pre-specifie..."
π° NEWS
β¬οΈ 9 ups
β‘ Score: 6.2
"Repo:
https://github.com/jeongmk522-netizen/agentlas\_org\_chart
Almost every multi-agent setup I have shipped or tested eventually hits the same wall. Agents bouncing between each other, reviewers asking for one more polish pass forever, research workers spawning indefinite subtopics, tool calls s..."
π° NEWS
πΊ 1 pts
β‘ Score: 6.2
π° NEWS
β¬οΈ 2 ups
β‘ Score: 6.1
"I've been working with agents for months now, and I haven't found a sandbox environment that "just works" so I built it!
My requirements were as follows:
1. Agent is unable to destroy my host OS but able to install software and run sudo commands
2. Agent is able to browse the web autonomously and ..."