π You are visitor #51538 to this AWESOME site! π
Last updated: 2026-06-12 | Server uptime: 99.9% β‘
π Filter by Category
Loading filters...
π° NEWS
πΊ 5 pts
β‘ Score: 7.5
π¬ RESEARCH
via Arxiv
π€ Elias Lumer, Sahil Sen, Kevin Paul et al.
π
2026-06-11
β‘ Score: 7.3
"Recursive language models (RLMs) showed that recursion over model calls is an effective strategy for long-context reasoning, and production coding agents have begun to write code that spawns subagents at scale, most recently in Anthropic's dynamic workflows. We name and study the pattern between the..."
π° NEWS
πΊ 2 pts
β‘ Score: 7.3
π¬ RESEARCH
via Arxiv
π€ Sanjay Adhikesaven, Haoxiang Sun, Sewon Min
π
2026-06-10
β‘ Score: 7.3
"Modern LLM training pipelines increasingly rely on other models to generate data, filter corpora, judge outputs, and guide development decisions. These dependencies are recursive: a model may depend on an upstream artifact whose own dependencies are documented only in separate releases and artifacts..."
π¬ RESEARCH
via Arxiv
π€ Jundong Xu, Qingchuan Li, Jiaying Wu et al.
π
2026-06-11
β‘ Score: 7.1
"Large language model (LLM) agents have achieved strong performance on a wide range of benchmarks, yet most evaluations assume static environments. In contrast, real-world deployment is inherently dynamic, requiring agents to continually align their knowledge, skills, and behavior with changing envir..."
π° NEWS
πΊ 2 pts
β‘ Score: 7.1
π¬ RESEARCH
via Arxiv
π€ Leon Bergen, Usha Bhalla, Sidharth Baskaran et al.
π
2026-06-10
β‘ Score: 7.1
"Language-model post-training is the main stage at which model behavior is shaped, yet it still largely involves optimization of scalar rewards that summarize diverse desiderata. This abstraction gives practitioners little visibility into what their data actually teaches models, allowing spurious cor..."
π° NEWS
πΊ 1 pts
β‘ Score: 7.0
π¬ RESEARCH
via Arxiv
π€ Amy Xin, Jiening Siow, Junjie Wang et al.
π
2026-06-11
β‘ Score: 7.0
"LLM-based agents have shown increasing potential in automating scientific discovery. Given an optimizable metric and an execution environment, they can propose, validate, and iterate scientific solutions, and have produced results that outperform human-designed approaches. As model capabilities cont..."
π¬ RESEARCH
via Arxiv
π€ NoΓ©mi ΓltetΕ, Nathaniel D. Daw, Kimberly L. Stachenfeld et al.
π
2026-06-10
β‘ Score: 7.0
"Advancing scientific understanding through mechanistic modeling requires posing the right experimental questions to yield maximally informative data. To automate this pursuit within cognitive science, we introduce ATLAS (Active Theory Learning for Automated Science), an active learning framework for..."
π οΈ SHOW HN
πΊ 1 pts
β‘ Score: 6.9
π‘ AI NEWS BUT ACTUALLY GOOD
The revolution will not be televised, but Claude will email you once we hit the singularity.
Get the stories that matter in Today's AI Briefing.
Powered by Premium Technology Intelligence Algorithms β’ Unsubscribe anytime
π¬ RESEARCH
via Arxiv
π€ Minghao Luo, Liang Chen
π
2026-06-11
β‘ Score: 6.9
"Search-augmented LLMs increasingly mediate everyday consumer recommendations by retrieving live web content. This creates a new risk: generative recommenders may consume polluted web content, such as fake reviews and promotional pages crafted to mislead recommendations. We ask: to what extent do sea..."
π¬ RESEARCH
via Arxiv
π€ Xiaoyuan Liu, Jianhong Tu, Yuqi Chen et al.
π
2026-06-11
β‘ Score: 6.9
"Agent systems are advancing quickly across domains, but their evaluation remains fragmented. Most benchmarks rely on fixed, LLM-centric harnesses that require heavy integration, create test-production mismatch, and limit fair comparison across diverse agent designs. The root problem is the lack of a..."
π° NEWS
πΊ 2 pts
β‘ Score: 6.9
π¬ RESEARCH
via Arxiv
π€ Xingjian Diao, Wenbo Li, Yashas Malur Saidutta et al.
π
2026-06-10
β‘ Score: 6.9
"Long input sequences are central to document understanding and multi-step reasoning in Large Language Models, yet the quadratic cost of attention makes inference both memory-intensive and slow. Context distillation mitigates this by compressing contextual information into model parameters, and recen..."
π° NEWS
πΊ 223 pts
β‘ Score: 6.8
π¬ RESEARCH
via Arxiv
π€ Yaxin Du, Yifan Zhou, Yujie Ge et al.
π
2026-06-11
β‘ Score: 6.8
"Tool-augmented LLM agents commonly rely on step-wise atomic tool calls, where each invocation, observation, and value transfer is exposed in the main reasoning trace. This creates an \emph{execution-granularity mismatch}: locally deterministic tool workflows are unfolded into repeated model-visible..."
π¬ RESEARCH
via Arxiv
π€ Zongsheng Cao, Bihao Zhan, Jinxin Shi et al.
π
2026-06-11
β‘ Score: 6.8
"Current LLM-based research agents have advanced through agent orchestration, yet largely overlook scientific knowledge orchestration. Existing works often reduce papers to abstracts, surface mentions, and flat \texttt{cites} edges, omitting key entities, claims, evidence, mechanisms, and method line..."
π° NEWS
πΊ 2 pts
β‘ Score: 6.8
π¬ RESEARCH
via Arxiv
π€ Hongjian Zhou, Xinyu Zou, Jinge Wu et al.
π
2026-06-10
β‘ Score: 6.8
"Large language models (LLMs) now reach expert-level scores on medical licensing exams, encouraging the assumption that high scores imply safe medical judgment while patients increasingly use them for health advice. We show this assumption is fragile: when misleading context is injected into question..."
π¬ RESEARCH
via Arxiv
π€ Anamaria-Roberta Hartl, Levente ZΓ³lyomi, David Stap et al.
π
2026-06-10
β‘ Score: 6.8
"Transformers dominate modern sequence modeling, but their quadratic attention incurs substantial computational cost. Subquadratic architectures offer a scalable alternative. However, it remains unclear which designs yield the most effective sequence models. We compare three leading approaches: xLSTM..."
π¬ RESEARCH
via Arxiv
π€ Zilin Xiao, Qi Ma, Chun-cheng Jason Chen et al.
π
2026-06-11
β‘ Score: 6.7
"Retrieval-augmented generation (RAG) has become a standard mechanism for grounding language models in external knowledge, yet conventional retrieval based on lexical or semantic similarity is poorly suited for complex reasoning tasks: a semantically similar problem may demand an entirely different s..."
π¬ RESEARCH
via Arxiv
π€ Chirag Chawla, Pratinav Seth, Vinay Kumar Sankarapu
π
2026-06-10
β‘ Score: 6.7
"Domain fine-tuning degrades the safety of large language models: fine-tuned specialists readily comply with harmful prompts framed in domain language. Existing inference-time defenses that mix logits from a safe anchor model require both models to share a vocabulary, which rules them out for the cro..."
π¬ RESEARCH
via Arxiv
π€ Mengyu Zheng, Kai Han, Boxun Li et al.
π
2026-06-10
β‘ Score: 6.7
"General-purpose agents such as OpenClaw are increasingly used as autonomous tool users, but their coding ability is difficult to measure under SWE-bench: a generic agent does not by itself satisfy the clean Docker workspace, patch, and prediction contract required for scoring. We introduce Claw-SWE-..."
π¬ RESEARCH
via Arxiv
π€ Xucong Wang, Ziyu Ma, Yong Wang et al.
π
2026-06-10
β‘ Score: 6.7
"Recent advances in agentic Reinforcement Learning (RL) have substantially improved the multi-turn tool-use capabilities of large language model agents. However, most existing methods assign credit over coarse heuristic units, such as tool-call boundaries or fixed workflows, making it difficult to id..."
π¬ RESEARCH
via Arxiv
π€ King Yeung Tsang, Zihao Zhao, Vishal Venkataramani et al.
π
2026-06-11
β‘ Score: 6.6
"Multi-Agent Systems (MAS) built on Large Language Models (LLMs) require effective orchestration to coordinate specialized agents, yet training such orchestrators is hindered by limited supervision and high computational cost. We propose Orchestration Reward Modeling (OrchRM), a self-supervised frame..."
π¬ RESEARCH
via Arxiv
π€ Yucheng Li, Huiqiang Jiang, Yang Xu et al.
π
2026-06-10
β‘ Score: 6.6
"Reinforcement learning (RL) has become a key component in modern large language models, yet the rollout stage remains the key bottleneck in RL training pipelines. Although Multi-Token Prediction (MTP) offers a natural solution to accelerate rollouts through speculative decoding, many studies have ob..."
π° NEWS
πΊ 7 pts
β‘ Score: 6.4
π¬ RESEARCH
πΊ 2 pts
β‘ Score: 6.4