π HISTORICAL ARCHIVE - November 17, 2025
What was happening in AI on 2025-11-17
π You are visitor #47291 to this AWESOME site! π
Archive from: 2025-11-17 | Preserved for posterity β‘
π Filter by Category
Loading filters...
π¬ RESEARCH
β¬οΈ 9 ups
β‘ Score: 8.3
"Academic research paper shared from arXiv preprint server."
π€ AI MODELS
β¬οΈ 330 ups
β‘ Score: 7.8
"External link discussion - see full content at original source."
π― Microsoft's Market Position β’ AI Capabilities β’ Corporate Dysfunction
π¬ "Number 2 in every market is a very good business state"
β’ "Microsoft is successful at confusing people"
π SECURITY
πΊ 2 pts
β‘ Score: 7.5
π€ AI MODELS
β¬οΈ 54 ups
β‘ Score: 7.4
"I created 10+ hybrid MXFP4 GGUF of the top models available today. Many of these models often have faster TPS than a Q4\_K\_M, \~10% smaller than a Q8\_0 model, and much less precision loss than Q6\_K (very near Q8, sometimes better) . I'll provide links to the models, all the benchmarks, and my pro..."
π― Quantization options β’ Perplexity evaluation β’ Community collaboration
π¬ "MXFP4 isn't particular a strong quantization on its own"
β’ "iq2_kl and iq4_ks are very strong and likely more widely applicable"
π¬ RESEARCH
via Arxiv
π€ Mohamad Amin Mohamadi, Tianhao Wang, Zhiyuan Li
π
2025-11-14
β‘ Score: 7.3
"Modern language models fail a fundamental requirement of trustworthy intelligence: knowing when not to answer. Despite achieving impressive accuracy on benchmarks, these models produce confident hallucinations, even when wrong answers carry catastrophic consequences. Our evaluations on GSM8K, MedQA..."
β‘ BREAKTHROUGH
β¬οΈ 11 ups
β‘ Score: 7.3
"A Chinese research team built an AI system that pulled core physics laws straight out of experimental data with zero prior knowledge. AI-Newton independently found relationships such as Newton's second law. This shows even more that automated science is starting to look real. China's moving fast on ..."
π οΈ TOOLS
β¬οΈ 178 ups
β‘ Score: 7.3
"# What Memlayer Does
MemLayer is an open-source **Python package** that adds persistent, long-term memory to **local LLMs** and embedding pipelines.
Local models are powerful, but theyβre stateless. Every prompt starts from zero.
This makes it difficult to build assistants or agents that remembe..."
π― Integration with LLM UIs β’ Technical implementation details β’ Memory storage and retrieval
π¬ "Definitely want a standalone reverse-proxy (preferably with an easily editable config file) or MCP implementation."
β’ "Consider looking into [LEANN] as a vector DB, due to its efficiency."
π¬ RESEARCH
via Arxiv
π€ Haizhou Shi, Ye Liu, Bo Pang et al.
π
2025-11-13
β‘ Score: 7.1
"Large Language Models (LLMs) have demonstrated remarkable reasoning abilities, yet existing test-time frameworks often rely on coarse self-verification and self-correction, limiting their effectiveness on complex tasks. In this paper, we propose Socratic Self-Refine (SSR), a novel framework for fine..."
π SECURITY
β¬οΈ 3 ups
β‘ Score: 7.1
"If a user asks Claude how well has Dyson's 1984 book "Weapons and Hope" aged, the LLM will try to do a web search and then, regardless of what happens next (even if the user stops the generation amid-search), user's question and model's answer will be both deleted even though there's nothing sketchy..."
π¬ RESEARCH
via Arxiv
π€ Afra Feyza AkyΓΌrek, Advait Gosai, Chen Bo Calvin Zhang et al.
π
2025-11-14
β‘ Score: 7.0
"Frontier model progress is often measured by academic benchmarks, which offer a limited view of performance in real-world professional contexts. Existing evaluations often fail to assess open-ended, economically consequential tasks in high-stakes domains like Legal and Finance, where practical retur..."
π¬ RESEARCH
via Arxiv
π€ Jiang Liu, Jialian Wu, Xiaodong Yu et al.
π
2025-11-13
β‘ Score: 6.9
"Large language models (LLMs) have demonstrated remarkable performance across a wide range of tasks, yet the majority of high-performing models remain closed-source or partially open, limiting transparency and reproducibility. In this work, we introduce Instella, a family of fully open three billion..."
π‘ AI NEWS BUT ACTUALLY GOOD
The revolution will not be televised, but Claude will email you once we hit the singularity.
Get the stories that matter in Today's AI Briefing.
Powered by Premium Technology Intelligence Algorithms β’ Unsubscribe anytime
π οΈ TOOLS
β¬οΈ 10 ups
β‘ Score: 6.9
"Hey everyone,
I spent the last days building a small MCP β SSH relay so an LLM can safely control remote servers using a limited command set.
**Hereβs what the agent currently does completely autonomously:**
1. βοΈ **Creates a temporary Hetzner server** via API
2. π **Generates its own SSH keys**..."
π¬ RESEARCH
via Arxiv
π€ Jeffrey S. Bowers, Jeff Mitchell
π
2025-11-14
β‘ Score: 6.9
"According to Futrell and Mahowald [arXiv:2501.17047], both infants and language models (LMs) find attested languages easier to learn than impossible languages that have unnatural structures. We review the literature and show that LMs often learn attested and many impossible languages equally well. D..."
π οΈ TOOLS
πΊ 2 pts
β‘ Score: 6.8
π¬ RESEARCH
via Arxiv
π€ Prabodh Katti, Sangwoo Park, Bipin Rajendran et al.
π
2025-11-14
β‘ Score: 6.8
"On-device fine-tuning is a critical capability for edge AI systems, which must support adaptation to different agentic tasks under stringent memory constraints. Conventional backpropagation (BP)-based training requires storing layer activations and optimizer states, a demand that can be only partial..."
π¬ RESEARCH
via Arxiv
π€ Zhenyu Ding, Yuhao Wang, Tengyue Xiao et al.
π
2025-11-14
β‘ Score: 6.7
"Large Language Models (LLMs) demonstrate impressive capabilities, yet their outputs often suffer from misalignment with human preferences due to the inadequacy of weak supervision and a lack of fine-grained control. Training-time alignment methods like Reinforcement Learning from Human Feedback (RLH..."
π¬ RESEARCH
via Arxiv
π€ Tianzhu Ye, Li Dong, Zewen Chi et al.
π
2025-11-13
β‘ Score: 6.7
"Black-box distillation creates student large language models (LLMs) by learning from a proprietary teacher model's text outputs alone, without access to its internal logits or parameters. In this work, we introduce Generative Adversarial Distillation (GAD), which enables on-policy and black-box dist..."
π DATA
πΊ 1 pts
β‘ Score: 6.7
π¬ RESEARCH
via Arxiv
π€ Yongxin Shi, Jiapeng Wang, Zeyu Shan et al.
π
2025-11-13
β‘ Score: 6.7
"Recent multimodal large language models (MLLMs) still struggle with long document understanding due to two fundamental challenges: information interference from abundant irrelevant content, and the quadratic computational cost of Transformer-based architectures. Existing approaches primarily fall in..."
π¬ RESEARCH
via Arxiv
π€ Srikant Panda, Avinash Rai
π
2025-11-13
β‘ Score: 6.6
"Large Language Models (LLMs) are commonly evaluated for robustness against paraphrased or semantically equivalent jailbreak prompts, yet little attention has been paid to linguistic variation as an attack surface. In this work, we systematically study how linguistic styles such as fear or curiosity..."
π¬ RESEARCH
via Arxiv
π€ Dena Mujtaba, Brian Hu, Anthony Hoogs et al.
π
2025-11-14
β‘ Score: 6.5
"The deployment of decision-making AI agents presents a critical challenge in maintaining alignment with human values or guidelines while operating in complex, dynamic environments. Agents trained solely to achieve their objectives may adopt harmful behavior, exposing a key trade-off between maximizi..."
π¬ RESEARCH
via Arxiv
π€ Yonatan Dukler, Guihong Li, Deval Shah et al.
π
2025-11-14
β‘ Score: 6.4
"Blocking communication presents a major hurdle in running MoEs efficiently in distributed settings. To address this, we present FarSkip-Collective which modifies the architecture of modern models to enable overlapping of their computation with communication. Our approach modifies the architecture to..."
π SECURITY
β¬οΈ 57 ups
β‘ Score: 6.4
"External link discussion - see full content at original source."
π― Criticism of AI Content β’ Skepticism of Automated Responses β’ Humor in Absurd Situations
π¬ "You can't make that up. It's chaotic, absurd, and definitely entertaining to watch unfold."
β’ "lol a $2 charge sent it over the edgeβ¦ kind of like hitting your weekly limit or daily limit? π"
π€ AI MODELS
β¬οΈ 122 ups
β‘ Score: 6.3
"There are so many embedding models out there that itβs hard to know which one is actually βthe best.β I kept seeing different recommendations, so I got curious and tested them myself.
I ran 13 models on 8 datasets and checked latency, accuracy, and an LLM-judged ELO score. Honestly, the results we..."
π― Benchmarking quality β’ LLM performance variations β’ Judging methodology
π¬ "Saturated benchmarks, not quality"
β’ "LLMs diverge fast in ability"
π¬ RESEARCH
β¬οΈ 3 ups
β‘ Score: 6.3
"Academic research paper shared from arXiv preprint server."
π οΈ SHOW HN
πΊ 4 pts
β‘ Score: 6.2
π SECURITY
πΊ 78 pts
β‘ Score: 6.2
π― Data ownership & control β’ Pros and cons of AI-driven privacy β’ Impact of AI on privacy
π¬ "Your AI. Not theirs."
β’ "Privacy will come back as a main selling point"
π¬ RESEARCH
via Arxiv
π€ Anurag J. Vaidya, Felix Meissen, Daniel C. Castro et al.
π
2025-11-14
β‘ Score: 6.1
"Digitized histopathology analysis involves complex, time-intensive workflows and specialized expertise, limiting its accessibility. We introduce NOVA, an agentic framework that translates scientific queries into executable analysis pipelines by iteratively generating and running Python code. NOVA in..."
π οΈ TOOLS
"
https://preview.redd.it/n3h3cqvhjv1g1.png?width=736&format=png&auto=webp&s=f382ca9a59d5a439b65095e6c57a69c107ad3890
I just got this notification, didnt do a lot of work.
just did one prompt and it seems to be good and fast (i use grok code free)..."