arXiv cs.AI AI News
AINews has published 140 articles sourced from arXiv cs.AI. Explore the latest related coverage, summaries and analysis.
Overview
Published articles
140
Latest update
April 10, 2026
Primary host
arxiv.org
Latest coverage from arXiv cs.AI
The field of Multi-Agent Reinforcement Learning (MARL) has achieved remarkable feats in simulation, from mastering complex games like StarCraft II to optimizing logistics networks.…
Qualixar OS represents a foundational leap in AI infrastructure, positioning itself not as another AI model or a simple orchestration framework, but as the first application-layer …
A critical reassessment of the 'hallucination' problem in multimodal AI is underway, exposing a dangerous flaw in current safety paradigms. The industry's obsession with reducing o…
The AI industry's relentless pursuit of longer context windows—with models now reaching millions of tokens—has created a paradoxical situation: we can store more information than e…
AI产业正处于一个转折点:单个智能体的创建速度已超过管理它们的系统发展速度。随着专业模型在设备、边缘节点和云数据中心中激增——每个模型都具有不同的能力、延迟特性、成本及数据隐私影响——如何将用户请求动态路由到最优智能体,已成为一个亟待解决的关键问题。对于复杂应用而言,手动硬编码这些决策是不可行的,这为构建健壮的多智能体系统设置了巨大障碍。AgentGate正…
The artificial intelligence community has operated for years under a simplifying assumption: Supervised Fine-Tuning (SFT) teaches models to mimic training data, while Reinforcement…
The persistent issue of 'fluent hallucinations' in large language models—where AI generates mathematically plausible but logically incorrect reasoning—has long hampered their appli…
The persistent challenge of getting AI systems to accurately assess their own confidence has been a major roadblock to their deployment in high-stakes fields. Traditional methods f…
The prevailing method for mitigating hallucinations in large language models has long been an external, post-hoc affair. Systems typically rely on retrieval-augmented generation (R…
Port logistics, long characterized by reactive operations and costly inefficiencies, is entering an era of predictive intelligence. The core challenge has been 'unproductive moves'…
The AI industry is undergoing a fundamental paradigm shift. The era of scaling model parameters is giving way to a new focus: test-time compute scaling. This concept involves dynam…
The Silicon Mirror framework represents a foundational shift in how we approach AI alignment, moving beyond output filtering to intervention at the decision-making layer. Developed…
The field of AI agent evaluation has reached both a milestone and a precipice. Independent research has validated that LLM-based judges—systems that assess the quality of other AI …
The deployment of large language models in serious applications is hitting a fundamental roadblock: their inability to reliably distinguish fact from fabrication. While these model…
The field of automated optimization modeling, crucial for applications from supply chain logistics to financial portfolio management, has long been trapped between two flawed appro…
The relentless pursuit of larger, more capable language models has made Mixture-of-Experts (MoE) architectures a cornerstone of modern AI scaling. By activating only a subset of pa…
The rapid evolution of large language models from conversational interfaces to autonomous agents has exposed a critical architectural vulnerability. Current systems typically emplo…
The AI research community faces a mounting credibility challenge centered on the reproducibility of advanced agent capabilities. The GPT-OSS-20b model, celebrated for its tool-use …
As AI agents powered by large language models transition from research prototypes to production systems, a previously underestimated bottleneck has emerged: the operational overhea…
The frontier of applied artificial intelligence is undergoing a quiet but profound transformation. The dominant narrative of scaling ever-larger monolithic models is being challeng…
The evaluation of artificial intelligence is undergoing a paradigm shift from closed-domain problem-solving to open-ended social cognition. The vocabulary association game Connecti…
The promise of AI-powered programming tutors—unlimited, patient, and personalized instruction—is colliding with a subtle but profound technical reality. Large language models, when…
The field of AI-assisted behavioral health is undergoing a foundational transformation. The long-standing tension between fluid, empathetic conversation and rigid, reliable safety …
The AI community has reached consensus that agent reliability represents the final frontier before widespread practical deployment. While significant progress has been made in agen…