Recent AI research papers with accessible summaries. Updated daily from arXiv, summarized for developers who don't read papers regularly.
Rishi Bommasani, Sarah H. Bana, Kathleen A. Creel et al.
When many employers use the same hiring algorithm, it amplifies bias rather than spreading risk—the same people get rejected everywhere, and racial disparities compound across the job market.
This paper analyzes hiring algorithms from a single vendor used by many employers and finds they create unfair outcomes.
Huawei Lin, Peng Li, Jie Song et al.
Treating AI agent skills as long-lived, testable assets with persistent memory—rather than disposable code—significantly improves task success rates and enables skills to transfer between agents and tasks.
This paper introduces MUSE-Autoskill, a framework that helps AI agents continuously improve by creating, storing, and refining reusable skills over time. Instead of treating skills as one-time solutions, the system manages them like software—organizing them in memory, testing them, and learning from experience to make them more reliable and effective across different tasks.
Yifan Yang, Ziyang Gong, Weiquan Huang et al.
Skills can be trained like model parameters: use a separate optimizer to iteratively edit skill text based on validation feedback, not just generate them once. This approach is reproducible, stable, and transfers across models.
SkillOpt treats agent skills like neural network weights—optimizing them systematically through an external optimizer model that suggests bounded edits to skill documents based on scored rollouts.
Xu Ouyang, Deyi Liu, Yuhang Cai et al.
LLMs have a fundamental capacity limit based on signal-to-noise ratio: scaling parameters or data without maintaining sufficient signal clarity causes performance degradation, explaining phenomena like catastrophic overtraining and quantization failures that standard scaling laws can't capture.
This paper explains why large language models sometimes get worse with more training or smaller precision—not just better. Using information theory, the authors model LLM training like sending signals through a noisy channel. When you scale up the model or data without keeping the signal clear relative to noise, performance actually drops in a U-shape.