←── back to feed
/topics/arxiv-cs-ai-papers-may-29-2026

arXiv cs.AI papers May 29 2026

50 items1 sourcesupdated 19d agotrend 0

On May 29, 2026, arXiv's cs.AI category published 20 papers spanning reinforcement learning, language models, AI safety, and applications in engineering, education, and clinical research. Topics ranged from temporal-difference learning methods and diffusion model concept erasure to LLM evaluation frameworks, hallucination mitigation, and autonomous agent deployment challenges.

[BLG]blog/rss50
Behavior-Induced Mirror-Prox Temporal-Difference Learning for Faster Off-Policy Prediction
arXiv cs.AI · Xingguo Chen, Yuchen Shen, Shangdong Yang, Chao Li, Guang Yang, Wenhao Wang · 19d
Behavior-Aware Auxiliary Corrections for Off-Policy Temporal-Difference Prediction
arXiv cs.AI · Xingguo Chen, Zhiang He, Yuchen Shen, Shangdong Yang, Chao Li, Guang Yang, Wenhao Wang · 19d
The Cognitive Categorical Transformer: Category-Theoretic Inductive Biases for Language Modeling
arXiv cs.AI · Al Kari · 19d
Ultra-Reduced-Impact-Encased-Logging (URIEL): propose a new method for selective sustainable logging and post-harvest silvicultural treatment in tropical forest using airborne robotics systems
arXiv cs.AI · Daniel Albiero, Gelton Fernando de Morais, Daniela Han, Fl\'avio Roberto de Freitas Gon\c{c}alves, Artur Vit\'orio Andrade Santos, Wesllen Lins de Ara\'ujo, Alessandra Maia Freire, Cl\'audio Kiyoshi Umezu, Mateus Peressin, Francesco Toscano, Admilson \'Irio Ribeiro, Alfeu J. Sguarezi Filho, Am\'erico Ferraz Dias Neto, Angel Pontin Garcia · 19d
Review Arcade: On the Human Alignment and Gameability of LLM Reviews
arXiv cs.AI · Hans Ole Hatzel, Sebastian Steindl, Jan Strich · 19d
Orthogonal Concept Erasure for Diffusion Models
arXiv cs.AI · Yuhao Sun, Lingyun Yu, Haoxiang Xu, Fengyuan Miao, Zhuoer Xu, Hongtao Xie · 19d
Frontier LLM-based agents can overcome the ontology curation bottleneck for natural phenotypes
arXiv cs.AI · James P. Balhoff, Hilmar Lapp · 19d
VFEAgent: A Multimodal Agent Framework for End-to-End Automated Finite Element Analysis
arXiv cs.AI · Jiachen Zhang (Peking University, China Agricultural University), Junyi Lao (Peking University), Chenghao Liu (Peking University), Siyuan Liu (Peking University), Shixin Wu (Peking University), Linsen Zhang (Peking University), Boyu Wang (Peking University), Songfang Huang (Peking University) · 19d
BEAMS: Benchmarking and Evaluating AI for Modeling and Simulation
arXiv cs.AI · Sara Metcalf, William Schoenberg · 19d
Adopt $\neq$ Adapt: Longitudinal Analyses of LLM Conversations in the Wild
arXiv cs.AI · Rebecca M. M. Hicke, Kiran Tomlinson · 19d
When Models Disagree: Rethinking LLM Evaluation for Public Comment Analysis
arXiv cs.AI · Aisha Najera, Alvin Moon, Vedant Srinivasan, Rajesh Veeraraghavan · 19d
Mind Your Tone: Does Tone Alter LLM Performance?
arXiv cs.AI · Om Dobariya, Akhil Kumar · 19d
Practitioner Beliefs and Behaviors in AI-Enhanced Education: DOT Framework Survey Evidence
arXiv cs.AI · David Gibson (Curtin University), M. Elizabeth Azukas (Georgia Institute of Technology), Gerald Knezek (University of North Texas) · 19d
Differentiable Belief-based Opponent Shaping
arXiv cs.AI · Aarav G Sane, Karthik Sivachandran, Rohan Paleja · 19d
Hallucination Mitigation with Agentic AI, Nested Learning, and AI Sustainability via Semantic Caching
arXiv cs.AI · Diego Gosmar, Deborah A. Dahl · 19d
Robust and Efficient Guardrails with Latent Reasoning
arXiv cs.AI · Siddharth Sai, Xiaofei Wen, Muhao Chen · 19d
Bridging the Sim-to-Real Gap in Reinforcement Learning-Based Industrial Dispatching through Execution Semantics
arXiv cs.AI · Jonathan Hoss, Noah Klarmann · 19d
The Importance of Out-of-Band Metadata for Safe Autonomous Agents: The Redpanda Agentic Data Plane
arXiv cs.AI · Tyler Akidau, Tyler Rockwood, Johannes Br\"uderl, Marc Millstone · 19d
The Chain Holds, the Answer Folds: Trace-Answer Dissociation in Reasoning Models Under Adversarial Pressure
arXiv cs.AI · Yubo Li, Ramayya Krishnan, Rema Padman · 19d
Trends in AI and Human-AI Interaction in Clinical Trials -- A Hybrid Human-AI Exploration
arXiv cs.AI · Sandra Woolley, Tim Collins, Khalid Khattak, Illia Chernomorets, Ariane Arevalo, Chris Richardson · 19d
Beyond Consensus: Trace-Level Synthesis in Mixture of Agents
arXiv cs.AI · Shreyas Fadnavis, Praitayini Kanakaraj, Felix Wyss · 19d
PRO-CUA: Process-Reward Optimization for Computer Use Agents
arXiv cs.AI · Yifei He, Rui Yang, Hao Bai, Tong Zhang, Han Zhao · 19d
The Confidence Shortcut: A Reasoning Failure Mode of Masked Diffusion Models
arXiv cs.AI · Dueun Kim, Albert No · 19d
Governing Technical Debt in Agentic AI Systems
arXiv cs.AI · Muhammad Zia Hydari, Raja Iqbal, Narayan Ramasubbu · 19d
Better Later Than Sooner: Neuro-Symbolic Knowledge Graph Construction via Ontology-grounded Post-extraction Correction
arXiv cs.AI · Lorenzo Loconte, Timothy Hospedales, Cristina Cornelio · 19d
Paper Agents, Paper Gains: An Empirical Analysis of DeFi Investment Agents
arXiv cs.AI · Jay Yu, Amy Zhao, Danning Sui · 19d
ReasonOps: Operator Segmentation for LLM Reasoning Traces
arXiv cs.AI · Daniel Lee, Owen Queen, James Zou · 19d
GTA: Generating Long-Horizon Tasks for Web Agents at Scale
arXiv cs.AI · Tenghao Huang, Kung-Hsiang Huang, Prafulla Kumar Choubey, Yilun Zhou, Muhao Chen, Jonathan May, Chien-Sheng Wu · 19d
BenchTrace: A Benchmark for Testing Reflection Ability and Controlled Evolution in LLM Agents
arXiv cs.AI · Jiahao Huang, Fei Cheng, Junfeng Jiang, Zefan Yu, Akiko Aizawa · 19d
Tailoring the Curriculum: Student-Centered Reasoning Distillation via Dynamic Data-Model Compatibility
arXiv cs.AI · Jiahao Huang, Fei Cheng, Junfeng Jiang, Akiko Aizawa · 19d
Rethinking Literature Search Evaluation: Deep Research Helps, and Human Citation Lists Are Not a Ground Truth
arXiv cs.AI · Gaurav Sahu, Laurent Charlin, Christopher Pal · 19d
Surfacing Isolated Learners with Outcome-Independent Mediation of Feedback between Teachers and Students Using AI
arXiv cs.AI · Junsoo Park, Youssef Medhat, Htet Phyo Wai, Ploy Thajchayapong, Ashok K. Goel · 19d
DenseSteer: Steering Small Language Models towards Dense Math Reasoning
arXiv cs.AI · Yang Ouyang, Shuhang Lin, Jung-Eun Kim · 19d
Provably Secure Agent Guardrail
arXiv cs.AI · Benlong Wu, Weiming Zhang, Kejiang Chen, Han Fang, Nenghai Yu · 19d
OpenClawBench: Benchmarking Process-side Anomalies in Real-world Agent Execution Trajectories
arXiv cs.AI · Yibing Liu, Yangze Liu, Xiaolong Yin, Bin Wang, Chong Zhang, Hao Yin, Zhongyi Han · 19d
Harmonizing Real-Time Constraints and Long-Horizon Reasoning: An Asynchronous Agentic Framework for Dynamic Scheduling
arXiv cs.AI · Shijie Cao, Yuan Yuan, Jing Liu · 19d
When and How Human Curation Backfires: Preference Alignment under Multi-Model Self-Consuming Loop
arXiv cs.AI · Yang Zhang, Xiukun Wei, Xueru Zhang · 19d
Indexing the Unreadable: LLM-Native Recursive Construction and Search of Service Taxonomies
arXiv cs.AI · Wei Zheng, Yang Yan, Yiyang Shao, Jinyang Li, Zeze Chang, Yukuang Jia, Qiming Mao, Chihyung Wang, Jingbin Zhou · 19d
CoHyDE: Iterative Co-Training of LLM Rewriter & Dense Encoder for Tool Retrieval
arXiv cs.AI · Vaishali Senthil, Ashutosh Hathidara, Sebastian Schreiber · 19d
Diagnosing Harmful Continuation in Answer-Correct Long-CoT Training Traces
arXiv cs.AI · Chen He, Yuhao Wu, Lei Wang, Wenxuan Zhang, Fumin Shen · 19d
Entropy-KL Divergence-based Token Masking: A Novel Approach for Selective Fine-tuning of Large Language Models
arXiv cs.AI · Qi Liu, Mingdi Sun, Yongyi He, Zhi Zheng, Tong Xu, Yi Zheng, Zhefeng Wang, Enhong Chen · 19d
Rubric-Guided Process Reward for Stepwise Model Routing
arXiv cs.AI · Shenghao Ye, Yu Guo, Zhengheng Li, Shuangwu Chen, Jian Yang · 19d
ConMoE: Expert-Pool Consolidation via Prototype Reassignment for MoE Compression
arXiv cs.AI · Yilun Yao, Jiaming Pan, Elsie Dai, Peizhuang Cong, Yaoming Li, Tong Yang · 19d
PassNet: Scaling Large Language Models for Graph Compiler Pass Generation
arXiv cs.AI · Yiqun Liu, Yingsheng Wu, Ruqi Yang, Enrong Zheng, Honglei Qiu, Sijun He, Tai Liang, Jingjing Wu, Yuhan Zhou, Yiwei Zhang, Dongyan Chen, Weihan Yi, Xinqi Li, Siqi Bao · 19d
Scaling Monosemanticity: Extracting Interpretable Features from Claude 3 Sonnet
arXiv cs.AI · Adly Templeton, Tom Conerly, Jonathan Marcus, Jack Lindsey, Trenton Bricken, Brian Chen, Adam Pearce, Craig Citro, Emmanuel Ameisen, Andy Jones, Hoagy Cunningham, Nicholas L Turner, Callum McDougall, Monte MacDiarmid, Alex Tamkin, Esin Durmus, Tristan Hume, Francesco Mosconi, C. Daniel Freeman, Theodore R. Sumers, Edward Rees, Joshua Batson, Adam Jermyn, Shan Carter, Chris Olah, Tom Henighan · 19d
MiraBench: Evaluating Action-Conditioned Reliability in Robotic World Models
arXiv cs.AI · Tianzhuo Yang, Zihan Shen, Zirui Mi, Zhaoyi Zhang, Jiayi Zhou, Jiaming Ji, Juntao Dai, Jiawei Chen, Boyuan Chen, Yaodong Yang · 19d
EvoMD-LLM: Learning the Language of Species Evolution in Reactive Molecular Dynamics
arXiv cs.AI · Zhichen Tang, Zhengzheng Dang, Yulin Chen, Jixin Wu, Haiwen Li, Yanming Wang · 19d
Aligned but Fragile: Enhancing LLM Safety Robustness via Zeroth-Order Optimization
arXiv cs.AI · Zhihao Liu, Yifan Wu, Jian Lou, Di Wang, Yuxi Zhou, Yuke Hu · 19d
Architecture-Sensitive Supervised Fine-Tuning for Screen-Conditioned Action Prediction: A PiSAR Benchmark
arXiv cs.AI · Rahul Bissa, Abhishek Vyas, Yash Jain · 19d
When Does Persona Prompting Actually Help? A Retrieval and Metric Analysis of Expert Role Injection in LLMs
arXiv cs.AI · Shuai Xiao, Su Liu, Weikai Zhou, Jialun Wu, Xinjie He, Zhiyuan Lin, Qiyang Xie · 19d