←── back to feed

/topics/arxiv-cs-ai-papers-may-29-2026

arXiv cs.AI papers May 29 2026

50 items●1 sources●updated 19d ago●trend 0

┌─ summary ─────────────────────────────┐

On May 29, 2026, arXiv's cs.AI category published 20 papers spanning reinforcement learning, language models, AI safety, and applications in engineering, education, and clinical research. Topics ranged from temporal-difference learning methods and diffusion model concept erasure to LLM evaluation frameworks, hallucination mitigation, and autonomous agent deployment challenges.

┌─ items (50) ──────────────────────────┐

[BLG]blog/rss50

Behavior-Induced Mirror-Prox Temporal-Difference Learning for Faster Off-Policy Prediction

arXiv cs.AI · Xingguo Chen, Yuchen Shen, Shangdong Yang, Chao Li, Guang Yang, Wenhao Wang · 19d

Behavior-Aware Auxiliary Corrections for Off-Policy Temporal-Difference Prediction

arXiv cs.AI · Xingguo Chen, Zhiang He, Yuchen Shen, Shangdong Yang, Chao Li, Guang Yang, Wenhao Wang · 19d

The Cognitive Categorical Transformer: Category-Theoretic Inductive Biases for Language Modeling

arXiv cs.AI · Al Kari · 19d

Ultra-Reduced-Impact-Encased-Logging (URIEL): propose a new method for selective sustainable logging and post-harvest silvicultural treatment in tropical forest using airborne robotics systems

arXiv cs.AI · Daniel Albiero, Gelton Fernando de Morais, Daniela Han, Fl\'avio Roberto de Freitas Gon\c{c}alves, Artur Vit\'orio Andrade Santos, Wesllen Lins de Ara\'ujo, Alessandra Maia Freire, Cl\'audio Kiyoshi Umezu, Mateus Peressin, Francesco Toscano, Admilson \'Irio Ribeiro, Alfeu J. Sguarezi Filho, Am\'erico Ferraz Dias Neto, Angel Pontin Garcia · 19d

Review Arcade: On the Human Alignment and Gameability of LLM Reviews

arXiv cs.AI · Hans Ole Hatzel, Sebastian Steindl, Jan Strich · 19d

Orthogonal Concept Erasure for Diffusion Models

arXiv cs.AI · Yuhao Sun, Lingyun Yu, Haoxiang Xu, Fengyuan Miao, Zhuoer Xu, Hongtao Xie · 19d

Frontier LLM-based agents can overcome the ontology curation bottleneck for natural phenotypes

arXiv cs.AI · James P. Balhoff, Hilmar Lapp · 19d

VFEAgent: A Multimodal Agent Framework for End-to-End Automated Finite Element Analysis

arXiv cs.AI · Jiachen Zhang (Peking University, China Agricultural University), Junyi Lao (Peking University), Chenghao Liu (Peking University), Siyuan Liu (Peking University), Shixin Wu (Peking University), Linsen Zhang (Peking University), Boyu Wang (Peking University), Songfang Huang (Peking University) · 19d

BEAMS: Benchmarking and Evaluating AI for Modeling and Simulation

arXiv cs.AI · Sara Metcalf, William Schoenberg · 19d

Adopt $\neq$ Adapt: Longitudinal Analyses of LLM Conversations in the Wild

arXiv cs.AI · Rebecca M. M. Hicke, Kiran Tomlinson · 19d

When Models Disagree: Rethinking LLM Evaluation for Public Comment Analysis

arXiv cs.AI · Aisha Najera, Alvin Moon, Vedant Srinivasan, Rajesh Veeraraghavan · 19d

Mind Your Tone: Does Tone Alter LLM Performance?

arXiv cs.AI · Om Dobariya, Akhil Kumar · 19d

Practitioner Beliefs and Behaviors in AI-Enhanced Education: DOT Framework Survey Evidence

arXiv cs.AI · David Gibson (Curtin University), M. Elizabeth Azukas (Georgia Institute of Technology), Gerald Knezek (University of North Texas) · 19d

Differentiable Belief-based Opponent Shaping

arXiv cs.AI · Aarav G Sane, Karthik Sivachandran, Rohan Paleja · 19d

Hallucination Mitigation with Agentic AI, Nested Learning, and AI Sustainability via Semantic Caching

arXiv cs.AI · Diego Gosmar, Deborah A. Dahl · 19d

Robust and Efficient Guardrails with Latent Reasoning

arXiv cs.AI · Siddharth Sai, Xiaofei Wen, Muhao Chen · 19d

Bridging the Sim-to-Real Gap in Reinforcement Learning-Based Industrial Dispatching through Execution Semantics

arXiv cs.AI · Jonathan Hoss, Noah Klarmann · 19d

The Importance of Out-of-Band Metadata for Safe Autonomous Agents: The Redpanda Agentic Data Plane

arXiv cs.AI · Tyler Akidau, Tyler Rockwood, Johannes Br\"uderl, Marc Millstone · 19d

The Chain Holds, the Answer Folds: Trace-Answer Dissociation in Reasoning Models Under Adversarial Pressure

arXiv cs.AI · Yubo Li, Ramayya Krishnan, Rema Padman · 19d

Trends in AI and Human-AI Interaction in Clinical Trials -- A Hybrid Human-AI Exploration

arXiv cs.AI · Sandra Woolley, Tim Collins, Khalid Khattak, Illia Chernomorets, Ariane Arevalo, Chris Richardson · 19d

Beyond Consensus: Trace-Level Synthesis in Mixture of Agents

arXiv cs.AI · Shreyas Fadnavis, Praitayini Kanakaraj, Felix Wyss · 19d

PRO-CUA: Process-Reward Optimization for Computer Use Agents

arXiv cs.AI · Yifei He, Rui Yang, Hao Bai, Tong Zhang, Han Zhao · 19d

The Confidence Shortcut: A Reasoning Failure Mode of Masked Diffusion Models

arXiv cs.AI · Dueun Kim, Albert No · 19d

Governing Technical Debt in Agentic AI Systems

arXiv cs.AI · Muhammad Zia Hydari, Raja Iqbal, Narayan Ramasubbu · 19d

Better Later Than Sooner: Neuro-Symbolic Knowledge Graph Construction via Ontology-grounded Post-extraction Correction

arXiv cs.AI · Lorenzo Loconte, Timothy Hospedales, Cristina Cornelio · 19d

Paper Agents, Paper Gains: An Empirical Analysis of DeFi Investment Agents

arXiv cs.AI · Jay Yu, Amy Zhao, Danning Sui · 19d

ReasonOps: Operator Segmentation for LLM Reasoning Traces

arXiv cs.AI · Daniel Lee, Owen Queen, James Zou · 19d

GTA: Generating Long-Horizon Tasks for Web Agents at Scale

arXiv cs.AI · Tenghao Huang, Kung-Hsiang Huang, Prafulla Kumar Choubey, Yilun Zhou, Muhao Chen, Jonathan May, Chien-Sheng Wu · 19d

BenchTrace: A Benchmark for Testing Reflection Ability and Controlled Evolution in LLM Agents

arXiv cs.AI · Jiahao Huang, Fei Cheng, Junfeng Jiang, Zefan Yu, Akiko Aizawa · 19d

Tailoring the Curriculum: Student-Centered Reasoning Distillation via Dynamic Data-Model Compatibility

arXiv cs.AI · Jiahao Huang, Fei Cheng, Junfeng Jiang, Akiko Aizawa · 19d

Rethinking Literature Search Evaluation: Deep Research Helps, and Human Citation Lists Are Not a Ground Truth

arXiv cs.AI · Gaurav Sahu, Laurent Charlin, Christopher Pal · 19d

Surfacing Isolated Learners with Outcome-Independent Mediation of Feedback between Teachers and Students Using AI

arXiv cs.AI · Junsoo Park, Youssef Medhat, Htet Phyo Wai, Ploy Thajchayapong, Ashok K. Goel · 19d

DenseSteer: Steering Small Language Models towards Dense Math Reasoning

arXiv cs.AI · Yang Ouyang, Shuhang Lin, Jung-Eun Kim · 19d

Provably Secure Agent Guardrail

arXiv cs.AI · Benlong Wu, Weiming Zhang, Kejiang Chen, Han Fang, Nenghai Yu · 19d

OpenClawBench: Benchmarking Process-side Anomalies in Real-world Agent Execution Trajectories

arXiv cs.AI · Yibing Liu, Yangze Liu, Xiaolong Yin, Bin Wang, Chong Zhang, Hao Yin, Zhongyi Han · 19d

Harmonizing Real-Time Constraints and Long-Horizon Reasoning: An Asynchronous Agentic Framework for Dynamic Scheduling

arXiv cs.AI · Shijie Cao, Yuan Yuan, Jing Liu · 19d

When and How Human Curation Backfires: Preference Alignment under Multi-Model Self-Consuming Loop

arXiv cs.AI · Yang Zhang, Xiukun Wei, Xueru Zhang · 19d

Indexing the Unreadable: LLM-Native Recursive Construction and Search of Service Taxonomies

arXiv cs.AI · Wei Zheng, Yang Yan, Yiyang Shao, Jinyang Li, Zeze Chang, Yukuang Jia, Qiming Mao, Chihyung Wang, Jingbin Zhou · 19d

CoHyDE: Iterative Co-Training of LLM Rewriter & Dense Encoder for Tool Retrieval

arXiv cs.AI · Vaishali Senthil, Ashutosh Hathidara, Sebastian Schreiber · 19d

Diagnosing Harmful Continuation in Answer-Correct Long-CoT Training Traces

arXiv cs.AI · Chen He, Yuhao Wu, Lei Wang, Wenxuan Zhang, Fumin Shen · 19d

Entropy-KL Divergence-based Token Masking: A Novel Approach for Selective Fine-tuning of Large Language Models

arXiv cs.AI · Qi Liu, Mingdi Sun, Yongyi He, Zhi Zheng, Tong Xu, Yi Zheng, Zhefeng Wang, Enhong Chen · 19d

Rubric-Guided Process Reward for Stepwise Model Routing

arXiv cs.AI · Shenghao Ye, Yu Guo, Zhengheng Li, Shuangwu Chen, Jian Yang · 19d

ConMoE: Expert-Pool Consolidation via Prototype Reassignment for MoE Compression

arXiv cs.AI · Yilun Yao, Jiaming Pan, Elsie Dai, Peizhuang Cong, Yaoming Li, Tong Yang · 19d

PassNet: Scaling Large Language Models for Graph Compiler Pass Generation

arXiv cs.AI · Yiqun Liu, Yingsheng Wu, Ruqi Yang, Enrong Zheng, Honglei Qiu, Sijun He, Tai Liang, Jingjing Wu, Yuhan Zhou, Yiwei Zhang, Dongyan Chen, Weihan Yi, Xinqi Li, Siqi Bao · 19d

Scaling Monosemanticity: Extracting Interpretable Features from Claude 3 Sonnet

arXiv cs.AI · Adly Templeton, Tom Conerly, Jonathan Marcus, Jack Lindsey, Trenton Bricken, Brian Chen, Adam Pearce, Craig Citro, Emmanuel Ameisen, Andy Jones, Hoagy Cunningham, Nicholas L Turner, Callum McDougall, Monte MacDiarmid, Alex Tamkin, Esin Durmus, Tristan Hume, Francesco Mosconi, C. Daniel Freeman, Theodore R. Sumers, Edward Rees, Joshua Batson, Adam Jermyn, Shan Carter, Chris Olah, Tom Henighan · 19d

MiraBench: Evaluating Action-Conditioned Reliability in Robotic World Models

arXiv cs.AI · Tianzhuo Yang, Zihan Shen, Zirui Mi, Zhaoyi Zhang, Jiayi Zhou, Jiaming Ji, Juntao Dai, Jiawei Chen, Boyuan Chen, Yaodong Yang · 19d

EvoMD-LLM: Learning the Language of Species Evolution in Reactive Molecular Dynamics

arXiv cs.AI · Zhichen Tang, Zhengzheng Dang, Yulin Chen, Jixin Wu, Haiwen Li, Yanming Wang · 19d

Aligned but Fragile: Enhancing LLM Safety Robustness via Zeroth-Order Optimization

arXiv cs.AI · Zhihao Liu, Yifan Wu, Jian Lou, Di Wang, Yuxi Zhou, Yuke Hu · 19d

Architecture-Sensitive Supervised Fine-Tuning for Screen-Conditioned Action Prediction: A PiSAR Benchmark

arXiv cs.AI · Rahul Bissa, Abhishek Vyas, Yash Jain · 19d

When Does Persona Prompting Actually Help? A Retrieval and Metric Analysis of Expert Role Injection in LLMs

arXiv cs.AI · Shuai Xiao, Su Liu, Weikai Zhou, Jialun Wu, Xinjie He, Zhiyuan Lin, Qiyang Xie · 19d