←── back to feed
/topics/arxiv-cs-cl-papers-may-26-2026

arXiv cs.CL papers May 26 2026

29 items1 sourcesupdated 22d agotrend 0

On May 26, 2026, arXiv's computational linguistics section published 20 papers spanning neural speech decoding, harmful content detection, retrieval-augmented generation, legal NLP, multimodal document processing, and LLM interpretability. Topics include end-to-end intracortical speech decoding without external language models, dialect bias in language models, long-context memory diagnostics, and medical reasoning in Hindi.

[BLG]blog/rss29
End-to-End Intracortical Speech Decoding from Neural Activity
arXiv cs.CL · Owais Mujtaba Khanday, Jose A. Gonzalez-Lopez, Marc Ouellet, Alberto Galdon, Gonzalo Olivares Granados · 22d
Distinguishing Right from Wrong in Debates: Attribution Analysis of Chinese Harmful Memes
arXiv cs.CL · Weiming Wang, Junyu Lu, Han Wang, Xiaokun Zhang, Zewen Bai, Bo Xu, Liang Yang, Hongfei Lin · 22d
How Much Structure Do LLMs Need? Evaluating LLMs for Bibliometric Cluster Description
arXiv cs.CL · Abraham Camelo-Guerrero, Jairo Diaz-Rodriguez · 22d
Structure-Aware RAG: Structured Retrieval Augmented Generation from Noisy Data for Conversational Agents
arXiv cs.CL · Kaiqiao Han, LuAn Tang, Renliang Sun, Peng Yuan, Wei Cheng, Haoyu Wang, Wei Wang, Yizhou Sun, Haifeng Chen · 22d
Side-by-side Comparison Amplifies Dialect Bias in Language Models
arXiv cs.CL · Kritee Kondapally, Claire J. Smerdon, Pooja C. Patel, Ogheneyoma Akoni, Jevon Torres, Jaspreet Ranjit, Matthew Finlayson, Swabha Swayamdipta · 22d
SEAL: Synergistic Co-Evolution of Agents and Learning Environments
arXiv cs.CL · Yihao Hu, Zhihao Wen, Xiujin Liu, Pan Wang, Xin Zhang, Wei Wu · 22d
Found in Conversation: LLMs Teach Themselves to Close the Multi-Turn Gap
arXiv cs.CL · Tianlang Chen, Shirley Wu, Jure Leskovec · 22d
Phonetic Modeling of Dialectal Variation in Vietnamese Speech
arXiv cs.CL · Quan Ngoc Hoang, Long Hoang Huu Nguyen, Nghia Hieu Nguyen, Kiet Van Nguyen, Ngan Luu-Thuy Nguyen · 22d
Temporal Concept Drift in Legal Judgment Prediction: Neural Baselines Across Three Epochs of Ukrainian Court Decisions
arXiv cs.CL · Volodymyr Ovcharov · 22d
Decompose-and-Refine: Structured Legal Question Answering with Parametric Retrieval
arXiv cs.CL · Jihyung lee, Hyounghun Kim, Gary Lee · 22d
Grammatically-Guided Sparse Attention for Efficient and Interpretable Transformers
arXiv cs.CL · Spandan Pratyush · 22d
Unveil: Unified Visual-Textual Integration and Distillation for Multi-modal Document Retrieval
arXiv cs.CL · Hao Sun, Yingyan Hou, Jiayan Guo, Bo Wang, Chunyu Yang, Jinsong Ni, Yan Zhang · 22d
Generating Legal Commentaries from Case Databases via Retrieval, Clustering, and Generation
arXiv cs.CL · Max Prior, Niklas Wais, Matthias Grabmair · 22d
AstroMind: A High-Fidelity Benchmark for Spacecraft Behavior Reasoning Based on Large Language Models
arXiv cs.CL · Hao Liu, Siyuan Yang, Qinglei Hu, Dongyu Li · 22d
WhenLoss: Diagnosing Write and Retrieval Bottlenecks in Long-Context Memory Systems
arXiv cs.CL · Jiangnan Yu, Kisson Songqi Lin, Jilong Wu · 22d
Word Class Representations Spontaneously Emerge from Successor Representations Trained on Natural Language
arXiv cs.CL · Mathis Immertreu, Achim Schilling, Thomas Kinfe, Patrick Krauss · 22d
CSP-Atlas: Concept-Specific Neural Circuits in a Sparse Python Transformer
arXiv cs.CL · Piotr Wilam · 22d
Guarded Repair for Harm-Aware Post-hoc Replacement of LLM Mathematical Reasoning
arXiv cs.CL · Haizhou Xia · 22d
Measuring the Depth of LLM Unlearning via Activation Patching
arXiv cs.CL · Jaeung Lee, Dohyun Kim, Jaemin Jo · 22d
HiMed: Incentivizing Hindi Reasoning in Medical LLMs
arXiv cs.CL · Dingfeng Jiang, Han Yan, Chenze Ma, Amit Kumar Jaiswal, Ang Li, Yunxiang Jiang, Xinlei Xiong, Juhao Liang, Hongru Xiao, Xiang Li, Fan Bu, Jiale Han, Ruchir Gupta, Prayag Tiwari, Benyou Wang · 22d
Know You Before You Speak: User-State Modeling for LLM Personalization in Multi-Turn Conversation
arXiv cs.CL · Jiani Luo, Xiaoyan Zhao, Yang Zhang, Shuyi Miao, Bingbing Xu, Stefan Konigorski, Tat-Seng Chua · 22d
Mix-MoE: Improving Multilingual Machine Translation of Large Language Models through Mixed MoEs
arXiv cs.CL · Bo Li, Tianyu Dong, Shaolin Zhu, Deyi Xiong · 22d
CP-Agent: A Calibrated Risk-Controlled Agent for Feedback-Driven Competitive Programming
arXiv cs.CL · Peisong Wang, Bowen Liu, Zehua Li, Yuyao Wang, Zhiwei Ma, Yuhan Li, Jia Li · 22d
The Path Matters: Learning a Token-Commitment Policy for Diffusion Language Models
arXiv cs.CL · Bohang Sun, Max Zhu, Francesco Caso, Jindong Gu, Junchi Yu, Philip Torr, Pietro Li\`o, Jialin Yu · 22d
TS-Skill: A Benchmark for Evaluating Analytical Skills in Time-Series Question Answering
arXiv cs.CL · Liying Han, Kang Yang, Oliver Wang, Jason Wu, Pengrui Quan, Gaofeng Dong, Ozan Baris Mulayim, Sizhe Ma, Yuyang Yuan, Dezhi Hong, Mario Berges, Mani Srivastava · 22d
The Tokenizer Tax Across 25 European Languages: Domain Invariance, Cross-Lingual Few-Shot Effects, and the Ukrainian Penalty
arXiv cs.CL · Volodymyr Ovcharov · 22d
World-State Transformations for Neuro-symbolic Interactive Storytelling
arXiv cs.CL · Santiago G\'ongora, Luis Chiruzzo, Gonzalo M\'endez, Pablo Gerv\'as · 22d
ROC Analysis for Evaluating Translation Quality Estimation Systems
arXiv cs.CL · Evelyn Y. Garland (Acta-Transphere), Carola F. Berger (CFB Scientific Translations LLC) · 22d
StepGap: A Hybrid NLI-LLM Checker for Step-Level Evidence-Gap Detectionin Multi-Hop Question Answering
arXiv cs.CL · Yuelyu Ji, Zhuochun Li, Hui Ji, Daqing He · 22d