←── back to feed

/topics/arxiv-cs-cl-papers-may-26-2026

arXiv cs.CL papers May 26 2026

29 items●1 sources●updated 22d ago●trend 0

┌─ summary ─────────────────────────────┐

On May 26, 2026, arXiv's computational linguistics section published 20 papers spanning neural speech decoding, harmful content detection, retrieval-augmented generation, legal NLP, multimodal document processing, and LLM interpretability. Topics include end-to-end intracortical speech decoding without external language models, dialect bias in language models, long-context memory diagnostics, and medical reasoning in Hindi.

┌─ items (29) ──────────────────────────┐

[BLG]blog/rss29

End-to-End Intracortical Speech Decoding from Neural Activity

arXiv cs.CL · Owais Mujtaba Khanday, Jose A. Gonzalez-Lopez, Marc Ouellet, Alberto Galdon, Gonzalo Olivares Granados · 22d

Distinguishing Right from Wrong in Debates: Attribution Analysis of Chinese Harmful Memes

arXiv cs.CL · Weiming Wang, Junyu Lu, Han Wang, Xiaokun Zhang, Zewen Bai, Bo Xu, Liang Yang, Hongfei Lin · 22d

How Much Structure Do LLMs Need? Evaluating LLMs for Bibliometric Cluster Description

arXiv cs.CL · Abraham Camelo-Guerrero, Jairo Diaz-Rodriguez · 22d

Structure-Aware RAG: Structured Retrieval Augmented Generation from Noisy Data for Conversational Agents

arXiv cs.CL · Kaiqiao Han, LuAn Tang, Renliang Sun, Peng Yuan, Wei Cheng, Haoyu Wang, Wei Wang, Yizhou Sun, Haifeng Chen · 22d

Side-by-side Comparison Amplifies Dialect Bias in Language Models

arXiv cs.CL · Kritee Kondapally, Claire J. Smerdon, Pooja C. Patel, Ogheneyoma Akoni, Jevon Torres, Jaspreet Ranjit, Matthew Finlayson, Swabha Swayamdipta · 22d

SEAL: Synergistic Co-Evolution of Agents and Learning Environments

arXiv cs.CL · Yihao Hu, Zhihao Wen, Xiujin Liu, Pan Wang, Xin Zhang, Wei Wu · 22d

Found in Conversation: LLMs Teach Themselves to Close the Multi-Turn Gap

arXiv cs.CL · Tianlang Chen, Shirley Wu, Jure Leskovec · 22d

Phonetic Modeling of Dialectal Variation in Vietnamese Speech

arXiv cs.CL · Quan Ngoc Hoang, Long Hoang Huu Nguyen, Nghia Hieu Nguyen, Kiet Van Nguyen, Ngan Luu-Thuy Nguyen · 22d

Temporal Concept Drift in Legal Judgment Prediction: Neural Baselines Across Three Epochs of Ukrainian Court Decisions

arXiv cs.CL · Volodymyr Ovcharov · 22d

Decompose-and-Refine: Structured Legal Question Answering with Parametric Retrieval

arXiv cs.CL · Jihyung lee, Hyounghun Kim, Gary Lee · 22d

Grammatically-Guided Sparse Attention for Efficient and Interpretable Transformers

arXiv cs.CL · Spandan Pratyush · 22d

Unveil: Unified Visual-Textual Integration and Distillation for Multi-modal Document Retrieval

arXiv cs.CL · Hao Sun, Yingyan Hou, Jiayan Guo, Bo Wang, Chunyu Yang, Jinsong Ni, Yan Zhang · 22d

Generating Legal Commentaries from Case Databases via Retrieval, Clustering, and Generation

arXiv cs.CL · Max Prior, Niklas Wais, Matthias Grabmair · 22d

AstroMind: A High-Fidelity Benchmark for Spacecraft Behavior Reasoning Based on Large Language Models

arXiv cs.CL · Hao Liu, Siyuan Yang, Qinglei Hu, Dongyu Li · 22d

WhenLoss: Diagnosing Write and Retrieval Bottlenecks in Long-Context Memory Systems

arXiv cs.CL · Jiangnan Yu, Kisson Songqi Lin, Jilong Wu · 22d

Word Class Representations Spontaneously Emerge from Successor Representations Trained on Natural Language

arXiv cs.CL · Mathis Immertreu, Achim Schilling, Thomas Kinfe, Patrick Krauss · 22d

CSP-Atlas: Concept-Specific Neural Circuits in a Sparse Python Transformer

arXiv cs.CL · Piotr Wilam · 22d

Guarded Repair for Harm-Aware Post-hoc Replacement of LLM Mathematical Reasoning

arXiv cs.CL · Haizhou Xia · 22d

Measuring the Depth of LLM Unlearning via Activation Patching

arXiv cs.CL · Jaeung Lee, Dohyun Kim, Jaemin Jo · 22d

HiMed: Incentivizing Hindi Reasoning in Medical LLMs

arXiv cs.CL · Dingfeng Jiang, Han Yan, Chenze Ma, Amit Kumar Jaiswal, Ang Li, Yunxiang Jiang, Xinlei Xiong, Juhao Liang, Hongru Xiao, Xiang Li, Fan Bu, Jiale Han, Ruchir Gupta, Prayag Tiwari, Benyou Wang · 22d

Know You Before You Speak: User-State Modeling for LLM Personalization in Multi-Turn Conversation

arXiv cs.CL · Jiani Luo, Xiaoyan Zhao, Yang Zhang, Shuyi Miao, Bingbing Xu, Stefan Konigorski, Tat-Seng Chua · 22d

Mix-MoE: Improving Multilingual Machine Translation of Large Language Models through Mixed MoEs

arXiv cs.CL · Bo Li, Tianyu Dong, Shaolin Zhu, Deyi Xiong · 22d

CP-Agent: A Calibrated Risk-Controlled Agent for Feedback-Driven Competitive Programming

arXiv cs.CL · Peisong Wang, Bowen Liu, Zehua Li, Yuyao Wang, Zhiwei Ma, Yuhan Li, Jia Li · 22d

The Path Matters: Learning a Token-Commitment Policy for Diffusion Language Models

arXiv cs.CL · Bohang Sun, Max Zhu, Francesco Caso, Jindong Gu, Junchi Yu, Philip Torr, Pietro Li\`o, Jialin Yu · 22d

TS-Skill: A Benchmark for Evaluating Analytical Skills in Time-Series Question Answering

arXiv cs.CL · Liying Han, Kang Yang, Oliver Wang, Jason Wu, Pengrui Quan, Gaofeng Dong, Ozan Baris Mulayim, Sizhe Ma, Yuyang Yuan, Dezhi Hong, Mario Berges, Mani Srivastava · 22d

The Tokenizer Tax Across 25 European Languages: Domain Invariance, Cross-Lingual Few-Shot Effects, and the Ukrainian Penalty

arXiv cs.CL · Volodymyr Ovcharov · 22d

World-State Transformations for Neuro-symbolic Interactive Storytelling

arXiv cs.CL · Santiago G\'ongora, Luis Chiruzzo, Gonzalo M\'endez, Pablo Gerv\'as · 22d

ROC Analysis for Evaluating Translation Quality Estimation Systems

arXiv cs.CL · Evelyn Y. Garland (Acta-Transphere), Carola F. Berger (CFB Scientific Translations LLC) · 22d

StepGap: A Hybrid NLI-LLM Checker for Step-Level Evidence-Gap Detectionin Multi-Hop Question Answering

arXiv cs.CL · Yuelyu Ji, Zhuochun Li, Hui Ji, Daqing He · 22d