←── back to feed
/topics/arxiv-cs-cl-papers-june-2-2026

arXiv cs.CL papers June 2 2026

50 items1 sourcesupdated 15d agotrend 0

On June 2, 2026, arXiv's cs.CL section published 20 papers spanning dialogue parsing, LLM robustness, AI-generated text detection, Chinese grammar correction, speculative decoding, humor generation, language diffusion models, medical LLM safety, knowledge-grounded generation, financial sentiment analysis, self-supervised learning, LLM evaluation metrics, legal document processing, AI disclosure, fake news detection, knowledge base question answering, and LLM effects on student writing.

  • DraDDP: first multimodal multi-party dialogue discourse parsing dataset with 495 segments, 6,374 utterances, 9.1 hours video from TV dramas
  • CSRP framework for Chinese grammatical error correction uses continual pre-training on 5.9M samples plus reinforcement learning with efficiency-aware rewards
  • TrustLDM benchmark evaluates safety, privacy, and fairness across language diffusion model architectures
  • RealityTest: large-scale multimodal multilingual benchmark testing whether AI systems disclose their identity when asked
  • BOUTEF: multilingual corpus for fake news in North Africa covering Algeria and Tunisia with fake narratives, genuine narratives, and debunking information
[BLG]blog/rss50
DraDDP: A Multimodal Multi-Party Dialogue Discourse Parsing Dataset
arXiv cs.CL · Shannan Liu, Peifeng Li, Yaxin Fan, Qiaoming Zhu · 15d
Toward Robust In-Context Learning: Leveraging Out-of-distribution Proxies for Target Inaccessible Demonstration Retrieval
arXiv cs.CL · Hao Xu, Rite Bo, Fausto Giunchiglia, Yingji Li, Rui Song · 15d
AEyeDE: An Attention-Based Attribution Framework for AI-Generated Text Detection
arXiv cs.CL · Aria Nourbakhsh, Adelaide Danilov, Christoph Schommer, Salima Lamsiyah · 15d
CSRP: Chain-of-Thought Reasoning for Chinese Text Correction via Reinforcement Learning with Efficiency-Aware Rewards
arXiv cs.CL · Wei Tian, Yuhao Zhou, Man Lan · 15d
SENSE: Semantic Embedding Navigation with Soft-gated Evaluation for Retrieval-based Speculative Decoding
arXiv cs.CL · Shaowen Chen, Zhicheng Liao, Hongwei Wang · 15d
lmfaoooo at SemEval-2026 Task 1: Humor Is an Audience. Preference Modeling for Constrained Humor Generation
arXiv cs.CL · Alexey Tikhonov, Alexey Ivanov · 15d
TrustLDM: Benchmarking Trustworthiness in Language Diffusion Models
arXiv cs.CL · Yichuan Mo, Yukun Jiang, Yanbo Shi, Mingjie Li, Michael Backes, Yang Zhang, Yisen Wang · 15d
ART: Attention Run-time Termination for Efficient Large Language Model Decoding
arXiv cs.CL · Chen Qiu, Guozhong Li, Panos Kalnis · 15d
Cognitive-Linguistic Indicators of Depression in Online Communities: Analysed by DistilBERT and Holographic Reduced Representation
arXiv cs.CL · Brian Van Steen · 15d
A Multi-Domain Red Teaming Framework for Safety, Robustness, and Fairness Evaluation of Medical Large Language Models
arXiv cs.CL · Andrei Marian Feier, Veysel Kocaman, Yigit Gul, Ahmet Korkmaz, Alexander Thomas, Aleksei Zakharov, Jay Gil, Mehmet Butgul, David Talby · 15d
TCAR-Gen: Temporal Graph Retrieval with Evidence Fusion for Knowledge-Grounded Generation
arXiv cs.CL · Sidra Nasir, Muhammad Noman Zahid, Rizwan Ahmed Khan · 15d
LLMs for Cardiovascular Risk Prediction from Structured Clinical Data
arXiv cs.CL · Jeba Maliha, Md Rafiul Kabir · 15d
Graph-Augmented Retrieval for Cross-Entity Financial Sentiment Analysis: A Comparative Study
arXiv cs.CL · Rajan Bastakoti, Sagar Bhetwal, Nirajan Acharya, Gaurav Kumar Gupta · 15d
DLLM-JEPA: Joint Embedding Predictive Architectures for Masked Diffusion Language Models
arXiv cs.CL · Sangdae Nam · 15d
Agreement Metrics for LLM-as-Judge Evaluation: What to Report and Why
arXiv cs.CL · Delip Rao, Chris Callison-Burch · 15d
Enhancing BiGRU with a KAN Block for Legal Document Classification and Summarization
arXiv cs.CL · Ahmed Faizul Haque Dhrubo, Souvik Pramanik, Most. Aysha Siddika Sumona, Shahnewaz Siddique, Mohammad Ashrafuzzaman Khan, Mohammad Abdul Qayum, Mohsin Sajjad · 15d
RealityTest: How People Probe AI Identity and Whether Models Disclose It
arXiv cs.CL · Anna Gausen, Sarenne Wallbridge, Bessie O'Dell, Christopher Summerfield, Hannah Rose Kirk · 15d
BOUTEF: A Multilingual Corpus for FakeNews in North Africa -- Language as a Weapon
arXiv cs.CL · Kamel Smaili, Yassine Toughrai, Amina Laggoun, David Langlois · 15d
DeSQ: Decomposition-based SPARQL Query Generation
arXiv cs.CL · Papa Abdou Karim Karou Diallo, Aditya Sharma, Neshat Elhami Fard, Amal Zouaq · 15d
Effects of Varying LLM Access on Essay Writing Behavior
arXiv cs.CL · Julia Christenson, Karin de Langis, Shirley Anugrah Hayati, Dongyeop Kang · 15d
Parameter Alignment Mitigates Catastrophic Forgetting in Multilingual Expert Language Models
arXiv cs.CL · Sanchit Ahuja, Terra Blevins · 15d
Model-Based Quality Assessment for Massively Multilingual Parallel Data
arXiv cs.CL · Abdelaziz M. A. Ibrahim, Zihao Li, J\"org Tiedemann, Shaoxiong Ji · 15d
Uncovering Temporal Framing in the News
arXiv cs.CL · Tarek Mahmoud, Veronika Solopova, Premtim Sahitaj, Ariana Sahitaj, Max Upravitelev, Mervat Abassy, Hana Fatima Shaikh, Neda Foroutan, Vera Schmitt, Preslav Nakov · 15d
Bridging Reasoning Trajectories in On-Policy Distillation via Near-Future Guidance
arXiv cs.CL · Yuxuan Jiang, Francis Ferraro · 15d
Which Institutional Frameworks Do Chatbots Assume? Auditing Jurisdictional Defaults in Multilingual LLMs
arXiv cs.CL · Zhizhi Wang, Harini Suresh · 15d
Isolating LLM Lexical Bias: A Curation-Free Triangulated Metric for Preference-Stage Learning
arXiv cs.CL · Xiaoyang Ming, Jose Hernandez, Thomas Stephan Juzek · 15d
How Far Do Auto-Interpretation Labels Generalize: A Controlled Study Across Languages, Scripts, and Rewordings
arXiv cs.CL · Sripad Karne · 15d
Masking Stale Observations Helps Search Agents -- Until It Doesn't: A Regime Map and Its Mechanism
arXiv cs.CL · Haoxiang Zhang, Qixin Xu, Zhuofeng Li, Lei Zhang, Pengcheng Jiang, Yu Zhang, Julian McAuley · 15d
ProtStructQA: A Denotation Threshold in Protein Structural Reasoning
arXiv cs.CL · Aravind Mandiga, Guoming Li, Jin Lu, Ismailcem Budak Arpinar, Khaled Rasheed, Samuel E. Aggrey · 15d
SALSA: Speech Aware LLM Adaptation via Learned Steering Activation Vectors
arXiv cs.CL · Yekaterina Yegorova, Argyrios Gerogiannis, Haolong Zheng, Julia Hockenmaier, Chang D. Yoo, Mark A. Hasegawa-Johnson · 15d
Short-form Text Rewriting with Phi Silica
arXiv cs.CL · Divya Tadimeti, Shawn Pan, Sameera Lanka, Chenghui Zhou, Sadid Hasan · 15d
On the Limits of LLM Adaptability: Impact of Model-Internalized Priors on Annotation Task Performance
arXiv cs.CL · Etienne Casanova, Rafal Kocielnik, R. Michael Alvarez · 15d
Do Text Edits Generalize to Visual Generation? Benchmarking Cross-Modal Knowledge Editing in UMMs
arXiv cs.CL · Xin Gao, Cheng Yang, Chufan Shi, Taylor Berg-Kirkpatrick · 15d
LaSR: Context-Aware Speech Recognition via Latent Reasoning
arXiv cs.CL · Heyang Liu, Ziyang Cheng, Jiayi Huang, Wenyang Xiao, Ronghua Wu, Qunshan Gu, Yanfeng Wang, Yu Wang · 15d
Skill or Skip? Learning Selective Skill Invocation in Agentic Tasks via Dual-Granularity Preference Learning
arXiv cs.CL · Chishui Chen, Jiaye Lin, Te Sun, Junxi Wang, Yi Yang, Cong Qin, Yangen Hu, Lu Pan, Ke Zeng · 15d
ProactiveLLM: Learning Active Interaction for Streaming Large Language Models
arXiv cs.CL · Junlong Tong, Yao Zhang, Anhao Zhao, Yingqi Fan, Yunpu Ma, Xiaoyu Shen · 15d
Learning to Retrieve: Dual-Level Long-Term Memory for Text-to-SQL Agents
arXiv cs.CL · Yibo Wang, Nikki Lijing Kuang, Philip S. Yu, Zhewei Yao, Yuxiong He · 15d
Revisiting Parameter-Based Knowledge Editing in Large Language Models: Theoretical Limits and Empirical Evidence
arXiv cs.CL · Wanying Ren, Xin Song, Futing Wang, Guoxiu He, Aixin Sun · 15d
Sandboxed Coding Agents are Competitive Omni-modal Task Solvers
arXiv cs.CL · Dongping Chen, Xuanao Huang, Zhihan Hu, Qingyuan Shi, Dianqi Li, Tianyi Zhou · 15d
SPADER: Step-wise Peer Advantage with Diversity-Aware Exploration Rewards for Multi-Answer Question Answering
arXiv cs.CL · Qiming Shi, Zhaolu Kang, Yunfan Zhou, Di Weng, Yingcai Wu · 15d
Toward Responsible and Epistemically Grounded Multilingual LLMs for Computational Social Science and Humanities
arXiv cs.CL · Wajdi Zaghouani · 15d
Linguistics-Aware Non-Distortionary LLM Watermarking
arXiv cs.CL · Shinwoo Park, Hyejin Park, Hyeseon An, Yo-Sub Han · 15d
MemPro: Agentic Memory Systems as Evolvable Programs
arXiv cs.CL · Qingshan Liu, Guoqing Wang, Wen Wu, Jingqi Huang, Xinqi Tao, Dejia Song, Jie Zhou, Liang He · 15d
Robust Reasoning via Dynamic Token Selection for Distribution-Aligned Self-Distillation
arXiv cs.CL · Ruiqi Zhang, Lingxiang Wang, Hainan Zhang Zhiming Zheng · 15d
French parsing enhanced with a word clustering method based on a syntactic lexicon
arXiv cs.CL · Anthony Sigogne, Matthieu Constant, Eric Laporte · 15d
LinguIUTics at PsyDefDetect: Iterative Imbalance-Aware Fine-tuning of Qwen3-8B for Psychological Defense Mechanism Classification
arXiv cs.CL · Shefayat E Shams Adib, Ahmed Alfey Sani, Md Hasibur Rahman Alif, Ajwad Abrar · 15d
FineVerify: Scaling Test-Time Compute with Fine-Grained Self-Verification for Agentic Search
arXiv cs.CL · James Xu Zhao, Hui Chen, Bryan Hooi, See-Kiong Ng · 15d
OCC-RAG: Optimal Cognitive Core for Faithful Question Answering
arXiv cs.CL · Maksim Savkin, Mikhail Goncharov, Alexander Gambashidze, Alla Chepurova, Dmitrii Tarasov, Nikita Andriianov, Daria Pugacheva, Vasily Konovalov, Andrey Galichin, Ivan Oseledets · 15d
EPIC: Efficient and Parallel Inference under CFG Constraints for Diffusion Language Models
arXiv cs.CL · Hyundong Jin, Yo-Sub Han · 15d
WaveFilter: Enhancing the Long-Context Capability of Diffusion LLMs via Wavelet-Guided KV Cache Filtering
arXiv cs.CL · Jinnan Yang, Yan Wang, Zhen Bi, Kehao Wu, Xiaojie Li, Jungang Lou, Zechao Li, Jing Liu · 15d