←── back to feed
/topics/arxiv-cs-cl-papers-june-11-2026

arXiv cs.CL papers June 11 2026

50 items1 sourcesupdated 6d agotrend 0

On June 11, 2026, 20 new papers appeared on arXiv's cs.CL track covering diverse topics in language models and NLP: quality evaluation frameworks for decentralized inference, retrieval-augmented generation improvements, jailbreak detection across languages, structured sequence generation, safety data extraction benchmarks, fine-tuning methods, biomedical reasoning, multimodal reasoning with process rewards, and multilingual safety evaluation.

[BLG]blog/rss50
PoQ-Judge: A Multi-Architecture Evaluation Framework for Cost-Aware Proof-of-Quality in Decentralized LLM Inference
arXiv cs.CL · Arther Tian, Alex Ding, Frank Chen, Simon Wu, Aaron Chan · 6d
The Structural Attention Tax: How Retrieval Format Hijacks In-Context Learning Independent of Content
arXiv cs.CL · Yuqi Zhang, Di Zhang · 6d
NightFeats @ MMU-RAGent NeurIPS 2025: A Context-Optimized Multi-Agent RAG System for the Text-to-Text Track
arXiv cs.CL · Quentin Fever, Naziha Aslam · 6d
Detecting AI-Generated Content on Social Media with Multi-modal Language Models
arXiv cs.CL · Chenyang Yang, Shen Yan, Yibo Yang, Litao Hu, Yuchen Liu, Yuan Zeng, Hanchao Yu, Yinan Zhu, Sumedha Singla, Brian Vanover, Huijun Qian, Zihao Wang, Fujun Liu, Aashu Singh, Jianyu Wang, Xuewen Zhang · 6d
One Jailbreak, Many Tongues: Learning Language-Insensitive Intention Representations for Multilingual Jailbreak Detection
arXiv cs.CL · Shuyu Jiang, Kaiyu Xu, Xingshu Chen, Hao Ren, Rui Tang, Yi Zhang, Tianwei Zhang, Hongwei Li · 6d
LatticeBridge: Rare-Event Sequential Inference for Faithful Structured Sequence Synthesis
arXiv cs.CL · Faruk Alpay, Bugra Kilictas · 6d
Benchmarking Large Language Models for Safety Data Extraction
arXiv cs.CL · Jonas Grill, Thomas Bayer, S\"oren Berlinger · 6d
Compatibility-Aware Dynamic Fine-Tuning for Large Language Models
arXiv cs.CL · Yucheng Zhou, Junwei Sheng, Qianning Wang, Jianbing Shen · 6d
BioDivergence: A Benchmark and Evaluation Framework for Hidden Contextual Contradictions in Biomedical Abstracts
arXiv cs.CL · Elias Hossain, Sanjeda Sara Jennifer, Sabera Akter Bushra, Niloofar Yousefi · 6d
ProcessThinker: Enhancing Multi-modal Large Language Models Reasoning via Rollout-based Process Reward
arXiv cs.CL · Jingpei Wu, Xiao Han, Weixiang Shen, Boer Zhang, Zifeng Ding, Volker Tresp · 6d
T2MM: An LLM Supported Architecture For Inquiry-Based Modeling
arXiv cs.CL · John Kos, Rudra Singh, Ashok Goel · 6d
Calibration Drift Under Reasoning: How Chain-of-Thought Budgets Induce Overconfidence in Large Language Models
arXiv cs.CL · Prakul Sunil Hiremath, Harshit R. Hiremath · 6d
EverydayGPT: Confidence-Gated Routing for Efficient and Safe Hybrid GPT-RAG Conversational QA
arXiv cs.CL · Jaspreet Singh Nahal · 6d
Beyond Compaction: Structured Context Eviction for Long-Horizon Agents
arXiv cs.CL · Andrew Semenov, Svyatoslav Dorofeev · 6d
Afrispeech Semantics: Evaluating Audio Semantic Reasoning in Spoken Language Models Across Domains and Accents
arXiv cs.CL · Chibuzor Okocha, Christan Grant · 6d
LifeSentence: Language models can encode human life course trajectories from longitudinal panel data
arXiv cs.CL · Samuel Liu, Muchen Xi, William Yeoh, Joshua J. Jackson · 6d
A Geometric Profile of Semantic Information in Text: Frame-Conditional Uniqueness and a Trade-Off Triangle for Scalar Summaries
arXiv cs.CL · Dmitriy Kompaneets · 6d
Every Act Has Its Price: Compressed Moral Composition in Frontier LLMs
arXiv cs.CL · Weijia Zhang, Ruiqi Chen, Yunze Xiao, Weihao Xuan · 6d
Energy-Efficient On-Device RAG on a Mobile NPU: System Design and Benchmark on Snapdragon X Elite
arXiv cs.CL · Zhiyuan Cheng, Longying Lai · 6d
Sch\"utzen: Evaluating LLM Safety in Bulgarian and German Contexts
arXiv cs.CL · Kiril Georgiev, Yuxia Wang, Dimitar Iliyanov Dimitrov, Preslav Nakov, Ivan Koychev · 6d
When More Documents Hurt RAG: Mitigating Vector Search Dilution with Domain-Scoped, Model-Agnostic Retrieval
arXiv cs.CL · Nabaraj Subedi, Ahmed Abdelaty, Shivanand Venkanna Sheshappanavar · 6d
The Dynamics of Human and AI-Generated Language: How Semantics Fluctuates across Different Timescales
arXiv cs.CL · Han-Jen Chang, Yasir \c{C}atal, Angelika Wolman, Agust\'in Ib\'a\~nez, David Smith, I-Wen Su, Kai-Yuan Cheng, Georg Northoff · 6d
When Probing Accuracy Saturates, Fragility Resolves: A Complementary Metric for LLM Pre-Training Analysis
arXiv cs.CL · Orion Reblitz-Richardson · 6d
Overcoming State Inertia in Full-Duplex Spoken Language Models via Activation Steering
arXiv cs.CL · Cheng-Kuang Chang, Kai-Wei Chang, Alexander H. Liu, James Glass · 6d
Small Experiments, Cheaper Decisions: A Case Study in Staged Promotion for Micro-Pretraining
arXiv cs.CL · Felipe Chavarro Polania · 6d
Scenario-based Probing and Steering Cultural Values in Large Language Models--Extended Version
arXiv cs.CL · Trung Duc Anh Dang, Tung Kieu, Sarah Masud · 6d
Context-Aware Multimodal Claim Verification in Spoken Dialogues
arXiv cs.CL · Chaewan Chun, Delvin Ce Zhang, Dongwon Lee · 6d
SOMA-SQL: Resolving Multi-Source Ambiguity in NL-to-SQL via Synthetic Log and Execution Probing
arXiv cs.CL · Sai Ashish Somayajula, Marianne Menglin Liu, Chuan Lei, Fjona Parllaku, Daniel Garcia, Rongguang Wang, Syed Fahad Allam Shah, Ankan Bansal, Sujeeth Bharadwaj, Tao Sheng, Sujith Ravi, Dan Roth · 6d
Agent Skill Evaluation and Evolution: Frameworks and Benchmarks
arXiv cs.CL · Kexin Ding, Yang Zhou, Can Jin, Feng Tong, Mu Zhou, Dimitris N. Metaxas · 6d
AI Coding Agents Can Reproduce Social Science Findings
arXiv cs.CL · Meysam Alizadeh, Mohsen Mosleh, Fabrizio Gilardi, Atoosa Kasirzadeh, Joshua Tucker · 6d
AI Coding Agents in Social Science: Methodologically Diverse, Empirically Consistent, Interpretively Vulnerable
arXiv cs.CL · Meysam Alizadeh, Fabrizio Gilardi, Mohsen Mosleh, Enkelejda Kasneci · 6d
APEX: Automated Prompt Engineering eXpert with Dynamic Data Selection
arXiv cs.CL · Fei Wang, Si Si, Cho-Jui Hsieh, Inderjit S. Dhillon · 6d
The Periodic Table of LLM Reasoning: A Structured Survey of Reasoning Paradigms, Methods, and Failure Modes
arXiv cs.CL · Avinash Anand, Mahisha Ramesh, Avni Mittal, Ashutosh Kumar, Erik Cambria, Zhengkui Wang, Timothy Liu, Aik Beng Ng, Simon See, Rajiv Ratn Shah · 6d
Hubs or Fringes: Pretraining Data Selection via Web Graph Centrality
arXiv cs.CL · Vedant Badoni, Danqi Chen, Xinyi Wang · 6d
When Roleplaying, Do Models Believe What They Say?
arXiv cs.CL · Benjamin Sturgeon, David Africa, Sid Black · 6d
SAGE: Answer-Conditioned Uncertainty Targets for Verbal Uncertainty Alignment
arXiv cs.CL · Kaiwen Shi, Zheyuan Zhang, Yanfang Ye · 6d
ISE: An Execution-Grounded Recipe for Multi-Turn OS-Agent Trajectories
arXiv cs.CL · Siyuan Luo, Nairong Zheng, Lin Zhou, Tiankuo Yao, Shengyou Yuan, Haojia Yu, Cong Pang, Jiapeng Luo, Lewei Lu · 6d
Measuring language complexity from hierarchical reuse of recurring patterns
arXiv cs.CL · Junyi Zhou, Rui Liu, Pengyu Liu, Yu Liu · 6d
Pretrained self-supervised speech models can recognize unseen consonants
arXiv cs.CL · Chihiro Taguchi, \'Eric Le Ferrand, Hirosi Nakagawa, Hitomi Ono, Kanji Kato, Emily Prud'hommeaux, David Chiang · 6d
Teaching Diffusion to Speculate Left-to-Right
arXiv cs.CL · Lexington Whalen, Yuki Ito, Ryo Sakamoto · 6d
When is Your LLM Steerable?
arXiv cs.CL · Chenrui Fan, Yize Cheng, Ming Li, Soheil Feizi, Tianyi Zhou · 6d
Multi-Agent Reasoning with Adaptive Worker Allocation for Stance Detection
arXiv cs.CL · Meysam Sabbaghan, Arman Zareian Jahromi, Doina Caragea · 6d
Evaluating Bias in Phoneme-Based Automatic Speech Recognition Systems: An Analysis of IPA Transcription Models
arXiv cs.CL · Catherine Bao, Maneesha Rani Saha, Neal Patwari · 6d
Improving Cross-Format Robustness in Language Models with Multi-Format Training
arXiv cs.CL · June M. Liu, Shaomian Zheng, He Cao, Dingnan Jin, Qing Cui, Jun Zhou · 6d
Can AI Reason Like an Urban Planner? Benchmarking Large Language Models Against Professional Judgment
arXiv cs.CL · Yijie Deng, He Zhu, Wen Wang, Junyou Su, Minxin Chen, Wenjia Zhang · 6d
UR-BERT: Scaling Text Encoders for Massively Multilingual TTS Through Universal Romanization and Speech Token Prediction
arXiv cs.CL · Sangmin Lee, Eekgyun Ahn, Woongjib Choi, Hong-Goo Kang · 6d
Layer-Isolated Evaluation: Gating the Deterministic Scaffold of a Production LLM Agent with a No-LLM, Regression-Locked Test Harness
arXiv cs.CL · Sawyer Zhang, Alexander Wang, Sophie Lei · 6d
Goal-Autopilot: A Verifiable Anti-Fabrication Firewall for Unattended Long-Horizon Agents
arXiv cs.CL · Youwang Deng · 6d
Substrate Asymmetry in User-Side Memory: A Diagnostic Framework
arXiv cs.CL · Youwang Deng · 6d
Hey Chat, Can You Teach Me? Structuring Socratic Dialogue for Human Learning in the Wild
arXiv cs.CL · Sidney Tio, Arunesh Sinha, Pradeep Varakantham · 6d