←── back to feed
/topics/arxiv-cs-cl-papers-june-4-2026
arXiv cs.CL papers June 4 2026
50 items●1 sources●updated 13d ago●trend 0
Twenty papers posted to arXiv's cs.CL section on June 4, 2026 address challenges in language model reasoning, safety, and practical deployment. Topics span long-form generation, retrieval-augmented systems, AI-generated text detection, medical QA, memory management for conversational agents, and parameter-efficient fine-tuning.
- POLARIS uses GRPO with frontier LLM judges and human-reference injection to improve small models' long-form creative writing quality and length consistency
- Biomedical RAG study across 5 open-weight models (7B–72B), 10 QA datasets, and 4 retrieval methods finds only small, inconsistent improvements from retrieval
- LazyAttention defers positional encoding in KV caching to improve reusability for long-context RAG and in-context learning without expensive re-encoding
- LR-LoRA learns adapter rank during training instead of using fixed-rank constraint, enabling more flexible parameter-efficient fine-tuning
- DOSEBENCH introduces 81 OTC dosing scenarios requiring dose-timing tracking and 24-hour intake computation to evaluate LLM safety in medical QA
[BLG]blog/rss50
POLARIS: Guiding Small Models to Write Long Stories
Discourse-Role Labels as Presentation-Time Variables for Context Use in Language Models
Computational conceptual history of scientific concepts: From early digital methods to LLMs
SaliMory: Orchestrating Cognitive Memory for Conversational Agents
When Retrieval Doesn't Help: A Large-Scale Study of Biomedical RAG
Expert-Aware Refusal Steering
A Systematic Analysis of Linguistic Features in AI-Generated Text Detection Across Domains and Models
ACAT: A Collaborative Platform for Efficient Aspect-Based Sentiment Dataset Annotation
Cross-Prompt Generalization in Detecting AI-Generated Fake News Using Interpretable Linguistic Features
MM-BizRAG: Rethinking Multimodal Retrieval-Augmented Generation for General Purpose Enterprise Q&A
Supportive Token Revealing for Fast Diffusion Language Model Decoding
Can I Take Another Dose? Evaluating LLM Decision-Making Under Temporal Uncertainty in OTC Dosing QA
Long Live Fine-Tuning: Task-Specific Transformers Outperform Zero-Shot LLMs for Misinformation Response Classification on Reddit
Using Text-Based Causal Inference to Disentangle Factors Influencing Online Review Ratings
LazyAttention: Efficient Retrieval-Augmented Generation with Deferred Positional Encoding
Parameter-Efficient Fine-Tuning with Learnable Rank
Noisy memory encoding explains negative polarity illusions
Deliberate Evolution: Agentic Reasoning for Sample-Efficient Symbolic Regression with LLMs
GlossAssist -- A Tool to Simplify Corpus Creation and Study the Effect of NLP Models in Low-Resource Documentation Settings
DLLG: Dynamic Logit-Level Gating of LLM Experts
When Clients Stop Following: A Cognitive Conceptualization Diagram-driven Framework for Strategic Counseling
Read the Trace, Steer the Path: Trajectory-Aware Reinforcement Learning for Diffusion Language Models
MemoryDocDataSet: A Benchmark for Joint Conversational Memory and Long Document Reasoning
Listening to the Workforce: Measuring Construction Worker Safety Attitudes from Social Media Discourse Using LLMs
Stepwise Reasoning Enhancement for LLMs via External Subgraph Generation
SePO: Self-Evolving Prompt Agent for System Prompt Optimization
Learning What to Learn: Stage-Specific Data Sets for SFT-then-RL in Small Language Model Reasoning
Entity Binding Failures in Speech LLM Reasoning: Diagnosis and Chain-of-Thought Intervention
Off-Distribution Voices: Fanfiction Subgenres as Universal Vernacular Jailbreaks for Aligned LLMs
SANE Schema-aware Natural-language Evaluation of Biological Data
Self-Evolving Deep Research via Joint Generation and Evaluation
SparDA: Sparse Decoupled Attention for Efficient Long-Context LLM Inference
GENEB: Why Genomic Models Are Hard to Compare
Dynamic Infilling Anchors for Format-Constrained Generation in Diffusion Large Language Models
LDARNet: DNA Adaptive Representation Network with Learnable Tokenization for Genomic Modeling
Temporal Order Matters for Agentic Memory: Segment Trees for Long-Horizon Agents
Cartridges at Scale: Training Modular KV Caches over Large Document Collections
VCIFBench: Evaluating Complex Instruction Following for Video Understanding
Fine-grained Fragment Retrieval in Multi-modal Long-form Dialogues
A Systematic Evaluation of Positional Bias in Multi-Video Summarization with MLLMs
Hybrid Adversarial Defence for Natural Language Understanding Tasks
RAMPART: Registry-based Agentic Memory with Priority-Aware Runtime Transformation
CYGNET: Cypher Gate for Neural Execution Triage and Cost Containment
QO-Bench: Diagnosing Query-Operator-Preserving Retrieval over Typed Event Tuples
LifeSide: Benchmarking Agents as Lifelong Digital Companions
CRAFT: Cost-aware Refinement And Front-aware Tuning of Prompts
SMADE-IE: Sparse Multi-Agent Framework with Evidence-Driven Debate for Zero-Shot Information Extraction
DuDi: Dual-Signal Distillation with Cross-Lingual Verbalizer
Rethinking Continual Experience Internalization for Self-Evolving LLM Agents
Query-based Cross-Modal Projector Bolstering Mamba Multimodal LLM