←── back to feed
/topics/arxiv-cs-lg-papers-may-27-2026
arXiv cs.LG papers May 27 2026
15 items●1 sources●updated 21d ago●trend 0
On May 27, 2026, 15 new machine learning papers were posted to arXiv's cs.LG category, covering topics ranging from data curation and quantization for large language models to anomaly detection, federated learning, weather forecasting, and medical signal classification. The papers address practical deployment challenges including output constraints for small models, low-bit quantization efficiency, and contamination auditing in foundation models.
- GEM reformulates LLM data curation as a variational problem on the hypersphere using Minorize-Maximize optimization to address embedding anisotropy.
- AirCast-SR downscales global weather forecasts from 0.25 degree (~28 km) to 1 km resolution using latent consistency diffusion for 67-hour predictions.
- InfoQuant shapes activation distributions for low-bit LLM quantization by matching distributions to uniform quantizers rather than just suppressing outliers.
- ARBITER identifies wrong-majority failures in test-time sampling where correct answers are outvoted by clustering reasoning trajectories into basins.
- TSFMAudit introduces the first pretraining contamination auditing framework for time series foundation models to detect evaluation dataset exposure.
- HRVConformer uses hybrid Convolution-Transformer architecture to classify neonatal hypoxic-ischemic encephalopathy directly from raw heart rate signals.
[BLG]blog/rss15
GEM: Geometric Entropy Mixing for Optimal LLM Data Curation
The Constraint Tax: Measuring Validity-Correctness Tradeoffs in Structured Outputs for Small Language Models
AirCast-SR: A Foundation Model for Kilometer-Scale Atmospheric Super-Resolution via Latent Consistency Diffusion
SilIF: Silhouette-Augmented Isolation Forest for Unsupervised Transaction Fraud Detection
Neural Bayesian Sequential Routing
TSFMAudit: Data Contamination Auditing in Forecasting Time Series Foundation Models
On the Push-Based Asynchronous Federated Learning: A Bias-Correction Aggregation Approach
Planning Neural Dynamics with Lie Group Embedding through Supervised Projective Manifold Learning
When Rule Violations Are Rare: Chimera Training for Logical Anomaly Detection
ARBITER: Reasoning Trajectory Basins and Majority Vote Failures in Test-Time Sampling
InfoQuant: Shaping Activation Distributions for Low-Bit LLM Quantization
GAC: Noise-Aware Adaptive Mixing for Hybrid SFT-RL Post-Training
Max-Window Scale Estimation for Near-Lossless HiF8 W8A8 Quantization-Aware Training
HRVConformer: Neonatal Hypoxic-Ischemic Encephalopathy Classification from the Heart Rate signals
Modeling Dynamic Mixtures of Time-Delay Systems from Streaming Time Series