惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

F
Full Disclosure
Recorded Future
Recorded Future
T
Tenable Blog
S
Securelist
C
CERT Recently Published Vulnerability Notes
T
Threatpost
S
Schneier on Security
A
Arctic Wolf
The Hacker News
The Hacker News
C
CXSECURITY Database RSS Feed - CXSecurity.com
Know Your Adversary
Know Your Adversary
P
Privacy International News Feed
Threat Intelligence Blog | Flashpoint
Threat Intelligence Blog | Flashpoint
The Register - Security
The Register - Security
Cisco Talos Blog
Cisco Talos Blog
AWS News Blog
AWS News Blog
K
Kaspersky official blog
T
True Tiger Recordings
T
Threat Research - Cisco Blogs
V
Vulnerabilities – Threatpost
P
Palo Alto Networks Blog
T
The Exploit Database - CXSecurity.com
小众软件
小众软件
B
Blog
Cyber Security Advisories - MS-ISAC
Cyber Security Advisories - MS-ISAC
Microsoft Azure Blog
Microsoft Azure Blog
Cyberwarzone
Cyberwarzone
C
Cybersecurity and Infrastructure Security Agency CISA
T
Tor Project blog
Spread Privacy
Spread Privacy
Malwarebytes
Malwarebytes
P
Proofpoint News Feed
F
Fox-IT International blog
F
Fortinet All Blogs
P
Privacy & Cybersecurity Law Blog
G
GRAHAM CLULEY
量子位
Latest news
Latest news
OSCHINA 社区最新新闻
OSCHINA 社区最新新闻
博客园 - 叶小钗
Project Zero
Project Zero
T
Tailwind CSS Blog
N
Netflix TechBlog - Medium
Martin Fowler
Martin Fowler
IntelliJ IDEA : IntelliJ IDEA – the Leading IDE for Professional Development in Java and Kotlin | The JetBrains Blog
IntelliJ IDEA : IntelliJ IDEA – the Leading IDE for Professional Development in Java and Kotlin | The JetBrains Blog
I
Intezer
博客园_首页
腾讯CDC
H
Hackread – Cybersecurity News, Data Breaches, AI and More
D
Darknet – Hacking Tools, Hacker News & Cyber Security

cs.AI updates on arXiv.org

Teaching Through Analogies: A Modular Pipeline for Educational Analogy Generation On the Impact of Class Imbalance on the Learning Dynamics of Deep Neural Networks:An Intuitive Insight Guarded Repair for Harm-Aware Post-hoc Replacement of LLM Mathematical Reasoning Mimir: Large-scale Multilingual Concept Modeling STREAM: A Data-Centric Framework for Mining High-Value Task-Oriented Dialogues from Streaming Media Evidence-Linked Radiology Reporting: A Human-Supervised Reference Architecture for Structured Imaging Intelligence Overcoming "Physics Shock" in Earth Observation A Heteroscedastic Uncertainty Framework for PINN-based Flood Inference Creative Quality Alignment: Expert Tacit Knowledge Transfer via Chain-of-Thought Fine-Tuning BC Protocol: Structured Dual-Expert Dialogue for Eliciting High-Quality Chain-of-Thought Post-Training Data A Large-Scale Dataset and Benchmark: Do Protein-Ligand Models Learn Binding Sites or Just Binding Likelihood? AutoSG: LLM-Driven Solver Generation Solely from Task Prompts for Expensive Optimization LLM-AutoSciLab: Closed-Loop Scientific Discovery via Active Experimentation with LLMs Rethinking Federated Unlearning via the Lens of Memorization Language Bias in LVLMs: From In-Depth Analysis to Simple and Effective Mitigation Who judges the judges? Governance from metrics: a runtime framework for continuous LLM compliance monitoring Verified SHAP: Provable Bounds for Exact Shapley Values of Neural Networks Cross-Domain Energy-Guided Diffusion Generation for Off-Dynamics Reinforcement Learning PennySynth: RAG-Driven Data Synthesis for Automated Quantum Code Generation Cascade-KDE: Robust Time-Series Restoration under Out-of-Distribution Impulse Corruptions Adaptive Graph Refinement and Label Propagation with LLMs for Cost-Effective Entity Resolution Spectral Probe-Circuits: A Three-Step Recipe for Identifying Attention-Head Circuits in Pretrained Transformers PILOT: Policy-Informed Learned Optimization for Adaptive Deep Network Training Clarify, Abstain or Answer? Strategising in Conversation with Belief-Augmented Generation RealBench: Benchmarking Data-Driven Numerical Weather Forecasting Under Operational Conditions and Extreme Event Challenges Assessing the Operational Viability of Foundation Models for Time Series Forecasting Generative OOD-regularized Model-based Policy Optimization Filtered Posterior Mean Collections: A Unified Framework for Analytical Models of Diffusion Generalization When Reasoning Hurts: Source-Aware Evaluation of Frontier LLMs for Clinical SOAP Note Generation Grammatically-Guided Sparse Attention for Efficient and Interpretable Transformers Knowledge Graph Modulated Deep Learning for Limited-Sample Clinical Data Analysis Towards a Universal Causal Reasoner Raon-Speech Technical Report The Concept Allocation Zone: Tracking How Concepts Form Across Transformer Depth LAPLEX: The FFT of Learnable Laplace Kernels Parameter Efficient Multi-Class Intelligent Scheduling for Multimodal Online Distributed Industrial Anomaly Detection Beyond the Aggregation Dilemma: Prior-Retaining Decoupled Learning for Multimodal Graphs Causal Tongue-Tie: LLMs Can Encode Causal Direction, But Their Yes/No Outputs Fail to Express Simulating Human Memory with Language Models Explainable Retinal Imaging for Prediction of Multi-Organ Dysfunction in Type 2 Diabetes Agent-ToM: Learning to Monitor Autonomous LLM Agents via Theory-of-Mind Reasoning Measuring the Depth of LLM Unlearning via Activation Patching QUIET: A Multi-Blank Cascaded Story Cloze Benchmark for LLM Creative Generation Capability The Path Matters: Learning a Token-Commitment Policy for Diffusion Language Models Feature Lottery? A Bifurcation Theory of Concept Emergence TGFormer: Towards Temporal Graph Transformer with Auto-Correlation Mechanism Balancing Fairness, Privacy, and Accuracy: A Multitask Adversarial Framework for Centralized Data-Driven Systems Extracting Training Data from Diffusion Language Models via Infilling Toward a Benchmark for Controllable Simulation of Imperfect Students with Large Language Models ChaosBench-Logic v2: Evaluating LLM Logical Reasoning over Dynamical Systems at Scale An Interactive Paradigm for Deep Research Eureka: Intelligent Feature Engineering for Enterprise AI Cloud Resource Demand Prediction SemanticZip: A Pilot Framework for Lossy Text Compression with LLMs as Semantic Decompressors Improving Labeling Consistency with Detailed Constitutional Definitions and AI-Driven Evaluation Side-by-side Comparison Amplifies Dialect Bias in Language Models Momentum Streams for Optimizer-Inspired Transformers Mixture of Complementary Agents for Robust LLM Ensemble Treatment Effect Estimation with Differentiated Networked Effect on Graph Data JudgmentBench: Comparing Rubric and Preference Evaluation for Quality Assessment LLM Agent Based Renewable Energy Forecasting Using Edge and IoT Data A Review of Solar Wind Weather and Grid Aware Decision Support Federated Learning over Human-Body Communication for On-Body Edge Intelligence: A Survey, Taxonomy, and BODYFED-HBC Scheduling Vignette By Their Fruits You Will Know Them: Comparing Formalizations of Law by the Decisions They Encode READER: Reasoning-Enhanced AI-Generated Text Detection SEP-Attack: A Simple and Effective Paradigm for Transfer-Based Textual Adversarial Attack IndexMem: Learned KV-Cache Eviction with Latent Memory for Long-Context LLM Inference SomaliBench Eval: Measuring English-to-Somali Refusal Gaps in Open-Weight Language Models Hidden-State Privacy Has an Empty Middle A Controlled Synthetic Benchmark for Educational Aspect-Based Sentiment Analysis Complement Submodular Information Measures for Balanced and Robust Data Selection Polymorphism Is Rotation: Operational Mechanistic Interpretability from a Two-Layer Transformer to Pythia-70m CONF-KV: Confidence-Aware KV Cache Eviction with Mixed-Precision Storage for Long-Horizon LLM On the Stability and Realizability of Recurrent Polynomial Surrogate Ternary Logic Gate Networks TriVAL: A Tri-Validation Framework for Faithful Automatic Optimization Modeling Disentangled Double Machine Learning for Accurate Causal Effect Estimation Tiny Brains, Giant Impact: Uncovering the Keystone Neurons of LLM with Just a Few Prompts Fine-Tuning Over Architectural Complexity: Broad-Coverage PII Detection on PIIBench with DeBERTa OSDTW: Optimal Shared Depth and Task Weighting for Long-Tailed Recognition Not All Transitions Matter: Evidence from PPO Riemannian-Manifold Steering: Geometry-Aware Generative Autoencoders for Label-Free Steering Signs Beat Floats: Low-Rank Double-Binary Adaptation for On-Device Fine-Tuning Generative Representation Learning on Hyper-relational Knowledge Graphs via Masked Discrete Diffusion Quaternion Self-Attention with Shared Scores Truthful Online Preference Aggregation for LLM Fine-Tuning in Mobile Crowdsourcing TIAR: Trajectory-Informed Advantage Reweighting for LLM Abstention Learning PromptAudit: Auditing Prompt Sensitivity in LLM-Based Vulnerability Detection World-State Transformations for Neuro-symbolic Interactive Storytelling Investigating the Interplay between Contextual and Parametric Chain-of-Thought Faithfulness under Optimization TS-Skill: A Benchmark for Evaluating Analytical Skills in Time-Series Question Answering EchoDistill:Alignment Noisy-to-Clean Self-Distillation for Robust Audio LLMs A general tensor-structured compression scheme for efficient large language models Temporal Concept Drift in Legal Judgment Prediction: Neural Baselines Across Three Epochs of Ukrainian Court Decisions Knowledge Graph-Driven Expert-Level Reasoning for Neuroscience Mix-MoE: Improving Multilingual Machine Translation of Large Language Models through Mixed MoEs Can LLMs Time Travel? Enhancing Temporal Consistency in Legal Agentic Search through Reinforcement Learning Iterative Refinement Neural Operators are Learned Fixed-Point Solvers: A Principled Approach to Spectral Bias Mitigation Factorize to Generalize: Retrieval-Guided Invariant-Dynamic Decomposition for Time Series Forecasting AI-Associated Lexical Shifts Across 34 Languages: Cross-Lingual Convergence and Diachronic Uptake in News Writing Document Classification Pattern Recognition via Information Fusion: A Systematic Review of Multimodal and Multiview Representation Approaches Batch Normalization Amplifies Memorization and Privacy Risks Actionable and diverse counterfactual explanations incorporating domain knowledge and plausibility constraints A Multi-Agent LLM Framework for Rating the Quality of Surgical Feedback
Beyond Generative Priors: Minority Sampling with JEPA-Guided Diffusion
Sol Park, So · 2026-05-26 · via cs.AI updates on arXiv.org

View PDF HTML (experimental)

Abstract:Minority sampling aims to generate low-density instances on a data manifold and is of central importance in applications such as medical diagnosis, anomaly detection, and creative AI. Existing approaches, however, define minority samples relative to generative priors learned from training data, confining rarity to model-specific notions that may poorly reflect real-world semantics. In this work, we propose a world-centric perspective on minority sampling, which defines rarity with respect to real-world priors rather than generator-induced densities. To this end, we introduce JEPA guidance, a diffusion sampling framework guided by a Joint-Embedding Predictive Architecture (JEPA) -- a class of world models that encode broad, semantically rich representations. JEPA guidance steers diffusion trajectories toward low-density regions under the implicit density induced by the JEPA, thereby aligning generated minorities with real-world semantic rarity. To make JEPA guidance computationally practical, we develop principled approximation strategies accompanied by theoretical error bounds, significantly reducing the overhead of guidance computation. Extensive experiments across unconditional, class-conditional, and text-to-image generation demonstrate that JEPA guidance consistently improves the fidelity and semantic validity of minority samples, outperforming generator-centric baselines in capturing real-world notions of rarity. Code is available at this https URL.
Comments: ICML 2026, 21 pages, 9 figures
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
Cite as: arXiv:2605.24631 [cs.LG]
  (or arXiv:2605.24631v1 [cs.LG] for this version)
  https://doi.org/10.48550/arXiv.2605.24631

arXiv-issued DOI via DataCite (pending registration)

Submission history

From: Soobin Um [view email]
[v1] Sat, 23 May 2026 15:40:56 UTC (16,404 KB)