惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

云风的 BLOG
云风的 BLOG
Threat Intelligence Blog | Flashpoint
Threat Intelligence Blog | Flashpoint
S
Secure Thoughts
Engineering at Meta
Engineering at Meta
Stack Overflow Blog
Stack Overflow Blog
B
Blog RSS Feed
V
Vulnerabilities – Threatpost
CTFtime.org: upcoming CTF events
CTFtime.org: upcoming CTF events
T
Tor Project blog
T
Threat Research - Cisco Blogs
GbyAI
GbyAI
T
The Blog of Author Tim Ferriss
A
About on SuperTechFans
H
Hackread – Cybersecurity News, Data Breaches, AI and More
Scott Helme
Scott Helme
M
MIT News - Artificial intelligence
V
Visual Studio Blog
L
Lohrmann on Cybersecurity
IT之家
IT之家
Jina AI
Jina AI
L
LangChain Blog
Spread Privacy
Spread Privacy
I
Intezer
E
Exploit-DB.com RSS Feed
Simon Willison's Weblog
Simon Willison's Weblog
L
LINUX DO - 热门话题
L
LINUX DO - 最新话题
U
Unit 42
C
Cisco Blogs
爱范儿
爱范儿
The Hacker News
The Hacker News
cs.CL updates on arXiv.org
cs.CL updates on arXiv.org
博客园 - 【当耐特】
C
Check Point Blog
Hugging Face - Blog
Hugging Face - Blog
F
Full Disclosure
Cyberwarzone
Cyberwarzone
N
Netflix TechBlog - Medium
S
SegmentFault 最新的问题
博客园_首页
Recorded Future
Recorded Future
Help Net Security
Help Net Security
D
Darknet – Hacking Tools, Hacker News & Cyber Security
大猫的无限游戏
大猫的无限游戏
Cyber Security Advisories - MS-ISAC
Cyber Security Advisories - MS-ISAC
G
GRAHAM CLULEY
P
Privacy International News Feed
S
Security Archives - TechRepublic
OSCHINA 社区最新新闻
OSCHINA 社区最新新闻
让小产品的独立变现更简单 - ezindie.com
让小产品的独立变现更简单 - ezindie.com

cs.CL updates on arXiv.org

C-ReD: A Comprehensive Chinese Benchmark for AI-Generated Text Detection Derived from Real-World Prompts General365: Benchmarking General Reasoning in Large Language Models Across Diverse and Challenging Tasks LangFlow: Continuous Diffusion Rivals Discrete in Language Modeling Discourse Diversity in Multi-Turn Empathic Dialogue Evaluating Cooperation in LLM Social Groups through Elected Leadership Legal2LogicICL: Improving Generalization in Transforming Legal Cases to Logical Formulas via Diverse Few-Shot Learning Playing Along: Learning a Double-Agent Defender for Belief Steering via Theory of Mind RPA-Check: A Multi-Stage Automated Framework for Evaluating Dynamic LLM-based Role-Playing Agents A Triadic Suffix Tokenization Scheme for Numerical Reasoning Hidden Measurement Error in LLM Pipelines Distorts Annotation, Evaluation, and Benchmarking Synthius-Mem: Brain-Inspired Hallucination-Resistant Persona Memory Achieving 94.4% Memory Accuracy and 99.6% Adversarial Robustness on LoCoMo Time is Not a Label: Continuous Phase Rotation for Temporal Knowledge Graphs and Agentic Memory NovBench: Evaluating Large Language Models on Academic Paper Novelty Assessment Policy Split: Incentivizing Dual-Mode Exploration in LLM Reinforcement with Dual-Mode Entropy Regularization METER: Evaluating Multi-Level Contextual Causal Reasoning in Large Language Models Think Before you Write: QA-Guided Reasoning for Character Descriptions in Books METRO: Towards Strategy Induction from Expert Dialogue Transcripts for Non-collaborative Dialogues Retrieval as Generation: A Unified Framework with Self-Triggered Information Planning Do LLMs Know Tool Irrelevance? Demystifying Structural Alignment Bias in Tool Invocations Enhancing Multimodal Large Language Models for Ancient Chinese Character Evolution Analysis via Glyph-Driven Fine-Tuning Polyglot Teachers: Evaluating Language Models for Multilingual Synthetic Data Generation Exploring Knowledge Conflicts for Faithful LLM Reasoning: Benchmark and Method CocoaBench: Evaluating Unified Digital Agents in the Wild MathAgent: Adversarial Evolution of Constraint Graphs for Mathematical Reasoning Data Synthesis Evaluating Memory Capability in Continuous Lifelog Scenario How Robust Are Large Language Models for Clinical Numeracy? An Empirical Study on Numerical Reasoning Abilities in Clinical Contexts Efficient Training for Cross-lingual Speech Language Models Shared Emotion Geometry Across Small Language Models: A Cross-Architecture Study of Representation, Behavior, and Methodological Confounds A Systematic Analysis of the Impact of Persona Steering on LLM Capabilities Uncertainty-Aware Web-Conditioned Scientific Fact-Checking When Valid Signals Fail: Regime Boundaries Between LLM Features and RL Trading Policies When Verification Fails: How Compositionally Infeasible Claims Escape Rejection Mem$^2$Evolve: Towards Self-Evolving Agents via Co-Evolutionary Capability Expansion and Experience Distillation AOP-Smart: A RAG-Enhanced Large Language Model Framework for Adverse Outcome Pathway Analysis OccuBench: Evaluating AI Agents on Real-World Professional Tasks via Language Environment Simulation Advancing Polish Language Modeling through Tokenizer Optimization in the Bielik v3 7B and 11B Series TInR: Exploring Tool-Internalized Reasoning in Large Language Models Do BERT Embeddings Encode Narrative Dimensions? A Token-Level Probing Analysis of Time, Space, Causality, and Character in Fiction Generating Multiple-Choice Knowledge Questions with Interpretable Difficulty Estimation using Knowledge Graphs and Large Language Models Deep-Reporter: Deep Research for Grounded Multimodal Long-Form Generation BlasBench: An Open Benchmark for Irish Speech Recognition Too Nice to Tell the Truth: Quantifying Agreeableness-Driven Sycophancy in Role-Playing Language Models Learning and Enforcing Context-Sensitive Control for LLMs HeceTokenizer: A Syllable-Based Tokenization Approach for Turkish Retrieval Efficient Process Reward Modeling via Contrastive Mutual Information Computational Lesions in Multilingual Language Models Separate Shared and Language-specific Brain Alignment Bridging Linguistic Gaps: Cross-Lingual Mapping in Pre-Training and Dataset for Enhanced Multilingual LLM Performance Knowing What to Stress: A Discourse-Conditioned Text-to-Speech Benchmark Early Decisions Matter: Proximity Bias and Initial Trajectory Shaping in Non-Autoregressive Diffusion Language Models LLMs Should Incorporate Explicit Mechanisms for Human Empathy ReFEree: Reference-Free and Fine-Grained Method for Evaluating Factual Consistency in Real-World Code Summarization Structure-Grounded Knowledge Retrieval via Code Dependencies for Multi-Step Data Reasoning From Query to Counsel: Structured Reasoning with a Multi-Agent Framework and Dataset for Legal Consultation Dynamic Adaptive Attention and Supervised Contrastive Learning: A Novel Hybrid Framework for Text Sentiment Classification EviCare: Enhancing Diagnosis Prediction with Deep Model-Guided Evidence for In-Context Reasoning NOSE: Neural Olfactory-Semantic Embedding with Tri-Modal Orthogonal Contrastive Learning Instruction Data Selection via Answer Divergence CodaRAG: Connecting the Dots with Associativity Inspired by Complementary Learning Turing or Cantor: That is the Question LASQ: A Low-resource Aspect-based Sentiment Quadruple Extraction Dataset NameBERT: Scaling Name-Based Nationality Classification with LLM-Augmented Open Academic Data BLUEmed: Retrieval-Augmented Multi-Agent Debate for Clinical Error Detection A Structured Clustering Approach for Inducing Media Narratives Adaptive Multi-Expert Reasoning via Difficulty-Aware Routing and Uncertainty-Guided Aggregation Comparative Analysis of Large Language Models in Healthcare Hijacking Text Heritage: Hiding the Human Signature through Homoglyphic Substitution The Amazing Agent Race: Strong Tool Users, Weak Navigators CodeComp: Structural KV Cache Compression for Agentic Coding Relational Probing: LM-to-Graph Adaptation for Financial Prediction FAITH: Factuality Alignment through Integrating Trustworthiness and Honestness ODUTQA-MDC: A Task for Open-Domain Underspecified Tabular QA with Multi-turn Dialogue-based Clarification Nationality encoding in language model hidden states: Probing culturally differentiated representations in persona-conditioned academic text Learning from Emptiness: De-biasing Listwise Rerankers with Content-Agnostic Probability Calibration Think in Sentences: Explicit Sentence Boundaries Enhance Language Model's Capabilities Training-Free Cross-Lingual Dysarthria Severity Assessment via Phonological Subspace Analysis in Self-Supervised Speech Representations CircuitSynth: Reliable Synthetic Data Generation Who Wrote This Line? Evaluating the Detection of LLM-Generated Classical Chinese Poetry SEPTQ: A Simple and Effective Post-Training Quantization Paradigm for Large Language Models Why Supervised Fine-Tuning Fails to Learn: A Systematic Study of Incomplete Learning in Large Language Models Reason Only When Needed: Efficient Generative Reward Modeling via Model-Internal Uncertainty ASPIRin: Action Space Projection for Interactivity-Optimized Reinforcement Learning in Full-Duplex Speech Language Models Mirroring Minds: Asymmetric Linguistic Accommodation and Diagnostic Identity in ADHD and Autism Reddit Communities Computational Implementation of a Model of Category-Theoretic Metaphor Comprehension CoSToM:Causal-oriented Steering for Intrinsic Theory-of-Mind Alignment in Large Language Models Weird Generalization is Weirdly Brittle FinTrace: Holistic Trajectory-Level Evaluation of LLM Tool Calling for Long-Horizon Financial Tasks Demographic and Linguistic Bias Evaluation in Omnimodal Language Models Reproduction Beyond Benchmarks: ConstBERT and ColBERT-v2 Across Backends and Query Distributions Human vs. Machine Deception: Distinguishing AI-Generated and Human-Written Fake News Using Ensemble Learning Cross-Cultural Value Awareness in Large Vision-Language Models From UAV Imagery to Agronomic Reasoning: A Multimodal LLM Benchmark for Plant Phenotyping Should We be Pedantic About Reasoning Errors in Machine Translation? Simulating Organized Group Behavior: New Framework, Benchmark, and Analysis Instructing LLMs to Negotiate using Reinforcement Learning with Verifiable Rewards Spoiler Alert: Narrative Forecasting as a Metric for Tension in LLM Storytelling COMPOSITE-Stem ProGAL-VLA: Grounded Alignment through Prospective Reasoning in Vision-Language-Action Models Claim2Vec: Embedding Fact-Check Claims for Multilingual Similarity and Clustering GIANTS: Generative Insight Anticipation from Scientific Literature Pioneer Agent: Continual Improvement of Small Language Models in Production
Few-shot Adaptation to Distribution Shifts By Mixing Source and Target Embeddings
Yihao Xue, Ali Payani, Yu Yang, Baharan Mirzasoleiman · 2023-05-24 · via cs.CL updates on arXiv.org

Pretrained machine learning models need to be adapted to distribution shifts when deployed in new target environments. When obtaining labeled data from the target distribution is expensive, few-shot adaptation with only a few examples from the target distribution becomes essential. In this work, we propose MixPro, a lightweight and highly data-efficient approach for few-shot adaptation. MixPro first generates a relatively large dataset by mixing (linearly combining) pre-trained embeddings of large source data with those of the few target examples. This process preserves important features of both source and target distributions, while mitigating the specific noise in the small target data. Then, it trains a linear classifier on the mixed embeddings to effectively adapts the model to the target distribution without overfitting the small target data. Theoretically, we demonstrate the advantages of MixPro over previous methods. Our experiments, conducted across various model architectures on 8 datasets featuring different types of distribution shifts, reveal that MixPro can outperform baselines by up to 7\%, with only 2-4 target examples.