惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

P
Proofpoint News Feed
李成银的技术随笔
人人都是产品经理
人人都是产品经理
Cyber Security Advisories - MS-ISAC
Cyber Security Advisories - MS-ISAC
H
Help Net Security
G
Google Developers Blog
AWS News Blog
AWS News Blog
N
Netflix TechBlog - Medium
P
Privacy & Cybersecurity Law Blog
C
Cisco Blogs
C
Check Point Blog
Threat Intelligence Blog | Flashpoint
Threat Intelligence Blog | Flashpoint
U
Unit 42
Cyberwarzone
Cyberwarzone
CTFtime.org: upcoming CTF events
CTFtime.org: upcoming CTF events
C
CERT Recently Published Vulnerability Notes
The GitHub Blog
The GitHub Blog
D
DataBreaches.Net
腾讯CDC
S
SegmentFault 最新的问题
Project Zero
Project Zero
F
Future of Privacy Forum
L
LangChain Blog
云风的 BLOG
云风的 BLOG
奇客Solidot–传递最新科技情报
奇客Solidot–传递最新科技情报
F
Fortinet All Blogs
cs.CV updates on arXiv.org
cs.CV updates on arXiv.org
F
Fox-IT International blog
Security Latest
Security Latest
S
Secure Thoughts
T
Tailwind CSS Blog
T
Troy Hunt's Blog
Jina AI
Jina AI
C
CXSECURITY Database RSS Feed - CXSecurity.com
Blog — PlanetScale
Blog — PlanetScale
美团技术团队
Recorded Future
Recorded Future
Application and Cybersecurity Blog
Application and Cybersecurity Blog
N
News and Events Feed by Topic
Schneier on Security
Schneier on Security
Microsoft Security Blog
Microsoft Security Blog
Google DeepMind News
Google DeepMind News
Apple Machine Learning Research
Apple Machine Learning Research
aimingoo的专栏
aimingoo的专栏
博客园 - 三生石上(FineUI控件)
MyScale Blog
MyScale Blog
IntelliJ IDEA : IntelliJ IDEA – the Leading IDE for Professional Development in Java and Kotlin | The JetBrains Blog
IntelliJ IDEA : IntelliJ IDEA – the Leading IDE for Professional Development in Java and Kotlin | The JetBrains Blog
A
Arctic Wolf
Spread Privacy
Spread Privacy
T
The Blog of Author Tim Ferriss

cs.CL updates on arXiv.org

Temporal Concept Drift in Legal Judgment Prediction: Neural Baselines Across Three Epochs of Ukrainian Court Decisions World-State Transformations for Neuro-symbolic Interactive Storytelling ROC Analysis for Evaluating Translation Quality Estimation Systems READER: Reasoning-Enhanced AI-Generated Text Detection M$^\star$: Every Task Deserves Its Own Memory Harness Learning to Route Languages for Multilingual Policy Optimization Quantifying the Impact of Translation Errors on Multilingual LLM Evaluation Repeated Sequences Reveal Gaps between Large Language Models and Natural Language They Are Not the Same: Direct Causes Are Not Grounded Emotion Explanations Unveil: Unified Visual-Textual Integration and Distillation for Multi-modal Document Retrieval Discovering Lexical Gaps Using Embeddings from Multilingual LLMs Measuring the Depth of LLM Unlearning via Activation Patching Generating Legal Commentaries from Case Databases via Retrieval, Clustering, and Generation AstroMind: A High-Fidelity Benchmark for Spacecraft Behavior Reasoning Based on Large Language Models EchoDistill:Alignment Noisy-to-Clean Self-Distillation for Robust Audio LLMs Overview of the PsyDefDetect Shared Task at BioNLP 2026: Detecting Levels of Psychological Defense Mechanisms in Supportive Conversations SEP-Attack: A Simple and Effective Paradigm for Transfer-Based Textual Adversarial Attack Faithful or Fabricated? A Causal Framework for Rationalization Bias in LLM Judges Clarification Is Not Enough: Post-Clarification Answering Remains the Bottleneck in Multi-Turn QA End-to-End Intracortical Speech Decoding from Neural Activity Eureka: Intelligent Feature Engineering for Enterprise AI Cloud Resource Demand Prediction Multi-Persona Debate System for Automated Scientific Hypothesis Generation An Interactive Paradigm for Deep Research Evidence-Linked Radiology Reporting: A Human-Supervised Reference Architecture for Structured Imaging Intelligence Towards a Universal Causal Reasoner MultiHaluDet: Multilingual Hallucination Detection via LLM Hidden State Probing SEAL: Synergistic Co-Evolution of Agents and Learning Environments Document Classification Pattern Recognition via Information Fusion: A Systematic Review of Multimodal and Multiview Representation Approaches Distinguishing Right from Wrong in Debates: Attribution Analysis of Chinese Harmful Memes A Multi-Probe Audit of Clinical-Interview Depression Detection Benchmarks The Tokenizer Tax Across 25 European Languages: Domain Invariance, Cross-Lingual Few-Shot Effects, and the Ukrainian Penalty Lngram: N-gram Conditional Memory in Latent Space H$^{2}$MT: Semantic Hierarchy-Aware Hierarchical Memory Transformer ECHO: Terminal Agents Learn World Models for Free MATO: Multi-objective Personalized Alignment with Test-time Optimization for Large Language Models Raon-Speech Technical Report Better, Faster: Harnessing Self-Improvement in Large Reasoning Models LLM Agent Based Renewable Energy Forecasting Using Edge and IoT Data A Review of Solar Wind Weather and Grid Aware Decision Support Language Bias in LVLMs: From In-Depth Analysis to Simple and Effective Mitigation GroupTravelBench: Benchmarking LLM Agents on Multi-Person Travel Planning CUNY at CLPsych 2026: A Pipeline Approach to Classification and Summarization of Mental Health Changes Improving the Completeness and Comparability of Segment Disclosures: A Large Language Model Approach Inference Time Optimization with Confidence Dynamics Mix-MoE: Improving Multilingual Machine Translation of Large Language Models through Mixed MoEs Found in Conversation: LLMs Teach Themselves to Close the Multi-Turn Gap DRInQ: Evaluating Conversational Implicature with Controlled Context Variation From Automation to Collaboration: Human-in-the-Loop Methods for Safe and Trustworthy NLP A general tensor-structured compression scheme for efficient large language models Agent-ToM: Learning to Monitor Autonomous LLM Agents via Theory-of-Mind Reasoning StepGap: A Hybrid NLI-LLM Checker for Step-Level Evidence-Gap Detectionin Multi-Hop Question Answering MindAlign: Bridging EEG, Vision, and Language for Zero-Shot Visual Decoding Faithfulness Metrics Don't Measure Faithfulness: A Meta-Evaluation with Ground Truth TS-Skill: A Benchmark for Evaluating Analytical Skills in Time-Series Question Answering Who judges the judges? Governance from metrics: a runtime framework for continuous LLM compliance monitoring Knowing but Not Showing: LLMs Recognize Ambiguity but Rarely Ask Clarifying Questions WhenLoss: Diagnosing Write and Retrieval Bottlenecks in Long-Context Memory Systems Faithfulness as Information Flow: Evaluating and Training Faithful Chain-of-Thought Reasoning SLAP: Stratified Loss-based Pruning for On-Policy Data-Efficient Instruction Tuning Know You Before You Speak: User-State Modeling for LLM Personalization in Multi-Turn Conversation The Path Matters: Learning a Token-Commitment Policy for Diffusion Language Models CSP-Atlas: Concept-Specific Neural Circuits in a Sparse Python Transformer Improving Labeling Consistency with Detailed Constitutional Definitions and AI-Driven Evaluation HiMed: Incentivizing Hindi Reasoning in Medical LLMs DTO: a Differentiable Training Objective for Effective Counterfactual Story Rewriting What Are We Actually Decoding? Source Attribution for Non-Invasive Brain-to-Language Retrieval Beyond the Target: From Imitation to Collaboration in Speculative Decoding TRACE: A taxonomy-grounded synthetic dataset for teaching-program generation and session interpretation in Applied Behavior Analysis NITP: Next Implicit Token Prediction for LLM Pre-training Locality Matters for Training-Free Audio Token Compression in Audio-Language Models When Reasoning Hurts: Source-Aware Evaluation of Frontier LLMs for Clinical SOAP Note Generation SemanticZip: A Pilot Framework for Lossy Text Compression with LLMs as Semantic Decompressors Investigating the Interplay between Contextual and Parametric Chain-of-Thought Faithfulness under Optimization Exploring Profiles of Cognitive Distortions Associated with Mental Health Disorders JudgmentBench: Comparing Rubric and Preference Evaluation for Quality Assessment Momentum Streams for Optimizer-Inspired Transformers Toxicity in Twitch Chats: An LLM-Based Analysis Across Gaming Communities Tool-Call Dependency Structure is Linearly Decodable in LLM Agent Residual Streams Extracting Training Data from Diffusion Language Models via Infilling By Their Fruits You Will Know Them: Comparing Formalizations of Law by the Decisions They Encode Direct Preference Optimization for English-Mandarin Code-Switching Speech Recognition in Audio LLMs Knowledge Graph-Driven Expert-Level Reasoning for Neuroscience ContextEcho: A Benchmark for Persona Drift in Long Agentic-Coding Sessions Decompose-and-Refine: Structured Legal Question Answering with Parametric Retrieval Side-by-side Comparison Amplifies Dialect Bias in Language Models How Much Structure Do LLMs Need? Evaluating LLMs for Bibliometric Cluster Description QUEST: Training Frontier Deep Research Agents with Fully Synthetic Tasks Word Class Representations Spontaneously Emerge from Successor Representations Trained on Natural Language Structure-Aware RAG: Structured Retrieval Augmented Generation from Noisy Data for Conversational Agents Guarded Repair for Harm-Aware Post-hoc Replacement of LLM Mathematical Reasoning TriVAL: A Tri-Validation Framework for Faithful Automatic Optimization Modeling AI-Associated Lexical Shifts Across 34 Languages: Cross-Lingual Convergence and Diachronic Uptake in News Writing AERIC: Anticipatory Hidden-State Monitoring for Implicit Harmful Dialogue P1SCO: Social Dimensions from a Perspectivist Lens Grammatically-Guided Sparse Attention for Efficient and Interpretable Transformers Teaching Through Analogies: A Modular Pipeline for Educational Analogy Generation Translators as Invisible Teachers of AI: Copyright, Translation Memory, and the Political Economy of Linguistic Data Mimir: Large-scale Multilingual Concept Modeling Re-defining Humor Data Objects for AI Humor Research STREAM: A Data-Centric Framework for Mining High-Value Task-Oriented Dialogues from Streaming Media Large Language Model Selection with Limited Annotations
Phonetic Modeling of Dialectal Variation in Vietnamese Speech
Quan Ngoc Ho · 2026-05-26 · via cs.CL updates on arXiv.org

View PDF

Abstract:Vietnamese exhibits substantial dialectal phonetic variation across Northern, Central, and Southern regions, where identical lexical items may be realized with markedly different pronunciations. Such variation poses challenges for automatic speech recognition (ASR) and remains difficult to model computationally due to the complex relationship between Vietnamese orthography and phonology. Existing approaches typically address dialect variability at the word level, assuming dialect-invariant mappings between spelling and pronunciation, which limits their ability to capture systematic phonetic differences. We propose a dialect-aware phonetic framework that explicitly models Vietnamese phonological structure and dialectal variation at both the vocabulary and decoding levels. The framework introduces a phonetic vocabulary that decomposes each syllable into structured phonetic components and maps them to dialect-specific IPA representations, together with a phonetic-structure decoder that jointly predicts these components. Experiments on the UIT-ViMD, a only-available dataset for multi-dialect in Vietnamese, show that the proposed approach outperforms various pre-trained baselines, \textbf{especially matches the performance of the strongest pretrained wav2ve2-base-vi-250h} across dialects while \textbf{using substantially fewer parameters and no external pretraining}. Code for experimental reproducibility will be publicly available upon the acceptance of this paper.
Subjects: Computation and Language (cs.CL)
Cite as: arXiv:2605.24451 [cs.CL]
  (or arXiv:2605.24451v1 [cs.CL] for this version)
  https://doi.org/10.48550/arXiv.2605.24451

arXiv-issued DOI via DataCite (pending registration)

Submission history

From: Nghia Hieu Nguyen [view email]
[v1] Sat, 23 May 2026 08:00:26 UTC (917 KB)