惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

MongoDB | Blog
MongoDB | Blog
IT之家
IT之家
J
Java Code Geeks
Cyber Security Advisories - MS-ISAC
Cyber Security Advisories - MS-ISAC
Recent Announcements
Recent Announcements
博客园 - 三生石上(FineUI控件)
博客园_首页
MyScale Blog
MyScale Blog
腾讯CDC
I
InfoQ
钛媒体:引领未来商业与生活新知
钛媒体:引领未来商业与生活新知
人人都是产品经理
人人都是产品经理
Vercel News
Vercel News
奇客Solidot–传递最新科技情报
奇客Solidot–传递最新科技情报
量子位
爱范儿
爱范儿
U
Unit 42
aimingoo的专栏
aimingoo的专栏
B
Blog RSS Feed
云风的 BLOG
云风的 BLOG
M
MIT News - Artificial intelligence
A
About on SuperTechFans
T
The Blog of Author Tim Ferriss
Blog — PlanetScale
Blog — PlanetScale
OSCHINA 社区最新新闻
OSCHINA 社区最新新闻
Engineering at Meta
Engineering at Meta
博客园 - 叶小钗
小众软件
小众软件
Jina AI
Jina AI
Hugging Face - Blog
Hugging Face - Blog
Google DeepMind News
Google DeepMind News
The Cloudflare Blog
freeCodeCamp Programming Tutorials: Python, JavaScript, Git & More
D
Docker
CTFtime.org: upcoming CTF events
CTFtime.org: upcoming CTF events
博客园 - 【当耐特】
博客园 - Franky
H
Help Net Security
Stack Overflow Blog
Stack Overflow Blog
阮一峰的网络日志
阮一峰的网络日志
C
Check Point Blog
C
CERT Recently Published Vulnerability Notes
cs.AI updates on arXiv.org
cs.AI updates on arXiv.org
Cisco Talos Blog
Cisco Talos Blog
H
Hackread – Cybersecurity News, Data Breaches, AI and More
I
Intezer
Latest news
Latest news
D
Darknet – Hacking Tools, Hacker News & Cyber Security
博客园 - 司徒正美
Microsoft Security Blog
Microsoft Security Blog

cs.CR updates on arXiv.org

A Security Analysis of Long-Horizon Agentic AI Systems: Threats, Evaluation, and Framework Development Is Your Agent Playing Dead? Deployed LLM Agents Exhibit Constraint-Evasive Fabrication and Thanatosis AutoDojo: Adaptive Attacks Expose Superficial Defenses and User-Underspecification Limits in LLM Agents Benign in Isolation, Harmful in Composition: Security Risks in Agent Skill Ecosystems Defending against Adaptive Prompt Injection Attacks via Reasoning-enabled Task Alignment CmdNeedle: Measuring the Incompleteness of Command Denylists for AI Agents FragFuse: Bypassing Access Control of Large Language Model Agents via Memory-Based Query Fragmentation and Fusion AnonShield: Scalable On-Premise Pseudonymization for CSIRT Vulnerability Data Odds Law: The Decomposition Algebra On How Intelligence Organizes Itself to Solve Difficult Problems Reliably Snyk VulnBench JS 1.0: Can LLMs Find the Same Bugs Twice? GAS-Leak-LLM: Genetic Algorithm-Based Suffix Optimization for Black-Box LLM Jailbreaking Let Them Steal: Trapping Large Language Model Extraction Attacks with Knowledge Honeypot SkillVetBench: LLM-as-Judge for Multi-Dimensional Security Risk Evaluation in Open-Source LLM Agent Skills MASCOT-Android: A Curated Dataset and Automated Collection Pipeline for Android Malware Source Code Specimens SPARK: Security Knowledge Priming and Representation-Guided Knowledge Activation for LLM-based Secure Code Generation The Proxy Knows Too Much: Sealing LLM API Routers with Attested TEEs Your "Pro" LLM Subscription May Actually Be "Free": Exposing Fingerprint Spoofing Risks in LLM Inference Services Censorship-Resistant Sealed-Bid Auctions on Blockchains Continual Backdoor Training in IoT/CPS Security Engineering of OpenClaw: Analyzing Attack Surface Expansion and Trust-Boundary Violations Semantic Integrity Failures in Document-to-LLM Supply Chains BT-MTD: Bus Traversal-based Moving Target Defense for Smart Grid Fuzzy PSI from Symmetric Primitives with Exact Logarithmic Dependence on Distance Threshold Data-Centric Benchmarking of Exploit Generation in LLMs: Understanding the Impact of Fine-Tuning VLALeaks: Membership Inference Attacks against Vision-Language-Action Models Robust and Precise Application Fingerprinting on 5G Physical Uplink Channel LLM: LSTM Look-Ahead Moving Target Defense Based on Historical Malicious Scan The Audit Gap in Blockchain Security: A Four-Year Empirical Study of Public Audit Findings and Real-World Exploit Incidents Multi-tier Differential Private Query Release FEnc$^2$: Unifying Data Packing for Efficient Private Inference via Convolution and Architecture-Aware Fragment Encoding did:crdt: Coordination-Free Decentralised Identifiers via Signed CRDTs FuseChain: Runtime Evidence Reconstruction for Software Supply-Chain Attacks Stickel-type key exchange with hidden subspaces New Ideas on a New Old Type of Cipher:The Mixed-Radix One-Time Pad The Anatomy of Scam Scenarios: Large-Scale Characterization and Conversation-Aware Detection Invisible Manipulation Channels in AI-Assisted Financial Advisory: Implications for Market Integrity and Regulatory Design Scalable Malware Family Classification Using Quantum Kernel Based Machine Learning Dynamic Malicious Skills in Agentic AI From Refusal Geometry to Safety Geometry: Harmfulness--Refusal Coupling under Dynamic Adversarial Fine-Tuning MIPSBLEED: Uncovering Microarchitectural Timing Leaks in Pervasive Embedded Processors MPX: A Unified Systolic Array for Matrix and Polynomial Multiplication Transferable Self-Evolving Playbooks for Agentic Security Auditing A Formal Resilience Framework for Cyber-Physical Embodied Systems under Device-Level Cyberattacks Measurement Study of Post-Quantum Readiness of Internet: 2026 A data-driven security quantification framework for IoT-based systems SoK: Taxonomizing the Low-Level Attack Surface of Modern Web Browsers From Third-Party to First-Party: Measuring and Protecting Against Modern Web Tracking Mechanisms
AttackonCTF: Defending Hardware Security Competition Benchmarks in the Age of LLMs
[Submitted on 14 Jun 2026] · 2026-06-16 · via cs.CR updates on arXiv.org

View PDF HTML (experimental)

Abstract:Hardware security competitions such as HackTheSilicon serve as benchmarking platforms for evaluating vulnerability detection methods and for training humans and AI. However, our study reveals that LLMs threaten their validity. Instead of genuine security reasoning, detectors exploit a diff-style syntactic comparison, achieving an 83% detection rate, undermining fair evaluation. To mitigate this, we propose the first LLM-oriented, semantics-preserving obfuscation framework for these benchmarks. Unlike IP-protection approaches, it applies human-readable transformations and controlled diff-noise while preserving functionality. On HackTheSilicon, the framework reduces LLM-based detection accuracy by 50% with only 10% obfuscation and by 78.6% under complete obfuscation, restoring benchmark reliability.

Submission history

From: Mohamadreza Rostami [view email]
[v1] Sun, 14 Jun 2026 13:22:27 UTC (2,435 KB)