The Value of Adaptivity in LSM Bloom-Filter Tuning: A Log-Law and a Two-Clock Frontier - 惯性聚合

推荐订阅源

Proofpoint News Feed

The Hacker News

Google Developers Blog

Schneier on Security

freeCodeCamp Programming Tutorials: Python, JavaScript, Git & More

Security Archives - TechRepublic

博客园 - Franky

Recent Announcements

Hacker News - Newest: "LLM"

Kaspersky official blog

Engineering at Meta

Java Code Geeks

Google Online Security Blog

Last Week in AI

Vulnerabilities – Threatpost

News and Events Feed by Topic

cs.CL updates on arXiv.org

Y Combinator Blog

博客园 - 【当耐特】

Hacker News: Ask HN

Tor Project blog

Apple Machine Learning Research

Microsoft Security Blog

Exploit-DB.com RSS Feed

Security Affairs

About on SuperTechFans

Darknet – Hacking Tools, Hacker News & Cyber Security

博客园 - 聂微东

奇客Solidot–传递最新科技情报

Check Point Blog

宝玉的分享

Visual Studio Blog

The Blog of Author Tim Ferriss

cs.DB updates on arXiv.org

Architectural Evolution and Selection Framework for Database Systems in AI-Ready Data Platforms Fast LLM-Based Semantic Filtering: From a Unified Framework to an Adaptive Two-Phase Method Demand-Driven Vulnerability Detection for Cloud Security Posture Management: Removing Human Rule Authoring from the Disclosure-to-Protection Critical Path Larch: Learned Query Optimization for Semantic Predicates DP4SQL: Differentially Private SQL with Flexible Privacy Policies Data Profiling for Change Rules RACT: Retrieval Augmented Column-Table Learning and Prediction for Multi-Table Schema Matching The Role of Semirings in Incremental View Maintenance DataEvolver: Automatic Data Preparation for Large Language Models through Multi-Level Self-Evolving Principles of Concept Representation in Sentence Encoders TOKI: A Bitemporal Operator Algebra for Contradiction Resolution in LLM-Agent Persistent Memory Causal Scaffolding for Physical Reasoning: A Benchmark for Causally-Informed Physical World Understanding in VLMs QCFuse: Query-Aware Cache Fusion via Compressed View for Efficient RAG Serving Data Flow Control: Data Safety Policies for AI Agents LLM-Guided ANN Index Optimization for Human-Object Interaction Retrieval CYGNET: Cypher Gate for Neural Execution Triage and Cost Containment MLSkip: Data Skipping for ML Filters via Lightweight Metadata Formalizing all indexed mathematics as a benchmark for general reasoning, with the example of implementing dilatations of categories A Community Survey on SHACL and ShEx: Briding Gaps in RDF Validation BlobShuffle: Cost-Effective Repartitioning in Stream Processing Systems via Object Storage Exemplified with Kafka Streams CAPER: Clause-Aligned Process Supervision for Text-to-SQL HRNN: A Hybrid Graph Index for Approximate Reverse k-Nearest Neighbor Search on High-Dimensional Vectors Cost-Aware Optimization for Agentic Query Execution ACRONYM: Accelerated Approximate Nearest Neighbor Search in Memory for Dynamic Vector Databases The Case for Text-to-SQL Friendly Logical Database Design LAANN: I/O-Aware Look-Ahead Search for Disk-Based Approximate Nearest Neighbor Search TimeBlocks: Foundational and Continual Time-Series Blockbase -- Extended Version Self-Conditioned Positional HNSW for Overlap-Aware Retrieval in Chunked-Document RAG Systems: Method and Industrial Evidence-Quality Audit Inference Cost Attacks for Retrieval-Augmented Large Language Models Can we trust LLM Self-Explanations for Entity Resolution? The World's Fastest Matching Engine Algorithm PE-means: Improved Differentially Private $k$-means Clustering through Private Evolution Vector Linking via Cross-Model Local Isometric Consistency SpecDB: LLM-Generated Customized Databases via Feature-Oriented Decomposition Modeling and Optimization for Massive Data Allocation in Database Sophrosyne: Agentic Exploration of Relational Data Systems Needs Moderation Listing Even Cycles Faster than the Submodular-Width Barrier Towards Reliable Agentic Progressive Text-to-Visualization with Verification Rules Explaining Rankings with Hidden Group Bonuses One Ring to Shuffle Them All: Scalable Intra-Process Data Redistribution with Ring-Buffer Shuffle in Redpanda Oxla ScanTwin: Simulating Performance Regressions Without Access to Tenant Data Residual-Entropy Accounting for Routed Atom-Budgeted Learned Indexes IORM: Hierarchical I/O Governance for Thousands of Consolidated Databases on Oracle Exadata Building Community-Centred NLP Resources for Puno Quechua Efficient Shapley-Based Influence Attribution in Social Networks Are Diffusion Language Models Good Database Analysts? A Query Engine for the Agents Discovery Agents for Real-Time Analytics: Toward Proactive Insight Systems Beyond the Data Mesh Illusion: Designing Modern AI-augmented Lakehouses to Bridge the Gap Between Theory and Practice Knowledge Graphs as the Missing Data Layer for LLM-Based Industrial Asset Operations RT-RkNN: Reverse k Nearest Neighbor Queries as a Graphics Ray Casting Problem Generalized Range Filtering Approximate Nearest Neighbor Search: Containment and Overlap [Technical Report] Geo: A Query Rewrite Framework for Graph Pattern Mining Is Agent Memory a Database? Rethinking Data Foundations for Long-Term AI Agent Memory From Facts to Insights: A Persona-Driven Dual Memory Framework and Dataset for Role-Playing Agents Tetris: Tile-level Sampling for Efficient and High-Fidelity Video Object Tracking AgentIR: A Workload-Adaptive Cascade Retrieval Substrate for Long-Term Conversational Memory CAFS: A Cache-Aware Frequency Sort for Low-Cardinality Integer Data on x86-64 Top-k Approximate Functional Dependency Discovery MVR-cache: Optimizing Semantic Caching via Multi-Vector Retrieval and Learned Prompt Segmentation MetaboKG: An Analysis-centric Knowledge Graph Framework for Untargeted Metabolomics LEARNT: A Practical Estimator for Cardinality of LIKE Queries with Formal Accuracy Guarantees Incorporating Deep Learning Design in Database Queries AvalancheBench: Evaluating Enterprise Data Agents Through Latent World Recovery The Time is Here for Just-in-Time Systems: Challenges and Opportunities CHRONOS: Temporally-Aware Multi-Agent Coordination for Evolving Data Marketplaces A Pragmatic Approach to Learned Indexing in RocksDB: Targeted Optimizations with Minimal System Modification BCTuner: LLM-Guided Monte Carlo Tree Search for Efficient Blockchain Knob Tuning A Fine-Tuned BERT Classifier for Personal-Letter Titles in Late-Ming and Early-Qing Collected Works Finding Performance Issues in Database Systems by Exploiting Dormant Code Paths Measuring Database Unfairness via Dependency Quantification Under Differential Privacy Evaluation of Pipelines for Data Integration into Knowledge Graphs Residual Skill Optimization for Text-to-SQL Ensembles AOP-Wiki EMOD 3.0: Data Model Expansions and Content Evaluation Framework for Using Agentic AI to Improve Integration between AOPs and New Approach Methodologies (NAMs) Dynamic Shapley Computation A Case for Agentic Tuning: From Documentation to Action in PostgreSQL Block-Sphere Vector Quantization AffectAI-Capture: A Reproducible Multimodal Protocol for Small-Group Meeting Research GroupAffect-4: A Multimodal Dataset of Four-Person Collaborative Interaction CogScale: Scalable Benchmark for Sequence Processing TextAlign: Preference Alignment for Text Rendering with Hierarchical Rewards LogRouter: Adaptive Two-Level LLM Routing for Log Question Answering in Big Data Systems Expressive Power of Deep Homomorphism Networks over Relational Databases Agentic Cost-Aware Query Planning with Knowledge Distillation for Big Data Analytics Covariance Structure and Coordinate Heterogeneity Govern Binary Quantization of Contrastive Embeddings IVF-TQ: Calibration-Free Streaming Vector Search via a Codebook-Free Residual Layer MemForest: An Efficient Agent Memory System with Hierarchical Temporal Indexing Automatic Unsupervised Ensemble Outlier Model Selection--Extended Version A Generative AI Framework for Intelligent Utility Billing CO 2 Analytics and Sustainable Resource Optimisation Towards Foundation Models for Relational Databases with Language Models and Graph Neural Networks Gaussian Relational Graph Transformer Croissant Baker: Metadata Generation for Discoverable, Governable, and Reusable ML Datasets Reducing Hallucination in Vision-Language Models via Stage-wise Preference Optimization under Distribution Shift A Horn extension of DL-Lite with NL data complexity 3D Primitives are a Spatial Language for VLMs Enabling AI-Native Mobility in 6G: A Real-World Dataset for Handover, Beam Management, and Timing Advance A CAP-like Trilemma for Large Language Models: Correctness, Non-bias, and Utility under Semantic Underdetermination EpiCastBench: Datasets and Benchmarks for Multivariate Epidemic Forecasting FERMI: Exploiting Relations for Membership Inference Against Tabular Diffusion Models Toward Multi-Database Query Reasoning for Text2Cypher

The Value of Adaptivity in LSM Bloom-Filter Tuning: A Log-Law and a Two-Clock Frontier

[Submitted on 16 Jun 2026] · 2026-06-17 · via cs.DB updates on arXiv.org

此内容由惯性聚合(RSS阅读器)自动聚合整理，仅供阅读参考。原文来自 — 版权归原作者所有。