惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

Exploit-DB.com RSS Feed
Exploit-DB.com RSS Feed
让小产品的独立变现更简单 - ezindie.com
让小产品的独立变现更简单 - ezindie.com
Recent Announcements
Recent Announcements
阮一峰的网络日志
阮一峰的网络日志
爱范儿
爱范儿
博客园_首页
Last Week in AI
Last Week in AI
月光博客
月光博客
有赞技术团队
有赞技术团队
IT之家
IT之家
博客园 - Franky
P
Proofpoint News Feed
Hugging Face - Blog
Hugging Face - Blog
钛媒体:引领未来商业与生活新知
钛媒体:引领未来商业与生活新知
Microsoft Azure Blog
Microsoft Azure Blog
博客园 - 三生石上(FineUI控件)
Cyber Security Advisories - MS-ISAC
Cyber Security Advisories - MS-ISAC
V
V2EX
CTFtime.org: upcoming CTF events
CTFtime.org: upcoming CTF events
云风的 BLOG
云风的 BLOG
WordPress大学
WordPress大学
The GitHub Blog
The GitHub Blog
人人都是产品经理
人人都是产品经理
A
About on SuperTechFans
N
Netflix TechBlog - Medium
雷峰网
雷峰网
Recorded Future
Recorded Future
S
Securelist
C
CERT Recently Published Vulnerability Notes
Vercel News
Vercel News
F
Full Disclosure
C
Cybersecurity and Infrastructure Security Agency CISA
A
Arctic Wolf
Simon Willison's Weblog
Simon Willison's Weblog
L
LINUX DO - 热门话题
T
Tenable Blog
MongoDB | Blog
MongoDB | Blog
V
Visual Studio Blog
cs.CV updates on arXiv.org
cs.CV updates on arXiv.org
Jina AI
Jina AI
TaoSecurity Blog
TaoSecurity Blog
H
Hacker News: Front Page
D
DataBreaches.Net
Google DeepMind News
Google DeepMind News
T
The Exploit Database - CXSecurity.com
S
Security @ Cisco Blogs
W
WeLiveSecurity
酷 壳 – CoolShell
酷 壳 – CoolShell
D
Darknet – Hacking Tools, Hacker News & Cyber Security
SecWiki News
SecWiki News

cs.SE updates on arXiv.org

VLA Foundry: A Unified Framework for Training Vision-Language-Action Models Evaluating LLM-Generated Obfuscated XSS Payloads for Machine Learning-Based Detection Do Agents Dream of Root Shells? Partial-Credit Evaluation of LLM Agents in Capture the Flag Challenges Refute-or-Promote: An Adversarial Stage-Gated Multi-Agent Review Methodology for High-Precision LLM-Assisted Defect Discovery From Particles to Perils: SVGD-Based Hazardous Scenario Generation for Autonomous Driving Systems Testing Choose Your Own Adventure: Non-Linear AI-Assisted Programming with EvoGraph Human-Machine Co-Boosted Bug Report Identification with Mutualistic Neural Active Learning LLMSniffer: Detecting LLM-Generated Code via GraphCodeBERT and Supervised Contrastive Learning Neurosymbolic Repo-level Code Localization CodeMMR: Bridging Natural Language, Code, and Image for Unified Retrieval Symbolic Guardrails for Domain-Specific Agents: Stronger Safety and Security Guarantees Without Sacrificing Utility Verification Modulo Tested Library Contracts The Semi-Executable Stack: Agentic Software Engineering and the Expanding Scope of SE Scaling Test-Time Compute for Agentic Coding AI-Assisted Requirements Engineering: An Empirical Evaluation Relative to Expert Judgment From Procedural Skills to Strategy Genes: Towards Experience-Driven Test-Time Evolution Atropos: Improving Cost-Benefit Trade-off of LLM-based Agents under Self-Consistency with Early Termination and Model Hotswap Vibe-Coding: Feedback-Based Automated Verification with no Human Code Inspection, a Feasibility Study Benchmarks for Trajectory Safety Evaluation and Diagnosis in OpenClaw and Codex: ATBench-Claw and ATBench-Codex Bounded Autonomy for Enterprise AI: Typed Action Contracts and Consumer-Side Execution AIPC: Agent-Based Automation for AI Model Deployment with Qualcomm AI Runtime Analyzing Chain of Thought (CoT) Approaches in Control Flow Code Deobfuscation Tasks Asking What Matters: Reward-Driven Clarification for Software Engineering Tasks Prompt-Driven Code Summarization: A Systematic Literature Review LinuxArena: A Control Setting for AI Agents in Live Production Software Environments LLMs taking shortcuts in test generation: A study with SAP HANA and LevelDB Large Language Models to Enhance Business Process Modeling: Past, Present, and Future Trends CollabCoder: Plan-Code Co-Evolution via Collaborative Decision-Making for Efficient Code Generation Sentiment analysis for software engineering: How far can zero-shot learning (ZSL) go? Learning from Change: Predictive Models for Incident Prevention in a Regulated IT Environment The Cognitive Circuit Breaker: A Systems Engineering Framework for Intrinsic AI Reliability Dive into Claude Code: The Design Space of Today's and Future AI Agent Systems SIR-Bench: Evaluating Investigation Depth in Security Incident Response Agents CodeTracer: Towards Traceable Agent States Context Kubernetes: Declarative Orchestration of Enterprise Knowledge for Agentic AI Systems FM-Agent: Scaling Formal Methods to Large Systems via LLM-Based Hoare-Style Reasoning From Translation to Superset: Benchmark-Driven Evolution of a Production AI Agent from Rust to Python AgentForge: Execution-Grounded Multi-Agent LLM Framework for Autonomous Software Engineering OOM-RL: Out-of-Money Reinforcement Learning Market-Driven Alignment for LLM-Based Multi-Agent Systems Designing Adaptive Digital Nudging Systems with LLM-Driven Reasoning Taking a Pulse on How Generative AI is Reshaping the Software Engineering Research Landscape E2E-REME: Towards End-to-End Microservices Auto-Remediation via Experience-Simulation Reinforcement Fine-Tuning Ambiguity Detection and Elimination in Automated Executable Process Modeling Compliant But Unsatisfactory: The Gap Between Auditing Standards and Practices for Probabilistic Genotyping Software Resilient Write: A Six-Layer Durable Write Surface for LLM Coding Agents LLMs for Qualitative Data Analysis Fail on Security-specificComments in Human Experiments Verify Before You Fix: Agentic Execution Grounding for Trustworthy Cross-Language Code Analysis The Code Whisperer: LLM and Graph-Based AI for Smell and Vulnerability Resolution AutoFlows++: Hierarchical Message Flow Mining for System on Chip Designs DynamicsLLM: a Dynamic Analysis-based Tool for Generating Intelligent Execution Traces Using LLMs to Detect Android Behavioural Code Smells Vibe-driven model-based engineering Machine Learning-Based Detection of MCP Attacks Towards an Appropriate Level of Reliance on AI: A Preliminary Reliance-Control Framework for AI in Software Engineering How Many Tries Does It Take? Iterative Self-Repair in LLM Code Generation Across Model Scales and Benchmarks Intent-aligned Formal Specification Synthesis via Traceable Refinement ClawVM: Harness-Managed Virtual Memory for Stateful Tool-Using LLM Agents From Helpful to Trustworthy: LLM Agents for Pair Programming MR-Coupler: Automated Metamorphic Test Generation via Functional Coupling Analysis Applying an Agentic Coding Tool for Improving Published Algorithm Implementations Formal Architecture Descriptors as Navigation Primitives for AI Coding Agents Rebooting Microreboot: Architectural Support for Safe, Parallel Recovery in Microservice Systems Can Coding Agents Be General Agents? Automating Structural Analysis Across Multiple Software Platforms Using Large Language Models CCCE: A Continuous Code Calibration Engine for Autonomous Enterprise Codebase Maintenance via Knowledge Graph Traversal and Adaptive Decision Gating Building Trust in the Skies: A Knowledge-Grounded LLM-based Framework for Aviation Safety Contract-Coding: Towards Repo-Level Generation via Structured Symbolic Paradigm ECM Contracts: Contract-Aware, Versioned, and Governable Capability Interfaces for Embodied Agents Externalization in LLM Agents: A Unified Review of Memory, Skills, Protocols and Harness Engineering AgentOpt v0.1 Technical Report: Client-Side Optimization for LLM-Based Agent CODESTRUCT: Code Agents over Structured Action Spaces Chinese Language Is Not More Efficient Than English in Vibe Coding: A Preliminary Study on Token Cost and Problem-Solving Rate Inside the Scaffold: A Source-Code Taxonomy of Coding Agent Architectures Evaluating the Formal Reasoning Capabilities of Large Language Models through Chomsky Hierarchy WybeCoder: Verified Imperative Code Generation QuanBench+: A Unified Multi-Framework Benchmark for LLM-Based Quantum Code Generation ContractSkill: Repairable Contract-Based Skills for Multimodal Web Agents From Natural Language to PromQL: A Catalog-Driven Framework with Dynamic Temporal Resolution for Cloud-Native Observability Evaluating Reliability Gaps in Large Language Model Safety via Repeated Prompt Sampling From Scalars to Tensors: Declared Losses Recover Epistemic Distinctions That Neutrosophic Scalars Cannot Express Automating Crash Diagram Generation Using Vision-Language Models: A Case Study on Multi-Lane Roundabouts LoRA-MME: Multi-Model Ensemble of LoRA-Tuned Encoders for Code Comment Classification MobiFlow: Real-World Mobile Agent Benchmarking through Trajectory Fusion A Pythonic Functional Approach for Semantic Data Harmonisation in the ILIAD Project Help Without Being Asked: A Deployed Proactive Agent System for On-Call Support with Continuous Self-Improvement ACE-Bench: A Lightweight Benchmark for Evaluating Azure SDK Usage Correctness X-SYS: A Reference Architecture for Interactive Explanation Systems KRONE: Scalable LLM-Augmented Log Anomaly Detection via Hierarchical Abstraction Capture the Flags: Family-Based Evaluation of Agentic LLMs via Semantics-Preserving Transformations VeruSAGE: A Study of Agent-Based Verification for Rust Systems Process-Centric Analysis of Agentic Software Systems Enabling Predictive Maintenance in District Heating Substations: A Labelled Dataset and Fault Detection Evaluation Framework based on Service Data Context-Guided Decompilation: A Step Towards Re-executability Saber: An Efficient Sampling with Adaptive Acceleration and Backtracking Enhanced Remasking for Diffusion Language Model From Charts to Code: A Hierarchical Benchmark for Multimodal Models E2Edev: Benchmarking Large Language Models in End-to-End Software Development Task AISysRev -- LLM-based Tool for Title-abstract Screening SecureVibeBench: Benchmarking Secure Vibe Coding of AI Agents via Reconstructing Vulnerability-Introducing Scenarios TriagerX: Dual Transformers for Bug Triaging Tasks with Content and Interaction Based Rankings CodeFlowBench: A Multi-turn, Iterative Benchmark for Complex Code Generation A PennyLane-Centric Dataset to Enhance LLM-based Quantum Code Generation using RAG
Prompts Are Programs Too! Understanding How Developers Build Software Containing Prompts
Jenny T. Liang, Melissa Lin, Nikitha Rao, Brad A. Myers · 2024-09-19 · via cs.SE updates on arXiv.org

Generative pre-trained models power intelligent software features used by millions of users controlled by developer-written natural language prompts. Despite the impact of prompt-powered software, little is known about its development process and its relationship to programming. In this work, we argue that some prompts are programs and that the development of prompts is a distinct phenomenon in programming known as "prompt programming". We develop an understanding of prompt programming using Straussian grounded theory through interviews with 20 developers engaged in prompt development across a variety of contexts, models, domains, and prompt structures. We contribute 15 observations to form a preliminary understanding of current prompt programming practices. For example, rather than building mental models of code, prompt programmers develop mental models of the foundation model (FM)'s behavior on the prompt by interacting with the FM. While prior research shows that experts have well-formed mental models, we find that prompt programmers who have developed dozens of prompts still struggle to develop reliable mental models. Our observations show that prompt programming differs from traditional software development, motivating the creation of prompt programming tools and providing implications for software engineering stakeholders.