惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

A
Arctic Wolf
V
V2EX
P
Proofpoint News Feed
The Hacker News
The Hacker News
GbyAI
GbyAI
G
Google Developers Blog
S
Schneier on Security
freeCodeCamp Programming Tutorials: Python, JavaScript, Git & More
W
WeLiveSecurity
Security Archives - TechRepublic
Security Archives - TechRepublic
博客园 - Franky
Recent Announcements
Recent Announcements
腾讯CDC
Hacker News - Newest:
Hacker News - Newest: "LLM"
K
Kaspersky official blog
U
Unit 42
Engineering at Meta
Engineering at Meta
J
Java Code Geeks
Google Online Security Blog
Google Online Security Blog
Last Week in AI
Last Week in AI
V
Vulnerabilities – Threatpost
N
News and Events Feed by Topic
O
OpenAI News
量子位
cs.CL updates on arXiv.org
cs.CL updates on arXiv.org
Y
Y Combinator Blog
博客园 - 【当耐特】
Vercel News
Vercel News
Hacker News: Ask HN
Hacker News: Ask HN
T
Tor Project blog
Apple Machine Learning Research
Apple Machine Learning Research
Microsoft Security Blog
Microsoft Security Blog
Exploit-DB.com RSS Feed
Exploit-DB.com RSS Feed
AWS News Blog
AWS News Blog
MongoDB | Blog
MongoDB | Blog
S
Security Affairs
A
About on SuperTechFans
Project Zero
Project Zero
D
Darknet – Hacking Tools, Hacker News & Cyber Security
博客园 - 聂微东
Webroot Blog
Webroot Blog
奇客Solidot–传递最新科技情报
奇客Solidot–传递最新科技情报
Cloudbric
Cloudbric
T
Tenable Blog
月光博客
月光博客
C
Check Point Blog
宝玉的分享
宝玉的分享
V
Visual Studio Blog
T
The Blog of Author Tim Ferriss
NISL@THU
NISL@THU

cs.SE updates on arXiv.org

Defective Task Descriptions in LLM-Based Code Generation: Detection and Analysis Leveraging LLMs for Multi-File DSL Code Generation: An Industrial Case Study Less Is More: Engineering Challenges of On-Device Small Language Model Integration in a Mobile Application Understanding the Limits of Automated Evaluation for Code Review Bots in Practice Diffusion Templates: A Unified Plugin Framework for Controllable Diffusion MEMCoder: Multi-dimensional Evolving Memory for Private-Library-Oriented Code Generation RefEvo: Agentic Design with Co-Evolutionary Verification for Agile Reference Model Generation AgentPulse: A Continuous Multi-Signal Framework for Evaluating AI Agents in Deployment Constraint-Guided Multi-Agent Decompilation for Executable Binary Recovery Learning Selective LLM Autonomy from Copilot Feedback in Enterprise Customer Support Workflows Query2Diagram: Answering Developer Queries with UML Diagrams ClawMark: A Living-World Benchmark for Multi-Turn, Multi-Day, Multimodal Coworker Agents AgentEval: DAG-Structured Step-Level Evaluation for Agentic Workflows with Error Propagation Tracking Grammar-Constrained Refinement of Safety Operational Rules Using Language in the Loop: What Could Go Wrong Uncertainty Propagation in LLM-Based Systems Automating Categorization of Scientific Texts with In-Context Learning and Prompt-Chaining in Large Language Models UniAda: Universal Adaptive Multi-objective Adversarial Attack for End-to-End Autonomous Driving Systems An Empirical Evaluation of Locally Deployed LLMs for Bug Detection in Python Code Knowledge Lever Risk Management for Software Engineering: A Stochastic Framework for Mitigating Knowledge Loss AI-Assisted Code Review as a Scaffold for Code Quality and Self-Regulated Learning: An Experience Report RAT: RunAnyThing via Fully Automated Environment Configuration ArgRE: Formal Argumentation for Conflict Resolution in Multi-Agent Requirements Negotiation No Test Cases, No Problem: Distillation-Driven Code Generation for Scientific Workflows Code Broker: A Multi-Agent System for Automated Code Quality Assessment AnemiaVision: Non-Invasive Anemia Detection via Smartphone Imagery Using EfficientNet-B3 with TrivialAugmentWide, Mixup Augmentation, and Persistent Patient History Management How Do AI Agents Spend Your Money? Analyzing and Predicting Token Consumption in Agentic Coding Tasks Quality-Driven Selective Mutation for Deep Learning From Natural Language to Verified Code: Toward AI Assisted Problem-to-Code Generation with Dafny-Based Formal Verification Evaluating LLM-Based Goal Extraction in Requirements Engineering: Prompting Strategies and Their Limitations Who Audits the Auditor? Tamper-Proof Fraud Detection with Blockchain-Anchored Explainable ML Ethics Testing: Proactive Identification of Generative AI System Harms Call-Chain-Aware LLM-Based Test Generation for Java Projects MathDuels: Evaluating LLMs as Problem Posers and Solvers PrismaDV: Automated Task-Aware Data Unit Test Generation Agentic AI-assisted coding offers a unique opportunity to instill epistemic grounding during software development From If-Statements to ML Pipelines: Revisiting Bias in Code-Generation Verifying Machine Learning Interpretability Requirements through Provenance You Don't Need Public Tests to Generate Correct Code A Metamorphic Testing Approach to Diagnosing Memorization in LLM-Based Program Repair Probabilistic Verification of Neural Networks via Efficient Probabilistic Hull Generation A systematic review of generative AI usage for IT project management Conjecture and Inquiry: Quantifying Software Performance Requirements via Interactive Retrieval-Augmented Preference Elicitation VLAA-GUI: Knowing When to Stop, Recover, and Search, A Modular Framework for GUI Automation mcdok at SemEval-2026 Task 13: Finetuning LLMs for Detection of Machine-Generated Code Strategic Heterogeneous Multi-Agent Architecture for Cost-Effective Code Vulnerability Detection Trustworthy Clinical Decision Support Using Meta-Predicates and Domain-Specific Languages Feedback Over Form: Why Execution Feedback Matters More Than Pipeline Topology in 1-3B Code Generation Mind the Prompt: Self-adaptive Generation of Task Plan Explanations via LLMs Structural Quality Gaps in Practitioner AI Governance Prompts: An Empirical Study Using a Five-Principle Evaluation Framework Behavioral Consistency and Transparency Analysis on Large Language Model API Gateways Value-Conflict Diagnostics Reveal Widespread Alignment Faking in Language Models SWE-chat: Coding Agent Interactions From Real Users in the Wild QuanForge: A Mutation Testing Framework for Quantum Neural Networks Evaluating Assurance Cases as Text-Attributed Graphs for Structure and Provenance Analysis Early-Stage Product Line Validation Using LLMs: A Study on Semi-Formal Blueprint Analysis Deja Vu at Scale: Paraphrase-Robust Detection of Duplicate Gherkin Steps in Behaviour-Driven Software Testing with Sentence-Transformer Embeddings and a 1.1M-Step Open Benchmark Shift-Up: A Framework for Software Engineering Guardrails in AI-native Software Development -- Initial Findings WebGen-R1: Incentivizing Large Language Models to Generate Functional and Aesthetic Websites with Reinforcement Learning Towards Secure Logging: Characterizing and Benchmarking Logging Code Security Issues with LLMs Taint-Style Vulnerability Detection and Confirmation for Node.js Packages Using LLM Agent Reasoning The Path Not Taken: Duality in Reasoning about Program Execution Absorber LLM: Harnessing Causal Synchronization for Test-Time Training A Delta-Aware Orchestration Framework for Scalable Multi-Agent Edge Computing Biomedical systems biology workflow orchestration and execution with PoSyMed VLA Foundry: A Unified Framework for Training Vision-Language-Action Models Evaluating LLM-Generated Obfuscated XSS Payloads for Machine Learning-Based Detection Do Agents Dream of Root Shells? Partial-Credit Evaluation of LLM Agents in Capture the Flag Challenges Refute-or-Promote: An Adversarial Stage-Gated Multi-Agent Review Methodology for High-Precision LLM-Assisted Defect Discovery From Particles to Perils: SVGD-Based Hazardous Scenario Generation for Autonomous Driving Systems Testing Choose Your Own Adventure: Non-Linear AI-Assisted Programming with EvoGraph Human-Machine Co-Boosted Bug Report Identification with Mutualistic Neural Active Learning More Is Different: Toward a Theory of Emergence in AI-Native Software Ecosystems Co-Located Tests, Better AI Code: How Test Syntax Structure Affects Foundation Model Code Generation SolidCoder: Bridging the Mental-Reality Gap in LLM Code Generation through Concrete Execution JTPRO: A Joint Tool-Prompt Reflective Optimization Framework for Language Agents Precise Debugging Benchmark: Is Your Model Debugging or Regenerating? Mitigating Prompt-Induced Cognitive Biases in General-Purpose AI for Software Engineering LLMSniffer: Detecting LLM-Generated Code via GraphCodeBERT and Supervised Contrastive Learning Neurosymbolic Repo-level Code Localization CodeMMR: Bridging Natural Language, Code, and Image for Unified Retrieval Symbolic Guardrails for Domain-Specific Agents: Stronger Safety and Security Guarantees Without Sacrificing Utility Verification Modulo Tested Library Contracts The Semi-Executable Stack: Agentic Software Engineering and the Expanding Scope of SE Scaling Test-Time Compute for Agentic Coding AI-Assisted Requirements Engineering: An Empirical Evaluation Relative to Expert Judgment From Procedural Skills to Strategy Genes: Towards Experience-Driven Test-Time Evolution Atropos: Improving Cost-Benefit Trade-off of LLM-based Agents under Self-Consistency with Early Termination and Model Hotswap Vibe-Coding: Feedback-Based Automated Verification with no Human Code Inspection, a Feasibility Study Benchmarks for Trajectory Safety Evaluation and Diagnosis in OpenClaw and Codex: ATBench-Claw and ATBench-Codex Bounded Autonomy for Enterprise AI: Typed Action Contracts and Consumer-Side Execution AIPC: Agent-Based Automation for AI Model Deployment with Qualcomm AI Runtime Analyzing Chain of Thought (CoT) Approaches in Control Flow Code Deobfuscation Tasks Asking What Matters: Reward-Driven Clarification for Software Engineering Tasks Prompt-Driven Code Summarization: A Systematic Literature Review LinuxArena: A Control Setting for AI Agents in Live Production Software Environments LLMs taking shortcuts in test generation: A study with SAP HANA and LevelDB Large Language Models to Enhance Business Process Modeling: Past, Present, and Future Trends CollabCoder: Plan-Code Co-Evolution via Collaborative Decision-Making for Efficient Code Generation Sentiment analysis for software engineering: How far can zero-shot learning (ZSL) go? Learning from Change: Predictive Models for Incident Prevention in a Regulated IT Environment
PromptMN: Pseudo Prompting Language
[Submitted on 15 Jun 2026] · 2026-06-17 · via cs.SE updates on arXiv.org

View PDF HTML (experimental)

Abstract:Prompting has become the primary interface between humans and generative AI, yet many natural language prompts remain fragile: roles, goals, constraints, and expected outputs are often buried in prose or left implicit. In agentic and software development workflows, a misread at the first handoff can propagate through every step, since a significant portion of agent failures stem from context ambiguities rather than model limitations. This paper introduces PromptMN, a pseudo-prompting domain-specific language that annotates natural language with compact, %-prefixed typed directives covering roles, goals, requirements, priorities, constraints, plans, inputs, and outputs. Semantic resolution lets authors write in any order while the model interprets directives by function. PromptMN sits between informal prompting and programming-style pseudocode: structured enough to be inspectable and reusable, yet lightweight enough for analysts, managers, developers, and stakeholders across the software development lifecycle (SDLC). PromptMN also pairs with reverse prompt engineering. Asking a model to restate a desired outcome as PromptMN lets users inspect the inferred roles, goals, constraints, and missing assumptions before acting, reducing repair cycles and yielding a reusable artifact for aligning people and AI tools. PromptMN's feasibility is evaluated across several frontier models, including Claude Fable 5, Claude Opus 4.8, Gemini 3.1 Pro, and GPT-5.5. The models correctly resolved PromptMN instructions, including complex structures such as repetition, conditionals, methods, and a prime-checking task, without fine-tuning. The same vocabulary applies across new codebases, maintenance, and redesign in the SDLC scenarios presented. While large-scale validation remains future work, these early results suggest PromptMN is a practical step toward clearer, more reviewable human-to-AI interaction.

Submission history

From: Enkhzol Dovdon [view email]
[v1] Mon, 15 Jun 2026 18:04:50 UTC (746 KB)