惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

The GitHub Blog
The GitHub Blog
T
ThreatConnect
C
Check Point Blog
T
The Exploit Database - CXSecurity.com
U
Unit 42
云风的 BLOG
云风的 BLOG
freeCodeCamp Programming Tutorials: Python, JavaScript, Git & More
T
Tenable Blog
博客园 - 叶小钗
D
Docker
T
Threatpost
WordPress大学
WordPress大学
腾讯CDC
I
Intezer
T
Tailwind CSS Blog
Engineering at Meta
Engineering at Meta
D
Darknet – Hacking Tools, Hacker News & Cyber Security
Hugging Face - Blog
Hugging Face - Blog
cs.CV updates on arXiv.org
cs.CV updates on arXiv.org
The Register - Security
The Register - Security
Stack Overflow Blog
Stack Overflow Blog
PCI Perspectives
PCI Perspectives
S
Security Archives - TechRepublic
Simon Willison's Weblog
Simon Willison's Weblog
A
Arctic Wolf
MongoDB | Blog
MongoDB | Blog
小众软件
小众软件
Hacker News: Ask HN
Hacker News: Ask HN
O
OpenAI News
博客园 - 【当耐特】
L
LINUX DO - 最新话题
C
Comments on: Blog
S
Securelist
月光博客
月光博客
S
Secure Thoughts
Security Latest
Security Latest
MyScale Blog
MyScale Blog
NISL@THU
NISL@THU
F
Full Disclosure
M
Microsoft Research Blog - Microsoft Research
T
True Tiger Recordings
SecWiki News
SecWiki News
aimingoo的专栏
aimingoo的专栏
Threat Intelligence Blog | Flashpoint
Threat Intelligence Blog | Flashpoint
L
LINUX DO - 热门话题
cs.AI updates on arXiv.org
cs.AI updates on arXiv.org
AWS News Blog
AWS News Blog
Hacker News - Newest:
Hacker News - Newest: "LLM"
L
Lohrmann on Cybersecurity
H
Help Net Security

DEV Community

AllReduce Stalls Are Network Stalls. Most Tools See Neither. Agents are workflows. SirenSpec is the workflow tool that admits it. I Built FreeDevUtils — 60+ Free In-Browser Developer Tools using github copilot an google gemini pro for developer community Most programmers are miserable and we pretend that's normal Serverless Research Paper Intelligence: Docling, Lambda Containers, and Amazon Bedrock 🇺🇸 Rails Realtime ERD: Visualize Your Rails Schema in Real Time RAG for Codebases Is Harder Than It Looks Pay for Any API from Inside Claude with Base MCP + APIbase I Set Up CI/CD for My React App in 5 Minutes — Here's the Exact YML Config GCSI 2026: AI Readiness in a City Built in Layers 🇧🇷 Rails Realtime ERD: visualize seu schema Rails em tempo real Rails Realtime ERD: visualize seu schema Rails em tempo real The Moment the JSON Config Parser Became the Enemy n8n vs Zapier — Which Is Right for Production Workflows? AI Security Tools Are Drowning Open Source Maintainers — curl Is the Canary I was wondering whether we can write both the Deployment and Service manifest in the same file? but your explaination made it clearer GitHub Copilot Has a New App. Here's What Changed for My Daily Workflow. 5 gotchas I hit moving LLM logs from Postgres to ClickHouse AWS Database Savings Plans: What DB Teams Need to Know Self-Expiring Report-Only CI Gates: From Advisory to Enforced Cadence v8.4: a multi-model coding harness where Claude writes, Codex reviews, and Bugbot triages What happens when an AI agent commits to your repo How I Run Two Claude Accounts as One How to Pass the Google Play 12-Tester Rule Without Losing Your Sanity The Degradation Ladder: How Systems Fail Before They Fail Deploy Ping Identity Products on Kubernetes with a Single Operator Flutter Deep Linking: Complete Guide for Android App Links & iOS Universal Links I Read Anthropic's 2026 Agentic Coding Trends Report. Here's What It Actually Means for Engineering Teams. Migrate from Crunchy Data PostgreSQL Operator to Percona PostgreSQL Operator: The Standby Cluster Method Less Than a Penny Per Document How to Build Your First REST API in Node.js ? MCP Isn't a Model Feature. It's a Power Outlet for Your Tools. Testing JavaScript: A Practical Guide to TDD with Jest (2026) When Your Search Tree Becomes the Bottleneck in a Distributed Game Server GitHub Code Coverage in Pull Requests: What Developers Should Set Up Now Vibe Coding vs. Real Coding: Why Both Are Wrong (and Right) Why I’m Building a Privacy-First SOW Analyzer to Kill Scope Creep (Launching Next Month) FHIR in Indian Healthcare IT: What Every Developer Building HMIS Software Needs to Know Data Normalization Across Dublin Rental Portals: How to Make Listings Comparable Building a Rental Aggregator When Daft.ie Already Exists Finishing Hakozuna HZ5: From Experimental Allocator to DOI-Archived Artifact Building search features for users in different timezones. The remote renter problem. State management for real-world workflows: tracking apartment viewings and applications How I built automated reminders into a Slack approval tool with zero coding experience Identity Verification Just Became Infrastructure — And Your Evidence Better Survive It The Production Deployment Checklist Senior Devs Never Skip (2026) Stop relying on Cursor AI. You are destroying your engineering brain Building an Automated Invoice Processing Pipeline with Node.js Built and launched WebDoctor AI 🌐🧠 AI Citation Registry: Decentralized Coordination in Government AI Attribution How to Fix CSV Encoding Issues (UTF-8, Windows-1252, and More) Building the private markets data infra for AI agents Why Your Resume Keeps Getting Rejected by ATS Systems (Even When You’re Qualified) Building an Offline-First Architecture for 40,000+ Concurrent RFID Scans I Built a Tiny Chrome Extension to Save My Mouse Wheel (Auto Scroll) # I Got Burned by Socket Chaos. Here's How I Finally Built Real-Time Calls That Actually Work. How to Cut Your CSS File Size by 40% Without Losing Any Styles Building a Zero-Friction Browser Screen Recorder (Just Press Alt + R) AI Wrappers Are Dying: Why Most AI Products Fail The Operators Regret: How We Blew Up the Event Bus at 3 AM 'Verified' mudou de significado: o que agentic engineering exige de times de desenvolvimento A Flask Vulnerability Walkthrough How DeepMind AlphaProof Nexus Cracks 56-Year-Old Math: Agentic LLM Loops and Lean Formal Verification Why your AI shouldn't decide alone: the 3-options pattern Pourquoi votre IA ne devrait pas trancher seule un audit ou une permission One year of self-hosted n8n on a $6 Hetzner VPS Adding comments to a static Astro blog with Netlify Forms I Built 30+ Free Online Tools With Zero Signup, Zero Tracking, and Instant Access We just launched on the Shopify App Store - here's the architecture behind what we built How to Delete a Cloudflare Access Application (Without Guesswork) Why Backend Secrets Leak More Often Than Developers Think: A Deep Dive into Runtime Security with XyPriss I built an MCP server for DNS + email security — 37 tools for Claude Code, Cursor, Windsurf CI/CD avec GitHub Actions I Used Amazon Bedrock as My AI Coding Partner for a Day Here's What Happened From Vibe Coding to Verified Engineering Building a ESP32-CAM Helmet Detection System Using and CircuitDigest Cloud Vitalii Kiro: The Drone War Is Over. The War of Algorithms Begins App Development Costs in India (2026): A No-Fluff Technical Breakdown How to Automate File Renaming with AI and OCR Why green CI doesn't mean your system works Capacity Governance in Microsoft Fabric: The Layer Most Teams Forget AI Observability: Stop Flying Blind in Production I love MJML — I just didn't want a whole templating engine for two tiny things Are we still in the Console Era of AI? Building a Senior-Level DevOps / SRE / Infrastructure Engineer Terminal Setup (macOS) Media Queries, Transitions, Positions, and Units (rem vs em) Explained Vibe Coding Will Destroy Your Software Engineering Career Your Payment API Wasn't Built for AI Agents. Open Banking Might Be the Fix. The Amazon Interview Process in 2026: Every Round Decoded (With Copy-Paste Scripts) Why Most Social Platforms Optimize Engagement Instead of Emotional Safety How to Build Your Own AI API Gateway (70x Cheaper Than GPT-4o) OpenBrief Review: Local-First Video AI Summarizer 2026 Announcing LightningChart JS Trader v.4.1 TensorCircuit-NG: Quantum Software On AI, For AI, With AI Open-Source Multi-Agent Orchestration: Lessons from AgentForge AI Agents in Practice — Part 3: How the Control Loop Actually Works Polymarket vs Kalshi: Who Actually Wins on Volume and Liquidity I Wired 8 MCP Servers Into One Claude Agent. 3 Pairs Quietly Fought Over the Same Tool Name. Twenty Minutes, Seventeen Organizations DNSControl + CoreDNS Container Example - Announcement
When Cucumber Grows Too Big: Pain Points, Lessons Learned, and Alternatives
Luis Iñesta · 2026-05-27 · via DEV Community

What Is Cucumber, and Why Do Teams Love It?

If you have spent any time in behavior-driven development (BDD), you have almost certainly encountered Cucumber. First released in 2008, it has become the de facto standard for writing executable specifications in plain language.

The core idea is elegant: tests are written in Gherkin, a structured natural language format built around three keywords — Given, When, and Then. A product owner, a tester, and a developer can all read the same file and agree on what the system is supposed to do.

Feature: User login

  Scenario: Successful login with valid credentials
    Given the user "alice" exists with password "secret"
    When she submits the login form
    Then she is redirected to the dashboard
    And a session cookie is set

Enter fullscreen mode Exit fullscreen mode

Under the hood, each step is matched by a regular expression or a cucumber expression to a step definition — a method in Java, Ruby, JavaScript, or whatever language your project uses. Cucumber finds the right method, runs it, and aggregates the results into a report.

The promise is compelling: business-readable tests that are also executable. The gap between what stakeholders describe and what testers automate, closed forever. In small projects and well-contained modules, it genuinely works.


Where Things Start to Break Down

I spent several years working on large backend systems where Cucumber was adopted as the standard integration testing tool. Early on, things were manageable. Over time, a set of recurring problems emerged that no amount of team discipline could fully solve.

1. The Glue Code Explosion

Every Gherkin step needs a step definition. In a project with hundreds of scenarios covering REST APIs, databases, message queues, and background jobs, this means hundreds — sometimes thousands — of step definition methods spread across dozens of classes.

The immediate problem is discoverability. When a new developer writes a step, how do they know whether it already exists? The IDE can sometimes help, but step matching relies on regular expressions that are not always obvious to navigate. You end up with:

  • Duplicate step definitions that do subtly different things
  • Slightly different phrasing that bypasses an existing step and creates a new one
  • Inconsistent abstractions because different people solved the same problem independently

The deeper problem is coupling. Step definitions are not unit-tested; they are integration plumbing. When a REST client is refactored, you find that fifteen step definitions directly instantiate it. When a database fixture format changes, you discover that nobody documented which step methods touch which tables.

2. Shared State and Context Passing

Cucumber scenarios are supposed to be independent, but step definitions need to share state: the HTTP response from the When step needs to be inspectable in the Then step. The standard solution is a World object (or @ScenarioScoped beans in Java) — a bag of shared state injected into step definition classes.

This works until you have fifteen step definition classes all mutating the same World, nobody owns the contract for what each field means, and a flaky test appears because some scenario left dirty state that wasn't cleaned up. Debugging it means reading glue code, not feature files — which defeats half the purpose of BDD.

3. The Feature File Drift Problem

In a healthy BDD process, feature files are living documents co-authored by business and technical people. In practice, after the initial sprint, product owners stop reading them. They become developer-only artifacts, written with the same mindset as JUnit tests: exhaustive, technical, and opaque to anyone outside the team.

You end up with scenarios like:

Given the user entity with id 42 exists in schema "core" table "users" with status "ACTIVE"
When the endpoint POST /api/v2/auth/session is called with payload from fixture "auth_fixtures/alice_valid.json"
Then the response body path "$.data.token" matches regex "[A-Za-z0-9-_]{40}"

Enter fullscreen mode Exit fullscreen mode

This is not BDD. It is JUnit with extra ceremony.

4. Step Definition Scope Creep

In Cucumber, step definitions are global. There is no namespacing, no module boundary, no way to say "these steps belong to the payments domain." As the test suite grows, you inherit every step ever written, and step expressions start colliding.

Teams work around this with naming conventions, careful phrasing, and tribal knowledge. That is technical debt in disguise.

5. Maintenance Cost Compounds Over Time

Every refactor of production code ripples through glue code. A renamed endpoint, a changed response schema, a migrated database table — each one can silently break dozens of step definitions, or worse, fail to break them because a step is now asserting against stale data that happens to still match.

The test suite that was supposed to give you confidence becomes a maintenance burden that slows releases down. At some point the question stops being "how do we fix the tests?" and becomes "is this the right tool for this job?"


The Specific Pain That Made Me Reconsider

The breaking point for me was the combination of two things happening simultaneously.

First, onboarding friction. A new team member joining the project needed days to understand the glue code before they could write a single new test. The feature files were not self-explanatory; they were a surface sitting on top of an iceberg of implementation. That is the opposite of what BDD promises.

Second, the semantic gap for API testing. Our integration tests were almost entirely black-box: send an HTTP request, assert on the response, check the database state. For this use case, Cucumber adds a translation layer — Gherkin step → step definition → HTTP client call — that provides no value. The "business readable" framing makes no sense for Then the response status is 200. Nobody is showing those files to a product owner.

We were paying the full cost of Cucumber's glue code model while getting almost none of its BDD benefits.


Alternatives Worth Considering

Depending on what is actually causing your pain, different tools address different problems.

Karate is the closest direct alternative for API testing. It uses a Gherkin-like syntax but eliminates step definitions entirely — steps are interpreted directly by the framework. You get zero glue code for REST and GraphQL testing, plus built-in mocking and performance testing. If your Cucumber usage is primarily API testing, Karate is worth a serious look.

REST-assured (with JUnit or TestNG) takes the opposite position: abandon the DSL entirely and write your tests as code. You lose the business-readable layer, but you gain the full power of a proper programming language — real abstractions, composable helpers, IDE support, type safety. For teams that have already given up on non-developer readers, this is often the pragmatic choice.

Playwright / Cypress are not Cucumber replacements, but if your integration tests are UI-heavy, their built-in test organization and recording capabilities may do more for you than Cucumber's BDD layer.

SpecFlow (for .NET) and Behave (for Python) are Cucumber-family tools that sometimes have better ecosystem integration for their respective stacks, though they share the same architectural tradeoffs.

The rule of thumb I arrived at: Cucumber earns its keep when the scenario files are genuinely read and validated by non-developers on a regular basis. If that is not happening — and it often is not — you are paying glue-code tax for a benefit you are not receiving.


A Tool Built From These Lessons

After working through these problems repeatedly, I designed Azertio as an attempt to take what Cucumber gets right (human-readable Gherkin, structured scenario files) and eliminate what causes the most pain in large projects.

The central bet: for black-box testing of REST APIs and databases, there should be no glue code at all. Steps are provided by plugins loaded at runtime — rest, db, and others — and every step in those plugins is immediately available in any feature file without any wiring. You declare which plugins you use in a YAML config file and write tests immediately.

Scenario: Creating an order reduces stock
  Given db table stock has:
    | sku   | units |
    | P-001 | 10    |
  When I make a POST request to "orders" with body:
    """json
    { "sku": "P-001", "quantity": 3 }
    """
  Then the HTTP status code is equal to 201
  And db table stock row where sku = "P-001" has units = 7

Enter fullscreen mode Exit fullscreen mode

No step definitions. No World objects. No regex to maintain.

It also directly addresses the definition/implementation split that Cucumber struggles with: you can write a definition feature (business-readable, owned by the product team) and a separate implementation feature (technical, owned by testers), linked by a tag. The execution report shows the business structure; the implementation details stay out of the way.

The project is open source, still early, and genuinely shaped by the frustrations described in this article. If any of this resonates with your own Cucumber experience, I would be glad to hear your feedback at azertio.org.


Have you hit any of these Cucumber pain points in your own projects? Which solutions worked for you? Let me know in the comments.