惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

W
WeLiveSecurity
D
DataBreaches.Net
cs.AI updates on arXiv.org
cs.AI updates on arXiv.org
T
The Exploit Database - CXSecurity.com
D
Darknet – Hacking Tools, Hacker News & Cyber Security
腾讯CDC
PCI Perspectives
PCI Perspectives
阮一峰的网络日志
阮一峰的网络日志
S
Security Archives - TechRepublic
Hugging Face - Blog
Hugging Face - Blog
U
Unit 42
IT之家
IT之家
T
Troy Hunt's Blog
P
Proofpoint News Feed
www.infosecurity-magazine.com
www.infosecurity-magazine.com
F
Full Disclosure
V
V2EX
Stack Overflow Blog
Stack Overflow Blog
C
Comments on: Blog
V
Vulnerabilities – Threatpost
Cyber Security Advisories - MS-ISAC
Cyber Security Advisories - MS-ISAC
V
V2EX - 技术
cs.CL updates on arXiv.org
cs.CL updates on arXiv.org
N
News | PayPal Newsroom
MyScale Blog
MyScale Blog
Google DeepMind News
Google DeepMind News
Application and Cybersecurity Blog
Application and Cybersecurity Blog
OSCHINA 社区最新新闻
OSCHINA 社区最新新闻
李成银的技术随笔
P
Privacy & Cybersecurity Law Blog
大猫的无限游戏
大猫的无限游戏
V
Visual Studio Blog
T
ThreatConnect
WordPress大学
WordPress大学
Security Latest
Security Latest
C
Cybersecurity and Infrastructure Security Agency CISA
Recent Announcements
Recent Announcements
Google DeepMind News
Google DeepMind News
SecWiki News
SecWiki News
Recorded Future
Recorded Future
小众软件
小众软件
K
Kaspersky official blog
T
Tor Project blog
Last Week in AI
Last Week in AI
GbyAI
GbyAI
人人都是产品经理
人人都是产品经理
Jina AI
Jina AI
S
SegmentFault 最新的问题
MongoDB | Blog
MongoDB | Blog
Simon Willison's Weblog
Simon Willison's Weblog

DEV Community

Semantic Layer Best Practices: 7 Mistakes to Avoid I Run MCP Servers. Here's What the Recent Vulnerabilities Actually Mean for Me Phive v1.1.1 — automatic port conflict handling for local VS Code environments Building a SQL-like Relational Database Engine in C++ From Scratch How a Self-Documenting Semantic Layer Reduces Data Team Toil The Adopter: Advocating for OSS You Use (But Don't Own) Optimizing Vite Build Output: A Practical Guide to Tree-Shaking I made a free 7-video series to prep for the new GH-600 (GitHub Agentic AI Developer) cert Choosing the Right Treasure Map to Avoid Data Decay in Veltrix Migrating to Apache Iceberg: Strategies for Every Source System Stop Reviewing Every Line of AI Code - Build the Trust Stack Instead Implementation of AI in mobile applications: Comparative analysis of On-Device and On-Server approaches on Native Android and Flutter Should you use Gemma 4 for your Development? A Multiversal Analysis to Determine if Gemma 4 is Right for You! The Rising Trend of Creative Interview Questions in Tech I Spent Hours Fighting a Silent Subnet Conflict to Build an Isolated ICS Security Lab (And What It Taught Me About the Linux Kernel) It Worked When I Closed the Laptop. I Swear. We Built an Agent That Flags Fake Internships #kryx Your Personal AI Stack Is the New Dotfiles Your LLM Bill Is Exploding Because of Architecture, Not Pricing -- Here's the Fix How We Prevent Attendance Fraud Using GPS Verification AI Code Review in 2026: How the Tools Actually Differ (A Builder's Field Guide) From Problems to Patterns: Generative AI in .Net (C#) GemmaOps Edge: From 373 Alarms to 1 Root Cause Using Local AI (Gemma 4) Building an Amazon EKS Security Baseline Hands-On with Apache Iceberg Using Dremio Cloud 🤫 Firebase Is Quietly Preparing for an Offline-First AI Future Should Angular Apps Still Rely on RxJS in 2025? Gaslighting Gemma 4: Can Open-Weight Reasoning Models Withstand a Confident Liar? AI Workflow Automation Needs More Than Another Script Reviving Cineverse: From Local Storage to Firebase 🚀 Approaches to Streaming Data into Apache Iceberg Tables How to Add Rounded Corners to an Image Online The subtle impact of AI (&amp; IT) on jobs Made a Rust based AI agent Your AI is not bad, your instructions are What Clicked for Me After Building on Solana for a Few Days WhatsApp's Encryption Stack: What It Covers, What It Doesn't, and What a Federal Agent Spent 10 Months Investigating Building CogniPlan: A Local-First Task Planning System Using Apache Iceberg with Python and MPP Query Engines How I Built AegisDesk: A Zero-Token Semantic IT Agent with <5ms Latency I built CodeArchy: an open-source that turns any codebase into a visual, explainable architectural experience, powered by Gemma 4. The Day Our Bot Ran Out of Money How we're using Gemini Embeddings to build a smarter, community-driven feed on DEV The Speculative Decoding Pattern The PKCE "Gotcha" in Expo’s exchangeCodeAsync TharVA : Keeping India's Desert Heritage Alive with Offline AI (Gemma4) n8n for Healthcare: 5 Automations for Clinics, Practices, and Health Tech Teams (Free Workflow JSON) How I Built an OWASP Memory Guard for AI Agents (ASI06) Condition-Based vs Time-Based Maintenance: Making the Switch I Tested Spam Protection on Formspree vs Formgrid. The Results Were Surprising. May 27 - Video Understanding Workshop Beyond Keywords: How Google's 2026 Algorithms are Redefining SEO From Click to Cart: Ensuring an Accessible Customer Journey in WooCommerce Your company won't replace you with good AI. They'll replace you with bad AI. How to Use an SVG Icon Search Engine as a Claude Custom Connector O fim do “modelo que faz tudo”? Conheça o Conductor, a IA que orquestra outras IAs 10 First-Principles Strategies to Learn Any Programming Language Deeply 10 First-Principles Strategies to Learn Any Programming Language Deeply Understanding Embeddings easily. The Hidden Cost of “Move Fast and Break Things” Why Your Logs Are Useless Without Traces DressCode: Your AI Stylist for Tomorrow The Documented Shortcoming of Our Production Treasure Hunt Engine I'm 16, and I Built an AI Tool That Audits Your Technical Debt Without Ever Touching code Building Your Own Crypto Poker Bot: A Developer's Guide to Blockchain Gaming Logic Apache Iceberg Metadata Tables: Querying the Internals Hermes, The Self-Improving Agent You Can Actually Run Yourself Unity vs Unreal: 5 Things I Had to Relearn the Hard Way Building Agentic Commerce Infrastructure: Overcoming SQLite Concurrency for Autonomous Procurement Agents Solana Accounts vs Databases HTML Table Borders I built a skill that makes AI-generated AWS diagrams actually usable My first post! I'm kinda excited The Page Root Was the Wrong Unit How to audit what your IDE extension actually sends to the cloud I Migrated 23 Make.com Scenarios to n8n and Cut My Bill by 60% — Complete Migration Guide (2026) Solving a Logistics Problem Using Genetic Algorithms Claude Code Skills Explained: What They Are & When to Use Them (2026) Maintaining Apache Iceberg Tables: Compaction, Expiry, and Cleanup Zero-Idle Local LLMs: Running Llama 3 in AWS Lambda Containers We scanned 8 B2B SaaS companies across 5 categories. ChatGPT named the same 12 brands in every answer. How To "Market" Yourself As A Tech Pro We scanned 500 MCP servers on Smithery. Here is what we found. HTML Basics for Beginners – Markup Language, Elements and Types of CSS DiffWhisperer: How I Turned Cryptic Git Diffs into Architectural Stories with Gemma 4 I built a version manager for llama.cpp using nothing but vibe coding. Unit Testing vs System Testing: Key Differences, Use Cases, and Best Practices for 2026 A game design textbook explains why products with fewer features win How to Build a Raydium Launchpad Bonding Curve in 5 Minutes with forgekit How to turn an AI prototype into a production system How Data Lake Table Storage Degrades Over Time Partition and Sort Keys on DynamoDB: Modeling data for batch-and-stream convergence Auto-Generate Optimized GitHub Actions Workflows For Any Stack With This New CLI Tool Unchaining the African Creator Economy The Treasure Hunt Engine Gotcha - A Lesson in Constrained Performance great_cto v2.17 - no more tambourine dance When Catalogs Are Embedded in Storage SafeMind AI: Instant Health & Safety Intelligence What Is PKCE, How It Works & Flow Examples AI Agent Failure Modes Beyond Hallucination
I built a free audit tool that runs 12 checks in parallel against any domain. Here is the architecture.
Adam McClari · 2026-05-23 · via DEV Community
Cover image for I built a free audit tool that runs 12 checks in parallel against any domain. Here is the architecture.

Adam McClarin

I spent the past few months building Canopy Guard, a free website audit tool that combines SEO, AEO, and GEO visibility scoring with a full security posture check. One scan, one report, about 15 seconds.
This is the technical breakdown of how it works.
The problem
I audit websites for clients as part of my regular work. Every engagement started with the same routine: run the site through an SEO checker, then a separate security header scanner, then manually check for structured data, then look at robots.txt. Four tools, four tabs, four different report formats, and none of them cross-referenced their findings.
I wanted a single scan that checked everything and surfaced the gaps between visibility and security.
Architecture
The backend is a Node.js Express server written in TypeScript, deployed on Railway. The frontend is a React app on Vercel.
When a user enters a domain, the frontend POSTs to /api/scan on the Railway backend. The backend runs 12 scan modules in parallel using Promise.all:
const [dns, tls, headers, htmlStructure, schema, qa, geo,
crawlRisk, endpoints, links, vulns, bizLogic] =
await Promise.all([
checkDNS(domain),
checkTLS(domain),
checkSecurityHeaders(domain),
checkHTMLStructure(domain),
checkSchemaMarkup(domain),
checkQADensity(domain),
checkGEO(domain),
checkAICrawlRisk(domain),
checkExposedEndpoints(domain),
checkInternalLinking(domain),
checkVulnerabilities(domain),
checkBusinessLogic(domain),
]);
Each module is an async function that fetches specific data from the target domain and returns structured results.
The scan modules
DNS: Resolves the domain via Google's public DNS API (dns.google/resolve). Returns whether the domain resolves and the IP address.
TLS: Checks HTTPS reachability, HSTS header presence and max-age value, and whether HTTP redirects to HTTPS.
Security Headers: Checks for all six critical headers: Content-Security-Policy, Strict-Transport-Security, X-Frame-Options, X-Content-Type-Options, Referrer-Policy, and Permissions-Policy.
HTML Structure: Fetches the full page HTML and parses it for H1 count, meta description presence and length, canonical URL match, and page title.
Schema Markup: Extracts all blocks, parses them, identifies FAQPage and Organization types, and flags structural errors like missing @context.<br> Q&amp;A Density: Strips HTML tags, splits into sentences, and calculates the ratio of question-pattern sentences to total sentences. This measures how &quot;answer engine ready&quot; the content is.<br> GEO: Measures chunking efficiency (how well content divides into ~350-token blocks based on header/paragraph structure), citation precision (ratio of specific data points to generic text), and checks for llms.txt at the domain root.<br> AI Crawl Risk: Fetches robots.txt, classifies the policy as PERMISSIVE/BALANCED/RESTRICTIVE/NONE, checks for AI-specific bot blocks (GPTBot, Anthropic, Google-Extended, CCBot, ByteSpider), and looks for crawl-delay directives.<br> Exposed Endpoints: This one was interesting to build. It probes 12 common sensitive paths (/.env, /.git/config, /graphql, etc.). The tricky part: sites with catch-all redirects return 200 for every path. So the module first fetches a guaranteed-nonsense path to detect catch-all behavior. If detected, it compares each probe&#39;s response body length and content-type against the catch-all fingerprint to filter out false positives.<br> Internal Linking: Counts unique internal links on the homepage and samples a few to estimate link depth.<br> Vulnerabilities: Checks server headers for version disclosure and outdated software signatures.<br> Business Logic: Checks for author/publisher attribution markup and cross-references sitemap URLs against homepage links to find orphaned pages.<br> Scoring<br> Each module feeds into a scoring function that normalizes results to 0-1:<br> const seo_score = scoreSEO(htmlStructure, links);<br> const aeo_score = scoreAEO(schema, qa);<br> const geo_score = scoreGEO(geo);<br> const security_posture_score = scoreSecurity(<br> tls, headers, crawlRisk, endpoints, vulns<br> );<br> The scoring weights are calibrated based on what actually impacts discoverability and security posture. For example, in SEO scoring, crawlability gets the highest weight (0.25) because nothing else matters if bots cannot reach your page. In security scoring, TLS validity (0.15) and security headers (0.25 distributed across 6 headers) carry the most weight.<br> Cross-Reference Intelligence<br> This is the differentiator. After scoring, the report engine maps findings across layers:</p> <p>geo_branch.llms_txt_status vs ai_crawl_risk.robots_policy: If llms.txt is MISSING and robots is PERMISSIVE, flag as CRITICAL. AI scrapers have access with no citation guidance.<br> application_security.exposed_endpoints vs GEO context: If endpoints are exposed, AI RAG parsers can index internal routes from JavaScript bundles.<br> business_logic_gaps.data_provenance_leak vs overall visibility: If content has no attribution markup, AI training sets can ingest without linking back.</p> <p>Lead capture<br> When a user wants their PDF report, they enter their email. The frontend sends the lead data to the Railway backend, which writes it to a Notion database via the Notion API. Name, email, domain, all four scores, full report JSON, and a Status field (New/Reviewed/Booked/Closed).<br> The PDF generates entirely in-browser using a print-ready HTML template opened in a new window.<br> What I would do differently<br> If I were starting over, I would add a headless browser module (Playwright) for JavaScript-rendered sites. The current HTML parser uses server-side fetch, which misses content rendered client-side. That is the biggest gap in the current scan accuracy.<br> I would also add a competitor comparison feature: scan two domains side by side and diff the results.<br> Try it<br> Free, no signup: <a href="https://thecanopyguard.com">https://thecanopyguard.com</a><br> The code is not open source yet, but I am considering it. Would love feedback on the scoring methodology, especially the GEO layer.<br> Adam McClarin, CISSP<br> Meraki is Love Digital | Soulful TechShareContent{<br> &quot;$schema&quot;: &quot;<a href="https://json-schema.org/draft/2020-12/schema">https://json-schema.org/draft/2020-12/schema</a>&quot;,<br> &quot;title&quot;: &quot;UnifiedVisibilityAndSecurityAudit&quot;,<br> &quot;description&quot;: &quot;Data schema for a combined SEO/AEO/GEO optimization and cybersecurity audit report.&quot;,<br> &quot;type&quot;: &quot;object&quot;,<br> &quot;required&quot;: [<br> &quot;audit_id&quot;,<br> &quot;target_domain&quot;,<br> &quot;timestapastedPlatform at a glance<br> The CNAPP features offered by Singularity™ Cloud Security brings hyper automation and AI into security auditing. The platform offers modules for cloud security posture management (CSPM), cloud detection and response (CDR), and cloud infrastructure entitlement management (CIEM),pasted</p>