Key learnings from the 2025 State of DevSecOps study

Datadog | The Monitor blog

Introducing our open source AI-native SAST Instrument and monitor Boomi integration flows with OpenTelemetry and Datadog Not all index scans are equal: How we cut query latency by over 99% Platform engineering metrics: What to measure and what to ignore Integrate Recorded Future threat intelligence with Datadog Cloud SIEM CI/CD security: threat modeling using a MITRE-style threat matrix CI/CD security: How to secure your GitHub ecosystem Ingress NGINX is EOL: A practical guide for migrating to Kubernetes Gateway API Operating agentic AI with Amazon Bedrock AgentCore and Datadog LLM Observability: Lessons from NTT DATA Introducing the Datadog Code Security MCP Capture and analyze custom heatmaps in Session Replay Understand session replays faster with AI summaries and smart chapters Monitor ClickHouse query performance with Datadog Database Monitoring How we designed empathetic alert sounds for on-call engineers Search and act across Datadog to resolve issues faster with Bits Assistant Measure the business impact of every product change with Datadog Experiments Analyzing round trip query latency Configuring JavaScript caches for better performance Introducing Bits AI Dev Agent for Code Security Datadog achieves ISO 42001 certification for responsible AI Monitor Nutanix clusters, hosts, and VMs with Datadog Monitor Juniper Mist in Datadog A new Host Map for modern infrastructure Annotate traces to improve LLM quality with Datadog LLM Observability What’s new in Cloud SIEM: AI-powered investigations, enhanced threat intelligence, and scalable security operations Explore Kubernetes with native OpenTelemetry data Monitor Oracle Fusion Cloud Applications with Datadog Announcing the Datadog Terraform provider v4.0.0 Scaling Kubernetes workloads on custom metrics How to design cloud environments for AI-powered threat analysis Monitor Aruba Central in Datadog How we centralize and remediate risks with Datadog Case Management Accelerate incident response with Datadog and ServiceNow Monitor your application and network load balancer logs Understanding Karpenter architecture for Kubernetes autoscaling Tools for collecting metrics and logs from Karpenter Monitor Karpenter with Datadog What your product data is actually saying Key metrics for monitoring Karpenter Securing Datadog’s platform in the AI age: The role of observability data Four ways engineering teams use the Datadog MCP Server to power AI agents Approaching your observability migration with the right mindset Meet the new Bits AI SRE: Deeper reasoning, twice as fast Key learnings from the 2026 State of DevSecOps study Use plain English to query your multi-cloud infrastructure in Resource Catalog Simplifying troubleshooting across the user journey with Datadog Synthetic Monitoring Protect your OCI resources with Datadog Cloud Security This Month in Datadog - February 2026 Amazon EC2 security: How misconfigured and public AMIs expand your cloud attack surface Enable end-to-end visibility into your Java apps with a single command Measure and improve mobile app startup performance with Datadog RUM Evaluating our AI Guard application to improve quality and control cost Identify untested code across every level of your codebase Make use of guardrail metrics and stop babysitting your releases Monitor Versa Networks SD-WAN performance in Datadog Improve performance and reliability with APM Recommendations Remediate transitive vulnerabilities faster with Datadog Software Composition Analysis Generate audit-ready vulnerability and compliance reports with Datadog Sheets Monitor Fortinet FortiManager performance in Datadog Improve test coverage across codebases with Datadog Code Coverage Move fast, don’t break things: Consistent testing standards at scale Enrich logs with ServiceNow CMDB context before routing to any SIEM or logging tool Monitor Lustre with Datadog Make faster, better product decisions with Datadog Product Analytics Surface and remediate runtime posture issues with Workload Protection Findings Protect agentic AI applications with Datadog AI Guard How to optimize JavaScript code with CSS Trace Google Pub/Sub workloads in Cloud Run with Datadog Detect human names in logs with ML in Sensitive Data Scanner How we cut our NLQ agent debugging time from hours to minutes with LLM Observability Debug PostgreSQL query latency faster with EXPLAIN ANALYZE in Datadog Database Monitoring Datadog acquires Propolis Unify and correlate frontend and backend data with retention filters Scale compliance across global frameworks with Datadog Cloud Security Monitor Arista VeloCloud SD-WAN performance with Datadog Building reliable dashboard agents with Datadog LLM Observability Simplify log collection and aggregation for MSSPs with Datadog Observability Pipelines Mitigation for Node.js denial-of-service vulnerability affecting Datadog APM Automate flaky test fixes with the Bits AI Dev Agent and Test Optimization How we built an AI SRE agent that investigates like a team of engineers Datadog integrations 2025 recap: Observability for AI, security, and hybrid cloud Design effective executive dashboards with Datadog Implement dbt data quality checks with dbt-expectations Bring faster visibility into AWS Lambda functions with remote instrumentation Troubleshoot faster with the GitLab Source Code integration in Datadog How Cambia Health Solutions saved $30,000 monthly with Cloud Cost Management and the Datadog Resource Catalog Normalize any logs for Cloud SIEM with Datadog's OCSF processor Optimizing Datadog at scale: Cost-efficient observability at Zendesk Detect, diagnose, and resolve network issues easily with CNM Network Health Connect engineering errors to user impact in early-stage products Cilium configuration for Kubernetes operations at scale Designing feedback loops for progressive delivery Ship features faster and safer with Datadog Feature Flags Choosing the right OpenTelemetry Collector distribution Route your monitor alerts with Datadog monitor notification rules Automate Cloud SIEM investigations with Bits AI Security Analyst Cloud threat detection: How to identify risky activity across control and data planes Collecting Kafka performance metrics Monitoring Kafka with Datadog Monitoring Kafka performance metrics

2025-04-23 · via Datadog | The Monitor blog

Rory McCune

Seth Art

Christina DePinto

We recently released the 2025 State of DevSecOps study, in which we analyzed tens of thousands of applications and container images across thousands of cloud environments in order to reveal trends in security posture and best practices across the software development life cycle. In particular, we found that:

In this post, we outline several best practices that organizations should implement based on these findings, including to:

Prioritize vulnerabilities with runtime context
Deploy guardrails within your software supply chain
Deploy frequently to stay current on patches
Adopt minimal container images
Expand IaC usage and rein in ClickOps

In addition, we explain how you can use Datadog Code Security, Cloud Security, Cloud SIEM, Workload Protection, and other products within the Datadog platform to improve your security posture and defend against threats.

Prioritize vulnerabilities with runtime context

Most security teams have a finite set of resources they can leverage to remediate security vulnerabilities, which means that prioritization is key. Luckily, not all vulnerabilities are equally urgent. CVSS base scores help with prioritization, but these scores do not account for environmental attributes, such as public accessibility or whether or not the vulnerability has a known exploit available. Unfortunately, trying to manually apply this type of context to thousands of vulnerabilities is a time- and labor-intensive task that is simply not possible for most teams.

However, by mapping environmental information about your cloud assets to the vulnerabilities associated with each resource, you can use runtime context to effectively prioritize the vulnerabilities that are the most likely to get exploited. Here are some key characteristics to consider when adjusting severity ratings:

Public accessibility: Vulnerabilities present in internet-facing systems are more likely to be identified and exploited. Vulnerabilities in systems that are not public-facing are certainly relevant, as they can be exploited once an attacker has already gained initial access to your environment. But when it comes to prioritization, organizations should focus first on vulnerabilities that affect public-facing systems.
Environment: A vulnerability exploited in a production environment is much more likely to lead to service disruptions, sensitive data breaches, and further compromise of additional production services, as production systems contain the most sensitive data and live user sessions. A vulnerability in a development or staging environment is generally less critical than one in production.
Active attack status: Prioritize vulnerabilities in any environments that you know are currently being targeted by malicious activity, including automated scanners, exploit attempts, or other indicators of compromise.
Highly privileged identity: Vulnerabilities present on systems or endpoints that have highly privileged identities associated with them pose a higher risk because they make post-exploitation tasks significantly easier for the attacker. Highly privileged identities, such as admin users, grant access to more sensitive data and critical system functions. If these assets are compromised, attackers can escalate privileges and gain control over significant portions of the infrastructure.
Known exploit proof of concept: Vulnerabilities with known exploits are significantly more dangerous, as attackers have readily available tools and methods to exploit them. These vulnerabilities are actively targeted and can lead to rapid compromise, even if the underlying system is not directly internet-facing. Prioritizing these vulnerabilities is crucial due to their high likelihood of exploitation and the potential for widespread impact.

By taking these characteristics into account, you can effectively prioritize vulnerabilities and allocate resources to address the most critical risks first. This approach ensures that your security efforts are focused on protecting your most valuable assets and mitigating the most significant threats.

Use Datadog to prioritize vulnerabilities and threats using runtime context

Datadog’s security portfolio enriches and correlates security findings with runtime insights from your environment. This helps you understand the actual impact of vulnerabilities on critical workloads and make smarter remediation decisions. Datadog applies runtime context in two key areas to reduce alert noise: Security Inbox and Datadog Severity Scoring.

Datadog Severity Scoring layers in critical environment context like exploitability, accessibility, and likelihood of attack to base CVSS scores. Datadog may increase or decrease this score based on context from your specific environment. This helps teams prioritize what to fix first.

You can see this in action in the security posture funnel within Datadog Code Security. In the example below, looking only at vulnerabilities that are running in production, have an exploit available, and were exposed to attacks has reduced the number of critical vulnerabilities by more than 92 percent.

Here’s another example of Datadog Severity Scoring. The image below shows a vulnerability within an aiohttp library that had a base CVSS score of 7.5. However, Datadog had the environmental context to see the service was not in production, not under attack, had no exploit available, and was unlikely to be exploited based on the Exploit Prediction Scoring System (EPSS). Based on this context, Datadog reclassified the vulnerability from high severity to low.

Here’s an example of the same prioritization happening in Datadog Cloud Security. This time, we’re looking at a vulnerability on an EC2 instance that had a base CVSS score of 6.5. Normally, this wouldn’t raise any alarms or be considered an urgent remediation. However, Datadog can see that this EC2 instance has a privileged role, which an attacker could abuse to gain additional access to cloud infrastructure. This raises the score significantly and informs security teams that this vulnerability could be more impactful than it initially seemed.

Datadog also prioritizes findings using Security Inbox, which provides an actionable list of your most critical findings and their relationship to each other. Security Inbox automatically correlates insights from across Datadog, including vulnerabilities, signals, misconfigurations, and identity risks. This allows you to see the most critical combinations of risks that could create viable attack paths for a threat actor to leverage.

Deploy guardrails within your software supply chain

Attackers continually evolve techniques to trick developers into installing malicious libraries by targeting the ecosystems that support modern software development, such as npm and PyPI. Successful execution of these attacks can occur anywhere, from developer laptops to your organization’s CI/CD pipelines.

Mitigating this risk requires a multi-pronged approach. You can take actions to prevent or detect the execution of malicious packages within your trusted environments, and you can also protect software repositories to harden your ecosystem against attacks.

In terms of actions you can take, you’ll want to scan software packages as they are used to determine if they are malicious. For example, a solution like Datadog’s Supply-Chain Firewall can use known databases, such as Datadog Security Research’s public malicious packages dataset and OSV.dev, and prevent the execution of any packages marked as malicious within those databases. Additionally, you can use GuardDog to identify malicious packages holistically and even extend its capabilities with additional signatures to detect malicious patterns in packages.

As security practitioners, we must always operate with the assumption that prevention eventually fails, so it is prudent to also deploy traditional endpoint detection and response (EDR) tools to detect malicious activity on developer workstations. Additionally, you’ll want to ensure you have visibility in your CI/CD pipelines and cloud workloads so you can detect malicious activity there as well.

The other approach to tackling these threats is to apply pressure on the owners of software repositories and marketplaces to deploy the types of controls that prevent attacks at the source. For example, an ecosystem that provides package maintainer attestations or provenance statements allows users to verify these attestations at time of use, which helps prevent the execution of any software that has not been signed by the author.

Secure your pipeline by eliminating long-lived credentials

Leaks of long-lived cloud credentials are one of the most common causes for cloud breaches—because they never expire, any leak of these types of credentials in configuration files, source code, or another source could provide a viable entry point for attackers. That’s why it’s critical to use short-lived cloud credentials through OpenID Connect (OIDC) or a similar protocol to authenticate both workloads and human users to cloud resources.

It’s particularly important to use short-lived credentials in CI/CD pipelines. These workloads often run on external infrastructure and need to authenticate to cloud environments to deploy resources, typically through infrastructure-as-code (IaC) or some sort of automation. Here’s a look at how you can use short-lived cloud credentials in your CI/CD provider through the OIDC protocol:

The CI/CD provider injects into the running pipeline a signed JSON Web Token (JWT) that provides a verifiable identity to the pipeline.
The CI/CD pipeline uses this JWT to request cloud credentials. How this is done depends on the cloud provider. On AWS, for instance, this is achieved using a call to AssumeRoleWithWebIdentity.
If the cloud environment has been properly configured to trust and distribute credentials to this pipeline, it returns short-lived credentials that are valid for a few hours.

In addition, you’ll want to monitor all of your systems for identity risks such as overprivileged users and unused credentials.

Use Datadog to monitor authentication events

Datadog provides multiple ways to monitor your cloud environment for suspicious authentication activity or potential misconfigurations. For example, Cloud Security provides out-of-the-box rules to identify risky IAM users with long-lived, unused credentials. In addition, you can gain visibility into accessible resources with access insights. This allows you to see what entities a resource can access and who can access the resource. You can also get granular insights into identities that have direct or indirect access to a given resource and take immediate action with autogenerated policies that help you right-size permissions.

In addition, Datadog Cloud SIEM can help you identify compromised access keys through numerous out-of-box rules, including:

Compromised AWS IAM user access key
TruffleHog user agent observed in AWS, which identifies an attacker who successfully uses the popular TruffleHog tool to discover leaked access keys
The AWS managed policy AWSCompromisedKeyQuarantineV2 has been attached, which identifies when AWS itself quarantines an IAM user whose access key was publicly leaked

You can also easily identify gaps to strengthen detection coverage with the MITRE ATT&CK Map and create custom Cloud SIEM rules using new value, anomaly detection, impossible travel, and other detection methods to identify unexpected activity.

For example, you can create a rule to alert when an attempt to access Amazon Bedrock is made using a long-term AWS access key.

Lastly, App and API Protection helps filter through the noise and find suspicious authentication patterns that may indicate credential stuffing and account takeover campaigns.

Deploy frequently to stay current on patches

Services that are deployed more frequently contain fewer out-of-date dependencies. One straightforward takeaway here could be that—even if your code has not changed—simply executing CI tasks that update dependencies, build, and test your software on a regular cadence can significantly improve the security of your services without much manual effort. That said, there is plenty of debate about the risks and rewards of automating dependency updates, so it’s worth spending time to understand if this type of solution can work for you.

As impactful as automated dependency updates can be, there’s still another area that needs to be addressed. We found that one in two services use libraries that are not actively maintained, which means they will no longer receive security updates, even if new vulnerabilities are identified. To tackle this problem, we recommend keeping track of all end-of-life libraries currently in use within your organization and starting with these end-of-life dependencies when prioritizing remediation efforts.

Monitor service health with Datadog

Datadog provides a range of capabilities that enable you to view the health of services so you can accelerate deployments and upgrade dependencies more frequently. CI Pipeline Visibility supports development velocity by giving you a high-level view into the health of your CI/CD systems.

You can also use Software Catalog to attain a centralized, real-time view of your service ecosystem, so your teams can track ownership, monitor performance, and enforce security and compliance standards all in one place. From here, you can also see the security posture of each service, including active threats like SQL injection and vulnerabilities in application code or libraries.

Adopt minimal container images

Most modern applications run in cloud-native containerized systems. Whether they run on top of Kubernetes or provider-specific services such as Amazon ECS or Azure Container Instances, these applications are packaged within a container image. When building such applications, it’s important to attempt to build small, minimal container images: They are faster to deploy, remove complexity, and drastically reduce the number of OS-level vulnerabilities.

There are a number of possible approaches to building minimal container images. The first might be to use profiling software like slim to analyze images and remove unnecessary software from them. This kind of approach works well with existing images, where re-architecting from scratch might not be possible.

When building new images, it’s important to use a minimal base image. If the application doesn’t require a full Linux distribution to run, there are options like Docker’s scratch image, which provides an effectively empty starting point for applications. Alternatively, Distroless provides a starting point with some more base Linux files, which some applications need in order to operate effectively.

If your application requires more operating system support files, be careful to avoid installing unnecessary packages, such as development dependencies. You can use techniques like multi-stage builds to keep the final container image as small as possible.

Monitor the security of your container images with Datadog

From development through production, Datadog offers deep visibility into the security posture of running container images. With security insights embedded in Infrastructure Monitoring views, engineers and resource owners are better equipped to secure containers as part of their standard optimization and deployment workflows.

The Container Images view provides rich information about the size of container images, their age, and any detected vulnerabilities. By clicking into specific images, you can immediately see all containers where the image is deployed, alongside their associated status and CPU usage. This ensures you have full context on where each image is running and how it’s performing, so you can make faster, more informed remediation decisions.

When investigating vulnerabilities on a container, users can see detailed insights like the repo digest, resource tags, and a timeline of when vulnerabilities were introduced. They can also automate remediation workflows with built-in workflow automation tools. This allows engineers to proactively remediate vulnerabilities that could impact performance reliability without having to wait for the security team to come to them with a list of critical remediations.

Additionally, Datadog’s integration with Chainguard allows users to see widely adopted vulnerable container images and identify places where Chainguard container images could be used to lower that risk. This ensures that high-value container images are minimal and secured before they go into production.

Expand IaC usage and rein in ClickOps

The concept of Zero Touch production environments was first introduced in 2019 by Michał Czapiński and Rainer Wolafka at a USENIX conference. At a high level, these environments rely heavily on reliable automation as opposed to manual, error-prone actions performed by humans. Zero Touch production also promotes the concept of the least privilege: When an automation can perform a workflow (such as deploying a cloud workload), human operators don’t need direct access to the production environment and can instead rely on safe, indirect, and user-friendly interfaces.

The first step toward implementing Zero Touch (or “low-touch”) production environments is to automate the deployment process, typically using IaC technologies. Tools like Terraformer can also help convert an existing environment into its code representation.

Once automation is in place, operators typically need only minimal permissions (such as view only) to the production environment, since provisioning new infrastructure now happens through automation. It’s helpful to plan “break-glass” roles ahead of time—these are roles that can be activated with approvals and audits when a specific situation, such as an incident, requires an operator to use privileged permissions in a production environment.

Track manual cloud actions with Datadog

It’s important to be notified when there are manually initiated changes in your cloud environment. You can use the following query in Datadog Cloud SIEM to identify and alert on manual actions performed from the AWS Console (adjusted from Arkadiy Tetelman’s methodology):

source:cloudtrail @http.useragent:("console.amazonaws.com" OR "Coral/Jakarta" OR "Coral/Netty4" OR "AWS CloudWatch Console" OR S3Console/* OR \[S3Console* OR Mozilla/* OR onsole.*.amazonaws.com OR aws-internal*AWSLambdaConsole/*) -@evt.name:(Get* OR Describe* OR List* OR Head* OR DownloadDBLogFilePortion OR TestScheduleExpression OR TestEventPattern OR LookupEvents OR listDnssec OR Decrypt OR REST.GET.OBJECT_LOCK_CONFIGURATION OR ConsoleLogin) -@userIdentity.invokedBy:"AWS Internal"

In addition, Datadog provides advanced real-time monitoring and security capabilities across application, API, and workload layers. At the workload level, Datadog Workload Protection utilizes eBPF-based, in-kernel instrumentation to provide deep visibility into system-level activity. This includes monitoring sensitive file access, network connections, and process execution with minimal performance overhead. These activity signals are correlated to detect suspicious behaviors and unauthorized actions, supporting threat detection, drift prevention, and incident response at runtime.

Secure your SDLC with Datadog

To ensure your software development practices support high release velocity while keeping applications and systems protected against threats, you need end-to-end visibility into vulnerabilities in your code, misconfigurations across your cloud resources, and the context to prioritize risks to remediate. Datadog surfaces these risks in the same platform where you monitor system health and performance, breaking down silos between teams so you can accelerate your adoption of DevSecOps best practices and improve security outcomes.

Check out our documentation to get started. If you’re not a customer, you can get started today with a 14-day free trial.

此内容由惯性聚合(RSS阅读器)自动聚合整理，仅供阅读参考。原文来自 — 版权归原作者所有。

推荐订阅源

Datadog | The Monitor blog