DataDog State Of DevSecOps Report: The Dog That Isn’t Barking
DataDog recently released its State of DevSecOps research findings which you can find here. In it, they identified 7 key facts based on systems they analyzed. The premise is that prioritization matters to keep systems safe, and they provide a few tips to help limit vulnerabilities in general.
We all know that prioritization is an important part of software security. There’s lots of different tools that claim to help software development teams prioritize including the Common Vulnerabilities and Exposures (CVE) list, the Common Weakness Enumeration (CWE), Exploit Prediction Scoring System (EPSS), Known Exploited Vulnerabilities (KEV) catalog, security analyzers have added reachability code analysis, and many others.
Let’s review some of the major points from the report.
Fact #1: Java Libraries Are the Most Impacted By 3rd Party Libraries
The report notes 90% of Java Services are vulnerable to one or more critical or high severity vulnerabilities. That’s versus 47% of other languages. The report states that 55% of these have an exploit in KEV. The fact identifies three vulnerabilities that are most common: Spring4Shell, Log4Shell, and an Apache ActiveMQ vulnerability. The report also notes that over 60% of the vulnerabilities are indirect. This means that even upgrading the direct third party library won’t necessarily address the underlying vulnerability.
This fact might be accurate. But it is also one of the challenges with reachability and SCA tools. Even if an exploit exists for a vulnerability it does not mean that application is vulnerable. Right off the top, 2 of the 3 vulnerabilities are only exploitable in certain situations. Having battle scars from the Log4Shell incident, many only had 1 or 2 applications of a full suite that were directly impacted. And many cases in which organizations were impacted were remediated by other measures beyond updating the library.
Another issue with this “fact” is that Spring and Log4J library upgrades are a challenge and most of these applications were built prior to the real notion of microservices. Meaning - these applications are older and upgrading and the associated work to do so, may be a lot more costly than a different remediation method.
To be clear, that isn’t to say the code should not be upgraded. It is simply to call out that the context of that statistic is very important.
Fact #2: Attack Attempts from Automated Security Scanners are Mostly Unactionable Noise
The report found that a significant number of requests come from automated scanning tools.. These tools bombard software with attacks but only 0.0065% of these are successfully triggering a vulnerability. The report provides guidance that a WAF and log monitoring can filter out most of these requests. These attacks are typically executed by pen testers, bug bounty hunters, red teamers, and attackers alike.
While this fact is handy to know, it lacks detail in how to handle this response.
Fact #3: Only a Small Portion of Identitifed Vulnerabilities are Worth Prioritizing
The report identifies that average service is vulnerable to only 19 of the over 5,000 critical and high CVEs that have been cataloged and previous research showed only 5% of these have been exploited. The report showed that the EPPS was factored in, the number of organizations with critical CVEs dropped by 63% and 30% had their critical CVEs drop by more than 50%.
The assumption here is that all of those critical and highs without known exploits aren’t being exploited. There’s also no analysis of medium and low critical CVEs with exploits.
This analysis can lead teams to a false sense of security: That the security of their system is safer than many believe and improvements aren’t needed because the risk is much lower.
Fact #4: Lightweight Container Images Lead to Fewer Vulnerabilities
The report notes that the larger the container the greater the likelihood of critical vulnerabilities. The report uses the size of the container as a guide for it. Using sizes of 100MB, 250 MB, and 500MB as guides, the report notes that vulnerabilities increase from 4.4 to 42.2, to almost 80.
On the surface this makes sense. Typically, a larger image is going to contain more services. And, the more services, the greater the likelihood of vulnerabilities. Therefore the higher the likelihood of vulnerabilities the higher the likelihood of critical vulnerabilities.
My issue is the systems themselves are only one part of the equation. The other part of the equation is being able to maintain and keep the images up to date. The previous fact on prioritization acknowledges not all vulnerabilities are created equally.
- If a larger image has 80 critical but 0 are exploitable, does it matter?
- If an organization lacks the skill set, resources, or understanding to deploy smaller containers, then is it hurting itself?
- Can the resources be used elsewhere with greater impact?
Fact #5: Adoption of Infrastructure as Code is High But Varies Across Cloud Provider
The report states that Infrastructure as Code (IaC) is a de facto standard for deploying cloud environments. It discusses the benefits including ability to peer review, limiting operations by humans, and the ability to scan the code for weaknesses. It also notes that IaC usage is at 71% in Amazon Web Services but only 55% in Google Cloud. And it is unable to provide data for Azure due to logging limitations.
What is missing from here is overall numbers. From a market share perspective, Google only has 11% while Amazon has 34%. The raw numbers will make the percentages skewed. Especially when that is combined with the type of systems deployed in each.
The critical statistic that is missing from this fact is the cost of mistakes in IaC and the cost of finding IaC resources. Infrastructure as code is a significant skillset. And mistakes can be costly. A simple variable being defined with a typo can cost a company millions due to the number of resources spun up.
Fact #6: Manual Cloud Deployments are Still Widespread
Upon reviewing CloudTrail (Amazon Web Service logs), DataDog found that at least 38% of organizations used manual operations (ClickOps) to deploy workloads or take sensitive actions. This would include production environments.
This seems like it might be high. But when you combine it with the previous infrastructure as code stats, it is probably not. It also lumps a bunch of operations together. It is not surprising that some organizations are using ClickOps in development environments. Some organizations may find that to be the better approach than going straight to automation. It may also be that there are certain actions which the API does not provide output in an easy to access methodology.
Should this number be lower? Probably. But there is a lot of context missing there.
Fact #7: Usage of Short-Lived Credentials in CI/CD Pipelines is Still Too Low
The report identified that only 37% of organizations use short-lived credentials like OIDC. The data showed that 63% used IAM (Amazon’s Identity and Access Management) and 42% used IAM exclusively. The report notes that the usage of long term users in the CI/CD pipeline can lead to leaked users and potential data breaches through the pipeline.
Automating credentials in the CI/CD pipeline is a must. Tying it to long term users leads to difficulty in rotating passwords and key management. This does require having the skills to support it, but this is really setting up documentation and the infrastructure to support it. There is very little to update and maintain.
The DataDog report was supposed to provide a state of DevSecOps. However, it seems to provide a random set of stats that offer a vague conclusion which the stats don't necessarily support. There’s no relationship identified between automation and breaches. There’s no relationship between image size and breaches.
The dog not barking in this is not that the stats are incorrect. Nor is it that the stats aren’t valid. The dog not barking is that the context of automation and prioritization are only part of the problem. These things require skill and resources to implement and manage. It requires understanding business impact.
However, perhaps most importantly, it requires the communication, experience, and an overarching security program to manage it.