5 Types of Reachability Analysis (and Which is Right for You)
Explore the five key categories of reachability and their practical applications in AppSec and development. Learn the differences between SCA and container scanning, and understand how various tools like Function-Level Reachability, Package Baselining, and Internet Reachability play crucial roles in identifying and prioritizing security risks.
Explore the five key categories of reachability and their practical applications in AppSec and development. Learn the differences between SCA and container scanning, and understand how various tools like Function-Level Reachability, Package Baselining, and Internet Reachability play crucial roles in identifying and prioritizing security risks.
Explore the five key categories of reachability and their practical applications in AppSec and development. Learn the differences between SCA and container scanning, and understand how various tools like Function-Level Reachability, Package Baselining, and Internet Reachability play crucial roles in identifying and prioritizing security risks.
Explore the five key categories of reachability and their practical applications in AppSec and development. Learn the differences between SCA and container scanning, and understand how various tools like Function-Level Reachability, Package Baselining, and Internet Reachability play crucial roles in identifying and prioritizing security risks.
Explore the five key categories of reachability and their practical applications in AppSec and development. Learn the differences between SCA and container scanning, and understand how various tools like Function-Level Reachability, Package Baselining, and Internet Reachability play crucial roles in identifying and prioritizing security risks.
CSPM! ASPM! CNAPP! CWPP! SCA! Supply Chain! Code to Cloud! If you know all of those buzzwords, I extend my apologies for your endless barrage of marketing content. As the VC-backed alphabet soup continues to pour forth, the latest phrase for which as many definitions exist as there are vendors is “reachability analysis.”
As part of building Latio Tech, I’ve met with countless vendors selling “reachability analysis.” Over that time, I’ve learned that there are five categories of reachability. Some tools do multiple categories, but none of them do them all. In this blog, I’ll provide an overview of each category and when each can be useful.
SCA vs Container Scanning
Before getting into the different types of reachability, it’s helpful to differentiate between software composition analysis (SCA) and container scanning because vendors use “reachability” for both types of scanning. These are distinct but commonly-confused tools.
SCA
SCA is a tool that analyzes open source code (from other repositories, which are most commonly public but can also be internal dependencies) that have been imported by a developer and used in applications. Here’s an example from an arbitrary public repository where the author is importing a ton of third-party code. The purpose of an SCA tool is to create a Software Bill of Materials (SBOM), primarily in order to surface potential vulnerabilities and risks with those OSS packages.
Container Scanning
Container vulnerabilities affect libraries that are installed onto the operating system. Here’s an example of a container built from the same repository. In this example, the author started with a common public Linux image and installed a bunch of stuff onto it. Similar to an SCA tool, the container scanner is looking for vulnerabilities in those imported images.
Where Things Get Confusing
The confusion between these two tools comes from this line of code: `RUN npm run build`, where the author is building their JavaScript app on the image before publishing it for use. Because of this, some third-party dependencies would show up in both a container and SCA scans. However, most dependencies instead just add a bunch of extra code into your project and only show up in an SCA scan.
It's best practice to have distinct scanning against both the docker container and third-party libraries because neither type of tool guarantees total visibility, and different teams within organizations typically care about images versus dependencies. Some vendors do both, while others choose to focus purely on one.What they have in common is the usage of terms “supply chain security” and “reachability analysis.”
Types of Reachability Ordered by Usefulness
Reachability has become a must-have feature for AppSec and developer teams because it helps separate exploitable risks from false positives. The type of reachability that’s relevant to you depends on how much time is spent manually researching risks and where those risks originate. Developers also care about improving security and know that when the mountain of risks is too high, almost nothing ends up getting fixed. So in addition to saving time, implementing the right kind of reachability will increase your rate of remediation.
- Function-level reachability
- Package baselining
- Internet reachability
- Dependency-level reachability
- Package used in image
Function-Level Reachability (SCA)
Function-level reachability was revolutionary: if you didn’t have it, you became immediately irrelevant. It’s a trustworthy way to instantly categorize a slough of false positives, so the time-saving potential is huge. Here’s an arbitrary example of why this functionality is critical in a tool but difficult for providers to build: CVE-2014-6071. This CVE is the kind of vulnerability that makes your head spin as an AppSec engineer. In brief, it’s a jQuery XSS vulnerability that is only exploitable when using an anti-pattern that almost no one does. On the one hand, there’s a published exploit, and JQuery was patched to fix it, so filtering by “exploit” or “fixable” would not be enough; but on the other hand, it requires using `text()` inside of `after()` which almost no one does. Here’s an example of what that might look like:
This kind of code would almost never actually happen because a developer typically wouldn’t use jQuery to receive user input and then render it back out to the DOM. However, as a security engineer, you’re really forced to treat this like any other issue because maybe your devs implemented the anti-pattern.
This is where function reachability is a game changer: An SCA tool knows that this kind of exploit requires the `text()` function to be used inside of `after()`. It actually scans your code to see if the library is being used in this way and if it’s really good at reachability analysis, it even traces entire application contexts to see if this happening outside of simple uses. With function-level reachability, if jQuery is not used in this way, then the CVE is a false positive that you don’t need to send for remediation.
Not many vendors do real function reachability because it’s really really hard to do. Some are trying via LLM’s reading CVE descriptors, but unfortunately CVEs often don’t contain this data. Others are trying via regex lookups, but there’s not enough code context to catch realistic use cases. While this isn’t an ad for Endor Labs, they’re the only vendor I’ve seen reliably catch this scenario.
The con of function level reachability is that more CVEs than you’d expect either don’t contain function-level data or rely on more general application context to know if an exploit is possible. For example, CVE–2022-23307 is against particular functions within Log4j 1.2.x, but only when used with Apache Chainsaw - something that reachability alone would never know. This is why Endor Labs is manually mapping all CVEs on the vulnerable functions, in all versions affected by the CVE.
When researching SCA products, you should always do a PoC. Never trust a demo account by itself — it’s just too easy to cherry-pick vulnerabilities that fit the story a little too perfectly. Some things to watch out for include:
- Does reachability stop at the class level while others go as deep as they can to the actual vulnerable function and context of its usage? While lighter functionality is useful for removing unused third-party code from your app, it does little to tell you if you’re actually vulnerable or not.
- Does reachability apply to direct and transitive dependencies, or just direct dependencies? There can be several (even dozens) of transitive dependencies for each direct, so the efficacy of the reachability will be limited if it only does direct dependencies.
- How far back does the database go for those reachable functions? SCA vendors correlate dependencies with a proprietary database that is different from the standard database of vulnerabilities and other information. Reachability that goes back to 2018 vs. to reachability back to 2023 can make a big difference!
- How precise is the reachability analysis? Does the analysis descend into call chains spanning direct and transitive dependencies? An imprecise analysis can make developer’s lifes miserable by bombarding them with FPs while it may miss some cases (FNs) giving a false sense of security.
Package Baselining (SCA and Container)
Package baselining works as a prioritization method for both SCA and containers by looking at the normal actions of a third-party package, and notifying or blocking when those actions change. An intuitive example is that Log4J should only be calling file systems, not executing code or making network calls. This is a step beyond detecting that a dependency is in use because it shows that the package is being used and also what it’s doing.
Package baselining is groundbreaking in what it’s doing, but it’s hard to tell its usefulness as a prioritization method as opposed to a runtime blocking mechanism for third party dependencies.
I’m also skeptical about this type of reachability (by itself) to provide the silver bullet for prioritization. For example, CVE-2023-22809 can be exploited if a user has sudo capabilities to any file - but whether or not sudo is “running” doesn’t matter. Marking this CVE as a false positive would require more reliance on user context analysis than any “is it running and what is it doing” type of analysis.
Ultimately, I’m more excited about this technology as a runtime protection tool than a reachability analysis and prioritization methodology. Runtime detection of what a package is doing is a better fit for container vulnerability prioritization than SCA prioritization, because it’s so rare that you’d have a dependency imported in code, called by the code, but not actually running anything; conversely, it’s common practice to implement bloated Docker images that aren't using half their packages. Blocking malicious package actions is a very cool security feature, but the value from an SCA prioritization perspective isn’t clear to me. For example, if I’m importing and calling Log4j, I’m probably also actually using it. Its runtime behavior doesn’t help me prioritize it more.
Internet Reachability (SCA and Container)
Internet reachability is probably the hottest and most misunderstood aspect of vulnerability prioritization. Most security teams rightly think that if an application is facing the internet, its vulnerabilities should be most highly-prioritized. However, things are rarely so simple. At its core, this type of reachability is asking “Which assets should I patch first?”
Take the sudo vulnerability from earlier: Should I fix that on my load balancer just because it’s internet-facing? My Kafka cluster because it’s the first hop from the load balancer? The Java service reading those messages? Or the orchestration platform for the Kafka service?
The real answer has nothing to do with how many hops from the internet the service is, it’s wherever I have over-permissioned user or application contexts with limited sudo permissions. Same with Log4J vulnerabilities - I should fix that wherever logs are being processed by the package, not arbitrarily wherever the package exists if it’s closer to the internet.
Overall, this type of prioritization is a better pitch than reality, especially from the major CNAPP providers who really only use it in demos. It’s useful in the context of understanding your own application, but most CVEs should not be patched in order of “How close to the internet are they?” Instead, significantly more application context is needed than internet reachability to have any hope of reducing vulnerabilities based solely on their runtime environment.
That’s not to say the functionality is useless. There are some vulnerabilities where this is a good prioritization method, especially for monolithic services that may be directly accessible from the web. These are certainly higher-risk assets, but they’re not automatically more vulnerable to every CVE. It’s also a useful filter once you already know how the vulnerability is exploitable, for example, if I knew that user interaction was required within an application context for the exploit, then internet facing would certainly be a valid filter.
Dependency-Level Reachability (SCA)
Early in the market, a lot of providers realized they needed an equivalent to reachability analysis and simply looked for the imported package being used anywhere in the app. This is a coarse grain lens (many steps higher than function-level) that is better than no reachability. But the scanner isn’t able to fully discern the package’s utilization at this level, so you’re still left with manual research of “potentially reachable” vulnerabilities. Looking through recent updates in the market, providers are less likely to fudge this level of reachability analysis than they used to be.
While not as detailed or robust as function-level reachability, dependency-level reachability can be used as a good indicator for prioritization. If you're not actually using the dependency at all, then removing that dependency could be a consideration. Determining whether or not a dependency is being called or used is another layer of prioritization you can add to your remediation process.
Package Used in Image (Container)
Alongside internet reachability, the other major prioritization method touted by most CNAPP providers is checking if the package is used by the image. For the same reasons discussed in the Package Baselining section, this method alone is ineffective for prioritization. This data is useful in the context of application delivery vulnerabilities, such as if Django has a direct vulnerability, but not for most Linux packages that can be easily called into usage by an attacker if they’re installed.
However, this functionality is becoming more useful when combined with other contexts, like publicly accessible and other configuration vulnerabilities. The real prioritization value is when combined with the user interaction score from CVSS. While not totally foolproof, typically if user interaction is required for the exploit, then the package needs to be running in order to be exploited. A cool use case for this functionality is ripping the dependency out of the image if it’s not used, such as what RapidFort and Slim are doing - in this case, you would actually make yourself protected.
Concluding Thoughts
A lot of the vulnerability overload problem exists because it’s incredibly difficult to overcome the gap between being 95% sure something doesn’t apply and 100%-sure. No one wants to be on the hook for a breach they miscategorized.
Function-Level reachability is the closest we can get to certainty of a false positive and therefore provides tremendous value. I’m excited about the future of Package Baselining as a runtime protection mechanism because it offers hope in a world where no matter what, you’re going to have vulnerabilities in your system. The other types of “reachability analysis” I view as marketing more than anything because it’s essentially one extra filter in a dashboard. You’re now equipped to navigate the market and discover what a vendor really means when they say “reachability analysis!”
References
See also: https://www.endorlabs.com/blog/what-is-reachability-based-dependency-analysis