By clicking “Accept”, you agree to the storing of cookies on your device to enhance site navigation, analyze site usage, and assist in our marketing efforts. View our Privacy Policy for more information.
18px_cookie
e-remove

Why Reachability Analysis for JavaScript Is Hard (and How We Fixed It)

JavaScript reachability is tricky for SCA tools because of how JavaScript approaches dependency resolution, dependency imports, and functions.

JavaScript reachability is tricky for SCA tools because of how JavaScript approaches dependency resolution, dependency imports, and functions.

JavaScript reachability is tricky for SCA tools because of how JavaScript approaches dependency resolution, dependency imports, and functions.

Written by
A photo of Danny Kim — technical staff at Endor Labs.
Danny Kim
Published on
December 17, 2024

JavaScript reachability is tricky for SCA tools because of how JavaScript approaches dependency resolution, dependency imports, and functions.

JavaScript reachability is tricky for SCA tools because of how JavaScript approaches dependency resolution, dependency imports, and functions.

Why is reachability important?

Software Composition Analysis (SCA) is a method of identifying vulnerabilities in 3rd-party dependencies. Traditional SCA relies solely on manifest files, files that list the dependencies for a project, which can result in a high number of false positives and false negatives. The primary issue with traditional SCA is that without the analysis of the project’s code, it is impossible to determine which portions of the 3rd-party dependencies are being used.

Endor Labs takes a unique approach where call graphs are used to determine the reachability of the vulnerabilities found in 3rd-party dependencies. Endor Labs performs static analysis on the 1st-party code to generate a call graph. A call graph is a graph where the nodes represent functions and the edges represent calls between functions. Using this graph, Endor Labs is capable of determining which vulnerable functions are being used by the 1st-party code, which helps reduce the noise of traditional SCA by up to 92%.

Why is generating a call graph for JavaScript difficult?

JavaScript presents unique challenges when it comes to generating call graphs. Through the development of Endor Labs’ JavaScript call graph generator, Endor Labs has made some observations that help explain why generating accurate JavaScript call graphs is so difficult.

Dependency resolution logic

NodeJS, a popular JavaScript runtime platform, uses a unique dependency resolution approach that allows multiple versions of the same dependency. When the dependencies are downloaded for a project, NodeJS places them in the project’s root directory under the node_modules directory. The catch is that there can be nested node_modules directories within the root’s node_modules directory, allowing multiple versions of the same dependency. 

Let’s take an example project to illustrate how dependencies are resolved. Assume we have a project ‘test_project’ that has 3 package dependencies: foo@v1.0, bar@v1.0, and coo@v1.0.

When npm install is run in the test_project directory, NodeJS creates a node_modules directory which contains the dependencies. 

Let’s assume the foo dependency relies on bar@v1.1 and coo@v1.0.

When npm install ran, NodeJS created a nested node_modules directory inside of the test_project/node_modules/foo directory to install the dependency bar@v1.1.

NodeJS does this to allow for the same project to rely on multiple versions of the same dependency. With this approach, foo@v1.0 can use bar@v1.1 while the test_project can use bar@1.0.

NodeJS performs dependency resolution with the following logic:

  1. Starting from the source file’s root directory, look for a node_modules directory. If present, look in the node_modules directory for a sub-directory that matches the imported dependency name. 
  2. Repeat step 1, moving to the parent directory each time after a search is not successful until the root directory of the system is reached.

Let’s look at an example of NodeJS’s dependency resolution logic for package foo.

  1. When foo imports bar@v1.1, NodeJS first looks in the test_project/node_modules/foo directory for a sub node_modules directory.
  2. Since it is found, NodeJS then looks in the test_project/node_modules/foo/node_modules directory for a directory called bar. Since it is found, the source code located in test_project/node_modules/foo/node_modules/bar is used for the bar@v1.1 dependency.
  3. When foo imports coo@v1.0, NodeJS looks in the test_project/node_modules/foo/node_modules directory (as explained in step 1) for a directory called coo.
  4. Since there is not a test_project/node_modules/foo/node_modules/coo directory, NodeJS moves up one directory from test_project/node_modules/foo to test_project/node_modules to look for a directory coo. Since it is found, the source code located in test_project/node_modules/coo is used for the coo@v1.0 dependency.

While NodeJS’s approach for dependency resolution enables multiple versions of the same dependency, it does make dependency resolution difficult. Determining the exact version of a dependency that the code is using is paramount to an accurate call graph. This is also why phantom dependencies can exist in JavaScript projects.

Module-level code and dynamic imports

In most other programming languages, when a dependency is imported none of its code is executed until a method or a function from that dependency is executed. In JavaScript, this is not the case. As soon as a dependency is imported, its module-level code is executed immediately. Module-level code is code that does not have a parent function.

Let’s look at an example. Let’s say the dependency foo imports dependency bar.

Let’s say bar’s index.js file (typically the default main file for any JavaScript package) had the following:

As soon as the code const b = require(‘bar’) executes, the console prints “Hello world!” since this code in bar does not have any parent function nor is it located in a module export.

In addition to module-level code, JavaScript also has a concept of module exports. Module exports allow for functions to be exported so that they are externally available. Dependencies can also just export one main function, which can then be called like:

To add to this, dependencies can be imported dynamically in functions. This can result in situations where a phantom dependency is found to be used in source, but is deemed unreachable because the parent function is not called. Lastly, the dependency name that is imported can also be a runtime value that is impossible to know statically.

Anonymous functions + function pointers

JavaScript allows users to declare functions without names, also known as anonymous functions.

Here the function is directly assigned to a variable. The function itself does not have a name. Anonymous functions present several challenges when it comes to generating accurate call graphs. Because the function itself does not have a name, generating accurate vulnerability information for vulnerable anonymous JavaScript functions is difficult. Because they do not have a name, alternative mapping schemes have to be generated such as looking at the line number and offset of the anonymous function declaration. Even this approach presents its own challenge since the line numbers of offsets (oftentimes just white space) can change between dependency versions.

Anonymous functions are also commonly used as arguments into other functions as function pointers. Function pointers are not unique to JavaScript, but do present issues when trying to accurately generate a call graph. This is because tracking function pointers statically is complex and requires a level of data flow tracking. Performing data flow tracking statically is a notoriously difficult problem, which leads to some missed edges in static call graphs compared to dynamic ones.

JavaScript reachability analysis with Endor Labs

Let’s take a look at an example JavaScript project and how Endor Labs’ reachability analysis identifies exploitable vulnerabilities. In this example JavaScript project, there are 21 direct dependencies:

Although there are only 21 direct dependencies for this project, when npm install is run the resulting dependency tree includes 236 transitive dependencies, for a total of 257 total dependencies! This type of dependency sprawl is common amongst JavaScript projects and results in a higher-than-expected number of vulnerabilities. In this example project, there are 124 findings that include different forms of open source risks, including vulnerabilities, license, and operational risks. In total Endor Labs evaluates more than 150 factors for security and quality. 

124 findings is a lot of things to fix, so you need a way to prioritize what matters. With just function-level reachability as a filter you can reduce that number to 11 reachable vulnerabilities, a reduction of roughly 91%.

And if we add additional filters of EPSS (1%), fix available, and in production, we end up with just 2 findings! This focus helps the AppSec team prioritize the most pressing issues and reduces the workload on developers. With reachability, organizations are able to streamline their vulnerability management workflows and address critical open source vulnerabilities efficiently.

Try JavaScript reachability with Endor Labs Today!

JavaScript reachability was introduced in 2023, and is one of several modern languages we support. Book a 20-minute demo to learn how Endor Labs turns your vulnerability prioritization workflows dreams into a reality.

The Challenge

The Solution

The Impact

Book a Demo

Book a Demo

Book a Demo

Welcome to the resistance
Oops! Something went wrong while submitting the form.

Book a Demo

Book a Demo

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.

Book a Demo