NST Cyber - Blogs - Can EPSS Address Today's Vulnerability Management Challenges Effectively?

Blog

Can EPSS Address Today's Vulnerability Management Challenges Effectively?

Exposure Management

5 Min

Can EPSS Address Today's Vulnerability Management Challenges Effectively?

NST Research Team

With every year that goes by, vulnerability management—the decades-old task of locating, ranking, and fixing software vulnerabilities—faces more difficulties. With MITRE reporting 25,068 new vulnerabilities in 2022 alone—a 24.3% increase from 2021—practitioners find the task even more difficult as they struggle with the complexity of remediation in addition to the sheer volume of vulnerabilities. While the Common Vulnerability Scoring System (CVSS) has long been an industry standard for vulnerability prioritization, it has many limitations, the most notable of which is the inability of its Base metric group to dynamically include post-disclosure updates, such as the introduction of new exploits. Here are some instances of how CVSS constraints can affect vulnerability prioritization:

If no exploit code is available, a high CVSS score vulnerability may be less serious than it appears.

A vulnerability with a low CVSS score may be more harmful than it appears if a high-quality exploit code is available.

A vulnerability with a high CVSS score may become less hazardous over time if it is patched and the exploit code is no longer available.

The Exploit Prediction Scoring approach (EPSS) was created in response to the demand for a more adaptable approach that better estimates the possibility of exploits in the wild. The EPSS Special Interest Group (SIG) was formed in 2020 under the auspices of the Forum of Incident Response and Security Teams (FIRST) to refine a scoring model that not only addresses practitioner needs but also incorporates real-time information post-vulnerability disclosure. Their joint efforts resulted in the creation of a public, more thorough scoring system, with EPSS scores made widely available, opening the door for more nuanced and responsive vulnerability prioritization.

In this article, we will trace the evolution of the Exploit Prediction Scoring System (EPSS) from version 1 to version 3 from a capability and feature standpoint. We will look into how this vulnerability ranking standard has grown and improved its risk prediction methodologies.

1.EPSS ML models

EPSS Version 1 used a logistic regression model for scoring vulnerabilities. Logistic regression is a type of machine learning model that is commonly used for classification tasks. It works by predicting the probability of a given input belonging to a particular class.

EPSS Version 2 transitioned to using the XGBoost model for improved predictive power. XGBoost is a type of machine learning model that is specifically designed for classification and regression tasks. It is known for its high accuracy and efficiency.

EPSS Version 3 continued to use the XGBoost model for vulnerability scoring. This indicates that the XGBoost model is performing well and that EPSS developers are confident in its ability to accurately predict the risk of vulnerabilities.

Comparison of ML models used in EPSS versions 1, 2, and 3:

The ML models used in EPSS versions 2 and 3 have improved predictive power and efficiency over the logistic regression model used in EPSS Version 1. This is likely due to the fact that XGBoost is a more sophisticated model that is specifically designed for classification and regression tasks.

2.EPSS Variables or Features

EPSS variables or features are the inputs to the EPSS ML models. They are used to assess the risk of a vulnerability being exploited. The role of EPSS variables or features is to provide the EPSS ML models with the information they need to assess the risk of a vulnerability being exploited. The more features that are considered, the more accurate the assessment is likely to be.

EPSS Version 1 utilized a feature set of 16 independent variables extracted at the time of vulnerability disclosure. This included factors such as the severity of the vulnerability, the number of affected users, and the availability of exploit code.

EPSS Version 2 expanded the feature set to 1,164 variables. This included additional factors such as the complexity of the vulnerability, the popularity of the affected software, and the presence of public exploit kits.

EPSS Version 3 further expanded the feature set to 1,477 variables. This included additional factors such as the attacker’s motivation, the availability of financial incentives, and the political climate.

By taking into account a greater number of factors, EPSS 3 provide a more accurate assessment of the risk of a vulnerability being exploited.

3.EPSS Data Collection

EPSS Version 1 relied on a manual data collection process, which was time-consuming and error prone. Version 2 adopted a centralized and automated data collection process, which improved efficiency and accuracy. Version 3 further expanded data sources by partnering with multiple organizations, providing a more comprehensive view of vulnerability exploit activity.

EPSS Version 3 has the most advanced data collection process, with significant improvements in all three areas.

4.Data sources used in EPSS

EPSS uses a variety of data sources to assess the likelihood of a vulnerability being exploited in the wild. These data sources include:

Exploitation activity in the wild: This data is collected from a variety of sources, such as honeypots, intrusion detection systems, and malware analysis platforms.
Publicly available exploit code: This data is collected from exploit repositories and vulnerability disclosure platforms.
CVE mentions on lists or websites: This data is collected from vulnerability databases, security blogs, and other online sources.
Social media: This data is collected from Twitter and other social media platforms.
Offensive security tools and scanners: This data is collected from security tools used by penetration testers and other security professionals.
References with labels: This data is collected from the MITRE CVE List and the National Vulnerability Database (NVD). The labels indicate whether a vulnerability has been exploited in the wild, whether exploit code is publicly available, and whether the vulnerability has been mentioned on lists or websites.
Keyword description of vulnerability: This data is extracted from the MITRE CVE List and the NVD. It includes the vulnerability name, description, and affected products.
CVSS metrics: This data is collected from the NVD. It includes the CVSS Base Score, Exploitability Subscore, and Impact Subscore.
CWE: This data is collected from the NVD. It includes the Common Weakness Enumeration (CWE) identifier for the vulnerability.
Vendor labels: This data is collected from the NVD. It includes labels from vendors such as Microsoft, Apple, and Linux.
Age of the vulnerability: This data is collected from the NVD. It indicates how many days since the vulnerability was first published.

EPSS uses this data to train the machine learning model to predict the likelihood of a vulnerability being exploited in the wild. The model is updated regularly with new data, so it can provide up-to-date insights into the exploit landscape.

5.Data Sources and Variables

EPSS uses a comprehensive set of data sources to assess the likelihood of a vulnerability being exploited in the wild. This data helps organizations to prioritize their patching efforts and reduce their risk of being attacked.

6.Prediction Window

Version 1 of EPSS predicted exploitation activity within the first year following the publication of a vulnerability. This longer-term prediction window provided a view of vulnerability risk over an extended period. However, it had limitations that needed addressing. In Version 2, the prediction window was significantly reduced to 30 days as of the time of scoring. This adjustment aligned with the typical remediation cycle for practitioners, making it more practical for prioritization decisions and timely actions.

Similarly, Version 3 continued with a 30-day prediction window as of scoring. This shorter time frame was considered relevant for organizations to take quick and effective actions in response to newly discovered vulnerabilities.

7.Performance

The first version of EPSS, while surpassing the Common Vulnerability Scoring System (CVSS), had certain limitations that hindered its practical adoption. These limitations prompted the development of subsequent versions.

In Version 2, EPSS adopted a centralized architecture and expanded its feature set, resulting in a significant improvement in predictive performance. This version captured higher-order interactions and provided more accurate predictions, making it a valuable tool for vulnerability prioritization.

Version 3 of EPSS focused on precision, aiming to identify vulnerabilities likely to be exploited in the wild more effectively. Through the incorporation of additional data sources and rigorous model fine-tuning, it achieved an impressive 82% improvement in classifier performance over Version 2. This substantial boost in prediction performance equips organizations with enhanced capabilities for prioritization practices and data-driven patching strategies.

Exploit Prediction Scoring System (EPSS) has emerged as a vital tool in the ever-evolving landscape of vulnerability management. As software vulnerabilities continue to surge in number and complexity, EPSS has responded with adaptability and innovation. From its initial logistic regression model to the sophisticated XGBoost-based approach in its latest version, EPSS has continuously improved predictive power and efficiency. With an expanding feature set, refined data collection processes, and a focus on real-time information, EPSS empowers organizations to make more accurate risk assessments and prioritize their actions effectively. This journey from its inception to its current state reflects the importance of staying at the forefront of vulnerability management to safeguard digital ecosystems against emerging threats.

NST Assure Continuous Threat Exposure Management platform leverages the capabilities of EPSS for enhancing the precision and efficacy of its threat management services, enabling clients to make data-driven decisions with confidence.

NST Assure CTEM (Continuous Threat Exposure Management) is an AI and ML-powered enterprise platform for cyber risk detection and management. It continually identifies exposed assets and vulnerabilities, utilizing machine learning to detect potential attack patterns. When threats arise, NST Assure CTEM initiates security assessments for validation and response.

NST Assure Continuous Threat Exposure Management is not just a toolkit but a full-service solution. It combines Machine Learning and Expert Penetration testers in four steps: Discover and analyze, Contextualize and prioritize, Validate, and Mitigate risks that could lead to data breaches or reputation damage if neglected.