Methodology behind project patchwindow
A breakdown of what’s in scope, what data sources are used, how dates are sourced, how confidence is assigned, and where the data has limits. If you find an error or have an idea to improve this, please reach out (LinkedIn) — this dataset is meant to be corrected and improved over time.
Purpose
patchwindow’s purpose is to provide a data driven, defensible, and auditable measurement of how vulnerability exploitation has changed over time, and how the current threat landscape compares. It aims to separate hype from reality, and challenge commonly accepted narratives like mean time-to-exploit has been trending downwards for years or is rapidly declining due to AI enabled exploitation.
It is also intended as a usable data source to use in patch management programs, and to guide threat intelligence briefings to senior stakeholders and technical teams alike. The current form is phase 1 of the project, with phase 2 planned to include additional KEV sources, and more granular filtering and data analysis.
Scope
The dataset contains every CVE in CISA's Known Exploited Vulnerabilities (KEV) catalog
with a CVE identifier of CVE-2018-* or later. Pre-2018 entries are
excluded because data reliability for older incidents degrades significantly — many
first-exploitation dates for early-2010s CVEs are not credibly documented in public
sources.
Data Sources
The CISA KEV catalogue is currently used as the sole data source for exploited vulnerabilities. Other sources were reviewed and considered, but were assessed as having either too high a false positive rate, or were poor predictive indicators of actual exploitation (such as PoCs on GitHub).
Dashboard
The graphical representations and key datapoints featured in the dashboard are based on key fields in the database, primarily Disclosure, First Exploited and First Reported (descriptions below). The Disclosure and First Exploited fields are used to establish the TTE for each CVE, and the First Reported enables us to accurately assess when exploitation of a vulnerability was widely known. This is key to performing like-for-like comparisons across years, which is the only way to accurately understand how key exploitation statistics, such as Mean TTE, Zero-day rate, and total CVEs exploited, is changing over time.
Other representations of how the mean TTE has changed over time, typically just measure it for all years from the current date, which severely biases recent years’ mean TTE downwards due to shorter overall elapsed times. By instead comparing snapshots across years, we get actual insight into how CVE exploitation has changed and a potential opportunity for predictive forecasting within a measurable error range, as the same predictive methodology can be validated against past years.
The dashboard graphs also highlight the mean TTE per year, both excluding zero-days and including zero-days. The purpose of the first is to provide the patch window defenders actually have for everything that isn’t a zero-day (as a zero-day’s patch window is known as zero), and the second is to provide a more definitionally accurate mean TTE representation. Both of these graphs use a value of zero for any CVE exploited before disclosure date, instead of utilising negative TTE values. For snapshots including negative TTE values refer to the Snapshots page.
Database Field Definitions
CVE, Vendor, Product
Sourced from the KEV catalog. Vendor and product names are normalised lightly for grouping (e.g. "Microsoft Corporation" → "Microsoft") but otherwise left as provided.
Disclosure
The CVE's NVD published date — the moment the CVE entry became public. For most CVEs this aligns within a day of vendor advisory publication; for a small number it diverges, and where the divergence is material the source notes flag it.
First Exploited
The earliest credible date of in-wild exploitation, drawn from primary sources: vendor advisories, incident-response writeups, threat intelligence publications, and post-incident law-enforcement disclosures. Where multiple sources give different dates, the earliest defensible one is taken, with the reasoning captured in source notes.
First Reported
The date the exploitation became publicly known — typically a vendor PSA, a research blog, or a news report. Always equal to or later than First Exploited.
TTE
Calculated as max(0, First Exploited − Disclosure). Values of 0
cover both "exploited on the day of disclosure" and "exploited some time before
disclosure" — i.e. the zero-day population. Use TTE inc negatives when this
distinction matters.
TTE inc negatives
First Exploited − Disclosure with no flooring. Negative values
indicate exploitation that pre-dated public disclosure of the vulnerability.
Zero-day
Defined as TTE inc negatives ≤ 0 — i.e. exploited at or before
public disclosure. This is a deliberately inclusive definition; it covers both
true pre-disclosure exploitation and "exploited on day 0" incidents where the
disclosure was effectively reactive to observed exploitation.
Confidence
Each row carries a high / medium / low confidence rating reflecting the documentary strength of the dates:
- High — Date is established by a primary source (vendor advisory, IR firm report, official disclosure) with internal consistency between sources.
- Medium — Date is established by a single primary source, or by multiple secondary sources without a primary citation.
- Low — Date is inferred from weaker signals (community speculation, KEV addition date used as proxy, single tertiary source). Read the source notes carefully.
KEV Added
The date CISA added the CVE to the KEV catalog. Useful as a publication-cadence signal but distinct from the exploitation timeline.
Snapshot tables
Each snapshot table is a point-in-time view. The 2018 row of the May month-end table shows only CVE-2018-* entries that were both disclosed and had exploitation publicly reported by May 31, 2018. This is the most useful comparator because it removes the bias from later-added entries — each row in a given table looks at each CVE year at the same vintage of data.
Why exclude later entries? If you simply look at "what does the 2018 catalog look like today" you're folding in seven years of subsequent enrichment. Comparing "May 2018 view of 2018 CVEs" to "May 2024 view of 2024 CVEs" gives a much cleaner read on how the patch window has actually moved.
Update cadence
Database is updated daily, and the monthly snapshots are updated at the end of each month.
Known limits
- Survivorship. The dataset only contains CVEs CISA has chosen to add to KEV. Some exploited vulnerabilities never make the catalog, and their omission shapes any aggregate.
- "First exploited" is a floor, not a ceiling. The earliest known exploitation often post-dates the actual first exploitation. Treat values close to zero as "no later than" estimates.
- Vendor labelling is imperfect. Products span multiple vendors over their lifetime; the dataset uses the most current canonical vendor where reasonable.
- Low-confidence rows are kept, not dropped. Filtering them out would inflate the apparent precision of aggregates. Use the confidence column to weight your reading.
- Not all KEV entries have reliable exploitation dates. Where possible, evidence of exploitation was sourced from reliable sources to ascertain the exact date. Where no independent source was available, the CISA KEV updated date was used as a fall back first exploited date.
Source links
Every row has a primary Source URL and a short narrative in
Source Notes. Expand any row in the database
to read them. If you find a row where the date is wrong or the source is broken,
please get in touch
(LinkedIn).
License
The aggregated dataset is published under CC BY 4.0. CVE identifiers are managed by MITRE; the KEV catalog is maintained by CISA. This site is an independent analytical layer over those public sources.