Project Overview
CVE Forecast is a sophisticated, self-improving automated platform that leverages advanced hyperparameter optimization and multiple time series forecasting models to predict the number of Common Vulnerabilities and Exposures (CVEs). It provides a comprehensive, data-driven view of future trends in vulnerability disclosures, all accessible through a sleek, interactive web dashboard.
π Real-World Validation
Historical backtest on 2025 data (Jan-Sep) with actual vs. predicted comparisons. MAPE ranging from 6.22% (LightGBM) to 21.65% (Croston).
π Unified Pipeline
Single command execution for CVE + CNA forecasting. Automated daily updates and monthly hyperparameter tuning via GitHub Actions.
π Accuracy Tracking
ForecastTracker accumulates prediction snapshots over time, enabling long-term accuracy analysis and model stability assessment.
ποΈ Modular Architecture
Clean separation between data loading, training, forecasting, and validation. BaseForecaster and ValidationMixin provide extensible framework.
CVE Forecasting Pipeline
The CVE forecasting pipeline uses 13 production-ready models with optimized hyperparameters, historical backtest validation, and transparent performance metrics.
ποΈ Core Components
- run_production_forecast.py: Unified pipeline entry point
- cve_adapter.py: CVE forecasting implementation
- base_forecaster.py: Abstract base class
- validation_mixin.py: Backtest validation
- forecast_tracker.py: Accuracy tracking over time
- data_loader.py: Processes 297K+ CVE JSON files
π Production Models (13 Optimized)
Top Performers (2025 Backtest): LightGBM (6.22% MAPE), KalmanFilter (6.26%), TBATS (7.21%)
π― Historical Backtest Validation
Real-World Accuracy Testing: Each model is backtested by training on data through 2024 and forecasting Jan-Sep 2025, then comparing predictions against actual published CVE counts.
- Forecast vs Published Table: Month-by-month comparison with error percentages and performance ratings
- Model Rankings: Real-time leaderboard sorted by backtest MAPE
- Transparent Metrics: MAE, MAPE, and performance badges (Excellent < 5%, Good < 10%, Fair < 20%)
- Historical Tracking: ForecastTracker accumulates snapshots for long-term accuracy analysis
π Automated Workflows
GitHub Actions Integration:
- Daily Forecast: Runs at midnight UTC, generates fresh forecasts, deploys to GitHub Pages
- Monthly Tuning: Runs 1st of each month, optimizes hyperparameters, updates config.json
- Zero Downtime: Continuous deployment with automatic rollback on failure
- Artifact Storage: 90-day retention of tuning results and execution logs
CNA Forecast
CNA Forecast provides organization-specific vulnerability disclosure predictions through a dedicated pipeline optimized for individual CVE Numbering Authorities.
π’ Core Components
- cna_main.py: Orchestrates CNA-specific forecasting workflow
- cna_config.json: CNA-optimized model configurations
- cna.js: Interactive visualization and table management
- cna_forecast.html: Dedicated CNA dashboard interface
- cna-*.js: Specialized chart and utility modules
π Model Selection (CPU-Optimized)
π― Intelligent Model Selection
Organization-Specific Optimization: Each CNA's unique vulnerability disclosure patterns are analyzed to select the best-performing model based on validation MAPE scores.
- Validation-Based Selection: Models tested on historical data with automatic fallback mechanisms
- Performance Tracking: MAPE scores recorded for transparency and model comparison
- Adaptive Configuration: Hyperparameters optimized per organization's data characteristics
- Robust Error Handling: Graceful degradation when models fail on insufficient data
π CNA-Specific Features
Organization-Centric Analytics:
- Individual Forecasts: Dedicated predictions for 166+ CNAs with interactive charts
- Sortable Interface: Dynamic table with forecast values, historical data, and model selection
- Cumulative Projections: Timeline visualization showing both historical and predicted trends
- Model Transparency: Clear indication of which model was selected for each organization
- Performance Metrics: MAPE scores displayed for forecast confidence assessment
Deployment & Automation
The system features fully automated CI/CD pipeline with daily updates and intelligent optimization integration.
π GitHub Actions Workflow
- Daily scheduled execution (midnight UTC)
- Automatic CVE data fetching and processing
- Model training and forecast generation
- Intelligent hyperparameter optimization
- Automated deployment and configuration updates
β‘ Production Features
- Processes 300K+ CVE JSON files daily
- Dynamic forecasting through January 2026
- Self-improving optimization workflow
- Automatic configuration backups
- Comprehensive validation and error handling
Release History
π₯π¦ v0.10 - Phoenix π₯π¦ (October 2025)
π Complete Architectural Rebirth - Production-ready unified pipeline with historical validation and accuracy tracking
β¨ Major Features
- Unified Pipeline: Single command (run_production_forecast.py) for CVE + CNA forecasting
- Historical Backtest: Real-world validation on 2025 data (Jan-Sep) with actual vs. predicted comparisons
- Forecast Tracking: ForecastTracker accumulates prediction snapshots for long-term accuracy analysis
- Model Rankings: Real-time leaderboard based on backtest MAPE (LightGBM: 6.22%, KalmanFilter: 6.26%)
- Forecast vs Published Table: Month-by-month comparison with error percentages and performance ratings
ποΈ Architecture Improvements
- Modular Design: BaseForecaster, ValidationMixin, CVEForecaster, CNAForecaster adapters
- Clean Separation: Data loading, training, forecasting, and validation in separate modules
- Extensible Framework: Easy to add new models, data sources, or validation strategies
- Production-Ready: Robust error handling, comprehensive logging, and monitoring
π Automated Workflows
- Daily Forecast: Midnight UTC execution with automatic deployment to GitHub Pages
- Monthly Tuning: 1st of each month hyperparameter optimization with config updates
- Zero Downtime: Continuous deployment with automatic rollback on failure
- Artifact Storage: 90-day retention of tuning results and execution logs
π Documentation
- Architecture Guide: System design, components, and data flow
- API Reference: Classes, methods, and configuration options
- Deployment Guide: GitHub Actions, hosting, and CI/CD
- Development Guide: Contributing, testing, and best practices
- Tuning Guide: Hyperparameter optimization workflows
π΄σ §σ ’σ ³σ £σ ΄σ Ώ v0.09 - Edinburgh π΄σ §σ ’σ ³σ £σ ΄σ Ώ (October 2025)
π Year Rollover Automation & Enhanced Forecasting - Complete 2026 readiness with zero manual intervention
π Year Rollover Automation
- Fully dynamic YoY growth calculations that automatically compare current year vs previous year
- Automatic chart axis updates - date ranges adapt seamlessly across year boundaries
- Smart forecast end year detection with automatic rollover when config becomes outdated
- Dynamic chart descriptions and labels that update based on current year
- Backend time series processing fully dynamic with current_datetime.year throughout
π Enhanced Dashboard Features
- Improved "Projected Full Year Growth" card with explicit year comparisons (e.g., "2025 vs 2024")
- Detailed growth metrics showing actual numbers: "45,000 vs 39,970 (Full Year)"
- Chart x-axis automatically spans from Jan of current year to Jan of next year
- All summary statistics and cumulative timelines update dynamically
π§ Code Quality & Maintenance
- Comprehensive year rollover audit identifying all hardcoded year references
- Main forecast page 100% automatic - zero manual intervention needed for 2026
- CNA forecast page requires minimal annual label updates (5 minute task)
- Annual maintenance checklist added to README for December 31, 2025
- Consolidated documentation for better repository organization
π v.08 - Opening Drive π (September 2025)
π Launch of Individual CNA Forecasts - Revolutionary organization-specific vulnerability prediction system
π’ CNA Forecasts Platform Launch
- Dedicated forecasting pipeline for 166+ CVE Numbering Authorities (CNAs)
- Organization-specific vulnerability disclosure predictions with interactive visualizations
- Advanced model ensemble including LightGBM, XGBoost, Prophet, and ExponentialSmoothing
- Intelligent model selection based on validation performance for each CNA's unique patterns
- Comprehensive sortable table interface with real-time forecast data and historical trends
- Dynamic chart generation with organization-specific timelines and cumulative projections
βοΈ Technical Architecture
- CPU-optimized forecasting pipeline designed for production scalability
- Automated model validation with MAPE scoring and fallback mechanisms
- JSON-based data architecture supporting real-time updates and historical analysis
- Responsive web interface with Chart.js integration for interactive data exploration
- Configurable forecast horizons and model hyperparameters via JSON configuration
- Enterprise-grade error handling and logging for reliable automated execution
π Data & Analytics
- Historical CVE data analysis spanning multiple years per organization
- Statistical model performance tracking with validation metrics
- Forecast confidence intervals and uncertainty quantification
- Cross-organizational trend analysis and comparative insights
v.07 - Security Summer Camp Prep ποΈ (August 2025)
Fixed critical month transition bug in cumulative total calculations, ensuring accurate data representation across month boundaries
π οΈ Bug Fix Details
- Replaced hard-coded month references with dynamic month detection
- Ensured cumulative totals properly build upon the previous month's values
- Fixed inconsistencies in cumulative statistics when crossing month boundaries
- Implemented future-proof solution that works reliably for all calendar transitions
- Added comprehensive logging to track cumulative total calculations
v.06 - KarlΕ―v mos π¨πΏ (July 2025)
Revolutionary self-improving forecasting system with intelligent hyperparameter optimization
π§ Intelligent Optimization
- Comprehensive hyperparameter tuner for 19+ models
- Self-improving workflow that learns from previous runs
- Adaptive grid/random search selection
- Intelligent timeout management and progress tracking
π Automated Infrastructure
- Daily GitHub Actions integration with tuner
- Automatic configuration backup and management
- End-to-end validation pipeline
- Complete self-optimization workflow
- Support for 25+ models across Statistical, Tree-Based, and Deep Learning categories
- Enterprise-grade modular architecture with 7 focused modules
- Enhanced model stability with comprehensive error handling
- Dynamic forecasting with automatic period adaptation
v.05 - Adolfo SuΓ‘rez Madrid-Baraja πͺπΈ
- Fixed a critical bug that prevented the cumulative graph from rendering due to an incorrect data structure in
data.json. - Restored frontend compatibility by correcting the data generation logic, ensuring all charts now load correctly.
v.04 ORD βοΈ MAD
- Enhanced model stability with improved error handling.
- Added input validation and scaling for better numerical stability.
- Optimized for CPU-only environments.
- Implemented dynamic forecast period calculation.
- Improved model selection based on MAPE scores.