ML Plan: VMA/Volumetric Prediction¶

Context¶

Problem: Voids in Mineral Aggregate (VMA) is a critical parameter that determines if a mix design will have adequate binder film thickness. Currently, VMA is only known after lab compaction testing. If VMA is too low, the gradation must be adjusted and testing repeated.

Goal: Predict VMA from gradation and aggregate properties before mixing, allowing engineers to adjust the blend upfront and reduce failed trial batches.

Data Available: 100+ historical Superpave mix designs with actual VMA results from lab testing.

What It Predicts¶

Output	Description
Predicted VMA	Expected VMA at N-design (e.g., 14.8%)
Confidence Range	Likely range (e.g., 14.2% - 15.4%)
Minimum VMA Check	Pass/fail against Superpave minimum
Improvement Suggestions	How to increase VMA if predicted too low

Output	Description
Predicted VFA	Voids Filled with Asphalt
Predicted Gmb	Bulk specific gravity at N-design
Dust-to-Binder Ratio	Fines to effective binder ratio

Data Requirements¶

Inputs¶

Category	Fields
Gradation	14 sieve values (% passing)
Aggregate SG	Gsb (bulk), Gsa (apparent) for coarse and fine
Bailey Ratios	CA ratio, FAc, FAf
Design Parameters	NMAS, N-design, target air voids
Binder Content	Design AC%

Target Variables¶

vma_ndesign - VMA at N-design gyrations
vfa_ndesign - VFA at N-design gyrations
gmb_ndesign - Bulk specific gravity at N-design

Current State: Bailey Method¶

You already have a Bailey Method VMA estimator in /app/services/bailey_method.py:

def estimate_vma(self, gradation, gsb, nmas):
    """Bailey Method VMA estimation"""
    # Uses aggregate packing theory
    # Calculates unit weight and void space

Limitation: Bailey is theoretical; ML can learn from actual lab results to improve accuracy.

Implementation Approach¶

Hybrid Model: Bailey + ML Correction¶

Rather than replacing Bailey, use ML to learn the error between Bailey's prediction and actual VMA:

predicted_vma = bailey_estimate + ml_correction

# Where ml_correction learns:
# - Aggregate-specific behaviors not captured by Bailey
# - N-design effects
# - Binder content interactions

Benefits: - Interpretable (Bailey physics + learned adjustments) - Works with limited data (only learning residuals) - Graceful degradation (if ML fails, Bailey still works)

Training Data Structure¶

for each historical mix design:
    features = {
        # Gradation
        'grad_19mm': 95.0,
        # ... all sieves

        # Aggregate properties
        'gsb_combined': 2.62,
        'gsa_combined': 2.70,
        'absorption': 1.2,

        # Bailey ratios
        'ca_ratio': 0.72,
        'fa_c': 0.42,
        'fa_f': 0.45,
        'bailey_vma_estimate': 14.2,  # Include Bailey's prediction

        # Design parameters
        'nmas': 12.5,
        'n_design': 75,
        'binder_content': 5.4,
        'target_air_voids': 4.0,
    }

    target = actual_vma_ndesign - bailey_vma_estimate  # Learn the correction

Model Choice: Ridge Regression with Polynomial Features¶

Why: - Small correction to Bailey (not full prediction) - Need interpretability to understand adjustments - Limited training data for complex models

Alternative: XGBoost if corrections are highly non-linear

Service Architecture¶

# /app/services/ml/vma_predictor.py

class VMAPredictor:
    def __init__(self):
        self.bailey = BaileyMethod()
        self.ml_corrector = load_model('vma_correction')

    def predict(self, mix_design_version) -> VMAPrediction:
        # Step 1: Get Bailey estimate
        gradation = self._extract_gradation(mix_design_version)
        bailey_vma = self.bailey.estimate_vma(gradation, gsb, nmas)

        # Step 2: Get ML correction
        features = self._extract_features(mix_design_version)
        features['bailey_vma_estimate'] = bailey_vma
        correction = self.ml_corrector.predict(features)

        # Step 3: Combine
        predicted_vma = bailey_vma + correction

        # Step 4: Check against minimums
        min_vma = self.get_min_vma(nmas)
        passes = predicted_vma >= min_vma

        return VMAPrediction(
            predicted_vma=predicted_vma,
            bailey_estimate=bailey_vma,
            ml_correction=correction,
            min_required=min_vma,
            passes_minimum=passes,
            suggestions=self._generate_suggestions(predicted_vma, min_vma, features)
        )

    def _generate_suggestions(self, predicted, minimum, features):
        """Suggest how to increase VMA if too low"""
        suggestions = []

        if predicted < minimum:
            gap = minimum - predicted

            # Check Bailey ratios
            if features['ca_ratio'] > 0.8:
                suggestions.append("Reduce CA ratio (increase coarse aggregate gap)")

            if features['fa_c'] < 0.35:
                suggestions.append("Increase FAc ratio (add more material at SCS)")

            # Check gradation shape
            if features['grad_0_075mm'] > 6.0:
                suggestions.append("Reduce fines content (P200)")

        return suggestions

API Endpoints¶

Endpoint	Method	Description
`/api/ml/predict-vma`	POST	Get VMA prediction with suggestions
`/api/ml/vma-sensitivity`	POST	Sensitivity analysis (what if I change X?)

Response Example¶

{
  "predicted_vma": 14.8,
  "confidence_interval": [14.2, 15.4],
  "bailey_estimate": 14.5,
  "ml_correction": 0.3,
  "minimum_required": 14.0,
  "passes_minimum": true,
  "margin": 0.8,
  "suggestions": [],
  "related_predictions": {
    "vfa": 72.5,
    "dust_to_binder": 0.95
  }
}

UI Integration¶

Mix Design Form¶

Real-time VMA gauge as gradation is entered
Warning indicator if predicted VMA < minimum
Suggestion panel with actionable fixes

Gradation Entry Enhancement¶

[Sieve Entry] → [Live Bailey Analysis] → [ML-Corrected VMA] → [Pass/Fail Indicator]

Visual Elements¶

VMA gauge (green/yellow/red zones)
Comparison: Bailey vs. ML-corrected
Historical accuracy chart

Verification¶

Accuracy Targets¶

MAE < 0.5% VMA
95% of predictions within ±1.0% of actual
Improvement over Bailey alone: reduce MAE by 30%+

Validation Workflow¶

Cross-validation on historical data
Compare Bailey-only vs. Bailey+ML accuracy
Shadow mode on new designs
Engineer feedback on suggestions quality

Value Proposition¶

Benefit	Impact
Fewer failed trial batches	Save 1-2 iterations per design
Earlier VMA feedback	Catch problems before mixing
Actionable guidance	Know HOW to fix, not just that it failed
Improved Bailey	Leverage lab data to enhance theory

Minimum VMA Requirements (Superpave)¶

NMAS (mm)	Minimum VMA (%)
37.5	11.0
25.0	12.0
19.0	13.0
12.5	14.0
9.5	15.0
4.75	16.0

Timeline¶

Week	Focus
1	Data extraction, feature engineering
2	Train correction model, validate vs Bailey
3	API endpoints
4	UI integration
5	Shadow mode, validation

Dependencies on Existing Code¶

File	Usage
`/app/services/bailey_method.py`	Use existing VMA estimator as baseline
`/app/models/superpave_mix_design.py`	Access gradation and volumetric data
`/app/constants.py`	Standard sieve sizes