Stepwise Selection Performance
StepwiseSelector includes a high-performance Rust engine that leverages parallel processing to test candidate features simultaneously. This provides a massive speedup over traditional serial implementations (like statsmodels).
Key Improvements
- Parallel Fitting: Utilizes all available CPU cores via
Rayonin Rust. - Zero Python Overhead: The entire inner loop (fitting hundreds of candidate models) happens inside the native Rust extension.
- Numerical Parity: Guaranteed 100% consistency with
statsmodelsfor AIC, BIC, and p-values.
Benchmark Results
Test Environment: - Rows: 10,000 - Features: 150 (32 selected) - CPU: 10 Cores (macOS)
| Engine | Execution Mode | Time (s) | Speedup | Result Parity |
|---|---|---|---|---|
| Python (statsmodels) | Serial | 185.86 | 1.0x | - |
| Rust (Newt) | Parallel | 8.02 | 23.2x | 100% Match |
Usage
By default, StepwiseSelector uses engine='auto' (prefer Rust; fallback to Python when Rust is unavailable).
from newt.features.selection import StepwiseSelector
# Uses auto engine by default (prefer Rust)
selector = StepwiseSelector(
direction='forward',
criterion='aic',
engine='auto',
verbose=True
)
selector.fit(X_transformed, y)
For debugging or comparing with legacy results, you can manually switch back to the Python engine:
# Fallback to statsmodels engine
selector = StepwiseSelector(engine='python')
Reproducibility
You can run your own benchmark using the script provided in the repository:
uv run python benchmarks/stepwise_benchmark.py --samples 10000 --features 150