Vector-Based Backtesting with VectorBT

6 min read

Core idea

Vectorize the backtest, accept the trade-off

A backtest at heart is a function from (price_series, entry_signal, exit_signal, position_sizing, costs) to a stream of P&L numbers. The naive way to write that function is a for loop over each bar, updating positions and cash; it is easy to reason about and slow as molasses. The vector-based approach VectorBT pioneers does the opposite: compute every entry signal for every parameter combination as one big boolean array, compute every exit similarly, and apply the resulting trade ledger to the price array in vectorized NumPy / Numba code. The cost is realism — you cannot model order types, slippage that depends on volume, or path-dependent state without contortions. The gain is speed: tens of thousands of parameter combinations per minute on a laptop.

The cookbook positions VectorBT as the iteration loop tool, not the production backtester. You use it to brute-force sweep moving-average windows, ATR multipliers, lookback periods, anything continuous — and you find the parameter regions where the strategy works. Then you take the winning combinations to an event-based backtester (Zipline, in the next topic) where order modelling is real but iteration is slow.

Walk-forward is how you distinguish skill from overfitting

The deepest idea in the topic is the difference between in-sample performance (which is always good if you tuned for it) and out-of-sample performance (which is the only signal that matters). Walk-forward optimization formalizes this: split the data into rolling (train, test) windows; tune parameters on the training window; evaluate them, untouched, on the next test window; slide both forward; repeat. The distribution of out-of-sample Sharpes — not the best in-sample Sharpe — is the strategy's true expected performance.

The cookbook drives the point home with a t-test: the AAPL moving-average crossover, brute-force-optimized in-sample, has negative mean out-of-sample Sharpe with a p-value of 0.86 on the "out-of-sample > in-sample" test. The strategy is overfit. The right response is not to keep searching; it is to discard it.

Why it matters

A backtest that takes hours kills iteration

If a single backtest takes 10 minutes, sweeping 30 fast-MA × 30 slow-MA parameter combinations on 5 stocks across 30 walk-forward splits is 30 × 30 × 5 × 30 × 10 minutes ≈ 1,350 hours, ≈ 56 days. With VectorBT's Numba-JIT-compiled core, the same sweep takes seconds. The change in turnaround time is the change in what's possible: from "I can test one strategy this week" to "I can test thirty strategies before lunch." Speed compounds — a tool that's 1000× faster doesn't just save time, it changes which questions are worth asking.

Vectorization makes overfitting more dangerous, not less

The dark side of fast iteration: when you can sweep ten thousand parameter combinations cheaply, you will sweep ten thousand parameter combinations, and one of them will produce a backtest curve so beautiful you start believing it. Some of them are statistical accidents — exactly what you'd expect from drawing ten thousand samples. VectorBT's speed makes the temptation worse. The walk-forward discipline is the antibody. Always.

Key takeaways

Mental model

Vector-based vs event-based backtesting

The two paradigms exist on opposite ends of a speed/realism trade-off:

Vector-based vs event-based backtesting

Walk-forward optimization, step by step

A walk-forward harness has four moving parts: data, splits, optimization, validation. The rolling_split method bundles the first two; the user supplies the last two.

n — number of (in-sample, out-of-sample) windows.
window_len — total length of each window in days.
set_lens — tuple (in_sample_len, out_sample_len). The cookbook uses (180,) meaning 180 days reserved for the test set.
left_to_right — direction the window grows.

For each split:

Optimize the strategy on the in-sample slice. The cookbook brute-forces all (fast, slow) MA combinations and picks the pair with the highest Sharpe.
Run that winning pair on the out-of-sample slice. Record its Sharpe.
Slide the window forward and repeat.

You end up with two distributions: in-sample Sharpes (always high, because you tuned for them) and out-of-sample Sharpes (the honest signal). A one-sided t-test on out > in is the cookbook's evidence test.

The indicator factory pattern

VectorBT (and Pro) ship a metaclass called IndicatorFactory that turns any function (*inputs, *params) -> outputs into a first-class indicator with vectorized parameter sweeps, MultiIndex outputs, and built-in plotting. The structure:

SuperTrend = vbt.IF(
    class_name="SuperTrend",
    short_name="st",
    input_names=["high", "low", "close"],
    param_names=["period", "multiplier"],
    output_names=["supert", "superd", "superl", "supers"],
).with_apply_func(supertrend_fn, takes_1d=True)

st = SuperTrend.run(high, low, close,
                    period=np.arange(4, 20),
                    multiplier=np.arange(20, 41) / 10,
                    param_product=True)

The factory runs supertrend_fn over every (period, multiplier) combination, returns a MultiIndex DataFrame with all outputs, and integrates with the rest of VectorBT (signals, portfolios, heatmaps).

The pattern generalizes: any indicator can become a factory-built sweep-ready VectorBT object. Pair it with Numba's @njit for compiled speed, and a moderately complex indicator (SuperTrend, Bollinger Bands, custom factor) sweeps tens of thousands of combinations in seconds.

Practical application

The parameter-sweep workflow

For a quant who wants to know "is there a profitable moving-average crossover on this asset?", the cookbook's pattern looks like:

Pull prices for the asset (or asset universe).
Define a parameter grid: windows = np.arange(10, 40) — every integer from 10 to 39.
Use vbt.MA.run_combs(price, windows, r=2) to generate every (fast, slow) pair where fast < slow.
Compute crossovers in one call: fast.ma_crossed_above(slow) → entries; fast.ma_crossed_below(slow) → exits.
pf = vbt.Portfolio.from_signals(price, entries, exits, freq="D") runs every combination's backtest.
Inspect: pf.sharpe_ratio(), pf.total_return(), pf.max_drawdown().
Plot as a heatmap. Look for stable parameter regions, not single peaks.

The whole loop is 20-ish lines and finishes in seconds, even for hundreds of combinations.

Walk-forward — the disciplined add-on

To gate the parameter sweep against overfitting:

Generate splits: (in_price, in_idx), (out_price, out_idx) = prices.vbt.rolling_split(n=30, window_len=365*2, set_lens=(180,)).
Sweep all parameters on each in-sample slice; pick the best for that slice.
Run those best parameters on the corresponding out-of-sample slice.
Compare: scipy.stats.ttest_ind(out_sample_sharpes, in_sample_sharpes, alternative="greater").
If the test fails — out-of-sample is statistically worse — discard the strategy.

This is the discipline that separates a researcher from a curve-fitter.

Example

Consider a quant building a "volatility-targeted moving-average crossover" strategy on a basket of 50 large-cap US equities. The hypothesis: the crossover works on stocks where 21-day realized volatility is between 15% and 35% annualized — too quiet and there are no trends; too loud and the crossover lags too much.

The VectorBT workflow:

Pull 5 years of daily bars for the 50 tickers via OpenBB.
Compute the 21-day rolling annualized volatility for each ticker.
Build a time-varying universe mask: only include each ticker on days when its rolling vol is in [0.15, 0.35].
Generate fast_ma and slow_ma with a sweep over (fast=5..20, slow=20..60) — roughly 270 combinations per ticker.
Build entries and exits, masked by the vol-regime gate.
Run Portfolio.from_signals(prices, entries, exits, freq="D") — a single call generates ~13,500 backtest results (270 × 50).
Wrap the whole thing in a walk-forward harness with 12 rolling 18-month windows.
For each split, find the parameter combination with the highest in-sample mean Sharpe across the 50 tickers; apply those same parameters to the out-of-sample slice.
Plot the in-sample-vs-out-of-sample Sharpe distributions side by side.

If the out-of-sample distribution has a positive median and the t-test is significant, the volatility gate is doing something real. If not, the strategy is overfit and the trader saves themselves from putting capital on it. Either way, the answer arrives in minutes — that's the point of vector-based backtesting.

Backtestinglinked concept
Walk-Forward Analysislinked concept
Overfittinglinked concept
Sharpe Ratiolinked concept
Algorithmic Tradinglinked concept

Vector-Based Backtesting with VectorBT

Core idea

Vectorize the backtest, accept the trade-off

Walk-forward is how you distinguish skill from overfitting

Why it matters

A backtest that takes hours kills iteration

Vectorization makes overfitting more dangerous, not less

Key takeaways

Mental model

Vector-based vs event-based backtesting

Walk-forward optimization, step by step

The indicator factory pattern

Practical application

The parameter-sweep workflow

Walk-forward — the disciplined add-on

Example

Continue exploring

Tags

Vector-Based Backtesting with VectorBT

Core idea

Vectorize the backtest, accept the trade-off

Walk-forward is how you distinguish skill from overfitting

Why it matters

A backtest that takes hours kills iteration

Vectorization makes overfitting more dangerous, not less

Key takeaways

Mental model

Vector-based vs event-based backtesting

Walk-forward optimization, step by step

The indicator factory pattern

Practical application

The parameter-sweep workflow

Walk-forward — the disciplined add-on

Example

Related lessons

Related concepts

Continue exploring

Tags