But does it work?

Evaluating the Performance of the stockz.ai Scoring Model (Jan 2024 – Nov 2025)

Evaluating the Performance of the stockz.ai Scoring Model (Jan 2024 – Nov 2025)

Content

1. Executive Summary

This report documents the first quantitative backtest of the stockz.ai composite scoring system, designed to evaluate whether the platform’s combined valuation and financial metrics can consistently identify outperforming equities.

Between January 2024 and November 2025, the simulated portfolio generated a cumulative return of 62.63%, compared to 45.64% for the S&P 500 benchmark. Despite a limited timeframe, the results indicate that the scoring model captured genuine predictive signals across multiple market cycles while maintaining a moderate drawdown profile and robust risk-adjusted returns.

Key outcome: The backtest supports the hypothesis that a fundamentally-driven, multi-factor scoring approach—filtered by a conservative risk layer—can yield superior returns compared to a passive index allocation, even without access to forward-looking data or machine-learning prediction.

2. Objective

The goal of this initial backtest was to answer a simple question:

“Would a portfolio constructed purely from stockz.ai scores have outperformed the market over the past 22 months?”

This test serves as a proof-of-concept validation of the scoring logic’s real-world efficiency. It is not an attempt at performance optimization but rather an integrity check of the underlying model structure:

  • Do the composite scores correlate with future price performance?

  • Does the risk-score filter successfully exclude unstable or overleveraged equities?

  • Are the results robust across multiple rolling periods?

3. Methodology

3.1 Data Sources

  • Price data: Yahoo Finance (yfinance API).

  • Fundamental data: stocks.ai database derived from yfinance API.

  • Benchmark: S&P 500 Index (^GSPC).

3.2 Universe Selection

At each monthly rebalance date, the eligible universe consisted of all companies with valid Stocks.ai scores and available historical prices.

3.3 Scoring Model

Each stock received an aggregated Total Score, based on the multiple of the stockz.ai financial and valuation score.
Additionally, each stock carries a stockz.ai Risk Score (0–5) capturing factors such as liquidity risk, leverage, and price volatility.

For this test:

  • Only stocks with risk_score == 0 were considered.

  • Among these, the Top 10 by total score were selected each month.

  • Allocation was equal-weighted across all holdings.

3.4 Rebalancing Logic

Frequency: Monthly (1st trading day of each month).

  • Sell all holdings at month-end and reinvest equally into the new Top 10 selection.

  • Transaction costs: Ignored (assumed negligible at this stage).

  • Capital start: $ 300.00 (arbitrary normalized baseline).

3.5 Evaluation Metrics

To assess the strategy comprehensively, the following statistics were computed:

  • Cumulative Return and CAGR

  • Sharpe Ratio (annualized)

  • Sortino Ratio (annualized)

  • Max Drawdown

  • Volatility (stdev of monthly returns)

  • Calmar Ratio

  • Percentage of Profitable Trades

  • Rolling Window Performance (semiannual overlapping windows)

4. Results Overview

stockz.ai Portfolio

S&P 500 Benchmark

Cumulative Return

62.36%

45.64%

CAGR

30.26%

22.76%

Sharpe Ratio

1.61

1.65

Sortino Ratio

2.21

1.32

Max Drawdown

10.73%

10.8%

Volatility

5.08%

3.76%

Calmar Ratio

2.82

2.11

4.1 Aggregate Performance

The strategy achieved a cumulative gain of 62.36%, outperforming the S&P 500 by 16.72 percentage points.
The average monthly return was 2.35% with a volatility of 5.08%, resulting in an annualized Sharpe ratio of 1.61.

4.2 Rolling Windows

Six overlapping semiannual backtests demonstrated positive consistency in four of the six periods, with the strongest window (Apr 2025 – Oct 2025) showing 43.94% absolute gain and minimal drawdown.
Periods of underperformance aligned with broader market contractions, confirming the model’s fundamental beta exposure rather than structural weakness.

4.3 Trade Distribution

Across 220 total trades:

  • Profitable trades: 57.34%

  • Largest single gain: 66.54% (ZIM)

  • Largest loss: -28.53% (STM)

High-beta but fundamentally sound companies (e.g. ZIM, TSM) frequently appeared among top performers, suggesting that the composite score efficiently captures value + momentum blends.

5. Interpretation

5.1 Evidence of Predictive Validity

The backtest provides encouraging evidence that the Stocks.ai scoring framework can rank equities in a way correlated with future excess returns.
While the sample covers only 22 months, the consistent outperformance across different time slices and the statistically reasonable Sharpe and Sortino ratios indicate that the signal is not random noise.

5.2 Strengths Observed

  • High capital efficiency: strong CAGR with controlled volatility.

  • Resilience: max drawdown comparable to the benchmark despite higher average returns.

  • Risk filter success: excluding risk-score > 0 stocks improved downside protection.

  • Transparency: methodology is rule-based and replicable, not discretionary.

5.3 Weaknesses & Limitations

  • Limited timespan: 22 months is insufficient for full-cycle validation.

  • U.S. bias: predominantly U.S. stocks tested (international universes slightly underrepresented).

  • Neglected transaction costs: could slightly reduce real-world performance.

  • Lagged fundamentals: relies on past filings; no real-time adjustments)

Nonetheless, even with these conservative assumptions, the model demonstrated clear alpha generation relative to a large-cap benchmark.

6. Next Steps

  1. Five-Year Historical Expansion
    Integrate a 5-Year Dataset to extend the testing horizon back to 2019+, covering multiple market environments.

  2. Sector-Specific Sub-Tests
    Evaluate model behavior across cyclical vs. defensive sectors to measure regime sensitivity.

  3. Cost-Adjusted Simulation
    Introduce 5–10 bps monthly friction to approximate transaction costs and check robustness.

7. Conclusion

Even at this early stage, the backtest indicates that stockz.ai’s data-driven scoring framework can translate fundamental analysis into measurable investment outperformance.
The approach appears especially effective at identifying under-appreciated yet financially sound companies, while avoiding high-risk outliers.

Although additional long-term validation is required, the results already demonstrate the core value proposition of stockz.ai:

Structured fundamentals can be systematically leveraged to build smarter portfolios.

The next iterations will extend this analysis over five-plus years of data and across broader markets.
If current trends persist, the stockz.ai model could serve as a transparent, quantitative foundation for modern fundamental investing.