Is Your System Broken?

The Ultimate Test of Efficacy: Time

At Futures Truth, we focused on one thing: walk-forward results across hundreds of trading systems. Back then, the industry was unrecognizable compared to today—if you can even call what exists now an “industry.”

Just this morning, I was reminiscing about vendors who sold strategies for upwards of $5,000. Those packages usually consisted of a set of rules (disclosed or not) and a rudimentary DOS-based program to generate simple charts and next-day orders. This was well before the heyday of TradeStation; those days are gone forever, for better or worse.

Back then, if a system performed well in walk-forward analysis, it earned a ranking. In those days, a Futures Truth ranking meant something. Sometimes, however, that ranking felt like a curse. Critics even argued you could “fade” a top-ranked system and make more money by taking the opposite trade. The reality is that many of those legendary “Top Ten” systems—built for a different era—simply wouldn’t survive in today’s markets.

Side Bar:  There are still services such as Striker and The Collective that still monitor trading systems.  Striker only shows real execution, which is mostly a good thing – real execution costs are shown.  But so is broker error.  Heck, we are all human and we all make mistakes.  Striker is upfront with this, and they state they will do the best they can – execution can be a beast. 

Speaking of execution – The Sunday Evening Gap

A Traders Nightmare. Most likely Trend Followers were short during the huge gap here!

 Imagine, having a stop order in crude oil on a Sunday evening when the market opens thousands of dollars through your stop.  If a system derived position is short the broker’s only directive is to GET OUT – no matter what, at the first off ramp.  The above chart shows a single contract slip on crude futures at the genesis of the Iran War.

The Million-Dollar Question

The second most common question I was asked at Futures Truth was: “How do I know if the system I’m trading is broken?”

Can you guess the most common one? “If it were your money, which system would you trade?”

We never answered that one directly; there was never a simple black-and-white answer. But the second question—the one about “broken” systems—usually surfaced when a trader was deep in a drawdown.

Seasoned traders know that systems ebb and flow; drawdown is simply the “tax” we pay to play the game. Others, however, get “married” to a system and stick with it until “death do us part.” My standard response back then was always:

“Is your current performance still within the boundaries of the backtest?”

In other words, has the system exceeded its historical maximum drawdown by a meaningful margin? Does the current real-time performance look like other “rough patches” in the historical equity curve? If the answer was yes, the trader would usually give it a little more room.

Into Unexplored Waters

Those were simpler days, but the core problem remains: we cannot see the future. No forward analysis can tell you with certainty what a trading system will do next. What it can do is tell you when the system has moved into “unexplored waters.” Once you know that, the decision to stay, abandon, or pause becomes much easier.

All trading systems oscillate between success and failure. If a system has a genuine technical edge, that edge may eventually reassert itself—provided you have the time and capital to wait. But most traders don’t have unlimited resources.

Recently, the market activity surrounding the Iran conflict has pushed many systems into intense drawdowns. The same old questions have reared their heads again. This time, however, I wanted to provide a more empirically derived analysis. I’ve developed a disciplined approach to measuring risk and reward that moves beyond “gut feel.”

To illustrate, I pulled a system off my shelf that I originally designed for retail consumption back in June 2018. Here is how it’s currently holding up:

Hypothetical Results: Before and after development.

The Profit Mirage

This is a mean-reversion approach. I remember thinking back in 2018: How much longer can this bull market continue? With that in mind, I utilized a simple regime filter. After just a few weeks of trading later that year, I genuinely thought the system might be broken. The market spent a large portion of that fall and winter below its 200-day moving average, and shorting simply wasn’t working.

The system then went dormant for a long stretch, finally “waking up” at the height of the pandemic. Had you stuck with it, you would be up significantly today (though we stopped trading right at the onset of the pandemic).

But here is the catch: Profit, by itself, is not enough to determine if a system is broken. As long as they are making money, most traders never bother to peek below the surface. This system would likely be sitting near the top of the Futures Truth rankings today. But let’s dive in and see how it actually performed on its “test of time.”

Test 1 – Baseline Projection

Baseline projection based on in sample average monthly return.

Is Outperformance Always a Good Thing?

Looking at the chart above, you see a steep deviation below the baseline initially, followed by a rocket-ship move to the upside. By late 2025, the actual equity is sitting way above the red dashed expectation line.

Most traders would see this and think they’ve struck gold. “Isn’t this what we want—a strong positive deviation?” they’d ask.

In a world of simple “bottom-line” thinking, the answer is yes. But in the world of professional algorithmic trading, this chart is screaming a different story. To a seasoned developer, deviation is deviation. Whether it’s to the upside or downside, moving this far away from the “expected” path suggests that the strategy’s original statistical model is no longer in control.

When a system starts generating 190% of its expected return, it’s often because it has inadvertently stepped into a high-volatility regime it wasn’t designed to navigate. If the “upside” is this aggressive, you can bet the “downside” risk has scaled right along with it.

Test 2:  Monte Carlo Analysis on Walk Forward

Return and Drawdown sitting on the tails.

The Statistical Reality Check

To understand why I called this system “Degraded” despite the profits, we have to look at the Monte Carlo Walk Forward distributions. This is where we compare real-time performance against thousands of simulated “alternate realities” based on the system’s history.

1. The Equity Distribution (The Good News… or is it?)

In the top chart, our actual forward equity of $67,575 (the dashed red line) sits at the 97th percentile.

  • Interpretation: Out of 1,000 possible outcomes, the system performed better than 970 of them. While this looks great, being this far out on the “tail” of the distribution suggests we are no longer operating in a normal environment.

2. The Drawdown Distribution (The Warning)

The bottom chart is the real story. The actual worst drawdown reached $18,588, placing it in the 98th–99th percentile of severity.

  • The Comparison: The median expected drawdown was only $8,125.

  • The Verdict: We have blown past the 90th percentile of $13,611 and are deep into the “danger zone.”

The Bottom Line

When a system hits the 97th percentile for gains but simultaneously hits the 99th percentile for drawdown stress, the math is telling you that the character of the strategy has changed. You aren’t just trading a system in a “rough patch”—you are trading a system that has moved into a risk regime it was never built to survive. This is the “evidence-based framework” I mentioned earlier. Without these charts, you’re just guessing. With them, you have the data to justify stepping aside.

The Verdict: Test 3 – Overall Assessment

This is where the “gut feel” ends and empirical analysis begins. The following commentary is generated by my new software; a quasi-expert system designed to pair raw statistical results with descriptive, actionable interpretation.

Status: DEGRADED

Risk Assessment: CRITICAL RISK

Incubation/Trading Readiness Score: 4 / 8

Overall Assessment: Degraded
Risk Assessment: Critical Risk
Incubation/Trading Readiness Score: 4 / 8
Return delivery is running ahead of baseline: expected annual return is 18% versus actual annual return of 34% and expected annual gain of $4,477 compares with actual annual gain of $8,536. However, that stronger return delivery has come with a less stable path and/or materially heavier risk than history would suggest. Risk is materially worse than the historical profile: actual worst drawdown of $18,588 is 2.670 times the historical drawdown of $6,962. Risk conditions are now in the Critical Risk range. The realized monthly path is poorly aligned with the baseline, based on monthly equity correlation of 0.960, projection RMSE of $21,789, normalized RMSE of 4.867, and path wander ratio of 0.621. Correlation still describes directional similarity, but RMSE and path wander ratio show how far the realized path has wandered from projection over the same window. Monte Carlo context is cautionary: actual forward equity is $67,575, gain percentile is 97% (near the top of the simulated distribution), and drawdown percentile is 99% (in the worst decile for drawdown stress). Taken together, the system shows meaningful deterioration in incubation.

A readiness reading of 4 out 8 indicates this system right now has degraded to a point where caution and I mean extreme caution should be used in your decision to trade this strategy.  The major factors influencing rating are:

  • Return Attainment:   1.907 – — This indicates the system has achieved 190.7% of its historically expected return during the incubation/trading period. In practical terms, the strategy is generating returns at nearly twice the pace implied by its historical baseline.
  • Path Wander Ratio: 0.621 — This indicates that the actual equity path is deviating from the projected path by a meaningful amount relative to the total expected move over the incubation/trading period. Put simply, the system is not wildly off course, but it is no longer tracking the historical baseline tightly.
  • Drawdown Stress: 2.67 — This indicates that the actual worst drawdown has expanded to roughly 2.7 times the level suggested by the system’s historical baseline. Put simply, the strategy may still be generating return, but it is doing so while absorbing far more pain than its historical profile would justify.

Hindsight is always 20/20, and it is easy to say now that you should have stuck with the system. But with a large sample of out-of-sample trades showing this degree of deterioration, if I had to choose between staying with it, abandoning it entirely, or temporarily shutting it down, I would probably step aside and wait for conditions to settle down. Remember, not trading is an algorithm too.   Having the right tools to make this kind of decision is paramount, because they give you an evidence-based framework for explaining and defending your reasoning.

The Power of Incubation: Is Your Best System Collecting Dust?

How many systems do you have sitting on your shelf? I personally have at least a hundred, probably more. In this business, a little dust is not always a bad thing.

Figuratively speaking, the more “dust” a system has collected, the more real-time incubation it has endured. And that gives you something far more valuable than any backtest: pure, unadulterated out-of-sample evidence.

Seeds Waiting to Be Planted

Algorithmic trading development rarely produces just one system. It creates a trail of offshoots—versions that may have looked unremarkable or even mediocre during the initial build. Yet many of these forgotten systems are really just seeds waiting for the right environment.

When you revisit them months or even years later, you may find that a strategy which struggled in 2022 has bloomed into a powerhouse in 2026. Without a framework to measure that growth, you would never know.

Why You Need an Incubation Framework

Most traders revisit old systems by simply eyeballing an equity curve. An incubation framework goes much deeper by providing:

Historical Context: Does the dusty system’s recent performance still match its original DNA?

Regime Readiness: Has the market finally moved into the regime this offshoot was designed for?

The Go/No-Go Signal: A quantitative way to decide whether a seed is finally ready to be moved from the shelf to the server.

TS-SystemChecker software

TS-System Check Control Panel

A Tool Built for the Journey

I built TS-SystemChecker because, after decades at Futures Truth and years of developing my own strategies, I needed a better way to cut through the emotional fog that surrounds system evaluation.
I didn’t design TS-SystemChecker to be a black box or some kind of get-rich-quick shortcut. I built it because, after decades at Futures Truth and years of developing my own strategies, I wanted a better way to cut through the emotional fog that surrounds system evaluation.

Whether you are a retail trader focused on refining one core system, or a developer like me with a shelf full of offshoots, this framework was built for that journey.

For the specialist: If you have one system you live and die by, the deep-dive analysis helps define its boundaries of truth. You can begin to see whether a drawdown is simply part of the system’s normal character or evidence of something more structural.

For the portfolio manager: If you are tracking a library of ideas, the Batch Analysis feature helps you monitor many systems at once. Import the trade files, review the evidence, and identify which seeds may finally be ready to move from the shelf to a live account.

Looking Beneath the Surface

At the end of the day, this is why I built TS-SystemChecker. Traders need more than opinions, hope, or fear. They need a framework grounded in evidence.

That is the real purpose of this tool: not to make decisions for you, but to help you make better ones.

TS-SystemChecker Batch Mode

Discover more from George Pruitt

Subscribe to get the latest posts sent to your email.

Leave a Reply