Is Your System Broken?

The Ultimate Test of Efficacy: Time

At Futures Truth, we focused on one thing: walk-forward results across hundreds of trading systems. Back then, the industry was unrecognizable compared to today—if you can even call what exists now an “industry.”

Just this morning, I was reminiscing about vendors who sold strategies for upwards of $5,000. Those packages usually consisted of a set of rules (disclosed or not) and a rudimentary DOS-based program to generate simple charts and next-day orders. This was well before the heyday of TradeStation; those days are gone forever, for better or worse.

Back then, if a system performed well in walk-forward analysis, it earned a ranking. In those days, a Futures Truth ranking meant something. Sometimes, however, that ranking felt like a curse. Critics even argued you could “fade” a top-ranked system and make more money by taking the opposite trade. The reality is that many of those legendary “Top Ten” systems—built for a different era—simply wouldn’t survive in today’s markets.

Side Bar:  There are still services such as Striker and The Collective that still monitor trading systems.  Striker only shows real execution, which is mostly a good thing – real execution costs are shown.  But so is broker error.  Heck, we are all human and we all make mistakes.  Striker is upfront with this, and they state they will do the best they can – execution can be a beast. 

Speaking of execution – The Sunday Evening Gap

A Traders Nightmare. Most likely Trend Followers were short during the huge gap here!

 Imagine, having a stop order in crude oil on a Sunday evening when the market opens thousands of dollars through your stop.  If a system derived position is short the broker’s only directive is to GET OUT – no matter what, at the first off ramp.  The above chart shows a single contract slip on crude futures at the genesis of the Iran War.

The Million-Dollar Question

The second most common question I was asked at Futures Truth was: “How do I know if the system I’m trading is broken?”

Can you guess the most common one? “If it were your money, which system would you trade?”

We never answered that one directly; there was never a simple black-and-white answer. But the second question—the one about “broken” systems—usually surfaced when a trader was deep in a drawdown.

Seasoned traders know that systems ebb and flow; drawdown is simply the “tax” we pay to play the game. Others, however, get “married” to a system and stick with it until “death do us part.” My standard response back then was always:

“Is your current performance still within the boundaries of the backtest?”

In other words, has the system exceeded its historical maximum drawdown by a meaningful margin? Does the current real-time performance look like other “rough patches” in the historical equity curve? If the answer was yes, the trader would usually give it a little more room.

Into Unexplored Waters

Those were simpler days, but the core problem remains: we cannot see the future. No forward analysis can tell you with certainty what a trading system will do next. What it can do is tell you when the system has moved into “unexplored waters.” Once you know that, the decision to stay, abandon, or pause becomes much easier.

All trading systems oscillate between success and failure. If a system has a genuine technical edge, that edge may eventually reassert itself—provided you have the time and capital to wait. But most traders don’t have unlimited resources.

Recently, the market activity surrounding the Iran conflict has pushed many systems into intense drawdowns. The same old questions have reared their heads again. This time, however, I wanted to provide a more empirically derived analysis. I’ve developed a disciplined approach to measuring risk and reward that moves beyond “gut feel.”

To illustrate, I pulled a system off my shelf that I originally designed for retail consumption back in June 2018. Here is how it’s currently holding up:

Hypothetical Results: Before and after development.

The Profit Mirage

This is a mean-reversion approach. I remember thinking back in 2018: How much longer can this bull market continue? With that in mind, I utilized a simple regime filter. After just a few weeks of trading later that year, I genuinely thought the system might be broken. The market spent a large portion of that fall and winter below its 200-day moving average, and shorting simply wasn’t working.

The system then went dormant for a long stretch, finally “waking up” at the height of the pandemic. Had you stuck with it, you would be up significantly today (though we stopped trading right at the onset of the pandemic).

But here is the catch: Profit, by itself, is not enough to determine if a system is broken. As long as they are making money, most traders never bother to peek below the surface. This system would likely be sitting near the top of the Futures Truth rankings today. But let’s dive in and see how it actually performed on its “test of time.”

Test 1 – Baseline Projection

Baseline projection based on in sample average monthly return.

Is Outperformance Always a Good Thing?

Looking at the chart above, you see a steep deviation below the baseline initially, followed by a rocket-ship move to the upside. By late 2025, the actual equity is sitting way above the red dashed expectation line.

Most traders would see this and think they’ve struck gold. “Isn’t this what we want—a strong positive deviation?” they’d ask.

In a world of simple “bottom-line” thinking, the answer is yes. But in the world of professional algorithmic trading, this chart is screaming a different story. To a seasoned developer, deviation is deviation. Whether it’s to the upside or downside, moving this far away from the “expected” path suggests that the strategy’s original statistical model is no longer in control.

When a system starts generating 190% of its expected return, it’s often because it has inadvertently stepped into a high-volatility regime it wasn’t designed to navigate. If the “upside” is this aggressive, you can bet the “downside” risk has scaled right along with it.

Test 2:  Monte Carlo Analysis on Walk Forward

Return and Drawdown sitting on the tails.

The Statistical Reality Check

To understand why I called this system “Degraded” despite the profits, we have to look at the Monte Carlo Walk Forward distributions. This is where we compare real-time performance against thousands of simulated “alternate realities” based on the system’s history.

1. The Equity Distribution (The Good News… or is it?)

In the top chart, our actual forward equity of $67,575 (the dashed red line) sits at the 97th percentile.

  • Interpretation: Out of 1,000 possible outcomes, the system performed better than 970 of them. While this looks great, being this far out on the “tail” of the distribution suggests we are no longer operating in a normal environment.

2. The Drawdown Distribution (The Warning)

The bottom chart is the real story. The actual worst drawdown reached $18,588, placing it in the 98th–99th percentile of severity.

  • The Comparison: The median expected drawdown was only $8,125.

  • The Verdict: We have blown past the 90th percentile of $13,611 and are deep into the “danger zone.”

The Bottom Line

When a system hits the 97th percentile for gains but simultaneously hits the 99th percentile for drawdown stress, the math is telling you that the character of the strategy has changed. You aren’t just trading a system in a “rough patch”—you are trading a system that has moved into a risk regime it was never built to survive. This is the “evidence-based framework” I mentioned earlier. Without these charts, you’re just guessing. With them, you have the data to justify stepping aside.

The Verdict: Test 3 – Overall Assessment

This is where the “gut feel” ends and empirical analysis begins. The following commentary is generated by my new software; a quasi-expert system designed to pair raw statistical results with descriptive, actionable interpretation.

Status: DEGRADED

Risk Assessment: CRITICAL RISK

Incubation/Trading Readiness Score: 4 / 8

Overall Assessment: Degraded
Risk Assessment: Critical Risk
Incubation/Trading Readiness Score: 4 / 8
Return delivery is running ahead of baseline: expected annual return is 18% versus actual annual return of 34% and expected annual gain of $4,477 compares with actual annual gain of $8,536. However, that stronger return delivery has come with a less stable path and/or materially heavier risk than history would suggest. Risk is materially worse than the historical profile: actual worst drawdown of $18,588 is 2.670 times the historical drawdown of $6,962. Risk conditions are now in the Critical Risk range. The realized monthly path is poorly aligned with the baseline, based on monthly equity correlation of 0.960, projection RMSE of $21,789, normalized RMSE of 4.867, and path wander ratio of 0.621. Correlation still describes directional similarity, but RMSE and path wander ratio show how far the realized path has wandered from projection over the same window. Monte Carlo context is cautionary: actual forward equity is $67,575, gain percentile is 97% (near the top of the simulated distribution), and drawdown percentile is 99% (in the worst decile for drawdown stress). Taken together, the system shows meaningful deterioration in incubation.

A readiness reading of 4 out 8 indicates this system right now has degraded to a point where caution and I mean extreme caution should be used in your decision to trade this strategy.  The major factors influencing rating are:

  • Return Attainment:   1.907 – — This indicates the system has achieved 190.7% of its historically expected return during the incubation/trading period. In practical terms, the strategy is generating returns at nearly twice the pace implied by its historical baseline.
  • Path Wander Ratio: 0.621 — This indicates that the actual equity path is deviating from the projected path by a meaningful amount relative to the total expected move over the incubation/trading period. Put simply, the system is not wildly off course, but it is no longer tracking the historical baseline tightly.
  • Drawdown Stress: 2.67 — This indicates that the actual worst drawdown has expanded to roughly 2.7 times the level suggested by the system’s historical baseline. Put simply, the strategy may still be generating return, but it is doing so while absorbing far more pain than its historical profile would justify.

Hindsight is always 20/20, and it is easy to say now that you should have stuck with the system. But with a large sample of out-of-sample trades showing this degree of deterioration, if I had to choose between staying with it, abandoning it entirely, or temporarily shutting it down, I would probably step aside and wait for conditions to settle down. Remember, not trading is an algorithm too.   Having the right tools to make this kind of decision is paramount, because they give you an evidence-based framework for explaining and defending your reasoning.

The Power of Incubation: Is Your Best System Collecting Dust?

How many systems do you have sitting on your shelf? I personally have at least a hundred, probably more. In this business, a little dust is not always a bad thing.

Figuratively speaking, the more “dust” a system has collected, the more real-time incubation it has endured. And that gives you something far more valuable than any backtest: pure, unadulterated out-of-sample evidence.

Seeds Waiting to Be Planted

Algorithmic trading development rarely produces just one system. It creates a trail of offshoots—versions that may have looked unremarkable or even mediocre during the initial build. Yet many of these forgotten systems are really just seeds waiting for the right environment.

When you revisit them months or even years later, you may find that a strategy which struggled in 2022 has bloomed into a powerhouse in 2026. Without a framework to measure that growth, you would never know.

Why You Need an Incubation Framework

Most traders revisit old systems by simply eyeballing an equity curve. An incubation framework goes much deeper by providing:

Historical Context: Does the dusty system’s recent performance still match its original DNA?

Regime Readiness: Has the market finally moved into the regime this offshoot was designed for?

The Go/No-Go Signal: A quantitative way to decide whether a seed is finally ready to be moved from the shelf to the server.

TS-SystemChecker software

TS-System Check Control Panel

A Tool Built for the Journey

I built TS-SystemChecker because, after decades at Futures Truth and years of developing my own strategies, I needed a better way to cut through the emotional fog that surrounds system evaluation.
I didn’t design TS-SystemChecker to be a black box or some kind of get-rich-quick shortcut. I built it because, after decades at Futures Truth and years of developing my own strategies, I wanted a better way to cut through the emotional fog that surrounds system evaluation.

Whether you are a retail trader focused on refining one core system, or a developer like me with a shelf full of offshoots, this framework was built for that journey.

For the specialist: If you have one system you live and die by, the deep-dive analysis helps define its boundaries of truth. You can begin to see whether a drawdown is simply part of the system’s normal character or evidence of something more structural.

For the portfolio manager: If you are tracking a library of ideas, the Batch Analysis feature helps you monitor many systems at once. Import the trade files, review the evidence, and identify which seeds may finally be ready to move from the shelf to a live account.

Looking Beneath the Surface

At the end of the day, this is why I built TS-SystemChecker. Traders need more than opinions, hope, or fear. They need a framework grounded in evidence.

That is the real purpose of this tool: not to make decisions for you, but to help you make better ones.

TS-SystemChecker Batch Mode

Generated Code Is Not the Same as Engineered Code

AI can write structure, but experienced programmers still supply the craft

The more we rely on generated code, the more disciplined we must become in questioning it.

AI and modern frameworks now provide valuable insights that, just a few years ago, would have required significant time and effort to obtain. However, while they offer tremendous macro-level leverage, they can also introduce subtle assumptions that lead to impossible scenarios and misleading downstream analysis. This is especially true in environments designed for rapid idea testing, where convenience can come at the expense of deeper, microscopic introspection.

For example, in my PatternSmasher framework, I use constructs like BarsSinceEntry to control trade duration and evaluate pattern efficacy. This makes it very easy to test thousands of ideas quickly. But that convenience comes with a responsibility. If you rely on these abstractions without thinking through the details, you can end up with behavior that looks perfectly valid in code but could never occur in the real world.

I have seen this problem in other frameworks and in AI-generated code as well. This is why it is so important to continue to hone your craft and take a deep dive into the results produced by generated code. In the quant world, the first step is to study the trades and isolate problems such as what I call simultaneous same-direction exit and reentry. Once you see it, the job is to fix it without changing the intent of the algorithm.

Let me show you exactly what I mean. The logic behind this example looks perfectly fine on the surface. But when you dig into the trades, you see the problem immediately. In this chart, the system exits a long position and then turns right around and buys again at the same time and price. That is not a reversal. It is a same direction exit and reentry that simply cannot happen in the real world, and it pollutes the back test with trades that should not exist.

Example of simultaneous same-direction exit and reentry at the same time and price—an impossible trade sequence that distorts backtest results – 640 minute bar on Gold – why 640?

We entered a long position, the trade expired, immediately re-entered on the next setup, that trade expired as well, and then entered again—only to get stopped out. That’s three round turns, each incurring commission and slippage.

// Simple code that is of course mean reversion.
// However, since we seem to be in this regime
// let's hone our craft to make this work as intended.

input:movAvgLen(50),consCloses(1),exitAfterNBars(5),stopLoss(3000);

value1 = countIF(c < c[1],consCloses);
value2 = countIF(c > c[1],consCloses);


if value1 = consCloses and close > average(c,movAvgLen) then
buy ("lentry") next bar at open;
if value2 = consCloses and close < average(c,movAvgLen) then
sellShort ("sentry") next bar at open;


if barsSinceEntry > exitAfterNBars then
Begin
sell("lx-exp") next bar at open;
buyToCover("sx-exp") next bar at open;
end;
setStopLoss(stopLoss);
Simple entry with expiration exit

The code that produced this looks pretty clean. You have your entry logic, a BarsSinceEntry exit, and a stop loss. On the surface, everything seems fine.

But you don’t find this kind of problem by staring at the code. You find it by looking at the trades.  This is the one thing AI or a framework doesn’t examine.  At first, the natural reaction is to slap a MarketPosition “gate” on the entry logic. The word “gate” may be a dead giveaway that AI has influenced the discussion. But I like it. It has been around since the early days of electrical circuits, and it is very appropriate here.  I’ve noticed that many of the words AI uses have started to creep into my own vocabulary. Funny how that happens.

Fix #1

input:movAvgLen(50),consCloses(1),exitAfterNBars(5),stopLoss(3000);

value1 = countIF(c < c[1],consCloses);
value2 = countIF(c > c[1],consCloses);


if marketPosition <> 1 and
value1 = consCloses and close > average(c,movAvgLen) then
buy ("lentry") next bar at open;

if marketPosition <> -1 and
value2 = consCloses and close < average(c,movAvgLen) then
sellShort ("sentry") next bar at open;


if barsSinceEntry > exitAfterNBars then
Begin
sell("lx-exp") next bar at open;
buyToCover("sx-exp") next bar at open;
end;
setStopLoss(stopLoss);
Fix #1 - solves the simultaneous exit and re-entry same direction glitch

So what does that MarketPosition “gate” actually do?

It fixes the symptom. The same-bar exit and reentry disappears, and the trades look cleaner.

But it also changes the algorithm in a much deeper way.

In the original design, a new long signal while already long reaffirmed the position and should have kept the trade alive. The gate removes that behavior. Now the strategy must exit first and then wait until the next bar to reenter.

And that delay matters.

By the time the next bar arrives, the setup may be gone. What should have been one continuous trade is now split into pieces—or missed entirely.

You didn’t just clean up the trades. You changed which trades exist.  The new strategy more or may not be more efficient, but just know the algorithm is now different.

Fix #2

We bought, suppressed the expiration exit due to a new buy setup—twice—and were ultimately stopped out at the level where the stop loss from the final trade that didn’t occur—but whose properties we were monitoring—would have been triggered.

Could most quants who aren’t programmers solve this riddle? Probably not. My 40 years of programming experience certainly played a role, and my familiarity with EasyLanguage—especially its limitations—helped guide me down the right path. But more importantly, I was able to recognize the nature of the problem, apply targeted fixes, and then analyze the resulting trades. I repeated this process—wash, rinse, repeat—until the issue was resolved.

Much of the knowledge I relied on has been documented by myself and others over the years. Investing time in books, videos, and webcasts specific to your programming language remains essential—it forms the foundation. But ultimately, refining your own skills and developing your craft is a time-consuming process that pays lasting dividends.

Groundwork for the Fix

Solving what initially appears to be a simple riddle requires recognizing several underlying behaviors. I was able to correct the issue because I could anticipate when a new trade was about to occur. When both the exit gate for an existing position and the entry gate for a new position in the same direction were simultaneously open, I prevented the transition by closing both gates.

However, simply blocking the transition was not enough. I had to simulate the trade that would have occurred. This meant marking the hypothetical entry price, resetting the stop-loss based on that price, and reinitializing my own bars-in-trade counter.

At this point, I could no longer rely on EasyLanguage’s built-in functions such as BarsSinceEntry or SetStopLoss. Those functions assume an actual executed trade and therefore could not reflect the internal state I needed to maintain. To solve the problem correctly, I had to take full control of trade state management and explicitly track these values myself.

input:movAvgLen(50),consCloses(1),exitAfterNBars(5),stopLoss(3000);

vars: mp(0),barsMult(1),barsIntrade(0),lStopLevel(0),sStopLevel(0),closedTrades(0);
vars: canGoLong(False),canGoShort(False);

canGoLong = countIF(c < c[1],consCloses) = consCloses and close > average(c,movAvgLen) ;
canGoShort = countIF(c > c[1],consCloses) = consCloses and close < average(c,movAvgLen);

mp = marketPosition;

//Exit Technology

closedTrades = totalTrades;
//long exit on bar after entry
if mp[1] <> mp and mp = 1 or (closedTrades > closedTrades[1]) Then
begin
barsInTrade = 0;
lStopLevel = open[0] - stopLoss/bigPointValue ;
end;

//short exit on bar after entry
if mp[1] <> mp and mp = -1 or (closedTrades > closedTrades[1]) Then
begin
barsInTrade = 0;
sStopLevel = open[0] + stopLoss/bigPointValue ;
end;

//long reentry stop reset
if mp = 1 and canGoLong and barsInTrade > exitAfterNBars Then
begin
lStopLevel = open of tomorrow - stopLoss/bigPointValue ;
// print(d," ",t," should exit and renter long tomorrow ",barsInTrade," ",barsSinceEntry," ",open of tomorrow);
barsInTrade = -1;
end;

//short reentry stop reset
if mp = -1 and canGoShort and barsInTrade > exitAfterNBars Then
begin
sStopLevel = open of tomorrow + stopLoss/bigPointValue ;
// print(d," ",t," should exit and renter short tomrorrow ",barsInTrade," ",barsSinceEntry," ",open of tomorrow);
barsInTrade = -1;
end;

if mp = 1 then
sell("lx-stopLoss") next bar at lStopLevel stop;

if mp = -1 then
buyToCover("sx-stopLoss") next bar at sStopLevel stop;

//Entry Logic
if canGoLong then
buy ("lentry") next bar at open;
if canGoShort then
sellShort ("sentry") next bar at open;


//Bars in trade expiration exit
if barsInTrade > exitAfterNBars then
Begin
sell("lx-exp") next bar at open;
buyToCover("sx-exp") next bar at open;
end;

//Day of entry protection
setStopLoss(stopLoss);
//Increment barsInTrade - mimic TradeStation here too!
if mp <> 0 then barsInTrade = barsInTrade + 1;
Fix #2 - difficult initially but reusable

This version fixes the problem by taking control of the trade state instead of relying on EasyLanguage’s built-in functions.

First, I define whether I can go long or short, independent of my current position. Then I track my own state variables—market position, bars in trade, stop levels, and trade count—so I know exactly what the system is doing at all times.

The key occurs when a same-direction signal appears after the trade has technically expired. Instead of allowing an exit and immediate reentry, I suppress both actions and simulate the renewed trade. I mark the hypothetical entry price, reset the stop based on that level, and restart my bars-in-trade counter.

Because of this, I can no longer rely on BarsSinceEntry or SetStopLoss—they depend on actual trades. I manage everything explicitly.

The result is a continuous position that preserves the original intent of the algorithm without introducing impossible trades into the backtest.

EasyLanguage also has its share of esoteric nuances. Code order can matter in some places and not in others, particularly with order execution. Even detecting position changes requires a bit of finesse. These details matter, but they are beyond the scope of this discussion.

This is where the difference between generated code and engineered code becomes clear.

A programmer who is not willing to put in the work—and instead relies on AI to solve the problem—will likely stop at the first acceptable fix. The code will run, the trades will look cleaner, and the issue will appear resolved. But the deeper problem remains: the structure has changed, trades may be missing, and the original intent of the algorithm has been compromised.

As we become more dependent on code generation through AI and frameworks, it becomes even more important to validate that the output is reasonable and reflects something that could occur in the real world. That responsibility does not go away—it increases. And it requires us to continue honing our craft.

AI can generate code and even suggest reasonable fixes, but it does not truly understand the nuances of the language, the sequencing of events, or the intent behind the strategy. It cannot look at a trade and say, “that shouldn’t have happened.” It does not debug by questioning reality—it follows patterns.

Arriving at the correct solution required recognizing the problem, iterating through possible fixes, examining the trades, and refining the logic until the behavior matched the intent. That process—wash, rinse, repeat—is the craft.

Generated code can get you started. Engineered code is what gets you to the truth.  Take a look at the two following reports.  Similar results, but look at the number of trades and those statistics tied to this number.