Category Archives: TS-PortfolioMerge

Monte Carlo: Garbage or Gold?

A quick history of Monte Carlo (the short version)

 

Monte Carlo Tool Link:  Read blog first:  Monte Carlo Trade Flight Simulator · Streamlit

Youtube Video on Monte Carlo Tool:  Youtube Monte Carlo Tool Video

Monte Carlo methods took off in the 1940s during wartime research at Los Alamos, when scientists needed a practical way to estimate outcomes for complex systems that couldn’t be solved with a single clean equation. Trading has the same problem: there’s no tidy formula that can tell you the order your wins and losses will arrive in—and that order is where luck lives.

So we do the next best thing: we add randomness on purpose. Monte Carlo repeatedly reshuffles the same trade outcomes to create many plausible equity paths, revealing how smooth (or brutal) the ride can be even when the system’s edge stays the same.

“No matter how sophisticated our choices, how good we are at dominating the odds, randomness will have the last word.” — Nassim Nicholas Taleb, Fooled by Randomness

What Monte Carlo does in trading

In trading system analysis, the “Monte Carlo move” is straightforward: you take your historical list of trade results (P/L per trade), then you randomly reshuffle (or resample) that list thousands of times. Each reshuffle produces a new, plausible equity curve built from the same underlying trade outcomes. From there, you measure the things that actually determine whether a system is tradeable—how deep the drawdowns get, how long the slumps can last, and how often the path gets ugly enough to force you to quit or cut size.

This doesn’t predict the future. It answers a different (and more practical) question:

Given the trade outcomes you’ve already seen, how good—or how bad—can the ride get due to sequencing alone?

That’s the value, or is it?

Why traders debate the value of Monte Carlo

Why some traders love it

  • It exposes fragility that a single backtest can hide.
    A backtest is one historical path—one specific order of wins and losses. Monte Carlo reshuffles that order to show other plausible paths. If a strategy only “works” when winners show up early, Monte Carlo will expose that quickly.

  • It turns vague fear into a measurable risk.
    Traders feel risk but struggle to quantify it. Monte Carlo lets you define a failure line (for example, “equity falls below 60% of starting capital”) and estimate how often that happens across thousands of simulated lives. You may still trade it—but now you’re choosing with a probability, not a gut feeling.

  • It helps you size the system rationally.
    Most blow-ups aren’t caused by a bad system—they’re caused by a decent system traded too big. By running the same trades under different starting capital (or leverage), Monte Carlo shows where the strategy becomes survivable. It often reveals a capital/size “threshold” where ruin risk drops and drawdowns become tolerable.

Why some traders hate it

  • It assumes the future behaves like the past.
    Monte Carlo can’t detect regime change. If your edge only works in certain “market moods” (trending vs choppy, low-vol vs high-vol), the simulation may look great right up until the market stops playing that game.

  • It assumes trades can be shuffled like a deck of cards.
    Many Monte Carlo runs treat each trade as an independent draw from the same bag of outcomes. Real systems aren’t that clean—markets come in streaks and clusters (volatility spikes, choppy stretches, correlation breaks), and those dependencies don’t always survive a simple reshuffle. Monte Carlo still helps measure sequence risk, but it isn’t a full market simulator.

  • It can punish good systems—and flatter lucky ones.
    A solid system can look worse if its history includes a few rare “tail” events—Monte Carlo will replay those tails in many sequences. Meanwhile, a strategy that enjoyed an unusually favorable historical run can look sturdier than it deserves, because the simulation is only as honest as the sample you feed it.

So, it’s not garbage… but it’s not gold automatically either.

Monte Carlo is a tool. Like any tool, it can be used well or used blindly.

The setup: how I ran these simulations

Monte Carlo Trade Flight Simulator · Streamlit

For these tests, I used my Streamlit-based Monte Carlo “Trade Flight Simulator.” You paste a column of trade P/L and the simulator generates:

  • Risk of Ruin based on a user-defined ruin line
  • Median drawdown across thousands of randomized equity paths
  • Worst-case outcomes (1st percentile)
  • Distribution visuals (“broom chart” equity fan + destination histogram)
  • Scaling table across start equity levels

Key settings used here

  • Ruin threshold: 60% of starting equity
  • Position size: 1 contract per trade
  • Execution costs: $40 per trade included in the trade list results you pasted
  • Horizon: number of trades pasted (the simulator runs “N trades” each life)
  • Optional CAGR is computed from first/last trade dates when provided

System #1: Mean Reversion on the Mini Nasdaq (MNQ)

~$40 execution costs, ~20 years

Start Equity Risk of Ruin Median DD Annual Return Worst Case (1st %)
$25,000 30% 42.9% 12.9% $85,019
$31,250 14% 36.2% 11.7% $80,747
$37,500 8% 32.2% 10.8% $93,072
$43,750 4% 30.1% 10.0% $78,308
$50,000 1% 26.4% 9.6% $81,352
$56,250 1% 25.2% 8.8% $93,765
$62,500 0% 24.4% 8.4% $77,907
$68,750 0% 23.0% 8.1% $81,076
$75,000 0% 20.6% 7.7% $81,108
$81,250 0% 19.1% 7.4% $89,028
$87,500 0% 18.9% 7.0% $81,982

System #1 — Mean Reversion (MNQ)

  • At $25,000 start equity
    • Risk of Ruin: 30%
    • Median Drawdown: 42.9%
    • Annual Return: 12.9%
    • Worst Case (1%): +$85,019
    • Prob > 0: 99.9%
  • At $50,000 start equity (“still not comfortable”)
    • Risk of Ruin: 1%
    • Median Drawdown: 26.4%
    • Annual Return: 9.6%
  • At $62,500 start equity (“stability zone”)
    • Risk of Ruin: 0%
    • Median Drawdown: 24.4%
    • Annual Return: 8.4%

What Monte Carlo reveals about this system

This is what a tradeable but under-capitalized system looks like.

The edge is real (the probability of finishing positive is essentially ~100%), but the sequence risk at low starting equity is still brutal:

  • A 30% risk of ruin at $25k (with a 60% ruin line) is not a rounding error.

  • Even at $31,250, ruin risk is still 14%.

  • The system doesn’t start to feel “professional” until you get into the $60k+ range, where ruin drops to 0% and median drawdowns settle into the mid-20% area.

Monte Carlo’s message:
If you want this system to behave like something you can actually stick with, you don’t optimize parameters — you capitalize it properly.

System #2: Trend Following on Crude Oil

~$40 execution costs, ~20 years

System #2 — Trend Following (Crude)

Start Equity Risk of Ruin Median DD Annual Return Worst Case (1%)
$25,000 68% 89.7% 8.7% -$99,928
$31,250 58% 80.7% 7.8% -$95,663
$37,500 47% 71.3% 7.1% -$100,991
$43,750 40% 64.1% 6.5% -$111,564
$50,000 31% 58.3% 6.2% -$98,846
$56,250 28% 57.0% 5.5% -$94,594
$62,500 22% 51.1% 5.3% -$100,349
$68,750 19% 49.9% 4.8% -$103,016
$75,000 16% 47.2% 4.5% -$112,968
$81,250 12% 42.3% 4.5% -$98,345
$87,500 11% 42.1% 4.0% -$98,228
  • At $25,000 start equity
    • Risk of Ruin: 68%
    • Median Drawdown: 89.7%
    • Annual Return: 8.7%
    • Worst Case (1%): -$99,928
    • Prob > 0: 87.3%
  • At $50,000 start equity
    • Risk of Ruin: 31%
    • Median Drawdown: 58.3%
    • Annual Return: 6.2%
    • Worst Case (1%): -$98,846
    • Prob > 0: 90.1%
  • At $81,250 start equity
    • Risk of Ruin: 12%
    • Median Drawdown: 42.3%
    • Annual Return: 4.5%
    • Worst Case (1%): -$98,345
    • Prob > 0: 86.4%
  • At $87,500 start equity
    • Risk of Ruin: 11%
    • Median Drawdown: 42.1%
    • Annual Return: 4.0%
    • Worst Case (1%): -$98,228
    • Prob > 0: 87.1%

What Monte Carlo reveals about this system

This is a classic crude trend-following signature: the system can be profitable over time, but the path can be violently unforgiving—especially when under-capitalized.

  • The probability of finishing positive is only in the mid-to-high 80% range, not “near-certain.”
  • At $25,000, the system is living on the edge: 68% risk of ruin with an 89.7% median drawdown.
  • Even after you scale up, the ride is still rough. At $81,250, ruin risk is still 12% with a 42.3% median drawdown.

The most important tell is the left tail: the 1% worst-case outcome is negative at every starting equity you tested (roughly –$94,594 to –$112,968). That means there are plausible sequences where the system not only suffers deep drawdowns, but ends the run down money—even with larger starting capital.

Monte Carlo’s message: This isn’t just a “start with more money” situation. Increasing capital helps, but the strategy’s tail risk remains severe. If you trade this, you need materially more capitalization, smaller sizing, or a risk overlay—because crude can deliver adverse sequences that this system does not comfortably absorb.

Which system is “superior” under your ruin rule?

You asked earlier: with a 60% ruin line and 1 contract per trade, does Monte Carlo reveal superiority?

Yes — because it reframes superiority as:

Which system survives at realistic starting equity levels with tolerable drawdowns?

Under-capitalized start: both are dangerous

At $25k, both systems are dangerous under the 60% ruin definition:

  • MNQ MR: 29% ruin, 41% median DD
  • Crude TF: 48% ruin, 55% median DD

So if someone insists on $25k and 1 contract, System #1 is clearly less fragile than System #2.

Once you move into realistic capital, System #1 stabilizes sooner

MNQ MR drops into “sane” ruin probabilities faster:

  • MNQ MR hits ~0% ruin by $56,250
  • Crude TF doesn’t really calm down until $75k–$81k

That’s not a judgment against trend following — it’s a reminder that instrument volatility matters and crude can be a different animal.

If you define “superior” as best risk-adjusted scaling

Based on your tables:

  • MNQ mean reversion looks easier to scale under your assumptions
  • Crude trend following can still be very viable, but it demands more capitalization to get into the same comfort zone

Monte Carlo didn’t make either system “good” or “bad.”
It made the capital requirements and sequence risk visible.

The visuals I include from the Streamlit app

System #1 Mean Reversion on the NQ

The Journey (“broom chart”)
Shows the median equity path with a confidence band.
Great for communicating “how rough can the ride get?”

Broom Chart

The Destination (ending equity histogram)
Think of each green bar as a bucket of endings. After 1,000 randomized runs, some endings cluster in the middle (the “typical” outcomes), while a smaller number land in the tails (the “lucky” and “unlucky” sequences). The dashed line marks your starting equity ($25,000). If the histogram sits mostly to the right, the system usually finishes positive. If a meaningful chunk sits to the left, that’s your “this can end down” reality—even with the same system and the same trades, just a different order.

Destination Histogram

Efficiency cloud (drawdown vs net profit scatter)

  • Each dot = one simulated run (“one life”).
  • Left to right (x-axis) = max drawdown during that run (more right = more pain).
  • Down to up (y-axis) = net profit at the end (higher = more gain).
  • The dotted lines mark the median drawdown and median profit, splitting the plot into four zones.
  • Best zone: upper-left (good profit with smaller drawdowns).
  • Worst zone: lower-right (big drawdowns with poor outcomes).

If most dots sit upper-left, the system is efficient. If the cloud spreads far right, the system’s edge may be real, but the ride can be brutal unless you reduce size or add capital.

Efficiency Cloud

Pros and cons of Monte Carlo (in plain English)

Pros

  • It highlights sequence risk that backtests hide
  • It gives you a practical scaling map
  • It converts drawdown fear into probability
  • It forces you to confront whether your system is truly robust or just lucky

Cons

  • Garbage in, garbage out (your trade list must be clean)
  • It assumes your future trade distribution resembles the past
  • It doesn’t simulate regime shifts (it’s not a market model)
  • It can create false confidence if you treat it as prophecy

Monte Carlo is not a crystal ball. It’s a stress test.

Conclusion: Garbage or Gold?

Monte Carlo is gold when it’s used as a risk lens.

It’s garbage only when people use it as a substitute for thinking — or when they treat it as a promise about the future.

For me, the biggest takeaway from these two systems is simple:

  • A profitable system can be untradeable if it’s under-capitalized.
  • Monte Carlo makes that obvious — quickly and brutally.
  • And it gives you something most trading metrics do not:
    a realistic map from “this looks good” to “this can survive.”

If you want to know what your system really feels like under stress, run it through my free Monte Carlo Trade Flight Simulator (Streamlit). Paste your trade list, set a starting equity, and it will generate a distribution of possible equity paths—so you can see the range of outcomes, not just the single backtest line. In a minute or two you’ll know whether your strategy is sturdy (most paths survive and grow) or fragile (too many paths crater early), and you’ll get practical numbers like “typical drawdown,” “worst-case runs,” and “probability of finishing above zero.”

Trades vs. Time: Two Monte Carlo Styles

Monte Carlo has to “shuffle” something: you can shuffle trades or you can shuffle time periods (daily/weekly/monthly returns). Trade-shuffling is great for a single system because it keeps each trade intact—entry and exit stay married—so you’re mainly testing how sensitive results are to the order trades arrive.

Devil’s Advocate: shuffling time can feel less “real,” because it breaks those trade narratives. A multi-day trade becomes a series of daily fragments, and once you reshuffle daily P/L you can build equity paths that no single set of trades could have produced exactly.

That’s the tradeoff TS-PortfolioMerge makes on purpose. It builds a daily mark-to-market equity curve (open positions are revalued each day), then resamples those daily equity changes so every system stays aligned to the same calendar. This isn’t about “reinvesting” or scaling up contracts—it’s about equity path risk: the way good and bad stretches of days create drawdowns, recovery difficulty, and survival pressure for a portfolio even when trade size stays constant.

 

A Turtle Thermometer for Trend-Following: 2025 Results

A Bare-Bones Turtle Algorithm for Gauging Trend-Following Conditions

The core Turtle rules were fully mechanical, but several operational choices, such as position sizing nuances, market selection, roll/contract handling, and execution practices were left to judgment or circumstance. Many would argue the philosophy behind the Turtles mattered as much as the rules themselves, and differing interpretations of that philosophy go a long way toward explaining why their results diverged so widely. The mechanics, however, are straightforward: you can distill them from the published books and courses, strip them down even further, and apply the resulting rule set across a broad portfolio to take the pulse of trend-following today.

I’ve worked with the Turtle framework for many years, coded numerous variants, and even compared notes with a handful of original Turtles. If any method can “take the temperature” of market trendiness, this one can. This system synthesizes a shorter-term trend mechanism (that limits execution based on the prior outcome) with a true longer term trend following entry and exit method (two months of data are used to determine entry).   Short-term trading is difficult and often falls victim to over trading.  The shorter-term entry is used to try and capitalize on the genesis of a big trend.  Preventing another trade after a winner is one method of reducing trading and chop.  If the short-term break out turns into a trend and entry is prevented, then the 55-day break out is there to capture it.  Below are the rules I extracted to build a fully mechanical, bare-bones algorithm for that purpose.

Rules Used in This Analysis

Conventions & Definitions

  • Breakout (stop basis): Enter on a stop when price exceeds the specified lookback extreme by 1 tick (or exactly at the extreme if your platform supports that).
  • N: 20-day weighted Average True Range (WATR) used for both systems.
  • Risk stop (volatility stop): A stop placed 2×N from the entry price.
  • Swing stop: For System #1, use the 10-day highest high/lowest low; for System #2, use the 20-day highest high/lowest low.
  • Closest stop wins: The active protective stop at any time is the tighter of the risk stop and the swing stop.
  • Loser vs. non-loser (for System #1’s gating rule):
    • A trade that is stopped out by 2×N is a loser.
    • A trade that exits via the 10-day swing stop (even if it’s a loss) is not counted as a loser for gating.
    • Profitable exits via the 10-day swing stop are obviously not losers.

System #1 — 20-Day Breakout (Conditional)

Purpose: Only take the next 20-day breakout if the most recent 20-day breakout resulted in a 2×N loss.

  • Entry condition (gated):
    • Compute the 20-day Donchian channel.
    • You may only take a long (new 20-day high) or short (new 20-day low) breakout if the last System #1 trade ended with a 2×N risk stop.
    • If the last System #1 trade exited via the 10-day swing stop, it does not unlock the gate.
  • Initial protective stops (at entry):
    • Risk stop: 2×N from entry.
    • Swing stop: Opposite extreme of the past 10 days (lowest low for longs, highest high for shorts).
    • Use the tighter of the two stops at all times (“closest stop wins”).
  • Exit rules:
    • Exit if price hits the active stop (risk or swing).
  • Bookkeeping for gating:
    • If exit was the 2×N risk stop, mark the trade as a loser (this unlocks the gate for the next entry).
    • If exit was the 10-day swing stop, do not mark as a loser (gate remains locked).
  • Always evaluating: System #1’s breakout logic runs continuously, but entries are allowed only when the gate is unlocked by a prior 2×N loss.


System #2 — 55-Day Breakout (Always On)

Purpose: Classic trend capture that runs regardless of System #1’s state; does not affect System #1’s gating.

  • Entry condition (ungated):
    • Enter long on a 55-day high breakout; enter short on a 55-day low breakout.
  • Initial protective stops (at entry):
    • Risk stop: 2×N from entry.
    • Swing stop: Opposite extreme of the past 20 days (lowest low for longs, highest high for shorts).
    • Use the tighter of the two stops at all times.
  • Exit rules:
    • Exit if price hits the active stop (risk or swing).
  • Isolation from System #1:
    • System #2 trades and outcomes do not influence System #1’s “last-trade-was-a-loser” gate (also known as a filter).

The Portfolio

Currencies (CME FX)

Preferred name Short Futures ticker
Australian Dollar AUD @AD (6A)
British Pound GBP @BP (6B)
Canadian Dollar CAD @CD (6C)
Euro EUR @EC (6E)
Japanese Yen JPY @JY (6J)
Swiss Franc CHF @SF (6S)

Rates (CBOT)

Preferred name Short Futures ticker
30-Year U.S. Treasury Bond 30Y @US (ZB)
10-Year U.S. Treasury Note 10Y @TY (ZN)
5-Year U.S. Treasury Note 5Y @FV (ZF)

Equity/Index

Preferred name Short Futures ticker
E-mini S&P 500 ES @ES (CME)
U.S. Dollar Index DXY @DX (ICE)

Metals (COMEX/NYMEX)

Preferred name Short Futures ticker
Gold XAU @GC
Copper Cu @HG
Silver XAG @SI
Palladium Pd @PA=11INC
Platinum Pt @PL

Energies (NYMEX)

Preferred name Short Futures ticker
RBOB Gasoline RBOB @RB
Heating Oil HO @HO
WTI Crude Oil WTI @CL
Henry Hub Natural Gas NatGas @NG

Grains/Oilseeds (CBOT)

Preferred name Short Futures ticker
Soybeans Beans @ZS
Corn Corn @ZC
Rough Rice Rice @ZR
Wheat (SRW) Wheat @ZW
Soybean Meal Meal @ZM

Livestock (CME)

Preferred name Short Futures ticker
Feeder Cattle Feeders @FC (GF )
Live Cattle LiveCat @LC (LE )
Lean Hogs Hogs @LH (HE)

Softs (ICE)

Preferred name Short Futures ticker
Frozen Concentrated Orange Juice OJ @OJ
Sugar No. 11 Sugar @SB
Cotton No. 2 Cotton @CT
Coffee “C” Coffee @KC
Lumber Lumber @LBR = @LB legacy

Market Normalization (Fixed-Fractional Sizing)

To level the playing field across markets, I used fixed-fractional position sizing keyed to the Turtle “quick” 20-day ATR.

Risk budget per trade

  • Account equity =$250,000
  • Fraction at risk per trade =2%
  • Dollar risk per trade: = × = 0.02 × 250,000 = $5,000

Contracts to trade

  • Let ATR = 20-day (Turtle quick) Average True Range in price units
  • Let BPV = Big Point Value (dollars per 1.0 move)
  • Dollar risk per 1 contract: ATR × BPV
  • Position size (contracts):
  • Contracts =⌊ / ATR × BPV⌋

In words: allocate $5,000 of risk to each trade and size the position by dividing that risk by the market’s expected dollar move (ATR×BPV).   Round down to an integer.

Notes & conventions
  • ATR is the 20-day Turtle quick ATR (same used in the rules).
  • Use the correct BPV for each contract (e.g., ES $50/pt, CL $1,000/pt, SI $5,000/pt).
  • Enforce at least on contract per signal.
  • $50 slippage and $10 commission per round turn.

Results

Large portfolio performance on Bare Bones Turtle

This equity curve is very typical across the spectrum of most trend following systems.  There have been big years to keep the trend following momentum going – recently 2008, 2010, 2014, 2018, 2020.

Big Years – pushes the popularity of Trend Following

Many times, the futures and commodity markets are there to benefit from global events such as the banking collapse (2008) and the pandemic (2020).

Over the years, markets have fallen in and out of favor with Trend Following.  The best market over the past twenty years or so turned out to be sugar.  With its smaller size and associated volatility and trends it was the clear winner.

Many HOT SPOTS on the Correlation Heat Map

Pearson Correlation Matrix

But, what about smaller accounts?

There exist sub portfolios with better profit to draw down ratios.  If you could only choose ten markets and wanted to know the best combination, you can do this with my TS-PortfolioMerge software.  In fact, all the metrics and images I have shown in this post were generated with TS-PortfolioMerge.  If your budget only allows for 10 markets and you want to evaluate every combination you will need to wait a while for TS-PM to run through every combination.

Search space C(37,10) = 348,330,136 (subset) – yes that is 300 million combinations.  Just set up your computer for overnight processing.

But if you want a speedy answer, that will approximate the entire search space you can do that as well.

Sampled (limit 50,000; randomized).

# P/DD Net Profit ($) Max DD ($) Symbols
1 12.071 1,615,758.97 133,857.91 @EC, @HG, @HO, @JY, @LB=11INC, @LH, @RB, @SB, @SM, @TY
2 11.676 1,333,359.10 114,194.85 @FC, @GC, @HO, @KC, @LB=11INC, @LBR=11INC, @LH, @OJ, @SB, @SM
3 11.563 1,473,201.00 127,410.20 @C, @CL, @EC, @ES, @GC, @HG, @HO, @LBR=11INC, @LH, @SB
4 11.545 1,601,806.50 138,745.50 @C, @CL, @CT, @GC, @HO, @LH, @RB, @SB, @SM, @US
5 11.444 1,427,082.85 124,705.40 @CL, @GC, @HO, @KC, @LB=11INC, @LBR=11INC, @LH, @OJ, @S, @SB

Run the speedy version multiple times to see if the same portfolio bubbles to the top.  If you continue getting different portfolios, you can run the exhaustive mode.  Here the best 10 markets were:

@EC, @HG, @HO, @JY, @LB=11INC, @LH, @RB, @SB, @SM, @TY

Here you have two currencies (EC and JY), one metal (HG), two energies (HO and RBOB), one interest rate (TY), sugar, lumber, soybean meal and lean hogs.  But are we guilty of cherry picking?  Maybe the Monte Carlo analysis will provide some insight.

Monte Carlo Analysis on 10 of the best combination from the Speedy output.

Conclusion

Trend Following as of late October 2025 is doing well and doing as expected.  The pandemic pulled the algorithm out of the doldrums and positive years have been banked since.  The current year looks like the exception, but we still have two months left.  With the Gold move you would think 2025 would have been a banner year.  The Trend is STILL OUR FRIEND.

Email me if you would like my code for my bare bones Turtle system, I utilized to create all these results.  The code includes the human curated LAST TRADE WAS A LOSER function.